Illumina top strand genotyping data

Due to the way that Illumina encodes their genotypes (https://www.illumina.com/documents/products/technotes/technote_topbot.pdf), the following are the possible allele pairs for a top strand file:

  • A/G
  • A/C
  • A/T
  • C/G

So one can check for a top strand file by checking the genotype frequencies. If only these pairs exists, then it is probably top strand. If all 12 possible pairs (4 x 4 – 4) are seen, then it is probably not top strand.