QCTOOL grows up

February 24, 2011

QCTOOL, the GWAS quality-control tool, has been gaining features steadily over the last few months.  Here are some of the new features:

  • Operate on several cohorts at once, treating them like one big cohort.  This works on the overlap set of SNPs, where the overlap is performed on SNPID, rsid, position, and allele fields, or by a subset of these specified using the -snp-match-fields option;
  • Update SNP positions using the -translate-snp-positions option;
  • Apply strand alignment files using the -strand option;
  • Swap alleles where necessary to make the cohorts match, using the -match-alleles-to-cohort1 option;
  • Calculate pairwise relatedness Bayes factors (a la E.A.Thompson, “The estimation of pairwise relationships”, Ann. Hum. Genet (1975), but without the typo in the table of probabilities, and dealing correctly with uncertainty in genotypes) between individuals using the -relatedness option.  This method uses allele frequencies estimated in the cohort, and although in principle it could be used for general relatedness QC, it is best suited for finding relatedness due to recent shared ancestry.  See Powell et al, “Reconciling the analysis of IBD and IBS in complex trait studies” , Nature Reviews Genetics (2010) for a good read about this and an alternative method.
  • Write PED files, instead of GEN files, using the -op option.  This preserves the pedigree structure from a pedigree read in with the -ip option.  It uses threshholded genotype calls.  I used this to make files suitable for input to QTDT; it’s not guaranteed they’d work with other programs.

These features are all in the development branch, hosted here.  (The released version is the original version which concentrates on per-SNP and per-sample summary statistics.)  If you want to use these features, you’ll need to download the source code and build QCTOOL yourself.  Drop me an email and I’ll do my best to help.

