Skip to contents

Introduction

In many genetic association studies that involve extensive multiple testing, it’s often observed that association signals below the statistical significance threshold have a greater impact on trait variation than those that are statistically significant. Accurately quantifying these sub-threshold effects is crucial but challenging due to biases known as the “Winner’s Curse.” In previous work, we developed FIQT (FDR Inverse Quantile Transformation), a straightforward technique to correct for these biases. Initially, FIQT uses the False Discovery Rate (FDR) to correct the p-values of SNP associations for multiple testing. It then derives the Z-score noncentrality estimates based on the Gaussian quantiles that correspond to these adjusted p-values, while maintaining the correct sign. The FIQT approach has been conveniently incorporated into the GAUSS package as the fiqt() function. By simply entering a Z-score vector along with the minimum non-zero p-value generated by the qnorm() function (default value is 10^-320), this function will produce a Z-score vector that has been adjusted to account for the winner’s curse biases.

Load necessary packages

library(gauss)
#> 
#> Attaching package: 'gauss'
#> The following object is masked from 'package:stats':
#> 
#>     dist
library(tidyverse)
#> ── Attaching packages
#> ───────────────────────────────────────
#> tidyverse 1.3.2 ──
#>  ggplot2 3.4.1      purrr   0.3.4
#>  tibble  3.1.8      dplyr   1.0.9
#>  tidyr   1.2.0      stringr 1.4.0
#>  readr   2.1.2      forcats 0.5.2
#> ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
#>  dplyr::filter() masks stats::filter()
#>  dplyr::lag()    masks stats::lag()
library(data.table)
#> 
#> Attaching package: 'data.table'
#> 
#> The following objects are masked from 'package:dplyr':
#> 
#>     between, first, last
#> 
#> The following object is masked from 'package:purrr':
#> 
#>     transpose
library(kableExtra)
#> 
#> Attaching package: 'kableExtra'
#> 
#> The following object is masked from 'package:dplyr':
#> 
#>     group_rows

Example

In this example, we’ll correct for the “Winner’s Curse” bias using the fiqt() function applied to summary statistics from the Psychiatric Genomics Consortium Schizophrenia Phase 3 study. For demonstration purposes, we have limited our analysis to SNPs from the Illumina 1M Chip only.

input_file <- "../data/PGC3_SCZ_ilmn1M_Z.txt"
pgc3 <- fread(input_file)
head(pgc3)
#>          rsid chr        bp a1 a2          z
#> 1:  rs1000000  12 126890980  G  A  1.9203001
#> 2: rs10000006   4 108826383  T  C -2.3233567
#> 3:  rs1000002   3 183635768  C  T  1.2342762
#> 4: rs10000021   4 159441457  G  T -0.7111359
#> 5: rs10000023   4  95733906  G  T -2.0628172
#> 6:  rs1000003   3  98342907  A  G -1.0814432

Running fiqt()

pgc3$z.wca <- fiqt(pgc3$z)

Results

head(pgc3) %>% kable("html")
rsid chr bp a1 a2 z z.wca
rs1000000 12 126890980 G A 1.9203001 0.9662015
rs10000006 4 108826383 T C -2.3233567 -1.2816898
rs1000002 3 183635768 C T 1.2342762 0.5112610
rs10000021 4 159441457 G T -0.7111359 -0.2436473
rs10000023 4 95733906 G T -2.0628172 -1.0743353
rs1000003 3 98342907 A G -1.0814432 -0.4263369

Original Z-scores vs FIQT Z-score noncentrality estimates

ggplot() +
  geom_point(aes(x=z, y=z.wca), data=pgc3) +
  geom_abline(intercept = 0, slope = 1, color="red") +
  labs(x="Original Z-score",
      y="Estimated Z-score noncentrality",
      caption="Data: PGC SCZ3 GWAS") +
  theme_classic()

Reference

  • Bigdeli et al. A simple yet accurate correction for winner’s curse can predict signals discovered in much larger genome scans. Bioinformatics. 2016 Sep 1;32(17):2598-603. doi: 10.1093/bioinformatics/btw303.