Winner's curse adjustment • gauss

Introduction

In many genetic association studies that involve extensive multiple testing, it’s often observed that association signals below the statistical significance threshold have a greater impact on trait variation than those that are statistically significant. Accurately quantifying these sub-threshold effects is crucial but challenging due to biases known as the “Winner’s Curse.” In previous work, we developed FIQT (FDR Inverse Quantile Transformation), a straightforward technique to correct for these biases. Initially, FIQT uses the False Discovery Rate (FDR) to correct the p-values of SNP associations for multiple testing. It then derives the Z-score noncentrality estimates based on the Gaussian quantiles that correspond to these adjusted p-values, while maintaining the correct sign. The FIQT approach has been conveniently incorporated into the GAUSS package as the fiqt() function. By simply entering a Z-score vector along with the minimum non-zero p-value generated by the qnorm() function (default value is 10^-320), this function will produce a Z-score vector that has been adjusted to account for the winner’s curse biases.

Load necessary packages

library(gauss)
#> 
#> Attaching package: 'gauss'
#> The following object is masked from 'package:stats':
#> 
#>     dist
library(tidyverse)
#> ── Attaching packages
#> ───────────────────────────────────────
#> tidyverse 1.3.2 ──
#> ✔ ggplot2 3.4.1     ✔ purrr   0.3.4
#> ✔ tibble  3.1.8     ✔ dplyr   1.0.9
#> ✔ tidyr   1.2.0     ✔ stringr 1.4.0
#> ✔ readr   2.1.2     ✔ forcats 0.5.2
#> ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag()    masks stats::lag()
library(data.table)
#> 
#> Attaching package: 'data.table'
#> 
#> The following objects are masked from 'package:dplyr':
#> 
#>     between, first, last
#> 
#> The following object is masked from 'package:purrr':
#> 
#>     transpose
library(kableExtra)
#> 
#> Attaching package: 'kableExtra'
#> 
#> The following object is masked from 'package:dplyr':
#> 
#>     group_rows

Example

In this example, we’ll correct for the “Winner’s Curse” bias using the fiqt() function applied to summary statistics from the Psychiatric Genomics Consortium Schizophrenia Phase 3 study. For demonstration purposes, we have limited our analysis to SNPs from the Illumina 1M Chip only.

input_file <- "../data/PGC3_SCZ_ilmn1M_Z.txt"
pgc3 <- fread(input_file)
head(pgc3)
#>          rsid chr        bp a1 a2          z
#> 1:  rs1000000  12 126890980  G  A  1.9203001
#> 2: rs10000006   4 108826383  T  C -2.3233567
#> 3:  rs1000002   3 183635768  C  T  1.2342762
#> 4: rs10000021   4 159441457  G  T -0.7111359
#> 5: rs10000023   4  95733906  G  T -2.0628172
#> 6:  rs1000003   3  98342907  A  G -1.0814432

Running fiqt()

pgc3$z.wca <- fiqt(pgc3$z)

Results

head(pgc3) %>% kable("html")

rsid	chr	bp	a1	a2	z	z.wca
rs1000000	12	126890980	G	A	1.9203001	0.9662015
rs10000006	4	108826383	T	C	-2.3233567	-1.2816898
rs1000002	3	183635768	C	T	1.2342762	0.5112610
rs10000021	4	159441457	G	T	-0.7111359	-0.2436473
rs10000023	4	95733906	G	T	-2.0628172	-1.0743353
rs1000003	3	98342907	A	G	-1.0814432	-0.4263369

Original Z-scores vs FIQT Z-score noncentrality estimates

ggplot() +
  geom_point(aes(x=z, y=z.wca), data=pgc3) +
  geom_abline(intercept = 0, slope = 1, color="red") +
  labs(x="Original Z-score",
      y="Estimated Z-score noncentrality",
      caption="Data: PGC SCZ3 GWAS") +
  theme_classic()

Reference

Bigdeli et al. A simple yet accurate correction for winner’s curse can predict signals discovered in much larger genome scans. Bioinformatics. 2016 Sep 1;32(17):2598-603. doi: 10.1093/bioinformatics/btw303.