Introduction
In many genetic association studies that involve extensive multiple
testing, it’s often observed that association signals below the
statistical significance threshold have a greater impact on trait
variation than those that are statistically significant. Accurately
quantifying these sub-threshold effects is crucial but challenging due
to biases known as the “Winner’s Curse.” In previous work, we developed
FIQT (FDR Inverse Quantile Transformation), a straightforward technique
to correct for these biases. Initially, FIQT uses the False Discovery
Rate (FDR) to correct the p-values of SNP associations for multiple
testing. It then derives the Z-score noncentrality estimates based on
the Gaussian quantiles that correspond to these adjusted p-values, while
maintaining the correct sign. The FIQT approach has been conveniently
incorporated into the GAUSS package as the fiqt()
function.
By simply entering a Z-score vector along with the minimum non-zero
p-value generated by the qnorm() function (default value is 10^-320),
this function will produce a Z-score vector that has been adjusted to
account for the winner’s curse biases.
Load necessary packages
library(gauss)
#>
#> Attaching package: 'gauss'
#> The following object is masked from 'package:stats':
#>
#> dist
library(tidyverse)
#> ── Attaching packages
#> ───────────────────────────────────────
#> tidyverse 1.3.2 ──
#> ✔ ggplot2 3.4.1 ✔ purrr 0.3.4
#> ✔ tibble 3.1.8 ✔ dplyr 1.0.9
#> ✔ tidyr 1.2.0 ✔ stringr 1.4.0
#> ✔ readr 2.1.2 ✔ forcats 0.5.2
#> ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag() masks stats::lag()
library(data.table)
#>
#> Attaching package: 'data.table'
#>
#> The following objects are masked from 'package:dplyr':
#>
#> between, first, last
#>
#> The following object is masked from 'package:purrr':
#>
#> transpose
library(kableExtra)
#>
#> Attaching package: 'kableExtra'
#>
#> The following object is masked from 'package:dplyr':
#>
#> group_rows
Example
In this example, we’ll correct for the “Winner’s Curse” bias using the fiqt() function applied to summary statistics from the Psychiatric Genomics Consortium Schizophrenia Phase 3 study. For demonstration purposes, we have limited our analysis to SNPs from the Illumina 1M Chip only.
input_file <- "../data/PGC3_SCZ_ilmn1M_Z.txt"
pgc3 <- fread(input_file)
head(pgc3)
#> rsid chr bp a1 a2 z
#> 1: rs1000000 12 126890980 G A 1.9203001
#> 2: rs10000006 4 108826383 T C -2.3233567
#> 3: rs1000002 3 183635768 C T 1.2342762
#> 4: rs10000021 4 159441457 G T -0.7111359
#> 5: rs10000023 4 95733906 G T -2.0628172
#> 6: rs1000003 3 98342907 A G -1.0814432
Running fiqt()
pgc3$z.wca <- fiqt(pgc3$z)
Results
rsid | chr | bp | a1 | a2 | z | z.wca |
---|---|---|---|---|---|---|
rs1000000 | 12 | 126890980 | G | A | 1.9203001 | 0.9662015 |
rs10000006 | 4 | 108826383 | T | C | -2.3233567 | -1.2816898 |
rs1000002 | 3 | 183635768 | C | T | 1.2342762 | 0.5112610 |
rs10000021 | 4 | 159441457 | G | T | -0.7111359 | -0.2436473 |
rs10000023 | 4 | 95733906 | G | T | -2.0628172 | -1.0743353 |
rs1000003 | 3 | 98342907 | A | G | -1.0814432 | -0.4263369 |
Original Z-scores vs FIQT Z-score noncentrality estimates
ggplot() +
geom_point(aes(x=z, y=z.wca), data=pgc3) +
geom_abline(intercept = 0, slope = 1, color="red") +
labs(x="Original Z-score",
y="Estimated Z-score noncentrality",
caption="Data: PGC SCZ3 GWAS") +
theme_classic()