HomeScienceGeneticsIdentifying and ranking causal biochemical biomarkers for breast cancer: a Mendelian randomization...

Identifying and ranking causal biochemical biomarkers for breast cancer: a Mendelian randomization study BMC medicine

Analysis plan

Our prospective plan was to perform a variety of two-sample univariable MR (UVMR) analyzes to examine the associations of each of the UKB biochemical biomarkers with overall, ER-positive, and ER-negative breast cancer liability. After our UVMR analyzes showed significant associations, we performed further multivariable MR (MVMR) analyzes to adjust for known risk factors and bidirectional analyses. We finally ranked our nominally significant biomarkers by importance using a multivariable Bayesian approach [4]. Our analysis follows the guidelines for conducting MR exams [5] and our reporting follows the Guidelines for Strengthening Reporting of Mendelian Randomization Studies (STROBE-MR) (Supplementary File 2: Checklist S1) [6]. We did not pre-register the research protocol.

Study populations

Our study used summary level exposure data from the UKB study [7] and summary-level outcome data from the Breast Cancer Association Consortium (BCAC) [8]. The BCAC contains ~6000 samples from the UK [8], which equates to a maximum sample overlap of ~1.4% between the exposure and outcome samples. Our data only included women of European descent to reduce bias from population stratification.

Exposure data

We obtained publicly available summary-level genome-wide association study (GWAS) statistics on 34 biomarker levels in serum, urine, and red blood cells; body mass index (BMI); and the frequency of alcohol consumption of unrelated female participants of White British descent (n = 194,174) in the UKB cohort study by Neale et al. [9]. The genotypes and 34 biomarker levels were collected by the UKB study at baseline between 2006 and 2010 using different laboratory techniques and instruments from different vendors [7, 10]. The GWASs were performed using age, age^2, and the first 20 principal components (PCs) as covariates [11]. Inverse-rank normalized GWAS data were used because many of the quantitative biomarker features were non-normally distributed. Most women (at least 59%) in the UKB cohort were postmenopausal [12]. More information on the panel of UKB biomarkers and the original UKB study can be found elsewhere [3, 7].

Outcome data

Publicly available GWAS summary statistics on overall breast cancer cases (n = 122,977) and controls (n = 105,974) of European descent were obtained from the BCAC [13]. Of the breast cancer cases, 69,501 were ER positive, 21,468 were ER negative, and the majority developed after menopause. More details about the original studies are described elsewhere [8, 14, 15].

Must Read
PSMA PET Imaging Market Growing Globally with Leading Key Players -Fisher Scientific, Aytu BioScience, Abzena PLC, Agilent Technologie, Endocyte, Bio Rad Laboratories, Diagnostic BioSystems, Blue Earth Diagnostics, Bayer Health, Seattle Genetics

static analysis

Selection of instrumental variables

For each exposure, we selected associated single-nucleotide polymorphisms (SNPs) with genome-wide significance (P < 5 × 10−8) and ensured their independence by removing those in disequilibrium from the linkage using the PLINK method (r
2 < 0.001, clumping distance = 10,000 kb). We then harmonized the directions of the effect alleles between exposures and outcomes.

In all of our MR analyses, SNPs must meet three assumptions to be considered valid IVs. Genetic variants must (1) be strongly related to exposure (the relevance assumption), (2) be independent of confounders (the independence assumption), and (3) influence the outcome only through their effect on exposure (the exclusionary constraint assumption ). .

Univariable analyses

The main univariable analysis consisted of inverse variance weighted (IVW) MR between each exposure and each outcome. The IVW method first estimates the Wald ratio for each SNP by dividing the SNP-outcome association by the SNP-exposure association and then combines these ratios in a fixed effect meta-analysis where each ratio is weighted by its inverse of the variance of the SNP-outcome association [16]. We used P < 0.05 as the nominal significance threshold. We also derived the corrected false discovery rate (FDR). Pvalues ​​with the Benjamini-Hochberg (BH) method and used P < 0.05 as the FDR-corrected significance threshold. For exposures for which only 1 IV could be identified, we estimated the Wald ratio [17]. Our results are reported as odds ratios (OR) per standard deviation (SD) change in genetically predicted biomarker concentration.

A common violation of the exclusionary constraint IV condition is caused by horizontal pleiotropy, where a genetic variant has an effect on the outcome that does not occur due to the exposure [18]. Therefore, we used several additional univariable approaches with different underlying assumptions about the structure of the pleiotropy for all exposures, including the MR-Egger [19]weighted median [20]and MR Pleiotropy RESIDUAL SUM AND BLOWER (MR-PRESSO) [21]. The MR-Egger allows for some directional pleiotropy in its estimation of the causal effect by making the additional assumption of Instrument Strength Independent of Direct Effect (InSIDE), which states that for all instruments the magnitude of the pleiotropic effect is independent of strength of the genetic variant-exposure association [19]. The weighted median provides sparse or balanced pleiotropy by weighting down outliers [20]. The MR-PRESSO method allows for some directional pleiotropy by identifying and adjusting for outliers [21].

Must Read
ROSEN, A TOP COMPANY, Encourages Fulgent Genetics, Inc.

Sensitivity analyses

We tested the robustness of our univariable findings by performing MVMR [22, 23] and bidirectional MR. MVMR was used to adjust previously reported risk factors, while bidirectional MR was used to rule out possible reverse causation.

We performed two-sample MVMR analyzes for all seven biomarkers that were nominally significantly associated with overall breast cancer in IVW MR. We looked for associations P < 10−8 of all variants used as IVs in Phenoscanner [24, 25] (Supplementary File 3: T1-T7), a database of summarized GWASs, and adjusted for properties that can be considered reasons for horizontal pleiotropy. MVMR assumes that pleiotropic pathways act through the risk factors included in the model [18]. For all MVMR analyses, we included SNPs that were significantly associated genome-wide (P < 5 × 10−8) with any exposure or risk factor taken into account in an MVMR model and not linkage disequilibrium (r
2 < 0.001, clumping distance = 10,000 kb).

Since lipids are correlated [26]we included HDL cholesterol, low-density lipoprotein (LDL) cholesterol, triglycerides, and lipoprotein A in MVMR models to observe the direct associations of each lipid with each outcome.

If BMI [27] and alcohol consumption [28] are associated with breast cancer risk, we included BMI and frequency of alcohol intake in MVMR models for each of the seven biomarkers that we found to be nominally significantly associated with overall breast cancer in IVW MR.

Because estrogen decreases the expression and activity of alkaline phosphatase (ALP) in breast cancer cells [29] and we could not obtain enough genetic variants for estradiol, we corrected for testosterone and SHBG in an MVMR model with ALP.

After adjustment for BMI in MVMR, significant associations were found between SHBG and breast cancer risk [28]So [30] we included BMI and SHBG in MVMR models.

Due to the low prior probability of association between ALP and breast cancer, we performed a two-way univariable MR analysis of genetically predicted overall, ER-positive, and ER-negative breast cancer liability and ALP levels.

Must Read
Genes can predict coronary heart disease risk: Study | Health

Exposure rankings

We used MR Bayesian model averaging (MR-BMA) to agnostically rank the causal importance of the seven biomarkers that were found to be nominally significantly associated with overall breast cancer in IVW MR, while accounting for possible pleiotropy [4]. Empirical Pvalues ​​were calculated using a permutation approach [31] and adapted for multiple testing using the BH method with P < 0.05 as significance threshold. All independent (r
2 < 0.001) genetic variants associated with one of the biomarkers with genome-wide significance were included in the analysis (n = 460).

We used MR-BMA to consider each combination of biomarkers (all single biomarkers, all pairs of biomarkers, all triplets, and so on) as a candidate model in an MVMR analysis using weighted regression. Each candidate model was assigned a posterior probability (PP) expressing the probability that the candidate model contains the true set of causal biomarkers using the goodness-of-fit measure of the regression.

We then used MR-BMA to perform model averaging to assign each biomarker a marginal inclusion probability (MIP) and report the model-mean causal effect (MACE) of each biomarker on overall breast cancer. The MIP represents the probability that the biomarker is a causal determinant of breast cancer risk, and the MACE represents the weighted mean direct causal effect of the biomarker on risk across all candidate models. The MIP was calculated by summing the posterior probabilities of all candidate models in which the biomarker is present. The MACE underestimates the true causal effect of a biomarker on breast cancer in general and should not be interpreted in absolute terms, but as an indication of the direction of the effect and to compare the relative causal effects between biomarkers.

We used 0.5 as the prior probability for inclusion in the main analysis, reflecting an a priori belief that half of the candidate models or that half of the nominally significant biomarkers are causal, and priors of 0.25 and 0, 75 as sensitivity analyses.


We used the TwoSampleMR [31]Mendelian randomization [32]MRPRESSO [33]and youth washer [34] R packages, as well as the GitHub repository for MR-BMA for our analyzes with R (version 4.0.5). We searched for secondary trait associations using Phenoscanner [24, 25].



Please enter your comment!
Please enter your name here

Most Popular

Recent Comments