Background Low-frequency variants play an important role in breast cancer (BC) susceptibility. Gene-based methods can increase power by combining multiple variants in the same gene and help... Show moreBackground Low-frequency variants play an important role in breast cancer (BC) susceptibility. Gene-based methods can increase power by combining multiple variants in the same gene and help identify target genes.Methods We evaluated the potential of gene-based aggregation in the Breast Cancer Association Consortium cohorts including 83,471 cases and 59,199 controls. Low-frequency variants were aggregated for individual genes' coding and regulatory regions. Association results in European ancestry samples were compared to single-marker association results in the same cohort. Gene-based associations were also combined in meta-analysis across individuals with European, Asian, African, and Latin American and Hispanic ancestry.Results In European ancestry samples, 14 genes were significantly associated (q < 0.05) with BC. Of those, two genes, FMNL3 (P = 6.11 x 10(-6)) and AC058822.1 (P = 1.47 x 10(-4)), represent new associations. High FMNL3 expression has previously been linked to poor prognosis in several other cancers. Meta-analysis of samples with diverse ancestry discovered further associations including established candidate genes ESR1 and CBLB. Furthermore, literature review and database query found further support for a biologically plausible link with cancer for genes CBLB, FMNL3, FGFR2, LSP1, MAP3K1, and SRGAP2C.Conclusions Using extended gene-based aggregation tests including coding and regulatory variation, we report identification of plausible target genes for previously identified single-marker associations with BC as well as the discovery of novel genes implicated in BC development. Including multi ancestral cohorts in this study enabled the identification of otherwise missed disease associations as ESR1 (P = 1.31 x 10(-5)), demonstrating the importance of diversifying study cohorts. Show less
Objectives Physical inactivity and sedentary behaviour are associated with higher breast cancer risk in observational studies, but ascribing causality is difficult. Mendelian randomisation (MR)... Show moreObjectives Physical inactivity and sedentary behaviour are associated with higher breast cancer risk in observational studies, but ascribing causality is difficult. Mendelian randomisation (MR) assesses causality by simulating randomised trial groups using genotype. We assessed whether lifelong physical activity or sedentary time, assessed using genotype, may be causally associated with breast cancer risk overall, pre/post-menopause, and by case-groups defined by tumour characteristics.Methods We performed two-sample inverse-variance-weighted MR using individual-level Breast Cancer Association Consortium case-control data from 130 957 European-ancestry women (69 838 invasive cases), and published UK Biobank data (n=91 105-377 234). Genetic instruments were single nucleotide polymorphisms (SNPs) associated in UK Biobank with wrist-worn accelerometer-measured overall physical activity (n(snps)=5) or sedentary time (n(snps)=6), or accelerometer-measured (n(snps)=1) or self-reported (n(snps)=5) vigorous physical activity.Results Greater genetically-predicted overall activity was associated with lower breast cancer overall risk (OR=0.59; 95% confidence interval (CI) 0.42 to 0.83 per-standard deviation (SD;similar to 8 milligravities acceleration)) and for most case-groups. Genetically-predicted vigorous activity was associated with lower risk of pre/perimenopausal breast cancer (OR=0.62; 95% CI 0.45 to 0.87,>= 3 vs. 0 self-reported days/week), with consistent estimates for most case-groups. Greater genetically-predicted sedentary time was associated with higher hormone-receptor-negative tumour risk (OR=1.77; 95% CI 1.07 to 2.92 per-SD (similar to 7% time spent sedentary)), with elevated estimates for most case-groups. Results were robust to sensitivity analyses examining pleiotropy (including weighted-median-MR, MR-Egger).Conclusion Our study provides strong evidence that greater overall physical activity, greater vigorous activity, and lower sedentary time are likely to reduce breast cancer risk. More widespread adoption of active lifestyles may reduce the burden from the most common cancer in women. Show less
Germline copy number variants (CNVs) are pervasive in the human genome but potential disease associations with rare CNVs have not been comprehensively assessed in large datasets. We analysed rare... Show moreGermline copy number variants (CNVs) are pervasive in the human genome but potential disease associations with rare CNVs have not been comprehensively assessed in large datasets. We analysed rare CNVs in genes and non-coding regions for 86,788 breast cancer cases and 76,122 controls of European ancestry with genome-wide array data. Gene burden tests detected the strongest association for deletions in BRCA1 (P = 3.7E-18). Nine other genes were associated with a p-value < 0.01 including known susceptibility genes CHEK2 (P = 0.0008), ATM (P = 0.002) and BRCA2 (P = 0.008). Outside the known genes we detected associations with p-values < 0.001 for either overall or subtype-specific breast cancer at nine deletion regions and four duplication regions. Three of the deletion regions were in established common susceptibility loci. To the best of our knowledge, this is the first genome-wide analysis of rare CNVs in a large breast cancer case-control dataset. We detected associations with exonic deletions in established breast cancer susceptibility genes. We also detected suggestive associations with non-coding CNVs in known and novel loci with large effects sizes. Larger sample sizes will be required to reach robust levels of statistical significance.Dennis et al. investigate potential breast cancer associations with rare germline copy number variants (CNVs) by conducting a genome-wide analysis in a large breast cancer case-control dataset. The authors detected associations with exonic deletions in established breast cancer susceptibility genes and suggestive associations for a number of non-coding CNVs. Show less
Background Genome-wide association studies (GWAS) have identified multiple common breast cancer susceptibility variants. Many of these variants have differential associations by estrogen receptor ... Show moreBackground Genome-wide association studies (GWAS) have identified multiple common breast cancer susceptibility variants. Many of these variants have differential associations by estrogen receptor (ER) status, but how these variants relate with other tumor features and intrinsic molecular subtypes is unclear. Methods Among 106,571 invasive breast cancer cases and 95,762 controls of European ancestry with data on 173 breast cancer variants identified in previous GWAS, we used novel two-stage polytomous logistic regression models to evaluate variants in relation to multiple tumor features (ER, progesterone receptor (PR), human epidermal growth factor receptor 2 (HER2) and grade) adjusting for each other, and to intrinsic-like subtypes. Results Eighty-five of 173 variants were associated with at least one tumor feature (false discovery rate < 5%), most commonly ER and grade, followed by PR and HER2. Models for intrinsic-like subtypes found nearly all of these variants (83 of 85) associated at p < 0.05 with risk for at least one luminal-like subtype, and approximately half (41 of 85) of the variants were associated with risk of at least one non-luminal subtype, including 32 variants associated with triple-negative (TN) disease. Ten variants were associated with risk of all subtypes in different magnitude. Five variants were associated with risk of luminal A-like and TN subtypes in opposite directions. Conclusion This report demonstrates a high level of complexity in the etiology heterogeneity of breast cancer susceptibility variants and can inform investigations of subtype-specific risk prediction. Show less
A combination of genetic and functional approaches has identified three independent breast cancer risk loci at 2q35. A recent fine-scale mapping analysis to refine these associations resulted in 1 ... Show moreA combination of genetic and functional approaches has identified three independent breast cancer risk loci at 2q35. A recent fine-scale mapping analysis to refine these associations resulted in 1 (signal 1), 5 (signal 2), and 42 (signal 3) credible causal variants at these loci. We used publicly available in silico DNase I and ChIP-seq data with in vitro reporter gene and CRISPR assays to annotate signals 2 and 3. We identified putative regulatory elements that enhanced cell-type-specific transcription from the IGFBP5 promoter at both signals (30-to 40-fold increased expression by the putative regulatory element at signal 2, 2- to 3-fold by the putative regulatory element at signal 3). We further identified one of the five credible causal variants at signal 2, a 1.4 kb deletion (esv3594306), as the likely causal variant; the deletion allele of this variant was associated with an average additional increase in IGFBP5 expression of 1.3-fold (MCF-7) and 2.2-fold (T-47D). We propose a model in which the deletion allele of esv3594306 juxtaposes two transcription factor binding regions (annotated by estrogen receptor alpha ChIP-seq peaks) to generate a single extended regulatory element. This regulatory element increases cell-type-specific expression of the tumor suppressor gene IGFBP5 and, thereby, reduces risk of estrogen receptor-positive breast cancer (odds ratio = 0.77, 95% CI 0.74-0.81, p = 3.1 x 10(-31)). Show less
Fine-mapping of causal variants and integration of epigenetic and chromatin conformation data identify likely target genes for 150 breast cancer risk regions.Genome-wide association studies have... Show moreFine-mapping of causal variants and integration of epigenetic and chromatin conformation data identify likely target genes for 150 breast cancer risk regions.Genome-wide association studies have identified breast cancer risk variants in over 150 genomic regions, but the mechanisms underlying risk remain largely unknown. These regions were explored by combining association analysis with in silico genomic feature annotations. We defined 205 independent risk-associated signals with the set of credible causal variants in each one. In parallel, we used a Bayesian approach (PAINTOR) that combines genetic association, linkage disequilibrium and enriched genomic features to determine variants with high posterior probabilities of being causal. Potentially causal variants were significantly over-represented in active gene regulatory regions and transcription factor binding sites. We applied our INQUSIT pipeline for prioritizing genes as targets of those potentially causal variants, using gene expression (expression quantitative trait loci), chromatin interaction and functional annotations. Known cancer drivers, transcription factors and genes in the developmental, apoptosis, immune system and DNA integrity checkpoint gene ontology pathways were over-represented among the highest-confidence target genes. Show less