Exploring the Neandertal legacy of pancreatic ductal adenocarcinoma risk in Eurasians

Background The genomes of present-day non-Africans are composed of 1–3% of Neandertal-derived DNA as a consequence of admixture events between Neandertals and anatomically modern humans about 50–60 thousand years ago. Neandertal-introgressed single nucleotide polymorphisms (aSNPs) have been associated with modern human disease-related traits, which are risk factors for pancreatic ductal adenocarcinoma (PDAC), such as obesity, type 2 diabetes, and inflammation. In this study, we aimed at investigating the role of aSNPs in PDAC in three Eurasian populations. Results The high-coverage Vindija Neandertal genome was used to select aSNPs in non-African populations from 1000 Genomes project phase 3 data. Then, the association between aSNPs and PDAC risk was tested independently in Europeans and East Asians, using existing GWAS data on more than 200 000 individuals. We did not find any significant associations between aSNPs and PDAC in samples of European descent, whereas, in East Asians, we observed that the Chr10p12.1-rs117585753-T allele (MAF = 10%) increased the risk to develop PDAC (OR = 1.35, 95%CI 1.19–1.54, P = 3.59 × 10–6), with a P-value close to a threshold that takes into account multiple testing. Conclusions Our results show only a minimal contribution of Neandertal SNPs to PDAC risk. Supplementary Information The online version contains supplementary material available at 10.1186/s40659-023-00457-y.

Considering that aSNPs are associated with several PDAC risk factors and that the genetic contribution to PDAC etiology still needs to be elucidated, we aimed at investigating the Neandertal legacy of PDAC genetic risk.We analysed PDAC GWAS cohorts from different Eurasian populations for significant associations with aSNPs to study the role of Neandertal admixture and PDAC risk in different ancestry groups.This study is the first attempt to investigate the role of archaic admixture on PDAC development.

Results
In this study, 389 144 aSNPs were identified among the non-African populations of the 1000 Genomes project [43].The association between aSNPs and the risk of developing PDAC was tested in three ancestry groups: non-Finnish Europeans, Finns, and East Asians.

Discussion
We tested the effects of Neandertal introgression on PDAC susceptibility in three ancestry groups.In non-Finnish Europeans and Finns, no novel significant associations between aSNPs and PDAC were observed.
In JaPAN, we found that the T allele of Chr10p12.1-rs117585753increased the risk to develop PDAC (P = 3.59 × 10 -6 ).This association was not statistically significant when considering multiple testing.However, it is very close to the Bonferroni corrected threshold (p j = 2.28 × 10 -6 ).The functional implications of this aSNP have not been clarified yet: according to GWAS catalog, it is not associated with any complex human trait.
Interestingly, the T allele of Chr10p12.1-rs117585753 is present in EAS (MAF = 10%), whereas it is almost absent in the other populations represented in 1000 Genomes project (e.g., MAF < 0.01 in Europeans from 1000 Genomes project).Since Chr10p12.1-rs117585753 is polymorphic only in Asians, it is possible that the role of this aSNP in complex traits has not been elucidated yet because most of the association studies have been conducted in cohorts with participants of European descent [46].The lower number of studies with Asian individuals implies that the associations between SNPs, which are rare in Europeans but common in Asians, still need further investigation to be understood entirely.
Chr10p12.1-rs117585753lies in an intron of the protein-coding PRTFDC1 gene, in which, according to the GWAS catalog, there are SNPs associated with blood cell count [47][48][49].Several white and red blood cell count parameters have been used to predict immune response and inflammation in various diseases, including PDAC [50,51].One SNP in PRTFDC1 (Chr10p12.1-rs7905553) is in weak LD (r 2 = 0.14, D' = 0.96) with Chr10p12.1-rs117585753 in EAS, and according to GWAS catalog it is associated with red blood cell distribution width (RDW) [52], which is a parameter of erythrocyte variation.RDW has been proposed as a biomarker of the inflammatory state that could predict progression/prognosis in PDAC [53], suggesting a potential contribution of the PRTFDC1 genomic region and Chr10p12.1-rs117585753 in PDAC and immunity.
Several Neandertal-derived haplotypes involved in immunity have been reported to be under selection after Neandertal-AMH introgression.In fact, the positive selection of aSNPs that lead to adaptation (adaptive introgression) has been observed to be driven by the immune response to pathogens [8,9,[54][55][56][57].
Possible limitations of our approach are represented by the fact that we could have underestimated the role Fig. 1 aSNPs filtering and analysis workflow for each ancestry group.The figure displays aSNPs analysis workflow for non-Finnish Europeans, Finns, and East Asians.The 389 144 aSNPs identified in all non-African populations from 1000 Genomes project phase 3, were filtered and analysed for each ancestry group.aSNP Neandertal introgressed SNP. 1 aSNPs that showed an association P-value < 0.05 in PanScan, PanC4 and in the two datasets combined.aSNPs with a P < 0.05 in PanScan (7850), PanC4 (8141), and the combined datasets (8718). 2 aSNPs with a P-value < 0.05 and an identical direction of the effect in all the three GWASs included in JaPAN of rare variants (MAF < 1%) because we did not have enough statistical power to detect associations between rare aSNPs and PDAC, although we used the largest PDAC datasets currently available, which included more than 200 000 individuals of three different ancestries.An additional potential limitation of this work is that 93 695 out of 389 144 aSNPs identified in Eurasian genomes could not be found in PanScan + PanC4, FinnGen, and JaPAN.Therefore, the role of these aSNPs in PDAC susceptibility was not explored.In future analyses, larger reference panels for imputation could be used to maximize the investigated Neandertalderived genetic variability.

Conclusions
In conclusion, we observed that the Neandertal introgressed DNA does not influence PDAC susceptibility in populations of European descent.Interestingly, we observed a potential association between Chr10p12.1-rs117585753-Tand an increased risk of developing PDAC in populations of Asian descent, although not formally significant after correction for multiple testing.This aSNP is polymorphic only in East Asians and is situated in a genomic region involved in immunity.Further investigations are needed to elucidate the evolutionary processes that lead to these aSNPs in the AMH gene pool and the role of aSNPs in PDAC risk, and more broadly, to explore the Neandertal legacy in the susceptibility to other cancer types.

Neandertal SNPs identification
The method to select aSNPs was previously described [12].Briefly, to define a potential introgressed allele, we used four criteria that needed to be fulfilled: (a) the allele is shared between the Vindija Neandertal [5] and at least one non-African population from 1000 Genomes project phase 3 [43]; (b) the allele is not present in Yoruba from sub-Saharan Africa; (c) the allele is carried in homozygous state by Vindija Neandertal; (d) based on the haplotype length, the allele is more likely derived from Neandertal-AMH admixture than incomplete lineage sorting (ILS).To apply the fourth criterion, an approach, that was previously described by Huerta-Sánchez et al., and Dannemann et al. was used [54,58].Briefly, it allows the identification of putative Neandertal introgressed regions in all non-African 1000 Genomes project

Table 1 Candidate aSNPs for each ancestry group
Panel A shows the summary statistics of aSNPs with the lowest association P-value, panel B shows the minor allele frequency of SNPs with the lowest P-value of associations across several datasets aSNP Neandertal introgressed SNP; M major allele; m, minor allele; MAF minor allele frequency; OR Odds Ratio; 95%C.I. 95% Confidence Interval; PRTFDC1 phosphoribosyl transferase domain containing 1; 1000G 1000 Genomes project phase 3 [43]; gnomAD Genome Aggregation Database [44]; HGDP Human Genome Diversity Project [45]  populations.Two recombination maps [59,60] were used to calculate the expected ILS segments length based on the local recombination rate.Then, the probability that a segment length was consistent with ILS was computed and the resulting P-values were corrected through Benjamini-Hochberg method.Haplotypes that showed an adjusted P-value < 0.05 were considered as introgressed from Neandertal.The aSNPs used in the following analyses lay on one of these Neandertal-derived haplotypes.
All the analyses were based on human genome assembly GRCh37, and only biallelic loci were considered, excluding indels.

Study populations
The association between aSNPs and PDAC risk was tested in three ancestry groups: non-Finnish Europeans, Finns and East Asians.A two-phase association study (discovery and replication) was performed to examine if aSNPs identified in non-Finnish Europeans affected PDAC susceptibility.On the other hand, a validation set was not available for Finns and East Asians, and the association between aSNPs and PDAC was tested by searching for aSNPs in FinnGen and JaPAN datasets, respectively (see below).
For non-Finnish European analyses, the discovery set included data of the Pancreatic Cancer Cohort Consortium (PanScan) and the Pancreatic Cancer Case-Control Consortium (PanC4).The data were downloaded from the database of Genotypes and Phenotypes (dbGaP, https:// www.ncbi.nlm.nih.gov/ gap/).The dbGaP study accession numbers were: phs000206.v5.p3 and phs000648.v1.p1.; the project reference number was #12644.Details about data collection, genotyping methods and analyses are described in the original publications [26,31,32,61].
The replication of aSNPs with a P-value of association with PDAC risk lower than the Bonferroni-adjusted threshold (see below) was attempted in the Pancreatic Disease Research (PANDoRA) consortium [64,65].
PANDoRA is a multicentric study on pancreatic cancer based mainly on European countries (Greece, Italy, Germany, Netherlands, Denmark, Czech Republic, Hungary, Poland, Ukraine, Lithuania, UK).In addition, PAN-DoRA includes a subgroup of Brazilian cases and controls that were excluded from the validation set in this study because PanScan + PanC4 (discovery set) included only Caucasian samples, while Brazilians belong to different ancestries (unlike the other PANDoRA samples).Information on sex, and age (recruitment for controls and diagnosis for the cases) was collected for each participant.The controls were enrolled among the general population, blood donors or hospitalised individuals not affected by cancer, chronic pancreatitis, or diabetes [64].For this study, 4983 individuals (1894 PDAC cases and 3089 controls) from PANDoRA were included in the analysis (Table 2).

Table 2 Study population description for each ancestry group
The table shows the number of cases and controls in PanScan + PanC4, PANDoRA, FinnGen and JaPAN.Male and female count and median age of cases and controls are displayed for each study a PanScan + PanC4 (discovery phase), PANDoRA (replication phase) b Data for male count and age are displayed as minimum-maximum values of the three GWASs included in JaPAN Data not available: "-" Non-Finnish Europeans and Finns were analysed separately because PanScan + PanC4 and PANDoRA mainly include subjects with Central European ancestry.We used the FinnGen Release 8 (R8) data that consists of GWAS summary statistics of 1249 pancreatic cancer cases and 259 583 controls with Finnish ancestry (Table 2).Subjects affected by other cancer types were excluded from the controls (https:// FinnG en.gitbo ok.io/ docum entat ion/) [66].
To examine the association between aSNPs identified in East Asians and PDAC, we downloaded JaPAN consortium dataset that consisted of summary statistics of a meta-analysis of three GWASs (JaPAN, National Cancer Center and BioBank Japan GWASs).Comprehensive information on genotyping and data analysis are given in the original publication [67].Summary statistics for the GWAS analysis are available on the JaPAN consortium website (http:// www.aichi-med-u.ac.jp/ JaPAN/ curre nt_ initi atives-e.html) and include 34 631 individuals of East Asian origin (2039 PDAC cases and 32 592 controls) (Table 2).

Data and statistical analyses
For non-Finnish Europeans, the association between aSNPs and PDAC susceptibility was tested in the PanScan + PanC4 dataset using logistic regression analysis, adjusting for age, sex and the top eight principal components (Fig. 2).To obtain a list of independent aSNPs, all aSNPs in linkage disequilibrium (LD; r 2 > 0.5) with each other were excluded, and in each LD block the aSNP with the lowest association P-value was selected.Then, all aSNPs showing an association lower than the threshold for statistical significance corrected for multiple testing in PanScan, PanC4 and in the combined datasets were selected for replication in PANDoRA.
The genomic DNA of the PANDoRA samples was extracted from circulating blood using the QIamp ® 96 DNA QIcube ® HT Kit (Qiagen, Hilden, Germany).The genotyping was done using TaqMan RealTime PCR assays in 384-well plates.Each plate included cases and controls, duplicated samples for quality controls (QCs) and negative controls.The fluorescent signal detection was detected through a QuantStudioTM 5 Real-Time PCR system (Thermofisher, USA) and genotypes were called using the QuantStudio ™ Design and Analysis Software v1.5.1.Samples with a genotyping call rate lower than 75% were excluded from the analysis.Hardy-Weinberg equilibrium test was performed with the Pearson chi-square test.To test the association between aSNPs and PDAC risk in PANDoRA, a logistic regression adjusted for age, sex, and country of origin was used.
For Finns and East Asians, the analyses were carried out in parallel, keeping separated the two ancestry groups.Considering that for FinnGen and JaPAN we used summary statistics, we looked at the P-value for association in these two datasets for the aSNPs selected for the two populations.Since JaPAN is a meta-analysis of three studies, along with P-value, the concordance of the direction of the effect between the three GWASs was considered.

Fig. 2
Fig. 2 Manhattan and Quantile-Quantile (Q-Q) plots of PanScan + PanC4 association study results.The P-values displayed in Manhattan (A) and Q-Q plots (B) are calculated combining PanScan and PanC4 datasets.The plots were done using qqman R package (https:// cran.r-project.org/ web/ packa ges/ qqman/ index.html)[68].The inflation factors (l) did not indicate systematic inflation for PanScan (l = 1.02),PanC4 (l = 1.05), and combined datasets (l = 1.05).The inflation factors were computed using simtrait R package (https:// cran.r-project.org/ web/ packa ges/ simtr ait/ index.html)[69] MAF of the SNP of interest in the analysed datasets.Since allele frequencies are not freely available in JaPAN dataset, the MAF in East Asians from 1000G is reported In HGDP, SNP frequency in Finns is not available.Frequency in all European populations is displayed EUR Europeans; FIN Finns; EAS East Asians a Gene in which aSNP lies b c PanScan + PanC4 (discovery phase), PANDoRA (replication phase) d