Breast cancer quantitative proteome and proteogenomic landscape


MS-based proteomics quantification of a breast tumor cohort

Nine patients classified into each of the five PAM50 subtype groups were selected from the Oslo2 study cohort to ensure tumor diversity is represented (denoted Oslo2 Landscape cohort) (Fig. 1)13,14. LC-MS/MS-based protein quantification was performed as described in the Supplementary Methods section11,12.

In all, 13,997 protein products of 12,645 genes were identified at a 1% protein false-discovery rate (FDR) based on 248,949 identified unique peptides (Fig. 1, Supplementary Fig. 1A, B, Supplementary Data S1). The subset of 9995 proteins quantified (with a median of 12 unique peptides/protein and 24 PSMs/protein for quantification) in each of the 45 tumors, based on gene symbol centric quantification (denoted proteins henceforth), is used for all quantitative proteome analyses (i.e., the quantified proteome) (Supplementary Fig. 1C–H).

Robustness of protein identification/quantification was examined by searching raw MS spectra using parallel methods (MS-GF + Percolator15,16 and Andromeda in MaxQuant) and performing reverse phase protein lysate assays (RPPA) on sections of the same tumors. Both spectral search methods yield similar protein identifications (Supplementary Fig. 1I), 60% of whose quantities are positively correlated with RPPA findings (Supplementary Fig. 1J)13,17, and MS-based profiles of BC hallmark proteins are consistent with well-established characteristics of tumor PAM50 classifications (Supplementary Fig. 1K).

Correlation analysis of tumor proteomes and metabolomes

Unsupervised hierarchical clustering of proteome profiles stratifies tumors largely in agreement with the PAM50 subtypes (Fig. 2a, Supplementary Fig. 2, Table S1). Basal-like, normal-like, and luminal A groups are distinguished; however, the luminal B and HER2 subtypes are intermixed, indicating similarities in the molecular phenotype. The validity of these mixed classifications is further supported by tumor-transcript profiles of both PAM50 subtypes correlating with either subtype centroid (Supplementary Fig. 2A) and by clinically HER2+ patients often receiving a conflicting mRNA-based classification18.

Proteome clustering, relation to PAM50 subtypes and metabolites.