Genome-wide identification and analysis of epithelial-mesenchymal transition-related RNA-binding proteins and … – Nature.com

Posted: May 23, 2024 at 7:53 am

Identification of DERBPs during EMT in breast cancer cells

By differential expression analysis of breast cancer cells at different stages of EMT, we identified a large number of DEGs among the comparison groups (Supplementary Data S1). We found that the number of DEGs decreased first and then increased during the transformation process, and there were fewer DEGs among the comparison groups in the intermediate state (Supplementary Fig. S2A).

We combined all identified DEGs and intersected them with known human RBP genes and found that 504 RBP genes were differentially expressed during EMT in breast cancer cells (Fig.1A)4. WGCNA was used to analyse the coexpression relationships among DERBPs (Supplementary Data S2). We found that according to the expression of RBPs in different EMT stages, these genes could be divided into different modules, and the expression of genes in each module was relatively similar at different stages (Supplementary Fig. S2B). Using the WGCNA process, we calculated the correlation between the genes of each module and the EMT state and found that the MEgreen, MEbrown and MEturquoise modules were significantly correlated with the EMT state (Fig.1B). Among them, the MEgreen module gene was highly expressed in the intermediate state, the MEbrown module gene was highly expressed in the E state cells, and the MEturquoise module gene was highly expressed in the M state cells (Fig.1C).

Identification of EMT-related RBPs in a breast cancer cell line. (A) Venn diagram showing the overlap of DEGs and RBP genes. (B) The correlation between DERBPs in different modules and EMT state. Module-trait associations were computed by a LME model with all factors on the x axis used as covariates. All Pearson s correlation value and P values are displayed. (C) Heatmap of module eigengenes sorted by average linkage hierarchical clustering. FPKM values were log2-transformed and then median-centred by each gene (color figure online). (DF) Heatmap showing the expression profile of DERBPs of green, brown and turquoise module. FPKM values were log2-transformed and then median-centred by each gene (color figure online). (G) The top 5 most enriched GO terms were illustrated for DERBP genesin the three modules.The colour scale showing the row-scaled significance (log10 corrected P value) of the terms.

RBPs in the MEgreen, MEbrown and MEturquoise modules were further extracted, and a heatmap of expression was drawn. Most of the RBPs in the MEgreen module were highly expressed in the intermediate state (Fig.1D). Some of the RBPs in the MEbrown module gene were highly expressed in E state cells, while other RBPs were expressed at extremely low levels in this state (Fig.1E). Most RBPs of MEturquoise module genes were highly expressed in M3 state cells, while a small portion of RBPs were expressed at extremely low levels in M3 state cells (Fig.1F). These results suggested that the expression level of RBPs might affect the conversion process of EMT in breast cancer.

The genes of MEgreen, MEbrown and MEturquoise were extracted for GO pathway analysis. The results showed that pathways enriched in MEbrown genes mainly included the innate immune response, immune system processes, mRNA processing (Supplementary Fig. S2C). The pathways enriched in MEgreen genes mainly included spermatogenesis, cell differentiation, RNA splicing, mRNA processing. (Supplementary Fig. S2D). The pathways of enriched in MEturquoise genes included mRNA processing and RNA splicing (Supplementary Fig. S2E). We further extracted the common GO functional pathways enriched in MEgreen, MEbrown and MEturquoise genes. The results showed that the MEgreen gene had the highest degree of enrichment in RNA splicing and spermatogenesis pathways, the MEturquoise gene had the highest degree of enrichment in mRNA processing, and the MEbrown gene had the highest degree of enrichment in innate immune pathways (Fig.1G).

According to the above results, high expression of RBPs in breast cancer cells in the E state might regulate the expression of immune-related genes in cancer cells to achieve immune escape. Breast cancer cells in intermediate state overexpressed RBPs related to splicing regulation and promoted EM progression. Breast cancer cells with M status highly expressed RBPs related to mRNA processing and realised the transformation of the M phenotype.

Based on the transcriptome data of 18 breast cancer cell samples at different EMT stages, AS events were analysed according to the use of splicing sites using the SUVA pipeline. Five types of AS events, such as alternative 5' splice site, were identified (Supplementary Fig. S3A).

The SUVA pipeline was used to compare the pSAR used in the same splicing event between the two groups of samples. We identified a large number of AS events, such as those involving alternative 5' splice sites and alternative 3' splice sites, between different comparison groups (Fig.2A). In addition, by matching the splicing events detected by SUVA to classical splicing events, 10 kinds of splicing events, including a large number of events involving alternative 3 splice sites, were found (Supplementary Fig. S3B). According to the pSAR used by each differential splicing event, the median pSAR value of the differential AS event was calculated. We found that most of the differential AS events had pSAR values greater than 50% (Fig.2B). Principal component analysis was performed based on pSAR values of differential splicing events with pSAR50% in each sample. The results showed that breast cancer cells at the E, EM1, EM2, EM3, M1 and M2 stages were clustered together. This suggested that these differential AS events can be used to distinguish breast cancer cells at different EMT stages (Fig.2C).

Identification of EMT-related AS in a breast cancer cell line. (A) Bar plot showing number of RAS detected by SUVA in each group. (B) Bar plot showing RAS with different pSAR. RAS with pSAR50% were labeled. (C) Principal component analysis based on RAS with pSAR50%. The ellipse for each group was the confidence ellipse. (D) Heatmap showing the splicing ratio of RAS (pSAR50%). Splicing ratio were log2-transformed and then median-centred by each gene (color figure online). (E) Bar plot exhibited the most enriched GO biological process results of the RAS with pSAR50%.

A heatmap was drawn with pSAR values of differential splicing events with pSAR50%. The pSAR values of some splicing events in breast cancer cells at the E, EM and M2 stages were higher than those at other stages (Fig.2D).

To identify the potential functions of these differential AS events, we extracted the genes responsible for these differential AS events and performed GO and KEGG analyses. GO analysis showed that these genes were enriched in pathways including cellular response to DNA damage stimulus, cell division, cell cycle, positive regulation of GTPase activity, protein transport, tRNA methylation (Fig.2E). KEGG analysis showed that these genes were enriched in pathways including adherens junction, Epstein-Barr virus infection, fatty acid biosynthesis, ferroptosis, yersinia infection (Supplementary Fig. S3C).

Given that RBPs can regulate the AS of some genes during EMT in breast cancer, we extracted differential splicing events with pSAR50% and RBPs in MEgreen, MEbrown, and MEturquoise modules associated with EMT. By using the expression levels of these RBPs and the pSAR of differential AS events to establish a coexpression relationship, we obtained the AS events potentially regulated by RBPs related to EMT. The genes involved in these differential splicing events were extracted for GO function analysis. We found that these genes were significantly mainly enriched in cell adhesion, the integrin-mediated signalling pathway, lipid transport, positive regulation of GTPase activity (Fig.3A).

DERBPs potentially regulated AS associated with cell adhesion in a breast cancer cell line (A) The most enriched GO biological process results of the coexpressed RAS (pSAR50%) potentially regulated by DERBPs. Cutoffs of P value0.01 and Pearson coefficient0.9 or0.9 were applied to identify the coexpression pairs. (B) Heatmap showing the splicing ratio of RAS in cell adhesion pathway. Splicing ratio were log2-transformed and then median-centred by each gene (color figure online). (C) Regulatory networks for differential AS events and coexpressed RBPs on genes in the cell adhesion pathway. (D) The reads distribution and splicing ratio of clualt3p26826 ITGA6. The expression levels of PCBP3 in breast cancer cells at different EMT stages were showed in the right part.

In view of the important role of cell adhesion in EMT and cancer metastasis29, we further extracted differential AS events corresponding to genes enriched in cell adhesion pathways. According to their pSAR values, a heatmap was drawn. The pSAR of some differential splicing events was higher in the E and EM stages, while the pSAR of other differential splicing events was higher in the M stage (Fig.3B).

We constructed regulatory networks for differential AS events and coexpressed RBPs on genes in the cell adhesion pathway and found that 88 RBPs may regulate 37 differential splicing events on 19 cell adhesion pathway genes (Fig.3C). RBM47, PCBP3, FRG1, SRP72 and other RBPs might regulate AS of ITGA6, ADGRE5, TNC and other genes and affect the EMT process of breast cancer cells (Fig.3D and Supplementary Fig. S4).

We further downloaded the sequencing data of breast cancer patients and related clinical information from the TCGA database and extracted the expression levels of the above 88 RBPs with regulatory effects. In the RBPs-related analysis, 1216 breast cancer patients were screened (Supplementary Table S1). The median follow-up was 905days (interquartile range 4621694days), with 200 deaths. We constructed a risk model based on the expression levels of these RBPs and found that ADAT2, C2orf15, SRP72, PAICS, RBMS3, APOBEC3G, NOA1, and ACO1 could be used for risk assessment in terms of breast cancer prognosis (Fig.4AC). Patients predicted to be at high risk using this model had a significantly worse prognosis (Fig.4D). We found significant differences in the expression levels of all 8 RBPs in breast cancer tissues without metastasis compared with normal breast tissues. Perhaps due to the small number of metastatic samples, the expression levels of the 8 RBPs in breast cancer tissues with vs. without metastasis were not significantly different (Fig.4E). Further analysis showed that the expression levels of 8 RBPs in breast cancer tissues were significantly correlated with the prognosis of patients (Fig.4F).

EMT-related RBPs were significantly correlated with the prognosis of breast cancer patients. (A) The result of LASSO regression analysis. (B) LASSO coefficient profiles of the candidate RBPs by tenfold cross-validation. (C) Prognostic value of the candidate RBPs in breast cancer. The HR and P values were calculated using the univariate Cox regression analysis. (D) Comparison of overall survival according to the risk score calculated from candidate RBPs. (E) The boxplot showing the FPKM of candidate RBPs in Tumour, Metastatic and Normal samples. *0.05;**0.01;***0.001. (F) Relationship between expression level of candidate RBPs and prognosis of breast cancer.

In the AS-related analysis, 90 breast cancer patients were screened (Supplementary Table S2). The median follow-up was 1268days (interquartile range 7742129days), with 26 deaths. We used SUVA to identify differential AS events between breast cancer tissue and normal tissue in TCGA and obtained pSAR values of 37 differential splicing events related to 19 cell adhesion pathway genes. Risk analysis based on pSAR values of splicing events showed that splicing events occurring on TNC and COL6A3 could be used to evaluate breast cancer prognosis (Fig.5AC). The analysis found that patients with high-risk differential splicing events had a poor prognosis (Fig.5D). We found that there were significant differences in the pSAR values of these two splicing events in breast cancer tissue without metastasis compared with normal breast tissue (Fig.5G). Further analysis showed that the pSAR values of these two differential splicing events in breast cancer tissues were significantly correlated with the prognosis of patients (Fig.5EF).

EMT-related AS were significantly correlated with the prognosis of breast cancer patients. (A) The result of LASSO regression analysis. (B) LASSO coefficient profiles of the candidate AS by tenfold cross-validation. (C) Prognostic value of the candidate AS in the breast cancer. The HR and P values were calculated using the univariate Cox regression analysis. (D) Comparison of overall survival according to the risk score calculated from candidate AS. (E,F) Relationship between the pSAR of candidate AS and prognosis of breast cancer. (G) The boxplot showing the splicing ratio of clualt5p25729 COL6A3 and clualt3p46274 TNC in Tumour and Normal samples. *0.05;**0.01;***0.001.

Follow this link:
Genome-wide identification and analysis of epithelial-mesenchymal transition-related RNA-binding proteins and ... - Nature.com

Related Posts