Genome-wide promoter responses to CRISPR perturbations of … – Nature.com

Posted: September 17, 2023 at 11:45 am

PPTP-seq development and validation

PPTP-seq uses plasmid to integrate each CRISPRi-based TF perturbation and each promoter activity reporter into one construct. Each plasmid contains a CRISPRi cassette that constitutively expresses a single guide RNA (sgRNA) to repress a specific TF in the genome19 and a promoter-reporter cassette to measure the activity of a specific promoter under the TF-repressed condition (Fig.1a, b). A self-cleaving ribozyme, RiboJ, was inserted between the promoter and the gfp reporter gene to produce invariant mRNA sequences, thus eliminating the interference of different promoter sequences with gfp mRNA stability20.

a Schematic of a regulatory network. Perturbing regulators and the recorded responses of genes are used to infer regulatory interactions. b Reporter plasmids used to quantify promoter activity under CRISPRi-based regulator perturbation. A native promoter was cloned upstream of the gfp gene, and a sgRNA was inserted downstream of a constitutive promoter. c Massively parallel promoter activity measurements for a combinatory library. A combinatory library of more than 2.5105 sgRNA-promoter pairs was sorted into 16 bins according to their GFP expression levels. The sgRNA and promoter regions in each bin were sequenced to estimate perturbed promoter activity for each sgRNA-promoter pair. d Sorted promoter activities of all promoters. The gray and red dots respectively represent promoter activities in strains with TF-targeting sgRNAs and negative control sgRNAs. The black line represents sorted median promoter activities across all TFKD conditions. The blue lines indicate 2-fold changes from the median activities. a.u. arbitrary units. Source data are provided as a Source Data file.

To profile genome-wide transcriptional responses for all TFs in E. coli, we constructed a combinatorial plasmid library consisting of both a sgRNA library and a promoter library (Fig.1c). The sgRNA library contains 183 TF-targeting sgRNAs that repress every single known TF gene in the E. coli genome (Supplementary Data1), and contains five non-targeting sgRNAs as negative controls. The promoter library contains 1372 native promoters that cover more than 50% of all operons in E. coli21 (Supplementary Data2). The combinatorial plasmid library was transformed into E. coli strain FR-E01, which carries a dCas9 gene in its chromosome. Transformed cells were first grown in minimal glucose medium to a steady state and sorted into 16 bins based on their fluorescence intensity (Supplementary Fig.1a). More than 20 million cells (including all 16 bins) were sorted in each replicate (Supplementary Fig.1b and Supplementary Data3), and their plasmids were sequenced using the NovaSeq S4 XP Platform, generating an average of 420 million reads from each replicate (Supplementary Fig.1c and Supplementary Data3). To estimate promoter activities under each perturbed TF condition, sequencing read counts across the bins were first converted to cell count distribution for each individual variant, followed by fitting into log-normal distribution by maximum-likelihood estimation22,23,24 (Supplementary Fig.2 and Methods).

Measured promoter activities were highly consistent between independent biological replicates performed in different weeks, with replicate correlation ranging between 0.90 and 0.95 (Supplementary Fig.3a). Across three independent replicates, the promoter activities of 201,433 library members (i.e., 201,433 different TF-promoter pairs, 81% of the entire library) passed our quality filters (Supplementary Fig.3b, Methods). For most promoters, the median activity of a promoter across all TFKD conditions was consistent with its activity in negative controls (Fig.1d and Supplementary Fig.4). We found that more than 98% of TF-promoter pairs fell within the two-fold-change boundaries of the median activity, indicating robust promoter activities in most TFKD conditions18,25.

CRISPRi can impair cell growth if essential genes are targeted. Seven TF-targeting sgRNAs (alaS, bluR, dicA, dnaA, iscR, mraZ, and nrdR) had substantially reduced reads (fewer than 10,000 reads per sgRNA compared to an average of 4.8 million reads per sgRNA). Among them, alaS, dicA, and dnaA are essential genes whose deletion led to cell death26,27. CRISPRi polarity28,29 can also lead to the repression of essential genes that are located downstream of a targeting TF within the same operon. This explains the substantially reduced reads for iscR, mraZ, and nrdR.

We further evaluated the CRISPRi repression efficiency using both TFspromoter activity measured from PPTP-seq (Supplementary Fig.5a) and transcript level measured from RT-qPCR (Supplementary Fig.5b). The two methods respectively found 95% and 86% of tested TFs showed significant repression (Students t-test P-value<0.05) compared to their corresponding controls containing non-targeting sgRNAs (Supplementary Note1). We further found a clear negative correlation between the degree of CRISPRi repression and TF expression level measured from TFspromoter activity (Supplementary Fig.5c, d). This explains the lack of repression for the small fraction of TFs (e.g., qseB and ttdR).

To further validate the promoter activities measured by PPTP-seq, we randomly selected five promoters, which involve a diverse range of gene functions. We then individually measured their activities in response to CRISPRi repression of nine representative TFs (and one non-targeting sgRNA as a negative control), using a plate-reader-based whole-cell fluorescence assay (Supplementary Fig.6a). Of these 50 sgRNA-promoter pairs, 45 were quantified by PPTP-seq and were highly consistent with individual whole-cell fluorescence measurements (Supplementary Fig.6b, Pearsons r=0.95), confirming the high quality of our pooled measurements. The other five combinations were missing in all three replicates due to their low read counts. This small dataset also contained the regulatory effects of five known direct interactions and one indirect interaction in RegulonDB1 (Supplementary Fig.6c).

We also compared our promoter activity measurements to previously published datasets from other independent experiments. Promoter activities measured from PPTP-seq (using the negative control strains) correlated with transcript levels measured from RNA-seq30 and promoter activities individually measured using flow cytometry31 (Supplementary Fig.7ac, Pearsons r=0.68 and 0.74, respectively). Additionally, fold change in promoter activity upon TFKD measured from PPTP-seq is also qualitatively consistent with that measured from EcoMAC microarray32 for a few known regulatory interactions in RegulonDB1 (Pearsons r=0.51, Supplementary Fig.7d).

We quantified promoter activity changes by TFKD relative to negative controls (Supplementary Fig.4) and modeled the replicated data as log-normal distributed to determine statistical significance. From the 201,433 measured promoter activities, single TFKDs led to upregulation in 3720 TF-promoter pairs and downregulation in 338 pairs (>1.7-fold in promoter activity, q<0.01; Fig.2a) in minimal glucose medium. Most TFs regulate fewer than ten promoters, while a few TFs affect more than 100 promoters (Fig.2b). We also found promoters that are regulated by multiple activators (leading to downregulation by TFKD in Fig.2c) are much less abundant than those regulated by multiple repressors (leading to upregulation in Fig.2c). The most common regulatory effect on a regulated promoter observed in PPTP-seq was single regulation by a single activator or a single repressor (30%, Fig.2c and Supplementary Fig.4), which was consistent with previous datasets measured using other methods1,14.

a Promoter activity changes by TFKD. Dashed lines indicate cutoffs for statistically significant (q<0.01) and substantial (>1.7-fold change) effects. Each dot represents a TF-promoter pair. Upregulation and downregulation by TFKD are shown in red and blue, respectively. A few known interacting TF-promoter pairs are labeled. b Histogram of the number of regulated promoters per TF. Inset in (b) shows histograms over a smaller range. c Histogram of the number of regulating TFs per promoter. d Fractions of constant promoters and variable promoters in each COG category. All COG categories of genes in an operon controlled by a promoter are assigned to the promoter. The dashed line indicates the average fraction of constant promoters over all COG categories. Statistical significance is determined by one-sided Fishers exact test. **P<0.01. Source data are provided as a Source Data file.

Collectively, we identified 936 (71% of 1323 measured promoters) variable promoters with significant activity change under at least one TFKD condition (Supplementary Note2), and the other 29% of the promoters were consideredas constant promoters. Clusters of Orthologous Genes (COG) analysis33 of all downstream genes of these promoters indicated that genes expressed by variable promoters are enriched in the COG class of Carbohydrate transport and metabolism (P=4.4103) (Fig.2d), specifically KEGG pathways in galactose metabolism (eco00052), pentose and glucuronate interconversions (eco00040), starch and sucrose metabolism (eco00500), and amino sugar and nucleotide sugar metabolism (eco00520). Variable promoters also control genes in flagellar and pilus (Supplementary Data4). The results suggested that these functions or activities are more readily subject to regulation under different condition changes. Genes expressed by constant promoters are enriched in inorganic ion transport and metabolism (P=2.6 103), specifically sulfur metabolism (eco00920), ion transport (GO:0006811), and iron ion homeostasis (GO:0055072) (Supplementary Data4), suggesting that these genes play housekeeping roles (Fig.2d).

We systematically investigated whether a TFs promoter can be affected by itself or other TFs. A perturbation-response network between TFs was constructed, where activation and repression represent down- and upregulation by CRISPRi knockdown of an upstream TF, respectively (Fig.3a). In minimal glucose medium, a total of 26 activations and 339 repressions were observed between 126 TFs (Supplementary Data5). Within this dataset, no mutual regulation or repressilators of three or more TFs were observed, likely due to low expression or missing allosteric regulation for some TFs when cells are growing in minimal glucose medium (Supplementary Note3).

a Perturbation-response network of TFs constructed using PPTP-seq data in minimal glucose medium. b Autoregulation of TFs identified by PPTP-seq in minimal glucose medium. Promoter activity fold changes upon the knockdown of TF controlled by the promoter. TF gene names marked in red were selected for validation. Source data are provided in Supplementary Data5.

We then examined TF autoregulatory responses, which have been challenging to study using other methods due to the coupling between perturbation and readout. We identified 12 autoregulated TFs with strong perturbation effects (>1.7-fold in promoter activity, q<0.01) in minimal glucose medium, including two autoregulatory interactions, PgrR and ComR, not present in RegulonDB (Fig.3b). Meanwhile, several previously identified autoregulated TFs (e.g., PhoB, Fur, LldR, etc.) showed only weak perturbation effects (i.e., less than 30% promoter activity change) under our growth conditions in minimal glucose medium. To further validate these findings, we selected seven TF genes and measured their promoter activities across a wide range of TF concentrations using a tunable E. coli TF library34, in which each endogenous TF is replaced by an inducible TF-mCherry fusion (Supplementary Fig.8). Both pgrR and comR promoters showed higher activity at lower TF levels, confirming their negative autoregulation. PgrR autoregulation is consistent with the identified PgrR binding site on its promoter region35. Except for ZraR, four out of five previously identified autoregulated TFs displayed negligible promoter activity changes over a wide TF level range. Thus, the results from the tunable TF library were mostly consistent with PPTP-seq. Our results also suggest that some previously identified TFs lack autoregulatory response when cells are growing in minimal glucose medium and may occur under other growth conditions36,37,38,39, so the interpretation of TF regulation should consider the condition dependency.

PPTP-seq data also allows us to systematically examine gene regulation on complex metabolic pathways. As an example, we selected the one-carbon metabolism (OCM), in which transcriptional regulation was not well characterized in bacteria. OCM is tightly associated with the synthesis of nucleotides, amino acids, and two essential cofactorstetrahydrofolate (THF) and Sadenosylmethionine (SAM), and it plays important roles in cell survival and growth. However, due to the presence of multiple metabolic cycles and interconnected pathway structures, dissecting the regulatory function of OCM remains challenging.

We identified 28 TF genes that can affect at least one promoter in OCM (Supplementary Fig.9). A few genes in methionine and SAM biosynthesis, such as metA, metE, and metK, were observed to be upregulated by metJ knockdown, recapitulating the known feedback control of SAM biosynthesis via MetJ5,40 (Fig.4a). Additionally, we found that metA, metE, and metK were also regulated by other TFs, but in distinct patterns (Fig.4b). For example, metE was found to be activated only by metJ knockdown, while metK was upregulated by knockdown of ten different TFs. This finding is intuitively surprising because MetE and MetK catalyze two consecutive reactions in the methionine cycle, and enzymes from the same pathway are often co-regulated41. The different regulations on metE and metK thus indicate that enzymes catalyzing consecutive steps can have distinct cellular functions: MetE synthesizes methionine for protein synthesis, and MetK produces SAM as a cofactor for metabolic reactions (Fig.4a).

a Promoter activity changes in response to metR and metJ knockdown by CRISPRi. Hcy and SAM control the activity of MetR and MetJ, respectively. NA not applicable, KD knockdown, GTP Guanosine-5-triphosphate, DHPPP 6-hydroxymethyl-7,8-dihydropterin pyrophosphate, PABA para-aminobenzoic acid, DHP dihydropteroate, DHF dihydrofolate, THF tetrahydrofolate, dUMP deoxyuridine monophosphate, dTMP deoxythymidine monophosphate, Met L-methionine, fMet N-formylmethionine, Hcy L-homocysteine, SAM S-adenosylmethionine, SAH S-adenosylhomocysteine, Rib-Hcy S-ribosyl-L-homocysteine. b TF-dependent promoter activity changes for metA, metE, and metK. Each row represents a promoter, and each column stands for a TFKD condition. c Validation of MetR targets. Promoter activities were measured in a metR knockdown strain and, as a control, in a wild-type E. coli strain. Data are presented as meansSD of three replicates from different days. a.u. arbitrary units. Source data are provided as a Source Data file.

The PPTP-seq dataset also revealed the regulatory functions of MetR, previously known only as a regulator of methionine biosynthesis. We found that metR knockdown affected multiple genes in the folate cycle and folate biosynthesis (e.g., metF, thyA, and folE; Fig.4a), not present in RegulonDB1. Previous DAP-seq binding analysis using purified TFs and genomic DNA fragments identified MetR binding sites at metF and folE promoters42, but the in vivo regulatory responses have never been tested. We further verified these regulatory responses using a MetR knockdown strain from the tunable TF library34 (Fig.4c). These findings allow us to discover metabolic feedback control mechanisms in E. coli OCM under homocysteine-starved conditions because MetR binding to DNA requires homocysteine activation43. When homocysteine is limited, cells cannot produce sufficient methionine for translation initiation and elongation. To quickly rescue the cells from their methionine-limited state, MetR-repression of metF must be alleviated, increasing the amount of 5-methyl-THF and preparing for rapid methionine synthesis when the homocysteine level is sufficiently restored. Meanwhile, upregulated metF and thyA by MetR also increase 5,10-methylene THF consumption, which simultaneously reduces 10-formyl-THF due to reversible reactions between these THF species (Fig.4a). Low 10-formyl-THF and methionine can further result in the insufficient formation of initiator tRNA to slow down translation. Additionally, we found that MetR activates folE, whose enzyme product catalyzes the first step in folate biosynthesis (Fig.4a). Thus, homocysteine limitation can also repress folE, thereby decreasing folate biosynthesis. Taken together, these phenomena suggest that MetR helps to block protein translation initiation and folate synthesis in response to low homocysteine and accumulates 5-methyl THF to prepare for rapid methionine biosynthesis once homocysteine is available.

Our genome-wide promoter activity measurements from perturbed TF levels can provide information that complements TF-promoter binding datasets from ChIP-seq, ChIP-exo, DAP-seq, gSELEX, and curated TF binding sites (TFBSs) in RegulonDB1,42,44,45, yielding knowledge about direct and functional TF-promoter interactions. In total, out of the 4058 regulatory responses identified by PPTP-seq in minimal glucose medium, 225 have binding evidence from DAP-seq, and an additional 256 have binding evidence from other binding datasets, altogether representing 12% (481/4058) of the PPTP-seq identified responses (Fig.5a, b, Supplementary Data6). For 127 TFs with binding site information, on average, 23% of regulated promoters per TF were presumably direct targets (Fig.5c). For the rest 56 TFs, their TFBSs were either not in our promoter library or not identified yet. Among the 481 regulatory responses with binding evidence, only 78 of them were found in the TF-operon network in RegulonDB, and the rest 403 TF-promoter responses may contribute to regulatory interactionsnot present in RegulonDB in minimal glucose medium (Supplementary Table1).

a Comparison of TF perturbation-response results from PPTP-seq and TF binding results. b Fraction of TF-promoter pairs that have binding evidence. c Distribution of fraction of regulated promoters with corresponding TFBS for each TF. dh Factors that may affect whether a potentially bound TF on a promoter affects the promoter activity. For each TF-promoter binding interaction, the binding site location in DAP-seq (d), TF concentration measured by Ribo-seq (e), TF concentration measured by mass spectrometry (f), relative binding strength per TF measured by DAP-seq (g), relative binding strength per TF measured by gSELEX (h), and relative binding strength per promoter measured by DAP-seq (i) were considered. The violin plot shows the distribution of data, the central dot in the box represents the median, the box bounds represent the 25th and 75th percentiles, and whiskers represent the minima to maxima values. The number of TFBSs is indicated below. BenjaminiHochberg adjusted P-values were calculated by the Wilcoxon rank sum test. Source data are provided in Supplementary Data6.

In general, PPTP-seq results and the binding datasets have a small overlap in TF-promoter interaction pairs (Fig.5a), which is consistent with the low overlaps between similar comparisons on specific TFs (GadX, GadW, Fur, and SoxS) in E. coli36,46,47 and between eukaryotic transcriptional response and TF binding datasets3,48. This can be caused by low TF expression levels, low TF activity (affected by other molecules), and/or complex regulatory patterns. We individually examined two promoters that have multiple different TF binding sites (Supplementary Note 4 and Supplementary Fig.10). We found the lack of response can be explained by the context-dependent transcriptional regulation49regulatory function of one TF affected by other TFs bound on the same promoter. Further, we found that deactivating the regulating TF can lead the promoter to respond to previously non-regulatory TFs (Supplementary Note4 and Supplementary Fig.10h, i). These observations indicate that TF-promoter binding is not sufficient for response, and E. coli uses layered control to achieve complex logic for gene expression. In RegulonDB, 48% of regulated promoters have more than one functional TF binding site (Supplementary Fig.11), suggesting that such context-dependent transcriptional regulation can be ubiquitous in E. coli.

We sought to explore what general features determine whether a potentially bound TF can regulate promoter activity under our experimental condition (i.e., growing in minimal glucose medium). For each TF binding site, we focused on the binding location, TF concentration, and binding strength. We found that binding sites from both regulating and non-regulating TFs were centered around the transcription start site (TSS) of a promoter50 (Fig.5d) and that regulating TFs had a significantly higher concentration in cells over non-regulating TFs (Fig.5e, f). Additionally, previous biophysical models indicate that TF-DNA binding energy can predict fold changes in promoter response16,51,52,53. We first hypothesized that when a TF has binding sites at multiple promoters, it tends to regulate its targets with the strongest binding strength. To test this hypothesis, we normalized the binding strength of each TF-promoter pair to the maximum binding strength for that TF (called relative binding strength per TF). On average, the relative binding strength per TF was slightly weaker for regulatory TF-promoter pairs than for non-regulatory TF-promoter pairs (Fig.5g, h). This unexpected result suggests that TFs do not necessarily regulate their most tightly associated promoters. We then considered the affinity of all TFs binding to the same promoter and normalized the binding strength of each TF-promoter pair to the maximal strength of the most tightly associated TF for each promoter (called relative binding strength per promoter) (Fig.5i). Results indicate that for each promoter, TFs with stronger binding are more likely to cause promoter activity change. Taking these findings together, the relative binding strengths of TFs on a promoter are a major determinant of promoter response.

To explore genome-scale regulatory networks at conditions other than minimal glucose medium, we further performed PPTP-seq experiments for cells grown in LB and minimal glycerol media. A total of 5279 and 3810 TF-promoter responses were identified in LB and minimal glycerol media, respectively (Supplementary Fig.12). The larger number of responses seen in LB was partially caused by high TF activity of a few TFs that have specific effectors in rich media (Supplementary Table2). Comparing these datasets with that collected from minimal glucose medium, 867 TF-promoter pairs appeared in all three conditions, with 1901, 2274, and 3495 pairs appearing only in one condition, suggesting TF-promoter responses are highly condition-specific (Fig.6a). The upregulated TF-promoter pairs by TFKD (TF repression) have more overlaps among these three conditions than downregulated pairs (TF activation, Fig.6a), suggesting that TF activation is more sensitive to growth conditions (e.g., affected by allosteric regulation) than TF repression. We examined a few individual TFs with known targets (Supplementary Data7) that have distinct regulatory responses in different conditions (Fig.6b). For example, repression of lacZ promoter by CRP was not detected in minimal glucose medium due to low cAMP concentration54, but was observed in LB medium. Similarly, activation of the maltose transporter malK by MalT was observed in LB medium but not in the minimal glucose medium, because expression of malT requires CRP activation55. On the other hand, activation of metE by MetR was observed in minimal glucose and glycerol media but not in LB medium. This is likely caused by repression of metE by MetJ at high SAM concentration56. Our data show that many regulatory responses are condition-dependent (Fig.6b) and highlight that growth condition needs to be specified when describing the regulatory network.

a Comparison of TF perturbation-response results from PPTP-seq at different growth conditions. b Known TF-promoter interactions from RegulonDB showed different regulation under different growth media. Source data are provided as a Source Data file.

Here is the original post:
Genome-wide promoter responses to CRISPR perturbations of ... - Nature.com

Related Posts