Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
RNA-binding proteins (RBPs) mediate myriad layers of post-transcriptional gene regulation, including alternative pre-mRNA splicing (AS)1. Despite the widespread importance of RBPs for cellular function, most of the more than 2,000 human proteins predicted or shown to bind RNA do not have an assigned molecular function1,2. AS is a prevalent and critical RNA processing step, as up to 95% of human multi-exon genes exhibit multiple splice isoforms3. Aberrant splicing is also widespread in disease, especially cancer4,5, driving proteomic imbalance and disruption of cellular homeostasis6,7. Among the RBPs lacking functional annotation of their RNA-binding activity are RBPs involved in AS. Systematic approaches to assign AS activity to RBPs are, thus, needed to bridge this knowledge gap.
Previous assays have employed luciferase and fluorescence-based reporter systems to identify and characterize RBPs that underscore AS. However, these have relied on global overexpression8 or knockdown9,10 of RBPs. Global perturbations of protein level are not able to separate effects caused by direct binding of RBPs from their indirect action through splicing regulatory networks. Furthermore, none of these previous studies has investigated how binding position relative to an alternatively spliced exon can modulate the effect of the RBP, even though many splicing factors can exert different effects depending on the distance and orientation (upstream or downstream of the alternative exon) of their binding position11,12,13,14. Reporter-based assays that recruit candidate proteins to a specific position, previously applied in studies of transcriptional effectors15 and modulators of RNA stability/translation16, are a promising avenue to address these limitations17.
Complementary to the important need to understand the mechanisms driving AS is the potential utility of tools for targeted modulation of splicing events. Engineered RBPs have been generated through fusion of exon activation domains to RNA-targeting PUF domains18 and RNA-targeting CRISPR systems19,20. Such technologies are in their nascent stage, reliant on exon activation domains selected from historically well-known splicing factors. A molecular toolkit of potent and compact activation domains to be implemented in maturation of these technologies remains to be established.
In this study, we developed tethered function luciferase-based splicing reporter assays to investigate and quantify the capacity of any protein sequence to directly promote exon inclusion. We used this system to systematically assess proximity-dependent modulation of exon inclusion for 718 human RBPs at two separate tethering positions and to identify potent and compact exon inclusion activation domains. Altogether, our assays serve as both a biological discovery engine that reveals factors involved in splicing and a prototyping platform that can yield molecular parts for protein engineering applications.
We investigated the biology underlying the candidates detected from our screens. To verify that our assays robustly captured known regulators of AS, we performed Gene Ontology (GO) analysis on the full list of final hits. When compared with a background of the complete tethering library, GO analysis showed strong enrichment of RNA splicing-associated terms (Fig. 2a). As AS occurs in the nucleus, we investigated the subcellular localization of the candidates. We referenced the COMPARTMENTS subcellular localization database, which integrates evidence from text mining, high-throughput screens, literature and prediction methods, and extracted the nuclear localization confidence score for each candidate25. All candidates, save two, have a nuclear confidence score of 4/5 or greater (Supplementary Table 10). The two candidates that scored lower than 4/5 were STAU1 and EIF4B. STAU1, which scored 2.68/5, has previously been linked to splicing regulation26,27. EIF4B, which scored 3.82/5, initiates translation in the cytoplasm by binding RNA substrates and recruiting ribosomes. We hypothesize that this mechanism could drive a false positive when artificially driven to nuclear pre-mRNA in our tethering system, as the mechanism of spliceosome recruitment is similar. Nevertheless, a potentially nuclear role of EIF4B in splicing regulation merits future investigation. Altogether, the candidates determined by our screen are enriched for known regulators of mRNA splicing and are largely localized to the nucleus.
Initially, we also investigated a complementary approach to identify RBPs that induce exon skipping. We constructed a reporter using the same framework around MAP3K7 exon 12, which is primarily included in HEK293T cells (Extended Data Fig. 3a). We validated the response of the MAP3K7 reporter to HNRNPK and PCBP1, known activators of exon skipping, using the reporter readout and RNA-level validation when tethered 100 base pairs upstream of the AS exon (Extended Data Fig. 3b). Twenty-two of 44 RBPs induced exon skipping when tethered 30 base pairs downstream of the AS exon, and 154 of 194 induced exon skipping when tethered 100 base pairs upstream of the AS exon (Extended Data Fig. 3c,d and Supplementary Tables 15 and 16). The high proportion of hits suggests that recruitment of many proteins may simply act to sterically prevent spliceosome recognition; thus, we stopped the skipping screen here and constrained this study to focus on exon inclusion, a more specific molecular task.
To investigate whether these RBPs modulate AS of endogenous RNA, we performed shRNA-mediated knockdown followed by RNA sequencing (RNA-seq) analysis in HEK293T cells with shRNAs specific to these proteins. Knockdowns of all targets were successful, with knockdown of at least 50% as measured by transcripts per million (TPM) (Extended Data Fig. 4c). We examined the differential AS events after knockdown and detected differentially spliced events for all knockdowns (Fig. 3c). To simplify characterization, we performed further analysis on differentially spliced events of the skipped exon (SE) category. At least 30 differential SE events were driven by the knockdown of each of these candidates. For RTCA and TRNAU1AP, more than 500 differentially spliced events were detected. We determined the direction of splicing change for each differentially spliced SE event (Fig. 3d). As the initial screens were designed to detect RBPs with the potential to induce exon inclusion, we expected to observe splicing events with increased skipping upon knockdown. We observe this trend for TRNAU1AP, indicating that TRNAU1AP is endogenously driving exon inclusion, matching our prediction from the screens. The other candidates did not display the same trend. Nevertheless, they cannot be eliminated as direct drivers of exon inclusion at this stage, because final AS outcome also captures participation of the unexpected hits in upstream pathways and competitive effects with other splicing factors50. The data here indicate that the candidates each play roles in AS regulation of some events, with TRNAU1AP and RTCA modulating many SE events.
To nominate AS exons that could be regulated by direct binding, we integrated findings from eCLIP and RNA-seq. We found that genes containing knockdown-sensitive exons are bound at a significantly higher rate than genes lacking knockdown-sensitive exons by SCAF8, RTCA and TRNAU1AP but not by STAU2 (Fig. 3e,f). Although the count of genes containing knockdown-sensitive SE events is low for STAU2 in comparison to the count of genes bound, the events in which there is overlap could be directly driven by binding; however, this appears to be a more specific than widespread phenomenon, at least in HEK293T cells. RTCA binds to most genes containing knockdown-sensitive SE events, indicating that the binding of RTCA directly drives many splicing changes. TRNAU1AP and SCAF8 both bind a substantial portion of genes with knockdown-sensitive SE events. Splicing modulation of these events may be directly driven by this binding. Some of the non-bound differential splicing events could by driven by their roles in pathways upstream of splicing outcome or could be bound at levels below the detection sensitivity of eCLIP. Altogether, RTCA, SCAF8 and TRNAU1AP appear to directly regulate many SE events through binding, whereas STAU2 appears to do this in a more limited capacity.
Motivated by our results articulating that TRNAU1AP or its domain can be useful in artificial splicing factors, we returned to the original list of top RBPs that altered splicing of our reporter construct and tested various protein truncations of these with the aim of determining minimal splice-activating domains to repurpose for artificial splicing factors. LUC7L2 and SRSF8 were selected as strong hits that activated splicing both upstream and downstream of the alternative exon (Fig. 6a). SNRPB and FUBP1 were selected as strong hits that activated lucMAPT-30D only (Fig. 6b). U2AF2 and SRSF10 were selected as strong hits that primarily activated exon inclusion when tethered upstream (Fig. 6c). We designed and cloned truncations based on domain structure, assuming modularity of RBPs where effector and binding domains are separate and independent.
We constructed CRISPR-based artificial splicing factors by fusing the truncations that most successfully activated the tethering reporter to catalytically dead Cas13d. These were tested with an MS2-free luciferase splicing reporter and compared with the recently reported RBFOX1N-dCasRx-C artificial splicing factor19 (Fig. 6g). As expected, RBFOX1N-dCasRx-C activated the reporter only when targeting sites downstream of the alternatively spliced exon, with a maximal ψ of 11.87% with g1. The SRSF8-2-based artificial splicing factor activated the reporter at all positions, with a maximal ψ of 31.34% with g2. The SNRPB-1-based artificial splicing factor activated the reporter only when targeting downstream of the alternatively spliced exon, as for RBFOX1N-dCasRx-C, but with a greater maximal ψ of 19.15% with g1. The U2AF2-2-based artificial splicing factor did not show activation only with upstream gRNAs as expected, although activation was maximized with upstream guide g5 at 18.60%. Altogether, the SNRPB-1 artificial splicing factor directly outperformed RBFOX1N-dCasRx-C; the SRSF8-2 artificial splicing factor provided a stronger tool with reduced position dependence; and the U2AF2 artificial splicing factor introduced a tool with upstream position association.
b1e95dc632