BACKGROUND: RNA interference (RNAi) is an effective and important tool used to study gene function. For large-scale screens, RNAi is used to systematically down-regulate genes of interest and analyze their roles in a biological process. However, RNAi is associated with off-target effects (OTEs), including microRNA (miRNA)-like OTEs. The contribution of reagent-specific OTEs to RNAi screen data sets can be significant. In addition, the post-screen validation process is time and labor intensive. Thus, the availability of robust approaches to identify candidate off-targeted transcripts would be beneficial. RESULTS: Significant efforts have been made to eliminate false positive results attributable to sequence-specific OTEs associated with RNAi. These approaches have included improved algorithms for RNAi reagent design, incorporation of chemical modifications into siRNAs, and the use of various bioinformatics strategies to identify possible OTEs in screen results. Genome-wide Enrichment of Seed Sequence matches (GESS) was developed to identify potential off-targeted transcripts in large-scale screen data by seed-region analysis. Here, we introduce a user-friendly web application that provides researchers a relatively quick and easy way to perform GESS analysis on data from human or mouse cell-based screens using short interfering RNAs (siRNAs) or short hairpin RNAs (shRNAs), as well as for Drosophila screens using shRNAs. Online GESS relies on up-to-date transcript sequence annotations for human and mouse genes extracted from NCBI Reference Sequence (RefSeq) and Drosophila genes from FlyBase. The tool also accommodates analysis with user-provided reference sequence files. CONCLUSION: Online GESS provides a straightforward user interface for genome-wide seed region analysis for human, mouse and Drosophila RNAi screen data. With the tool, users can either use a built-in database or provide a database of transcripts for analysis. This makes it possible to analyze RNAi data from any organism for which the user can provide transcript sequences.
The evaluation of specific endogenous transcript levels is important for understanding transcriptional regulation. More specifically, it is useful for independent confirmation of results obtained by the use of microarray analysis or RNA-seq and for evaluating RNA interference (RNAi)-mediated gene knockdown. Designing specific and effective primers for high-quality, moderate-throughput evaluation of transcript levels, i.e., quantitative, real-time PCR (qPCR), is nontrivial. To meet community needs, predefined qPCR primer pairs for mammalian genes have been designed and sequences made available, e.g., via PrimerBank. In this work, we adapted and refined the algorithms used for the mammalian PrimerBank to design 45,417 primer pairs for 13,860 Drosophila melanogaster genes, with three or more primer pairs per gene. We experimentally validated primer pairs for ~300 randomly selected genes expressed in early Drosophila embryos, using SYBR Green-based qPCR and sequence analysis of products derived from conventional PCR. All relevant information, including primer sequences, isoform specificity, spatial transcript targeting, and any available validation results and/or user feedback, is available from an online database (www.flyrnai.org/flyprimerbank). At FlyPrimerBank, researchers can retrieve primer information for fly genes either one gene at a time or in batch mode. Importantly, we included the overlap of each predicted amplified sequence with RNAi reagents from several public resources, making it possible for researchers to choose primers suitable for knockdown evaluation of RNAi reagents (i.e., to avoid amplification of the RNAi reagent itself). We demonstrate the utility of this resource for validation of RNAi reagents in vivo.
The ability to engineer genomes in a specific, systematic, and cost-effective way is critical for functional genomic studies. Recent advances using the CRISPR-associated single-guide RNA system (Cas9/sgRNA) illustrate the potential of this simple system for genome engineering in a number of organisms. Here we report an effective and inexpensive method for genome DNA editing in Drosophila melanogaster whereby plasmid DNAs encoding short sgRNAs under the control of the U6b promoter are injected into transgenic flies in which Cas9 is specifically expressed in the germ line via the nanos promoter. We evaluate the off-targets associated with the method and establish a Web-based resource, along with a searchable, genome-wide database of predicted sgRNAs appropriate for genome engineering in flies. Finally, we discuss the advantages of our method in comparison with other recently published approaches.
Analysis of high-throughput data increasingly relies on pathway annotation and functional information derived from Gene Ontology. This approach has limitations, in particular for the analysis of network dynamics over time or under different experimental conditions, in which modules within a network rather than complete pathways might respond and change. We report an analysis framework based on protein complexes, which are at the core of network reorganization. We generated a protein complex resource for human, Drosophila, and yeast from the literature and databases of protein-protein interaction networks, with each species having thousands of complexes. We developed COMPLEAT (http://www.flyrnai.org/compleat), a tool for data mining and visualization for complex-based analysis of high-throughput data sets, as well as analysis and integration of heterogeneous proteomics and gene expression data sets. With COMPLEAT, we identified dynamically regulated protein complexes among genome-wide RNA interference data sets that used the abundance of phosphorylated extracellular signal-regulated kinase in cells stimulated with either insulin or epidermal growth factor as the output. The analysis predicted that the Brahma complex participated in the insulin response.
RNA interference (RNAi) is a widely adopted tool for loss-of-function studies but RNAi results only have biological relevance if the reagents are appropriately mapped to genes. Several groups have designed and generated RNAi reagent libraries for studies in cells or in vivo for Drosophila and other species. At first glance, matching RNAi reagents to genes appears to be a simple problem, as each reagent is typically designed to target a single gene. In practice, however, the reagent-gene relationship is complex. Although the sequences of oligonucleotides used to generate most types of RNAi reagents are static, the reference genome and gene annotations are regularly updated. Thus, at the time a researcher chooses an RNAi reagent or analyzes RNAi data, the most current interpretation of the RNAi reagent-gene relationship, as well as related information regarding specificity (e.g., predicted off-target effects), can be different from the original interpretation. Here, we describe a set of strategies and an accompanying online tool, UP-TORR (for Updated Targets of RNAi Reagents; www.flyrnai.org/up-torr), useful for accurate and up-to-date annotation of cell-based and in vivo RNAi reagents. Importantly, UP-TORR automatically synchronizes with gene annotations daily, retrieving the most current information available, and for Drosophila, also synchronizes with the major reagent collections. Thus, UP-TORR allows users to choose the most appropriate RNAi reagents at the onset of a study, as well as to perform the most appropriate analyses of results of RNAi-based studies.
The spontaneous and reversible formation of foci and filaments that contain proteins involved in different metabolic processes is common in both the nucleus and the cytoplasm. Stress granules (SGs) and processing bodies (PBs) belong to a novel family of cellular structures collectively known as mRNA silencing foci that harbour repressed mRNAs and their associated proteins. SGs and PBs are highly dynamic and they form upon stress and dissolve thus releasing the repressed mRNAs according to changes in cell physiology. In addition, aggregates containing abnormal proteins are frequent in neurodegenerative disorders. In spite of the growing relevance of these supramolecular aggregates to diverse cellular functions a reliable automated tool for their systematic analysis is lacking. Here we report a MATLAB Script termed BUHO for the high-throughput image analysis of cellular foci. We used BUHO to assess the number, size and distribution of distinct objects with minimal deviation from manually obtained parameters. BUHO successfully addressed the induction of both SGs and PBs in mammalian and insect cells exposed to different stress stimuli. We also used BUHO to assess the dynamics of specific mRNA-silencing foci termed Smaug 1 foci (S-foci) in primary neurons upon synaptic stimulation. Finally, we used BUHO to analyze the role of candidate genes on SG formation in an RNAi-based experiment. We found that FAK56D, GCN2 and PP1 govern SG formation. The role of PP1 is conserved in mammalian cells as judged by the effect of the PP1 inhibitor salubrinal, and involves dephosphorylation of the translation factor eIF2α. All these experiments were analyzed manually and by BUHO and the results differed in less than 5% of the average value. The automated analysis by this user-friendly method will allow high-throughput image processing in short times by providing a robust, flexible and reliable alternative to the laborious and sometimes unfeasible visual scrutiny.
FlyRNAi (http://www.flyrnai.org), the database and website of the Drosophila RNAi Screening Center (DRSC) at Harvard Medical School, serves a dual role, tracking both production of reagents for RNA interference (RNAi) screening in Drosophila cells and RNAi screen results. The database and website is used as a platform for community availability of protocols, tools, and other resources useful to researchers planning, conducting, analyzing or interpreting the results of Drosophila RNAi screens. Based on our own experience and user feedback, we have made several changes. Specifically, we have restructured the database to accommodate new types of reagents; added information about new RNAi libraries and other reagents; updated the user interface and website; and added new tools of use to the Drosophila community and others. Overall, the result is a more useful, flexible and comprehensive website and database.
BACKGROUND: High-throughput screening using RNAi is a powerful gene discovery method but is often complicated by false positive and false negative results. Whereas false positive results associated with RNAi reagents has been a matter of extensive study, the issue of false negatives has received less attention. RESULTS: We performed a meta-analysis of several genome-wide, cell-based Drosophila RNAi screens, together with a more focused RNAi screen, and conclude that the rate of false negative results is at least 8%. Further, we demonstrate how knowledge of the cell transcriptome can be used to resolve ambiguous results and how the number of false negative results can be reduced by using multiple, independently-tested RNAi reagents per gene. CONCLUSIONS: RNAi reagents that target the same gene do not always yield consistent results due to false positives and weak or ineffective reagents. False positive results can be partially minimized by filtering with transcriptome data. RNAi libraries with multiple reagents per gene also reduce false positive and false negative outcomes when inconsistent results are disambiguated carefully.
BACKGROUND: Mapping of orthologous genes among species serves an important role in functional genomics by allowing researchers to develop hypotheses about gene function in one species based on what is known about the functions of orthologs in other species. Several tools for predicting orthologous gene relationships are available. However, these tools can give different results and identification of predicted orthologs is not always straightforward. RESULTS: We report a simple but effective tool, the Drosophila RNAi Screening Center Integrative Ortholog Prediction Tool (DIOPT; http://www.flyrnai.org/diopt), for rapid identification of orthologs. DIOPT integrates existing approaches, facilitating rapid identification of orthologs among human, mouse, zebrafish, C. elegans, Drosophila, and S. cerevisiae. As compared to individual tools, DIOPT shows increased sensitivity with only a modest decrease in specificity. Moreover, the flexibility built into the DIOPT graphical user interface allows researchers with different goals to appropriately 'cast a wide net' or limit results to highest confidence predictions. DIOPT also displays protein and domain alignments, including percent amino acid identity, for predicted ortholog pairs. This helps users identify the most appropriate matches among multiple possible orthologs. To facilitate using model organisms for functional analysis of human disease-associated genes, we used DIOPT to predict high-confidence orthologs of disease genes in Online Mendelian Inheritance in Man (OMIM) and genes in genome-wide association study (GWAS) data sets. The results are accessible through the DIOPT diseases and traits query tool (DIOPT-DIST; http://www.flyrnai.org/diopt-dist). CONCLUSIONS: DIOPT and DIOPT-DIST are useful resources for researchers working with model organisms, especially those who are interested in exploiting model organisms such as Drosophila to study the functions of human disease genes.
Characterizing the extent and logic of signaling networks is essential to understanding specificity in such physiological and pathophysiological contexts as cell fate decisions and mechanisms of oncogenesis and resistance to chemotherapy. Cell-based RNA interference (RNAi) screens enable the inference of large numbers of genes that regulate signaling pathways, but these screens cannot provide network structure directly. We describe an integrated network around the canonical receptor tyrosine kinase (RTK)-Ras-extracellular signal-regulated kinase (ERK) signaling pathway, generated by combining parallel genome-wide RNAi screens with protein-protein interaction (PPI) mapping by tandem affinity purification-mass spectrometry. We found that only a small fraction of the total number of PPI or RNAi screen hits was isolated under all conditions tested and that most of these represented the known canonical pathway components, suggesting that much of the core canonical ERK pathway is known. Because most of the newly identified regulators are likely cell type- and RTK-specific, our analysis provides a resource for understanding how output through this clinically relevant pathway is regulated in different contexts. We report in vivo roles for several of the previously unknown regulators, including CG10289 and PpV, the Drosophila orthologs of two components of the serine/threonine-protein phosphatase 6 complex; the Drosophila ortholog of TepIV, a glycophosphatidylinositol-linked protein mutated in human cancers; CG6453, a noncatalytic subunit of glucosidase II; and Rtf1, a histone methyltransferase.
MicroRNAs (miRNAs) are a class of short noncoding RNAs that regulate protein-coding genes posttranscriptionally. In animals, most known miRNA targeting occurs within the 3'UTR of mRNAs, but the extent of biologically relevant targeting in the ORF or 5'UTR of mRNAs remains unknown. Here, we develop an algorithm (MinoTar-miRNA ORF Targets) to identify conserved regulatory motifs within protein-coding regions and use it to estimate the number of preferentially conserved miRNA-target sites in ORFs. We show that, in Drosophila, preferentially conserved miRNA targeting in ORFs is as widespread as it is in 3'UTRs and that, while far less abundant, conserved targets in Drosophila 5'UTRs number in the hundreds. Using our algorithm, we predicted a set of high-confidence ORF targets and selected seven miRNA-target pairs from among these for experimental validation. We observed down-regulation by the miRNA in five out of seven cases, indicating our approach can recover functional sites with high confidence. Additionally, we observed additive targeting by multiple sites within a single ORF. Altogether, our results demonstrate that the scale of biologically important miRNA targeting in ORFs is extensive and that computational tools such as ours can aid in the identification of such targets. Further evidence suggests that our results extend to mammals, but that the extent of ORF and 5'UTR targeting relative to 3'UTR targeting may be greater in Drosophila.
Damage initiates a pleiotropic cellular response aimed at cellular survival when appropriate. To identify genes required for damage survival, we used a cell-based RNAi screen against the Drosophila genome and the alkylating agent methyl methanesulphonate (MMS). Similar studies performed in other model organisms report that damage response may involve pleiotropic cellular processes other than the central DNA repair components, yet an intuitive systems level view of the cellular components required for damage survival, their interrelationship, and contextual importance has been lacking. Further, by comparing data from different model organisms, identification of conserved and presumably core survival components should be forthcoming. We identified 307 genes, representing 13 signaling, metabolic, or enzymatic pathways, affecting cellular survival of MMS-induced damage. As expected, the majority of these pathways are involved in DNA repair; however, several pathways with more diverse biological functions were also identified, including the TOR pathway, transcription, translation, proteasome, glutathione synthesis, ATP synthesis, and Notch signaling, and these were equally important in damage survival. Comparison with genomic screen data from Saccharomyces cerevisiae revealed no overlap enrichment of individual genes between the species, but a conservation of the pathways. To demonstrate the functional conservation of pathways, five were tested in Drosophila and mouse cells, with each pathway responding to alkylation damage in both species. Using the protein interactome, a significant level of connectivity was observed between Drosophila MMS survival proteins, suggesting a higher order relationship. This connectivity was dramatically improved by incorporating the components of the 13 identified pathways within the network. Grouping proteins into "pathway nodes" qualitatively improved the interactome organization, revealing a highly organized "MMS survival network." We conclude that identification of pathways can facilitate comparative biology analysis when direct gene/orthologue comparisons fail. A biologically intuitive, highly interconnected MMS survival network was revealed after we incorporated pathway data in our interactome analysis.
Genome-wide RNA interference (RNAi) screening allows investigation of the role of individual genes in a process of choice. Most RNAi screens identify a large number of genes with a continuous gradient in the assessed phenotype. Screeners must decide whether to examine genes with the most robust phenotype or the full gradient of genes that cause an effect and how to identify candidate genes. The authors have used RNAi in Drosophila cells to examine viability in a 384-well plate format and compare 2 screens, untreated control and treatment. They compare multiple normalization methods, which take advantage of different features within the data, including quantile normalization, background subtraction, scaling, cellHTS2 (Boutros et al. 2006), and interquartile range measurement. Considering the false-positive potential that arises from RNAi technology, a robust validation method was designed for the purpose of gene selection for future investigations. In a retrospective analysis, the authors describe the use of validation data to evaluate each normalization method. Although no method worked ideally, a combination of 2 methods, background subtraction followed by quantile normalization and cellHTS2, at different thresholds, captures the most dependable and diverse candidate genes. Thresholds are suggested depending on whether a few candidate genes are desired or a more extensive systems-level analysis is sought. The normalization approaches and experimental design to perform validation experiments are likely to apply to those high-throughput screening systems attempting to identify genes for systems-level analysis.
Genome-wide, cell-based screens using high-content screening (HCS) techniques and automated fluorescence microscopy generate thousands of high-content images that contain an enormous wealth of cell biological information. Such screens are key to the analysis of basic cell biological principles, such as control of cell cycle and cell morphology. However, these screens will ultimately only shed light on human disease mechanisms and potential cures if the analysis can keep up with the generation of data. A fundamental step toward automated analysis of high-content screening is to construct a robust platform for automatic cellular phenotype identification. The authors present a framework, consisting of microscopic image segmentation and analysis components, for automatic recognition of cellular phenotypes in the context of the Rho family of small GTPases. To implicate genes involved in Rac signaling, RNA interference (RNAi) was used to perturb gene functions, and the corresponding cellular phenotypes were analyzed for changes. The data used in the experiments are high-content, 3-channel, fluorescence microscopy images of Drosophila Kc167 cultured cells stained with markers that allow visualization of DNA, polymerized actin filaments, and the constitutively activated Rho protein Rac(V12). The performance of this approach was tested using a cellular database that contained more than 1000 samples of 3 predefined cellular phenotypes, and the generalization error was estimated using a cross-validation technique. Moreover, the authors applied this approach to analyze the whole high-content fluorescence images of Drosophila cells for further HCS-based gene function analysis.
Off-target effects have been demonstrated to be a major source of false-positives in RNA interference (RNAi) high-throughput screens. In this study, we re-assess the previously published transcriptional reporter-based whole-genome RNAi screens for the Wingless and Hedgehog signaling pathways using second generation double-stranded RNA libraries. Furthermore, we investigate other factors that may influence the outcome of such screens, including cell-type specificity, robustness of reporters, and assay normalization, which determine the efficacy of RNAi-knockdown of target genes.
Although classical genetic and biochemical approaches have identified hundreds of proteins that function in the dynamic remodeling of cell shape in response to upstream signals, there is currently little systems-level understanding of the organization and composition of signaling networks that regulate cell morphology. We have developed quantitative morphological profiling methods to systematically investigate the role of individual genes in the regulation of cell morphology in a fast, robust, and cost-efficient manner. We analyzed a compendium of quantitative morphological signatures and described the existence of local signaling networks that act to regulate cell protrusion, adhesion, and tension.
To evaluate the specificity of long dsRNAs used in high-throughput RNA interference (RNAi) screens performed at the Drosophila RNAi Screening Center (DRSC), we performed a global analysis of their activity in 30 genome-wide screens completed at our facility. Notably, our analysis predicts that dsRNAs containing > or = 19-nucleotide perfect matches identified in silico to unintended targets may contribute to a significant false positive error rate arising from off-target effects. We confirmed experimentally that such sequences in dsRNAs lead to false positives and to efficient knockdown of a cross-hybridizing transcript, raising a cautionary note about interpreting results based on the use of a single dsRNA per gene. Although a full appreciation of all causes of false positive errors remains to be determined, we suggest simple guidelines to help ensure high-quality information from RNAi high-throughput screens.
RNA interference (RNAi) has become a powerful tool for genetic screening in Drosophila. At the Drosophila RNAi Screening Center (DRSC), we are using a library of over 21,000 double-stranded RNAs targeting known and predicted genes in Drosophila. This library is available for the use of visiting scientists wishing to perform full-genome RNAi screens. The data generated from these screens are collected in the DRSC database (http://flyRNAi.org/cgi-bin/RNAi_screens.pl) in a flexible format for the convenience of the scientist and for archiving data. The long-term goal of this database is to provide annotations for as many of the uncharacterized genes in Drosophila as possible. Data from published screens are available to the public through a highly configurable interface that allows detailed examination of the data and provides access to a number of other databases and bioinformatics tools.
This chapter describes the method used to conduct high-throughput screening (HTs) by RNA interference in Drosophila tissue culture cells. It covers four main topics: (1) a brief description of the existing platforms to conduct RNAi-screens in cell-based assays; (2) a table of the Drosophila cell lines available for these screens and a brief mention of the need to establish other cell lines as well as cultures of primary cells; (3) a discussion of the considerations and protocols involved in establishing assays suitable for HTS in a 384-well format; and (A) a summary of the various ways of handling raw data from an ongoing screen, with special emphasis on how to apply normalization for experimental variation and statistical filters to sort out noise from signals.