Data mining

Stephanie E Mohr and Norbert Perrimon. 2012. “RNAi screening: new approaches, understandings, and organisms.” Wiley Interdiscip Rev RNA, 3, 2, Pp. 145-58.Abstract

RNA interference (RNAi) leads to sequence-specific knockdown of gene function. The approach can be used in large-scale screens to interrogate function in various model organisms and an increasing number of other species. Genome-scale RNAi screens are routinely performed in cultured or primary cells or in vivo in organisms such as C. elegans. High-throughput RNAi screening is benefitting from the development of sophisticated new instrumentation and software tools for collecting and analyzing data, including high-content image data. The results of large-scale RNAi screens have already proved useful, leading to new understandings of gene function relevant to topics such as infection, cancer, obesity, and aging. Nevertheless, important caveats apply and should be taken into consideration when developing or interpreting RNAi screens. Some level of false discovery is inherent to high-throughput approaches and specific to RNAi screens, false discovery due to off-target effects (OTEs) of RNAi reagents remains a problem. The need to improve our ability to use RNAi to elucidate gene function at large scale and in additional systems continues to be addressed through improved RNAi library design, development of innovative computational and analysis tools and other approaches.

2012_Wiley Interdis Rev_Mohr.pdf
Matthew Booker, Anastasia A Samsonova, Young Kwon, Ian Flockhart, Stephanie E Mohr, and Norbert Perrimon. 2011. “False negative rates in Drosophila cell-based RNAi screens: a case study.” BMC Genomics, 12, Pp. 50.Abstract

BACKGROUND: High-throughput screening using RNAi is a powerful gene discovery method but is often complicated by false positive and false negative results. Whereas false positive results associated with RNAi reagents has been a matter of extensive study, the issue of false negatives has received less attention. RESULTS: We performed a meta-analysis of several genome-wide, cell-based Drosophila RNAi screens, together with a more focused RNAi screen, and conclude that the rate of false negative results is at least 8%. Further, we demonstrate how knowledge of the cell transcriptome can be used to resolve ambiguous results and how the number of false negative results can be reduced by using multiple, independently-tested RNAi reagents per gene. CONCLUSIONS: RNAi reagents that target the same gene do not always yield consistent results due to false positives and weak or ineffective reagents. False positive results can be partially minimized by filtering with transcriptome data. RNAi libraries with multiple reagents per gene also reduce false positive and false negative outcomes when inconsistent results are disambiguated carefully.

2011_BMCGenomics_Booker.pdf Supplement 1.xls Supplement 2.xls
Adam A Friedman, George Tucker, Rohit Singh, Dong Yan, Arunachalam Vinayagam, Yanhui Hu, Richard Binari, Pengyu Hong, Xiaoyun Sun, Maura Porto, Svetlana Pacifico, Thilakam Murali, Russell L Finley, John M Asara, Bonnie Berger, and Norbert Perrimon. 2011. “Proteomic and functional genomic landscape of receptor tyrosine kinase and ras to extracellular signal-regulated kinase signaling.” Sci Signal, 4, 196, Pp. rs10.Abstract

Characterizing the extent and logic of signaling networks is essential to understanding specificity in such physiological and pathophysiological contexts as cell fate decisions and mechanisms of oncogenesis and resistance to chemotherapy. Cell-based RNA interference (RNAi) screens enable the inference of large numbers of genes that regulate signaling pathways, but these screens cannot provide network structure directly. We describe an integrated network around the canonical receptor tyrosine kinase (RTK)-Ras-extracellular signal-regulated kinase (ERK) signaling pathway, generated by combining parallel genome-wide RNAi screens with protein-protein interaction (PPI) mapping by tandem affinity purification-mass spectrometry. We found that only a small fraction of the total number of PPI or RNAi screen hits was isolated under all conditions tested and that most of these represented the known canonical pathway components, suggesting that much of the core canonical ERK pathway is known. Because most of the newly identified regulators are likely cell type- and RTK-specific, our analysis provides a resource for understanding how output through this clinically relevant pathway is regulated in different contexts. We report in vivo roles for several of the previously unknown regulators, including CG10289 and PpV, the Drosophila orthologs of two components of the serine/threonine-protein phosphatase 6 complex; the Drosophila ortholog of TepIV, a glycophosphatidylinositol-linked protein mutated in human cancers; CG6453, a noncatalytic subunit of glucosidase II; and Rtf1, a histone methyltransferase.

2011_Sci Sig_Friedman.pdf Supplemental
Ralph A Neumüller and Norbert Perrimon. 2011. “Where gene discovery turns into systems biology: genome-scale RNAi screens in Drosophila.” Wiley Interdiscip Rev Syst Biol Med, 3, 4, Pp. 471-8.Abstract

Systems biology aims to describe the complex interplays between cellular building blocks which, in their concurrence, give rise to the emergent properties observed in cellular behaviors and responses. This approach tries to determine the molecular players and the architectural principles of their interactions within the genetic networks that control certain biological processes. Large-scale loss-of-function screens, applicable in various different model systems, have begun to systematically interrogate entire genomes to identify the genes that contribute to a certain cellular response. In particular, RNA interference (RNAi)-based high-throughput screens have been instrumental in determining the composition of regulatory systems and paired with integrative data analyses have begun to delineate the genetic networks that control cell biological and developmental processes. Through the creation of tools for both, in vitro and in vivo genome-wide RNAi screens, Drosophila melanogaster has emerged as one of the key model organisms in systems biology research and over the last years has massively contributed to and hence shaped this discipline. WIREs Syst Biol Med 2011 3 471-478 DOI: 10.1002/wsbm.127

Amy M Wiles, Mark Doderer, Jianhua Ruan, Ting-Ting Gu, Dashnamoorthy Ravi, Barron Blackman, and Alexander JR Bishop. 2010. “Building and analyzing protein interactome networks by cross-species comparisons.” BMC Syst Biol, 4, Pp. 36.Abstract

BACKGROUND: A genomic catalogue of protein-protein interactions is a rich source of information, particularly for exploring the relationships between proteins. Numerous systems-wide and small-scale experiments have been conducted to identify interactions; however, our knowledge of all interactions for any one species is incomplete, and alternative means to expand these network maps is needed. We therefore took a comparative biology approach to predict protein-protein interactions across five species (human, mouse, fly, worm, and yeast) and developed InterologFinder for research biologists to easily navigate this data. We also developed a confidence score for interactions based on available experimental evidence and conservation across species. RESULTS: The connectivity of the resultant networks was determined to have scale-free distribution, small-world properties, and increased local modularity, indicating that the added interactions do not disrupt our current understanding of protein network structures. We show examples of how these improved interactomes can be used to analyze a genome-scale dataset (RNAi screen) and to assign new function to proteins. Predicted interactions within this dataset were tested by co-immunoprecipitation, resulting in a high rate of validation, suggesting the high quality of networks produced. CONCLUSIONS: Protein-protein interactions were predicted in five species, based on orthology. An InteroScore, a score accounting for homology, number of orthologues with evidence of interactions, and number of unique observations of interactions, is given to each known and predicted interaction. Our website provides research biologists intuitive access to this data.

2010_BMC Sys Bio_Wiles.pdf Supplemental
Franz Wendler, Alison K Gillingham, Rita Sinka, Cláudia Rosa-Ferreira, David E Gordon, Xavier Franch-Marro, Andrew A Peden, Jean-Paul Vincent, and Sean Munro. 2010. “A genome-wide RNA interference screen identifies two novel components of the metazoan secretory pathway.” EMBO J, 29, 2, Pp. 304-14.Abstract

Genetic screens in the yeast Saccharomyces cerevisiae have identified many proteins involved in the secretory pathway, most of which have orthologues in higher eukaryotes. To investigate whether there are additional proteins that are required for secretion in metazoans but are absent from yeast, we used genome-wide RNA interference (RNAi) to look for genes required for secretion of recombinant luciferase from Drosophila S2 cells. This identified two novel components of the secretory pathway that are conserved from humans to plants. Gryzun is distantly related to, but distinct from, the Trs130 subunit of the TRAPP complex but is absent from S. cerevisiae. RNAi of human Gryzun (C4orf41) blocks Golgi exit. Kish is a small membrane protein with a previously uncharacterised orthologue in yeast. The screen also identified Drosophila orthologues of almost 60% of the yeast genes essential for secretion. Given this coverage, the small number of novel components suggests that contrary to previous indications the number of essential core components of the secretory pathway is not much greater in metazoans than in yeasts.

2010_EMBO_Wendler.pdf Supplemental
Stephanie Mohr, Chris Bakal, and Norbert Perrimon. 2010. “Genomic screening with RNAi: results and challenges.” Annu Rev Biochem, 79, Pp. 37-64.Abstract

RNA interference (RNAi) is an effective tool for genome-scale, high-throughput analysis of gene function. In the past five years, a number of genome-scale RNAi high-throughput screens (HTSs) have been done in both Drosophila and mammalian cultured cells to study diverse biological processes, including signal transduction, cancer biology, and host cell responses to infection. Results from these screens have led to the identification of new components of these processes and, importantly, have also provided insights into the complexity of biological systems, forcing new and innovative approaches to understanding functional networks in cells. Here, we review the main findings that have emerged from RNAi HTS and discuss technical issues that remain to be improved, in particular the verification of RNAi results and validation of their biological relevance. Furthermore, we discuss the importance of multiplexed and integrated experimental data analysis pipelines to RNAi HTS.

2010_Annu Rev Biochem_Mohr.pdf
Philippos Mourikis, Robert J Lake, Christopher B Firnhaber, and Brian S DeDecker. 2010. “Modifiers of notch transcriptional activity identified by genome-wide RNAi.” BMC Dev Biol, 10, Pp. 107.Abstract

BACKGROUND: The Notch signaling pathway regulates a diverse array of developmental processes, and aberrant Notch signaling can lead to diseases, including cancer. To obtain a more comprehensive understanding of the genetic network that integrates into Notch signaling, we performed a genome-wide RNAi screen in Drosophila cell culture to identify genes that modify Notch-dependent transcription. RESULTS: Employing complementary data analyses, we found 399 putative modifiers: 189 promoting and 210 antagonizing Notch activated transcription. These modifiers included several known Notch interactors, validating the robustness of the assay. Many novel modifiers were also identified, covering a range of cellular localizations from the extracellular matrix to the nucleus, as well as a large number of proteins with unknown function. Chromatin-modifying proteins represent a major class of genes identified, including histone deacetylase and demethylase complex components and other chromatin modifying, remodeling and replacement factors. A protein-protein interaction map of the Notch-dependent transcription modifiers revealed that a large number of the identified proteins interact physically with these core chromatin components. CONCLUSIONS: The genome-wide RNAi screen identified many genes that can modulate Notch transcriptional output. A protein interaction map of the identified genes highlighted a network of chromatin-modifying enzymes and remodelers that regulate Notch transcription. Our results open new avenues to explore the mechanisms of Notch signal regulation and the integration of this pathway into diverse cellular processes.

2010_BMC Dev Bio_Mourikis.pdf Supplemental
Dashnamoorthy Ravi, Amy M Wiles, Selvaraj Bhavani, Jianhua Ruan, Philip Leder, and Alexander JR Bishop. 2009. “A network of conserved damage survival pathways revealed by a genomic RNAi screen.” PLoS Genet, 5, 6, Pp. e1000527.Abstract

Damage initiates a pleiotropic cellular response aimed at cellular survival when appropriate. To identify genes required for damage survival, we used a cell-based RNAi screen against the Drosophila genome and the alkylating agent methyl methanesulphonate (MMS). Similar studies performed in other model organisms report that damage response may involve pleiotropic cellular processes other than the central DNA repair components, yet an intuitive systems level view of the cellular components required for damage survival, their interrelationship, and contextual importance has been lacking. Further, by comparing data from different model organisms, identification of conserved and presumably core survival components should be forthcoming. We identified 307 genes, representing 13 signaling, metabolic, or enzymatic pathways, affecting cellular survival of MMS-induced damage. As expected, the majority of these pathways are involved in DNA repair; however, several pathways with more diverse biological functions were also identified, including the TOR pathway, transcription, translation, proteasome, glutathione synthesis, ATP synthesis, and Notch signaling, and these were equally important in damage survival. Comparison with genomic screen data from Saccharomyces cerevisiae revealed no overlap enrichment of individual genes between the species, but a conservation of the pathways. To demonstrate the functional conservation of pathways, five were tested in Drosophila and mouse cells, with each pathway responding to alkylation damage in both species. Using the protein interactome, a significant level of connectivity was observed between Drosophila MMS survival proteins, suggesting a higher order relationship. This connectivity was dramatically improved by incorporating the components of the 13 identified pathways within the network. Grouping proteins into "pathway nodes" qualitatively improved the interactome organization, revealing a highly organized "MMS survival network." We conclude that identification of pathways can facilitate comparative biology analysis when direct gene/orthologue comparisons fail. A biologically intuitive, highly interconnected MMS survival network was revealed after we incorporated pathway data in our interactome analysis.

2009_PLOS Gen_Dashnamoorthy.pdf Supplemental
Amy M Wiles, Dashnamoorthy Ravi, Selvaraj Bhavani, and Alexander JR Bishop. 2008. “An analysis of normalization methods for Drosophila RNAi genomic screens and development of a robust validation scheme.” J Biomol Screen, 13, 8, Pp. 777-84.Abstract

Genome-wide RNA interference (RNAi) screening allows investigation of the role of individual genes in a process of choice. Most RNAi screens identify a large number of genes with a continuous gradient in the assessed phenotype. Screeners must decide whether to examine genes with the most robust phenotype or the full gradient of genes that cause an effect and how to identify candidate genes. The authors have used RNAi in Drosophila cells to examine viability in a 384-well plate format and compare 2 screens, untreated control and treatment. They compare multiple normalization methods, which take advantage of different features within the data, including quantile normalization, background subtraction, scaling, cellHTS2 (Boutros et al. 2006), and interquartile range measurement. Considering the false-positive potential that arises from RNAi technology, a robust validation method was designed for the purpose of gene selection for future investigations. In a retrospective analysis, the authors describe the use of validation data to evaluate each normalization method. Although no method worked ideally, a combination of 2 methods, background subtraction followed by quantile normalization and cellHTS2, at different thresholds, captures the most dependable and diverse candidate genes. Thresholds are suggested depending on whether a few candidate genes are desired or a more extensive systems-level analysis is sought. The normalization approaches and experimental design to perform validation experiments are likely to apply to those high-throughput screening systems attempting to identify genes for systems-level analysis.

2008_J Biomol Screen_Wiles.pdf
Chris Bakal, Rune Linding, Flora Llense, Elleard Heffern, Enrique Martin-Blanco, Tony Pawson, and Norbert Perrimon. 2008. “Phosphorylation networks regulating JNK activity in diverse genetic backgrounds.” Science, 322, 5900, Pp. 453-6.Abstract

Cellular signaling networks have evolved to enable swift and accurate responses, even in the face of genetic or environmental perturbation. Thus, genetic screens may not identify all the genes that regulate different biological processes. Moreover, although classical screening approaches have succeeded in providing parts lists of the essential components of signaling networks, they typically do not provide much insight into the hierarchical and functional relations that exist among these components. We describe a high-throughput screen in which we used RNA interference to systematically inhibit two genes simultaneously in 17,724 combinations to identify regulators of Drosophila JUN NH(2)-terminal kinase (JNK). Using both genetic and phosphoproteomics data, we then implemented an integrative network algorithm to construct a JNK phosphorylation network, which provides structural and mechanistic insights into the systems architecture of JNK signaling.

2008_Science_Bakal.pdf Supplement.pdf
Ramanuj DasGupta, Kent Nybakken, Matthew Booker, Bernard Mathey-Prevot, Foster Gonsalves, Binita Changkakoty, and Norbert Perrimon. 2007. “A case study of the reproducibility of transcriptional reporter cell-based RNAi screens in Drosophila.” Genome Biol, 8, 9, Pp. R203.Abstract

Off-target effects have been demonstrated to be a major source of false-positives in RNA interference (RNAi) high-throughput screens. In this study, we re-assess the previously published transcriptional reporter-based whole-genome RNAi screens for the Wingless and Hedgehog signaling pathways using second generation double-stranded RNA libraries. Furthermore, we investigate other factors that may influence the outcome of such screens, including cell-type specificity, robustness of reporters, and assay normalization, which determine the efficacy of RNAi-knockdown of target genes.

Meghana M Kulkarni, Matthew Booker, Serena J Silver, Adam Friedman, Pengyu Hong, Norbert Perrimon, and Bernard Mathey-Prevot. 2006. “Evidence of off-target effects associated with long dsRNAs in Drosophila melanogaster cell-based assays.” Nat Methods, 3, 10, Pp. 833-8.Abstract

To evaluate the specificity of long dsRNAs used in high-throughput RNA interference (RNAi) screens performed at the Drosophila RNAi Screening Center (DRSC), we performed a global analysis of their activity in 30 genome-wide screens completed at our facility. Notably, our analysis predicts that dsRNAs containing > or = 19-nucleotide perfect matches identified in silico to unintended targets may contribute to a significant false positive error rate arising from off-target effects. We confirmed experimentally that such sequences in dsRNAs lead to false positives and to efficient knockdown of a cross-hybridizing transcript, raising a cautionary note about interpreting results based on the use of a single dsRNA per gene. Although a full appreciation of all causes of false positive errors remains to be determined, we suggest simple guidelines to help ensure high-quality information from RNAi high-throughput screens.

2006_Nat Meth_Kulkarni.pdf Supplemental
Ian Flockhart, Matthew Booker, Amy Kiger, Michael Boutros, Susan Armknecht, Nadire Ramadan, Kris Richardson, Andrew Xu, Norbert Perrimon, and Bernard Mathey-Prevot. 2006. “FlyRNAi: the Drosophila RNAi screening center database.” Nucleic Acids Res, 34, Database issue, Pp. D489-94.Abstract

RNA interference (RNAi) has become a powerful tool for genetic screening in Drosophila. At the Drosophila RNAi Screening Center (DRSC), we are using a library of over 21,000 double-stranded RNAs targeting known and predicted genes in Drosophila. This library is available for the use of visiting scientists wishing to perform full-genome RNAi screens. The data generated from these screens are collected in the DRSC database ( in a flexible format for the convenience of the scientist and for archiving data. The long-term goal of this database is to provide annotations for as many of the uncharacterized genes in Drosophila as possible. Data from published screens are available to the public through a highly configurable interface that allows detailed examination of the data and provides access to a number of other databases and bioinformatics tools.

2006_Nucl Acids Res_Flockhart.pdf
Frederic Bard, Laetitia Casano, Arrate Mallabiabarrena, Erin Wallace, Kota Saito, Hitoshi Kitayama, Gianni Guizzunti, Yue Hu, Franz Wendler, Ramanuj DasGupta, Norbert Perrimon, and Vivek Malhotra. 2006. “Functional genomics reveals genes involved in protein secretion and Golgi organization.” Nature, 439, 7076, Pp. 604-7.Abstract

Yeast genetics and in vitro biochemical analysis have identified numerous genes involved in protein secretion. As compared with yeast, however, the metazoan secretory pathway is more complex and many mechanisms that regulate organization of the Golgi apparatus remain poorly characterized. We performed a genome-wide RNA-mediated interference screen in a Drosophila cell line to identify genes required for constitutive protein secretion. We then classified the genes on the basis of the effect of their depletion on organization of the Golgi membranes. Here we show that depletion of class A genes redistributes Golgi membranes into the endoplasmic reticulum, depletion of class B genes leads to Golgi fragmentation, depletion of class C genes leads to aggregation of Golgi membranes, and depletion of class D genes causes no obvious change. Of the 20 new gene products characterized so far, several localize to the Golgi membranes and the endoplasmic reticulum.

2006_Nature_Bard.pdf Supplemental
Adam Friedman and Norbert Perrimon. 2006. “A functional RNAi screen for regulators of receptor tyrosine kinase and ERK signalling.” Nature, 444, 7116, Pp. 230-4.Abstract

Receptor tyrosine kinase (RTK) signalling through extracellular-signal-regulated kinases (ERKs) has pivotal roles during metazoan development, underlying processes as diverse as fate determination, differentiation, proliferation, survival, migration and growth. Abnormal RTK/ERK signalling has been extensively documented to contribute to developmental disorders and disease, most notably in oncogenic transformation by mutant RTKs or downstream pathway components such as Ras and Raf. Although the core RTK/ERK signalling cassette has been characterized by decades of research using mammalian cell culture and forward genetic screens in model organisms, signal propagation through this pathway is probably regulated by a larger network of moderate, context-specific proteins. The genes encoding these proteins may not have been discovered through traditional screens owing, in particular, to the requirement for visible phenotypes. To obtain a global view of RTK/ERK signalling, we performed an unbiased, RNA interference (RNAi), genome-wide, high-throughput screen in Drosophila cells using a novel, quantitative, cellular assay monitoring ERK activation. Here we show that ERK pathway output integrates a wide array of conserved cellular processes. Further analysis of selected components-in multiple cell types with different RTK ligands and oncogenic stimuli-validates and classifies 331 pathway regulators. The relevance of these genes is highlighted by our isolation of a Ste20-like kinase and a PPM-family phosphatase that seem to regulate RTK/ERK signalling in vivo and in mammalian cells. Novel regulators that modulate specific pathway outputs may be selective targets for drug discovery.

2006_Nature_Friedman.pdf Supplemental
Christophe J Echeverri and Norbert Perrimon. 2006. “High-throughput RNAi screening in cultured cells: a user's guide.” Nat Rev Genet, 7, 5, Pp. 373-84.Abstract

RNA interference has re-energized the field of functional genomics by enabling genome-scale loss-of-function screens in cultured cells. Looking back on the lessons that have been learned from the first wave of technology developments and applications in this exciting field, we provide both a user's guide for newcomers to the field and a detailed examination of some more complex issues, particularly concerning optimization and quality control, for more advanced users. From a discussion of cell lines, screening paradigms, reagent types and read-out methodologies, we explore in particular the complexities of designing optimal controls and normalization strategies for these challenging but extremely powerful studies.

2006_Nat Rev Gene_Echeverri.pdf
Susan Armknecht, Michael Boutros, Amy Kiger, Kent Nybakken, Bernard Mathey-Prevot, and Norbert Perrimon. 2005. “High-throughput RNA interference screens in Drosophila tissue culture cells.” Methods Enzymol, 392, Pp. 55-73.Abstract

This chapter describes the method used to conduct high-throughput screening (HTs) by RNA interference in Drosophila tissue culture cells. It covers four main topics: (1) a brief description of the existing platforms to conduct RNAi-screens in cell-based assays; (2) a table of the Drosophila cell lines available for these screens and a brief mention of the need to establish other cell lines as well as cultures of primary cells; (3) a discussion of the considerations and protocols involved in establishing assays suitable for HTS in a 384-well format; and (A) a summary of the various ways of handling raw data from an ongoing screen, with special emphasis on how to apply normalization for experimental variation and statistical filters to sort out noise from signals.

2005_Methods Enzym_Armknecht.pdf