Data mining

Susan Armknecht, Michael Boutros, Amy Kiger, Kent Nybakken, Bernard Mathey-Prevot, and Norbert Perrimon. 2005. “High-throughput RNA interference screens in Drosophila tissue culture cells.” Methods Enzymol, 392, Pp. 55-73.Abstract

This chapter describes the method used to conduct high-throughput screening (HTs) by RNA interference in Drosophila tissue culture cells. It covers four main topics: (1) a brief description of the existing platforms to conduct RNAi-screens in cell-based assays; (2) a table of the Drosophila cell lines available for these screens and a brief mention of the need to establish other cell lines as well as cultures of primary cells; (3) a discussion of the considerations and protocols involved in establishing assays suitable for HTS in a 384-well format; and (A) a summary of the various ways of handling raw data from an ongoing screen, with special emphasis on how to apply normalization for experimental variation and statistical filters to sort out noise from signals.

Stephanie E Mohr and Norbert Perrimon. 2012. “RNAi screening: new approaches, understandings, and organisms.” Wiley Interdiscip Rev RNA, 3, 2, Pp. 145-58.Abstract

RNA interference (RNAi) leads to sequence-specific knockdown of gene function. The approach can be used in large-scale screens to interrogate function in various model organisms and an increasing number of other species. Genome-scale RNAi screens are routinely performed in cultured or primary cells or in vivo in organisms such as C. elegans. High-throughput RNAi screening is benefitting from the development of sophisticated new instrumentation and software tools for collecting and analyzing data, including high-content image data. The results of large-scale RNAi screens have already proved useful, leading to new understandings of gene function relevant to topics such as infection, cancer, obesity, and aging. Nevertheless, important caveats apply and should be taken into consideration when developing or interpreting RNAi screens. Some level of false discovery is inherent to high-throughput approaches and specific to RNAi screens, false discovery due to off-target effects (OTEs) of RNAi reagents remains a problem. The need to improve our ability to use RNAi to elucidate gene function at large scale and in additional systems continues to be addressed through improved RNAi library design, development of innovative computational and analysis tools and other approaches.

Dashnamoorthy Ravi, Amy M Wiles, Selvaraj Bhavani, Jianhua Ruan, Philip Leder, and Alexander JR Bishop. 2009. “A network of conserved damage survival pathways revealed by a genomic RNAi screen.” PLoS Genet, 5, 6, Pp. e1000527.Abstract

Damage initiates a pleiotropic cellular response aimed at cellular survival when appropriate. To identify genes required for damage survival, we used a cell-based RNAi screen against the Drosophila genome and the alkylating agent methyl methanesulphonate (MMS). Similar studies performed in other model organisms report that damage response may involve pleiotropic cellular processes other than the central DNA repair components, yet an intuitive systems level view of the cellular components required for damage survival, their interrelationship, and contextual importance has been lacking. Further, by comparing data from different model organisms, identification of conserved and presumably core survival components should be forthcoming. We identified 307 genes, representing 13 signaling, metabolic, or enzymatic pathways, affecting cellular survival of MMS-induced damage. As expected, the majority of these pathways are involved in DNA repair; however, several pathways with more diverse biological functions were also identified, including the TOR pathway, transcription, translation, proteasome, glutathione synthesis, ATP synthesis, and Notch signaling, and these were equally important in damage survival. Comparison with genomic screen data from Saccharomyces cerevisiae revealed no overlap enrichment of individual genes between the species, but a conservation of the pathways. To demonstrate the functional conservation of pathways, five were tested in Drosophila and mouse cells, with each pathway responding to alkylation damage in both species. Using the protein interactome, a significant level of connectivity was observed between Drosophila MMS survival proteins, suggesting a higher order relationship. This connectivity was dramatically improved by incorporating the components of the 13 identified pathways within the network. Grouping proteins into "pathway nodes" qualitatively improved the interactome organization, revealing a highly organized "MMS survival network." We conclude that identification of pathways can facilitate comparative biology analysis when direct gene/orthologue comparisons fail. A biologically intuitive, highly interconnected MMS survival network was revealed after we incorporated pathway data in our interactome analysis.

Alfeu Zanotto-Filho, Ravi Dashnamoorthy, Eva Loranc, Luis HT de Souza, José CF Moreira, Uthra Suresh, Yidong Chen, and Alexander JR Bishop. 2016. “Combined Gene Expression and RNAi Screening to Identify Alkylation Damage Survival Pathways from Fly to Human.” PLoS One, 11, 4, Pp. e0153970.Abstract

Alkylating agents are a key component of cancer chemotherapy. Several cellular mechanisms are known to be important for its survival, particularly DNA repair and xenobiotic detoxification, yet genomic screens indicate that additional cellular components may be involved. Elucidating these components has value in either identifying key processes that can be modulated to improve chemotherapeutic efficacy or may be altered in some cancers to confer chemoresistance. We therefore set out to reevaluate our prior Drosophila RNAi screening data by comparison to gene expression arrays in order to determine if we could identify any novel processes in alkylation damage survival. We noted a consistent conservation of alkylation survival pathways across platforms and species when the analysis was conducted on a pathway/process level rather than at an individual gene level. Better results were obtained when combining gene lists from two datasets (RNAi screen plus microarray) prior to analysis. In addition to previously identified DNA damage responses (p53 signaling and Nucleotide Excision Repair), DNA-mRNA-protein metabolism (transcription/translation) and proteasome machinery, we also noted a highly conserved cross-species requirement for NRF2, glutathione (GSH)-mediated drug detoxification and Endoplasmic Reticulum stress (ER stress)/Unfolded Protein Responses (UPR) in cells exposed to alkylation. The requirement for GSH, NRF2 and UPR in alkylation survival was validated by metabolomics, protein studies and functional cell assays. From this we conclude that RNAi/gene expression fusion is a valid strategy to rapidly identify key processes that may be extendable to other contexts beyond damage survival.

Frederic Bard, Laetitia Casano, Arrate Mallabiabarrena, Erin Wallace, Kota Saito, Hitoshi Kitayama, Gianni Guizzunti, Yue Hu, Franz Wendler, Ramanuj DasGupta, Norbert Perrimon, and Vivek Malhotra. 2006. “Functional genomics reveals genes involved in protein secretion and Golgi organization.” Nature, 439, 7076, Pp. 604-7.Abstract

Yeast genetics and in vitro biochemical analysis have identified numerous genes involved in protein secretion. As compared with yeast, however, the metazoan secretory pathway is more complex and many mechanisms that regulate organization of the Golgi apparatus remain poorly characterized. We performed a genome-wide RNA-mediated interference screen in a Drosophila cell line to identify genes required for constitutive protein secretion. We then classified the genes on the basis of the effect of their depletion on organization of the Golgi membranes. Here we show that depletion of class A genes redistributes Golgi membranes into the endoplasmic reticulum, depletion of class B genes leads to Golgi fragmentation, depletion of class C genes leads to aggregation of Golgi membranes, and depletion of class D genes causes no obvious change. Of the 20 new gene products characterized so far, several localize to the Golgi membranes and the endoplasmic reticulum.

Mar Arias Garcia, Miguel Sanchez Alvarez, Heba Sailem, Vicky Bousgouni, Julia Sero, and Chris Bakal. 2012. “Differential RNAi screening provides insights into the rewiring of signalling networks during oxidative stress.” Mol Biosyst, 8, 10, Pp. 2605-13.Abstract

Reactive Oxygen Species (ROS) are a natural by-product of cellular growth and proliferation, and are required for fundamental processes such as protein-folding and signal transduction. However, ROS accumulation, and the onset of oxidative stress, can negatively impact cellular and genomic integrity. Signalling networks have evolved to respond to oxidative stress by engaging diverse enzymatic and non-enzymatic antioxidant mechanisms to restore redox homeostasis. The architecture of oxidative stress response networks during periods of normal growth, and how increased ROS levels dynamically reconfigure these networks are largely unknown. In order to gain insight into the structure of signalling networks that promote redox homeostasis we first performed genome-scale RNAi screens to identify novel suppressors of superoxide accumulation. We then infer relationships between redox regulators by hierarchical clustering of phenotypic signatures describing how gene inhibition affects superoxide levels, cellular viability, and morphology across different genetic backgrounds. Genes that cluster together are likely to act in the same signalling pathway/complex and thus make "functional interactions". Moreover we also calculate differential phenotypic signatures describing the difference in cellular phenotypes following RNAi between untreated cells and cells submitted to oxidative stress. Using both phenotypic signatures and differential signatures we construct a network model of functional interactions that occur between components of the redox homeostasis network, and how such interactions become rewired in the presence of oxidative stress. This network model predicts a functional interaction between the transcription factor Jun and the IRE1 kinase, which we validate in an orthogonal assay. We thus demonstrate the ability of systems-biology approaches to identify novel signalling events.

Amy M Wiles, Mark Doderer, Jianhua Ruan, Ting-Ting Gu, Dashnamoorthy Ravi, Barron Blackman, and Alexander JR Bishop. 2010. “Building and analyzing protein interactome networks by cross-species comparisons.” BMC Syst Biol, 4, Pp. 36.Abstract

BACKGROUND: A genomic catalogue of protein-protein interactions is a rich source of information, particularly for exploring the relationships between proteins. Numerous systems-wide and small-scale experiments have been conducted to identify interactions; however, our knowledge of all interactions for any one species is incomplete, and alternative means to expand these network maps is needed. We therefore took a comparative biology approach to predict protein-protein interactions across five species (human, mouse, fly, worm, and yeast) and developed InterologFinder for research biologists to easily navigate this data. We also developed a confidence score for interactions based on available experimental evidence and conservation across species. RESULTS: The connectivity of the resultant networks was determined to have scale-free distribution, small-world properties, and increased local modularity, indicating that the added interactions do not disrupt our current understanding of protein network structures. We show examples of how these improved interactomes can be used to analyze a genome-scale dataset (RNAi screen) and to assign new function to proteins. Predicted interactions within this dataset were tested by co-immunoprecipitation, resulting in a high rate of validation, suggesting the high quality of networks produced. CONCLUSIONS: Protein-protein interactions were predicted in five species, based on orthology. An InteroScore, a score accounting for homology, number of orthologues with evidence of interactions, and number of unique observations of interactions, is given to each known and predicted interaction. Our website provides research biologists intuitive access to this data.

Meghana M Kulkarni, Matthew Booker, Serena J Silver, Adam Friedman, Pengyu Hong, Norbert Perrimon, and Bernard Mathey-Prevot. 2006. “Evidence of off-target effects associated with long dsRNAs in Drosophila melanogaster cell-based assays.” Nat Methods, 3, 10, Pp. 833-8.Abstract

To evaluate the specificity of long dsRNAs used in high-throughput RNA interference (RNAi) screens performed at the Drosophila RNAi Screening Center (DRSC), we performed a global analysis of their activity in 30 genome-wide screens completed at our facility. Notably, our analysis predicts that dsRNAs containing > or = 19-nucleotide perfect matches identified in silico to unintended targets may contribute to a significant false positive error rate arising from off-target effects. We confirmed experimentally that such sequences in dsRNAs lead to false positives and to efficient knockdown of a cross-hybridizing transcript, raising a cautionary note about interpreting results based on the use of a single dsRNA per gene. Although a full appreciation of all causes of false positive errors remains to be determined, we suggest simple guidelines to help ensure high-quality information from RNAi high-throughput screens.

Arunachalam Vinayagam, Jonathan Zirin, Charles Roesel, Yanhui Hu, Bahar Yilmazel, Anastasia A Samsonova, Ralph A Neumüller, Stephanie E Mohr, and Norbert Perrimon. 2014. “Integrating protein-protein interaction networks with phenotypes reveals signs of interactions.” Nat Methods, 11, 1, Pp. 94-9.Abstract

A major objective of systems biology is to organize molecular interactions as networks and to characterize information flow within networks. We describe a computational framework to integrate protein-protein interaction (PPI) networks and genetic screens to predict the 'signs' of interactions (i.e., activation-inhibition relationships). We constructed a Drosophila melanogaster signed PPI network consisting of 6,125 signed PPIs connecting 3,352 proteins that can be used to identify positive and negative regulators of signaling pathways and protein complexes. We identified an unexpected role for the metabolic enzymes enolase and aldo-keto reductase as positive and negative regulators of proteolysis, respectively. Characterization of the activation-inhibition relationships between physically interacting proteins within signaling pathways will affect our understanding of many biological functions, including signal transduction and mechanisms of disease.

Philippos Mourikis, Robert J Lake, Christopher B Firnhaber, and Brian S DeDecker. 2010. “Modifiers of notch transcriptional activity identified by genome-wide RNAi.” BMC Dev Biol, 10, Pp. 107.Abstract

BACKGROUND: The Notch signaling pathway regulates a diverse array of developmental processes, and aberrant Notch signaling can lead to diseases, including cancer. To obtain a more comprehensive understanding of the genetic network that integrates into Notch signaling, we performed a genome-wide RNAi screen in Drosophila cell culture to identify genes that modify Notch-dependent transcription. RESULTS: Employing complementary data analyses, we found 399 putative modifiers: 189 promoting and 210 antagonizing Notch activated transcription. These modifiers included several known Notch interactors, validating the robustness of the assay. Many novel modifiers were also identified, covering a range of cellular localizations from the extracellular matrix to the nucleus, as well as a large number of proteins with unknown function. Chromatin-modifying proteins represent a major class of genes identified, including histone deacetylase and demethylase complex components and other chromatin modifying, remodeling and replacement factors. A protein-protein interaction map of the Notch-dependent transcription modifiers revealed that a large number of the identified proteins interact physically with these core chromatin components. CONCLUSIONS: The genome-wide RNAi screen identified many genes that can modulate Notch transcriptional output. A protein interaction map of the identified genes highlighted a network of chromatin-modifying enzymes and remodelers that regulate Notch transcription. Our results open new avenues to explore the mechanisms of Notch signal regulation and the integration of this pathway into diverse cellular processes.

Ramanuj DasGupta, Kent Nybakken, Matthew Booker, Bernard Mathey-Prevot, Foster Gonsalves, Binita Changkakoty, and Norbert Perrimon. 2007. “A case study of the reproducibility of transcriptional reporter cell-based RNAi screens in Drosophila.” Genome Biol, 8, 9, Pp. R203.Abstract

Off-target effects have been demonstrated to be a major source of false-positives in RNA interference (RNAi) high-throughput screens. In this study, we re-assess the previously published transcriptional reporter-based whole-genome RNAi screens for the Wingless and Hedgehog signaling pathways using second generation double-stranded RNA libraries. Furthermore, we investigate other factors that may influence the outcome of such screens, including cell-type specificity, robustness of reporters, and assay normalization, which determine the efficacy of RNAi-knockdown of target genes.

Stephanie E Mohr, Yanhui Hu, Kevin Kim, Benjamin E Housden, and Norbert Perrimon. 2014. “Resources for functional genomics studies in Drosophila melanogaster.” Genetics, 197, 1, Pp. 1-18.Abstract

Drosophila melanogaster has become a system of choice for functional genomic studies. Many resources, including online databases and software tools, are now available to support design or identification of relevant fly stocks and reagents or analysis and mining of existing functional genomic, transcriptomic, proteomic, etc. datasets. These include large community collections of fly stocks and plasmid clones, "meta" information sites like FlyBase and FlyMine, and an increasing number of more specialized reagents, databases, and online tools. Here, we introduce key resources useful to plan large-scale functional genomics studies in Drosophila and to analyze, integrate, and mine the results of those studies in ways that facilitate identification of highest-confidence results and generation of new hypotheses. We also discuss ways in which existing resources can be used and might be improved and suggest a few areas of future development that would further support large- and small-scale studies in Drosophila and facilitate use of Drosophila information by the research community more generally.

Matthew Booker, Anastasia A Samsonova, Young Kwon, Ian Flockhart, Stephanie E Mohr, and Norbert Perrimon. 2011. “False negative rates in Drosophila cell-based RNAi screens: a case study.” BMC Genomics, 12, Pp. 50.Abstract

BACKGROUND: High-throughput screening using RNAi is a powerful gene discovery method but is often complicated by false positive and false negative results. Whereas false positive results associated with RNAi reagents has been a matter of extensive study, the issue of false negatives has received less attention. RESULTS: We performed a meta-analysis of several genome-wide, cell-based Drosophila RNAi screens, together with a more focused RNAi screen, and conclude that the rate of false negative results is at least 8%. Further, we demonstrate how knowledge of the cell transcriptome can be used to resolve ambiguous results and how the number of false negative results can be reduced by using multiple, independently-tested RNAi reagents per gene. CONCLUSIONS: RNAi reagents that target the same gene do not always yield consistent results due to false positives and weak or ineffective reagents. False positive results can be partially minimized by filtering with transcriptome data. RNAi libraries with multiple reagents per gene also reduce false positive and false negative outcomes when inconsistent results are disambiguated carefully.

Chris Bakal, Rune Linding, Flora Llense, Elleard Heffern, Enrique Martin-Blanco, Tony Pawson, and Norbert Perrimon. 2008. “Phosphorylation networks regulating JNK activity in diverse genetic backgrounds.” Science, 322, 5900, Pp. 453-6.Abstract

Cellular signaling networks have evolved to enable swift and accurate responses, even in the face of genetic or environmental perturbation. Thus, genetic screens may not identify all the genes that regulate different biological processes. Moreover, although classical screening approaches have succeeded in providing parts lists of the essential components of signaling networks, they typically do not provide much insight into the hierarchical and functional relations that exist among these components. We describe a high-throughput screen in which we used RNA interference to systematically inhibit two genes simultaneously in 17,724 combinations to identify regulators of Drosophila JUN NH(2)-terminal kinase (JNK). Using both genetic and phosphoproteomics data, we then implemented an integrative network algorithm to construct a JNK phosphorylation network, which provides structural and mechanistic insights into the systems architecture of JNK signaling.

Yanhui Hu, Aram Comjean, Lizabeth A Perkins, Norbert Perrimon, and Stephanie E Mohr. 2015. “GLAD: an Online Database of Gene List Annotation for Drosophila.” J Genomics, 3, Pp. 75-81.Abstract

We present a resource of high quality lists of functionally related Drosophila genes, e.g. based on protein domains (kinases, transcription factors, etc.) or cellular function (e.g. autophagy, signal transduction). To establish these lists, we relied on different inputs, including curation from databases or the literature and mapping from other species. Moreover, as an added curation and quality control step, we asked experts in relevant fields to review many of the lists. The resource is available online for scientists to search and view, and is editable based on community input. Annotation of gene groups is an ongoing effort and scientific need will typically drive decisions regarding which gene lists to pursue. We anticipate that the number of lists will increase over time; that the composition of some lists will grow and/or change over time as new information becomes available; and that the lists will benefit the scientific community, e.g. at experimental design and data analysis stages. Based on this, we present an easily updatable online database, available at, at which gene group lists can be viewed, searched and downloaded.