Publications and Datasets

To search or download DRSC genome-wide cell RNAi screen data sets, see the DRSC Screen Summary page.

Yanhui Hu, Arunachalam Vinayagam, Ankita Nand, Aram Comjean, Verena Chung, Tong Hao, Stephanie E Mohr, and Norbert Perrimon. 2018. “Molecular Interaction Search Tool (MIST): an integrated resource for mining gene and protein interaction data.” Nucleic Acids Res, 46, D1, Pp. D567-D574.Abstract
Model organism and human databases are rich with information about genetic and physical interactions. These data can be used to interpret and guide the analysis of results from new studies and develop new hypotheses. Here, we report the development of the Molecular Interaction Search Tool (MIST; The MIST database integrates biological interaction data from yeast, nematode, fly, zebrafish, frog, rat and mouse model systems, as well as human. For individual or short gene lists, the MIST user interface can be used to identify interacting partners based on protein-protein and genetic interaction (GI) data from the species of interest as well as inferred interactions, known as interologs, and to view a corresponding network. The data, interologs and search tools at MIST are also useful for analyzing 'omics datasets. In addition to describing the integrated database, we also demonstrate how MIST can be used to identify an appropriate cut-off value that balances false positive and negative discovery, and present use-cases for additional types of analysis. Altogether, the MIST database and search tools support visualization and navigation of existing protein and GI data, as well as comparison of new and existing data.
Ben Ewen-Campen, Stephanie E Mohr, Yanhui Hu, and Norbert Perrimon. 10/9/2017. “Accessing the Phenotype Gap: Enabling Systematic Investigation of Paralog Functional Complexity with CRISPR.” Dev Cell, 43, 1, Pp. 6-9.Abstract
Single-gene knockout experiments can fail to reveal function in the context of redundancy, which is frequently observed among duplicated genes (paralogs) with overlapping functions. We discuss the complexity associated with studying paralogs and outline how recent advances in CRISPR will help address the "phenotype gap" and impact biomedical research.
Benjamin E Housden, Zhongchi Li, Colleen Kelley, Yuanli Wang, Yanhui Hu, Alexander J Valvezan, Brendan D Manning, and Norbert Perrimon. 11/28/2017. “Improved detection of synthetic lethal interactions in Drosophila cells using variable dose analysis (VDA).” Proc Natl Acad Sci U S A.Abstract
Synthetic sick or synthetic lethal (SS/L) screens are a powerful way to identify candidate drug targets to specifically kill tumor cells, but this approach generally suffers from low consistency between screens. We found that many SS/L interactions involve essential genes and are therefore detectable within a limited range of knockdown efficiency. Such interactions are often missed by overly efficient RNAi reagents. We therefore developed an assay that measures viability over a range of knockdown efficiency within a cell population. This method, called Variable Dose Analysis (VDA), is highly sensitive to viability phenotypes and reproducibly detects SS/L interactions. We applied the VDA method to search for SS/L interactions with TSC1 and TSC2, the two tumor suppressors underlying tuberous sclerosis complex (TSC), and generated a SS/L network for TSC. Using this network, we identified four Food and Drug Administration-approved drugs that selectively affect viability of TSC-deficient cells, representing promising candidates for repurposing to treat TSC-related tumors.
Yanhui Hu, Aram Comjean, Stephanie E Mohr, The FlyBase Consortium, and Norbert Perrimon. 8/7/2017. “Gene2Function: An Integrated Online Resource for Gene Function Discovery.” G3 (Bethesda).Abstract
One of the most powerful ways to develop hypotheses regarding biological functions of conserved genes in a given species, such as in humans, is to first look at what is known about function in another species. Model organism databases (MODs) and other resources are rich with functional information but difficult to mine. Gene2Function (G2F) addresses a broad need by integrating information about conserved genes in a single online resource.
Julia Wang, Rami Al-Ouran, Yanhui Hu, Seon-Young Kim, Ying-Wooi Wan, Michael F Wangler, Shinya Yamamoto, Hsiao-Tuan Chao, Aram Comjean, Stephanie E Mohr, Stephanie E Mohr, Norbert Perrimon, Zhandong Liu, and Hugo J Bellen. 6/1/2017. “MARRVEL: Integration of Human and Model Organism Genetic Resources to Facilitate Functional Annotation of the Human Genome.” Am J Hum Genet, 100, 6, Pp. 843-853.Abstract
One major challenge encountered with interpreting human genetic variants is the limited understanding of the functional impact of genetic alterations on biological processes. Furthermore, there remains an unmet demand for an efficient survey of the wealth of information on human homologs in model organisms across numerous databases. To efficiently assess the large volume of publically available information, it is important to provide a concise summary of the most relevant information in a rapid user-friendly format. To this end, we created MARRVEL (model organism aggregated resources for rare variant exploration). MARRVEL is a publicly available website that integrates information from six human genetic databases and seven model organism databases. For any given variant or gene, MARRVEL displays information from OMIM, ExAC, ClinVar, Geno2MP, DGV, and DECIPHER. Importantly, it curates model organism-specific databases to concurrently display a concise summary regarding the human gene homologs in budding and fission yeast, worm, fly, fish, mouse, and rat on a single webpage. Experiment-based information on tissue expression, protein subcellular localization, biological process, and molecular function for the human gene and homologs in the seven model organisms are arranged into a concise output. Hence, rather than visiting multiple separate databases for variant and gene analysis, users can obtain important information by searching once through MARRVEL. Altogether, MARRVEL dramatically improves efficiency and accessibility to data collection and facilitates analysis of human genes and variants by cross-disciplinary integration of 18 million records available in public databases to facilitate clinical diagnosis and basic research.
Yanhui Hu, Aram Comjean, Norbert Perrimon, and Stephanie E Mohr. 2/10/2017. “The Drosophila Gene Expression Tool (DGET) for expression analyses.” BMC Bioinformatics, 18, 1, Pp. 98.Abstract
BACKGROUND: Next-generation sequencing technologies have greatly increased our ability to identify gene expression levels, including at specific developmental stages and in specific tissues. Gene expression data can help researchers understand the diverse functions of genes and gene networks, as well as help in the design of specific and efficient functional studies, such as by helping researchers choose the most appropriate tissue for a study of a group of genes, or conversely, by limiting a long list of gene candidates to the subset that are normally expressed at a given stage or in a given tissue. RESULTS: We report DGET, a Drosophila Gene Expression Tool ( ), which stores and facilitates search of RNA-Seq based expression profiles available from the modENCODE consortium and other public data sets. Using DGET, researchers are able to look up gene expression profiles, filter results based on threshold expression values, and compare expression data across different developmental stages, tissues and treatments. In addition, at DGET a researcher can analyze tissue or stage-specific enrichment for an inputted list of genes (e.g., 'hits' from a screen) and search for additional genes with similar expression patterns. We performed a number of analyses to demonstrate the quality and robustness of the resource. In particular, we show that evolutionary conserved genes expressed at high or moderate levels in both fly and human tend to be expressed in similar tissues. Using DGET, we compared whole tissue profile and sub-region/cell-type specific datasets and estimated a potential source of false positives in one dataset. We also demonstrated the usefulness of DGET for synexpression studies by querying genes with expression profile similar to the mesodermal master regulator Twist. CONCLUSION: Altogether, DGET provides a flexible tool for expression data retrieval and analysis with short or long lists of Drosophila genes, which can help scientists to design stage- or tissue-specific in vivo studies and do other subsequent analyses.
Eui Jae Sung, Masasuke Ryuda, Hitoshi Matsumoto, Outa Uryu, Masanori Ochiai, Molly E Cook, Na Young Yi, Huanchen Wang, James W Putney, Gary S Bird, Stephen B Shears, and Yoichi Hayakawa. 12/11/2017. “Cytokine signaling through Drosophila Mthl10 ties lifespan to environmental stress.” Proc Natl Acad Sci U S A.Abstract
A systems-level understanding of cytokine-mediated, intertissue signaling is one of the keys to developing fundamental insight into the links between aging and inflammation. Here, we employed Drosophila, a routine model for analysis of cytokine signaling pathways in higher animals, to identify a receptor for the growth-blocking peptide (GBP) cytokine. Having previously established that the phospholipase C/Ca2+ signaling pathway mediates innate immune responses to GBP, we conducted a dsRNA library screen for genes that modulate Ca2+ mobilization in Drosophila S3 cells. A hitherto orphan G protein coupled receptor, Methuselah-like receptor-10 (Mthl10), was a significant hit. Secondary screening confirmed specific binding of fluorophore-tagged GBP to both S3 cells and recombinant Mthl10-ectodomain. We discovered that the metabolic, immunological, and stress-protecting roles of GBP all interconnect through Mthl10. This we established by Mthl10 knockdown in three fly model systems: in hemocyte-like Drosophila S2 cells, Mthl10 knockdown decreases GBP-mediated innate immune responses; in larvae, Mthl10 knockdown decreases expression of antimicrobial peptides in response to low temperature; in adult flies, Mthl10 knockdown increases mortality rate following infection with Micrococcus luteus and reduces GBP-mediated secretion of insulin-like peptides. We further report that organismal fitness pays a price for the utilization of Mthl10 to integrate all of these various homeostatic attributes of GBP: We found that elevated GBP expression reduces lifespan. Conversely, Mthl10 knockdown extended lifespan. We describe how our data offer opportunities for further molecular interrogation of yin and yang between homeostasis and longevity.
Stephanie E Mohr, Kirstin Rudd, Yanhui Hu, Wei R Song, Quentin Gilly, Michael Buckner, Benjamin E Housden, Colleen Kelley, Jonathan Zirin, Rong Tao, Gabriel Amador, Katarzyna Sierzputowska, Aram Comjean, and Norbert Perrimon. 12/9/2017. “Zinc Detoxification: A Functional Genomics and Transcriptomics Analysis in Drosophila melanogaster Cultured Cells.” G3 (Bethesda).Abstract
Cells require some metals, such as zinc and manganese, but excess levels of these metals can be toxic. As a result, cells have evolved complex mechanisms for maintaining metal homeostasis and surviving metal intoxication. Here, we present the results of a large-scale functional genomic screen in Drosophila cultured cells for modifiers of zinc chloride toxicity, together with transcriptomics data for wildtype or genetically zinc-sensitized cells challenged with mild zinc chloride supplementation. Altogether, we identified 47 genes for which knockdown conferred sensitivity or resistance to toxic zinc or manganese chloride treatment, and more than 1800 putative zinc-responsive genes. Analysis of the 'omics data points to the relevance of ion transporters, glutathione-related factors, and conserved disease-associated genes in zinc detoxification. Specific genes identified in the zinc screen include orthologs of human disease-associated genes CTNS, PTPRN (also known as IA-2), and ATP13A2 (also known as PARK9). We show that knockdown of red dog mine (rdog; CG11897), a candidate zinc detoxification gene encoding an ABCC-type transporter family protein related to yeast cadmium factor (YCF1), confers sensitivity to zinc intoxication in cultured cells and that rdog is transcriptionally up-regulated in response to zinc stress. As there are many links between the biology of zinc and other metals and human health, the 'omics datasets presented here provide a resource that will allow researchers to explore metal biology in the context of diverse health-relevant processes.
Benjamin E Housden, Matthias Muhar, Matthew Gemberling, Charles A Gersbach, Didier YR Stainier, Geraldine Seydoux, Stephanie E Mohr, Johannes Zuber, and Norbert Perrimon. 10/31/2016. “Loss-of-function genetic tools for animal models: cross-species and cross-platform differences.” Nat Rev Genet. Publisher's VersionAbstract

Our understanding of the genetic mechanisms that underlie biological processes has relied extensively on loss-of-function (LOF) analyses. LOF methods target DNA, RNA or protein to reduce or to ablate gene function. By analysing the phenotypes that are caused by these perturbations the wild-type function of genes can be elucidated. Although all LOF methods reduce gene activity, the choice of approach (for example, mutagenesis, CRISPR-based gene editing, RNA interference, morpholinos or pharmacological inhibition) can have a major effect on phenotypic outcomes. Interpretation of the LOF phenotype must take into account the biological process that is targeted by each method. The practicality and efficiency of LOF methods also vary considerably between model systems. We describe parameters for choosing the optimal combination of method and system, and for interpreting phenotypes within the constraints of each method.

Arunachalam Vinayagam, Travis E Gibson, Ho-Joon Lee, Bahar Yilmazel, Charles Roesel, Yanhui Hu, Young Kwon, Amitabh Sharma, Yang-Yu Liu, Norbert Perrimon, and Albert-László Barabási. 5/3/2016. “Controllability analysis of the directed human protein interaction network identifies disease genes and drug targets.” Proc Natl Acad Sci U S A, 113, 18, Pp. 4976-81.Abstract

The protein-protein interaction (PPI) network is crucial for cellular information processing and decision-making. With suitable inputs, PPI networks drive the cells to diverse functional outcomes such as cell proliferation or cell death. Here, we characterize the structural controllability of a large directed human PPI network comprising 6,339 proteins and 34,813 interactions. This network allows us to classify proteins as "indispensable," "neutral," or "dispensable," which correlates to increasing, no effect, or decreasing the number of driver nodes in the network upon removal of that protein. We find that 21% of the proteins in the PPI network are indispensable. Interestingly, these indispensable proteins are the primary targets of disease-causing mutations, human viruses, and drugs, suggesting that altering a network's control property is critical for the transition between healthy and disease states. Furthermore, analyzing copy number alterations data from 1,547 cancer patients reveals that 56 genes that are frequently amplified or deleted in nine different cancers are indispensable. Among the 56 genes, 46 of them have not been previously associated with cancer. This suggests that controllability analysis is very useful in identifying novel disease genes and potential drug targets.

Stephanie E Mohr, Yanhui Hu, Benjamin Ewen-Campen, Benjamin E Housden, Raghuvir Viswanatha, and Norbert Perrimon. 2016. “CRISPR guide RNA design for research applications..” FEBS J.Abstract

The rapid rise of CRISPR as a technology for genome engineering and related research applications has created a need for algorithms and associated online tools that facilitate design of on-target and effective guide RNAs (gRNAs). Here, we review the state-of-the-art in CRISPR gRNA design for research applications of the CRISPR-Cas9 system, including knockout, activation and inhibition. Notably, achieving good gRNA design is not solely dependent on innovations in CRISPR technology. Good design and design tools also rely on availability of high-quality genome sequence and gene annotations, as well as on availability of accumulated data regarding off-targets and effectiveness metrics. This article is protected by copyright. All rights reserved.

Iiro Taneli Helenius, Ryan J Haake, Yong-Jae Kwon, Jennifer A Hu, Thomas Krupinski, Marina S Casalino-Matsuda, Peter HS Sporn, Jacob I Sznajder, and Greg J Beitel. 2016. “Identification of Drosophila Zfh2 as a Mediator of Hypercapnic Immune Regulation by a Genome-Wide RNA Interference Screen..” J Immunol, 196, 2, Pp. 655-67.Abstract

Hypercapnia, elevated partial pressure of CO2 in blood and tissue, develops in many patients with chronic severe obstructive pulmonary disease and other advanced lung disorders. Patients with advanced disease frequently develop bacterial lung infections, and hypercapnia is a risk factor for mortality in such individuals. We previously demonstrated that hypercapnia suppresses induction of NF-κB-regulated innate immune response genes required for host defense in human, mouse, and Drosophila cells, and it increases mortality from bacterial infections in both mice and Drosophila. However, the molecular mediators of hypercapnic immune suppression are undefined. In this study, we report a genome-wide RNA interference screen in Drosophila S2* cells stimulated with bacterial peptidoglycan. The screen identified 16 genes with human orthologs whose knockdown reduced hypercapnic suppression of the gene encoding the antimicrobial peptide Diptericin (Dipt), but did not increase Dipt mRNA levels in air. In vivo tests of one of the strongest screen hits, zinc finger homeodomain 2 (Zfh2; mammalian orthologs ZFHX3/ATBF1 and ZFHX4), demonstrate that reducing zfh2 function using a mutation or RNA interference improves survival of flies exposed to elevated CO2 and infected with Staphylococcus aureus. Tissue-specific knockdown of zfh2 in the fat body, the major immune and metabolic organ of the fly, mitigates hypercapnia-induced reductions in Dipt and other antimicrobial peptides and improves resistance of CO2-exposed flies to infection. Zfh2 mutations also partially rescue hypercapnia-induced delays in egg hatching, suggesting that Zfh2's role in mediating responses to hypercapnia extends beyond the immune system. Taken together, to our knowledge, these results identify Zfh2 as the first in vivo mediator of hypercapnic immune suppression.

Huajin Wang, Michel Becuwe, Benjamin E Housden, Chandramohan Chitraju, Ashley J Porras, Morven M Graham, Xinran N Liu, Abdou Rachid Thiam, David B Savage, Anil K Agarwal, Abhimanyu Garg, Maria-Jesus Olarte, Qingqing Lin, Florian Fröhlich, Hans Kristian Hannibal-Bach, Srigokul Upadhyayula, Norbert Perrimon, Tomas Kirchhausen, Christer S Ejsing, Tobias C Walther, and Robert V Farese. 2016. “Seipin is required for converting nascent to mature lipid droplets..” Elife, 5.Abstract

How proteins control the biogenesis of cellular lipid droplets (LDs) is poorly understood. Using Drosophila and human cells, we show here that seipin, an ER protein implicated in LD biology, mediates a discrete step in LD formation-the conversion of small, nascent LDs to larger, mature LDs. Seipin forms discrete and dynamic foci in the ER that interact with nascent LDs to enable their growth. In the absence of seipin, numerous small, nascent LDs accumulate near the ER and most often fail to grow. Those that do grow prematurely acquire lipid synthesis enzymes and undergo expansion, eventually leading to the giant LDs characteristic of seipin deficiency. Our studies identify a discrete step of LD formation, namely the conversion of nascent LDs to mature LDs, and define a molecular role for seipin in this process, most likely by acting at ER-LD contact sites to enable lipid transfer to nascent LDs.

Arunachalam Vinayagam, Meghana M Kulkarni, Richelle Sopko, Xiaoyun Sun, Yanhui Hu, Ankita Nand, Christians Villalta, Ahmadali Moghimi, Xuemei Yang, Stephanie E Mohr, Pengyu Hong, John M Asara, and Norbert Perrimon. 9/13/2016. “An Integrative Analysis of the InR/PI3K/Akt Network Identifies the Dynamic Response to Insulin Signaling.” Cell Reports, 16, 11, Pp. 3062-3074.Abstract

Insulin regulates an essential conserved signaling pathway affecting growth, proliferation, and meta- bolism. To expand our understanding of the insulin pathway, we combine biochemical, genetic, and computational approaches to build a comprehensive Drosophila InR/PI3K/Akt network. First, we map the dynamic protein-protein interaction network sur- rounding the insulin core pathway using bait-prey interactions connecting 566 proteins. Combining RNAi screening and phospho-specific antibodies, we find that 47% of interacting proteins affect pathway activity, and, using quantitative phospho- proteomics, we demonstrate that $10% of interact- ing proteins are regulated by insulin stimulation at the level of phosphorylation. Next, we integrate these orthogonal datasets to characterize the structure and dynamics of the insulin network at the level of protein complexes and validate our method by iden- tifying regulatory roles for the Protein Phosphatase 2A (PP2A) and Reptin-Pontin chromatin-remodeling complexes as negative and positive regulators of ribosome biogenesis, respectively. Altogether, our study represents a comprehensive resource for the study of the evolutionary conserved insulin network. 

Joel M Swenson, Serafin U Colmenares, Amy R Strom, Sylvain V Costes, and Gary H Karpen. 2016. “The composition and organization of Drosophila heterochromatin are heterogeneous and dynamic..” Elife, 5.Abstract

Heterochromatin is enriched for specific epigenetic factors including Heterochromatin Protein 1a (HP1a), and is essential for many organismal functions. To elucidate heterochromatin organization and regulation, we purified Drosophila melanogaster HP1a interactors, and performed a genome-wide RNAi screen to identify genes that impact HP1a levels or localization. The majority of the over four hundred putative HP1a interactors and regulators identified were previously unknown. We found that 13 of 16 tested candidates (83%) are required for gene silencing, providing a substantial increase in the number of identified components that impact heterochromatin properties. Surprisingly, image analysis revealed that although some HP1a interactors and regulators are broadly distributed within the heterochromatin domain, most localize to discrete subdomains that display dynamic localization patterns during the cell cycle. We conclude that heterochromatin composition and architecture is more spatially complex and dynamic than previously suggested, and propose that a network of subdomains regulates diverse heterochromatin functions.

Alfeu Zanotto-Filho, Ravi Dashnamoorthy, Eva Loranc, Luis HT de Souza, José CF Moreira, Uthra Suresh, Yidong Chen, and Alexander JR Bishop. 2016. “Combined Gene Expression and RNAi Screening to Identify Alkylation Damage Survival Pathways from Fly to Human..” PLoS One, 11, 4, Pp. e0153970.Abstract

Alkylating agents are a key component of cancer chemotherapy. Several cellular mechanisms are known to be important for its survival, particularly DNA repair and xenobiotic detoxification, yet genomic screens indicate that additional cellular components may be involved. Elucidating these components has value in either identifying key processes that can be modulated to improve chemotherapeutic efficacy or may be altered in some cancers to confer chemoresistance. We therefore set out to reevaluate our prior Drosophila RNAi screening data by comparison to gene expression arrays in order to determine if we could identify any novel processes in alkylation damage survival. We noted a consistent conservation of alkylation survival pathways across platforms and species when the analysis was conducted on a pathway/process level rather than at an individual gene level. Better results were obtained when combining gene lists from two datasets (RNAi screen plus microarray) prior to analysis. In addition to previously identified DNA damage responses (p53 signaling and Nucleotide Excision Repair), DNA-mRNA-protein metabolism (transcription/translation) and proteasome machinery, we also noted a highly conserved cross-species requirement for NRF2, glutathione (GSH)-mediated drug detoxification and Endoplasmic Reticulum stress (ER stress)/Unfolded Protein Responses (UPR) in cells exposed to alkylation. The requirement for GSH, NRF2 and UPR in alkylation survival was validated by metabolomics, protein studies and functional cell assays. From this we conclude that RNAi/gene expression fusion is a valid strategy to rapidly identify key processes that may be extendable to other contexts beyond damage survival.

Yanhui Hu, Aram Comjean, Charles Roesel, Arunachalam Vinayagam, Ian Flockhart, Jonathan Zirin, Lizabeth Perkins, Norbert Perrimon, and Stephanie E Mohr. 10/11/2016. “—the database of the Drosophila RNAi screening center and transgenic RNAi project: 2017 update.” Nucleic Acids Research. Publisher's VersionAbstract

The FlyRNAi database of the Drosophila RNAi Screening Center (DRSC) and Transgenic RNAi Project (TRiP) at Harvard Medical School and associated DRSC/TRiP Functional Genomics Resources website ( serve as a reagent production tracking system, screen data repository, and portal to the community. Through this portal, we make available protocols, online tools, and other resources useful to researchers at all stages of high-throughput functional genomics screening, from assay design and reagent identification to data analysis and interpretation. In this update, we describe recent changes and additions to our website, database and suite of online tools. Recent changes reflect a shift in our focus from a single technology (RNAi) and model species (Drosophila) to the application of additional technologies (e.g. CRISPR) and support of integrated, cross-species approaches to uncovering gene function using functional genomics and other approaches.

Chen X and Xu L. 2016. “Genome-Wide RNAi Screening to Dissect the TGF-β Signal Transduction Pathway.” Methods in Molecular Biology. Publisher's VersionAbstract

The transforming growth factor-β (TGF-β) family of cytokines figures prominently in regulation of embryonic development and adult tissue homeostasis from Drosophila to mammals. Genetic defects affecting TGF-β signaling underlie developmental disorders and diseases such as cancer in human. Therefore, delineating the molecular mechanism by which TGF-β regulates cell biology is critical for understanding normal biology and disease mechanisms. Forward genetic screens in model organisms and biochemical approaches in mammalian tissue culture were instrumental in initial characterization of the TGF-β signal transduction pathway. With complete sequence information of the genomes and the advent of RNA interference (RNAi) technology, genome-wide RNAi screening emerged as a powerful functional genomics approach to systematically delineate molecular components of signal transduction pathways. Here, we describe a protocol for image-based whole-genome RNAi screening aimed at identifying molecules required for TGF-β signaling into the nucleus. Using this protocol we examined >90 % of annotated Drosophila open reading frames (ORF) individually and successfully uncovered several novel factors serving critical roles in the TGF-β pathway. Thus cell-based high-throughput functional genomics can uncover new mechanistic insights on signaling pathways beyond what the classical genetics had revealed.