Publications and Datasets

To search or download DRSC genome-wide cell RNAi screen data sets, see the DRSC Screen Summary page.

Yanhui Hu, Richelle Sopko, Verena Chung, Marianna Foos, Romain A Studer, Sean D Landry, Daniel Liu, Leonard Rabinow, Florian Gnad, Pedro Beltrao, and Norbert Perrimon. 2019. “iProteinDB: An Integrative Database of Post-translational Modifications.” G3 (Bethesda), 9, 1, Pp. 1-11.Abstract
Post-translational modification (PTM) serves as a regulatory mechanism for protein function, influencing their stability, interactions, activity and localization, and is critical in many signaling pathways. The best characterized PTM is phosphorylation, whereby a phosphate is added to an acceptor residue, most commonly serine, threonine and tyrosine in metazoans. As proteins are often phosphorylated at multiple sites, identifying those sites that are important for function is a challenging problem. Considering that any given phosphorylation site might be non-functional, prioritizing evolutionarily conserved phosphosites provides a general strategy to identify the putative functional sites. To facilitate the identification of conserved phosphosites, we generated a large-scale phosphoproteomics dataset from embryos collected from six closely-related species. We built iProteinDB (, a resource integrating these data with other high-throughput PTM datasets, including vertebrates, and manually curated information for At iProteinDB, scientists can view the PTM landscape for any protein and identify predicted functional phosphosites based on a comparative analysis of data from closely-related species. Further, iProteinDB enables comparison of PTM data from to that of orthologous proteins from other model organisms, including human, mouse, rat, , , and .
Naoki Okamoto, Raghuvir Viswanatha, Riyan Bittar, Zhongchi Li, Sachiko Haga-Yamanaka, Norbert Perrimon, and Naoki Yamanaka. 2018. “A Membrane Transporter Is Required for Steroid Hormone Uptake in Drosophila.” Dev Cell, 47, 3, Pp. 294-305.e7.Abstract
Steroid hormones are a group of lipophilic hormones that are believed to enter cells by simple diffusion to regulate diverse physiological processes through intracellular nuclear receptors. Here, we challenge this model in Drosophila by demonstrating that Ecdysone Importer (EcI), a membrane transporter identified from two independent genetic screens, is involved in cellular uptake of the steroid hormone ecdysone. EcI encodes an organic anion transporting polypeptide of the evolutionarily conserved solute carrier organic anion superfamily. In vivo, EcI loss of function causes phenotypes indistinguishable from ecdysone- or ecdysone receptor (EcR)-deficient animals, and EcI knockdown inhibits cellular uptake of ecdysone. Furthermore, EcI regulates ecdysone signaling in a cell-autonomous manner and is both necessary and sufficient for inducing ecdysone-dependent gene expression in culture cells expressing EcR. Altogether, our results challenge the simple diffusion model for cellular uptake of ecdysone and may have wide implications for basic and medical aspects of steroid hormone studies.
Jeffrey J Hodgson, Nicolas Buchon, and Gary W Blissard. 2018. “Identification of insect genes involved in baculovirus AcMNPV entry into insect cells.” Virology, 527, Pp. 1-11.Abstract
The baculovirus Autographa californica multiple nucleopolyhedrovirus (AcMNPV) is a model enveloped DNA virus that infects and replicates in lepidopteran insect cells, and can efficiently enter a wide variety of non-host cells. Budded virions of AcMNPV enter cells by endocytosis and traffic to the nucleus where the virus initiates gene expression and genome replication. While trafficking of nucleocapsids by actin propulsion has been studied in detail, other important components of trafficking during entry remain poorly understood. We used a recombinant AcMNPV virus expressing an EGFP reporter in combination with an RNAi screen in Drosophila DL1 cells, to identify host proteins involved in AcMNPV entry. The RNAi screen targeted 86 genes involved in vesicular trafficking, including genes coding for VPS and ESCRT proteins, Rab GTPases, Exocyst proteins, and Clathrin adaptor proteins. We identified 24 genes required for efficient virus entry and reporter expression, and 4 genes that appear to restrict virus entry.
Yolande Grobler, Chi Y Yun, David J Kahler, Casey M Bergman, Hangnoh Lee, Brian Oliver, and Ruth Lehmann. 2018. “Whole genome screen reveals a novel relationship between Wolbachia levels and Drosophila host translation.” PLoS Pathog, 14, 11, Pp. e1007445.Abstract
Wolbachia is an intracellular bacterium that infects a remarkable range of insect hosts. Insects such as mosquitos act as vectors for many devastating human viruses such as Dengue, West Nile, and Zika. Remarkably, Wolbachia infection provides insect hosts with resistance to many arboviruses thereby rendering the insects ineffective as vectors. To utilize Wolbachia effectively as a tool against vector-borne viruses a better understanding of the host-Wolbachia relationship is needed. To investigate Wolbachia-insect interactions we used the Wolbachia/Drosophila model that provides a genetically tractable system for studying host-pathogen interactions. We coupled genome-wide RNAi screening with a novel high-throughput fluorescence in situ hybridization (FISH) assay to detect changes in Wolbachia levels in a Wolbachia-infected Drosophila cell line JW18. 1117 genes altered Wolbachia levels when knocked down by RNAi of which 329 genes increased and 788 genes decreased the level of Wolbachia. Validation of hits included in depth secondary screening using in vitro RNAi, Drosophila mutants, and Wolbachia-detection by DNA qPCR. A diverse set of host gene networks was identified to regulate Wolbachia levels and unexpectedly revealed that perturbations of host translation components such as the ribosome and translation initiation factors results in increased Wolbachia levels both in vitro using RNAi and in vivo using mutants and a chemical-based translation inhibition assay. This work provides evidence for Wolbachia-host translation interaction and strengthens our general understanding of the Wolbachia-host intracellular relationship.
Raghuvir Viswanatha, Zhongchi Li, Yanhui Hu, and Norbert Perrimon. 7/27/2018. “Pooled genome-wide CRISPR screening for basal and context-specific fitness gene essentiality in cells.” Elife, 7.Abstract
Genome-wide screens in cells have offered numerous insights into gene function, yet a major limitation has been the inability to stably deliver large multiplexed DNA libraries to cultured cells allowing barcoded pooled screens. Here, we developed a site-specific integration strategy for library delivery and performed a genome-wide CRISPR knockout screen in S2R+ cells. Under basal growth conditions, 1235 genes were essential for cell fitness at a false-discovery rate of 5%, representing the highest-resolution fitness gene set yet assembled for , including 407 genes which likely duplicated along the vertebrate lineage and whose orthologs were underrepresented in human CRISPR screens. We additionally performed context-specific fitness screens for resistance to or synergy with trametinib, a Ras/ERK/ETS inhibitor, or rapamycin, an mTOR inhibitor, and identified key regulators of each pathway. The results present a novel, scalable, and versatile platform for functional genomic screens in invertebrate cells.
Yanhui Hu, Arunachalam Vinayagam, Ankita Nand, Aram Comjean, Verena Chung, Tong Hao, Stephanie E Mohr, and Norbert Perrimon. 11/16/2017. “Molecular Interaction Search Tool (MIST): an integrated resource for mining gene and protein interaction data.” Nucleic Acids Res, 46, D1, Pp. D567-D574.Abstract
Model organism and human databases are rich with information about genetic and physical interactions. These data can be used to interpret and guide the analysis of results from new studies and develop new hypotheses. Here, we report the development of the Molecular Interaction Search Tool (MIST; The MIST database integrates biological interaction data from yeast, nematode, fly, zebrafish, frog, rat and mouse model systems, as well as human. For individual or short gene lists, the MIST user interface can be used to identify interacting partners based on protein-protein and genetic interaction (GI) data from the species of interest as well as inferred interactions, known as interologs, and to view a corresponding network. The data, interologs and search tools at MIST are also useful for analyzing 'omics datasets. In addition to describing the integrated database, we also demonstrate how MIST can be used to identify an appropriate cut-off value that balances false positive and negative discovery, and present use-cases for additional types of analysis. Altogether, the MIST database and search tools support visualization and navigation of existing protein and GI data, as well as comparison of new and existing data.
Stephanie E Mohr, Kirstin Rudd, Yanhui Hu, Wei R Song, Quentin Gilly, Michael Buckner, Benjamin E Housden, Colleen Kelley, Jonathan Zirin, Rong Tao, Gabriel Amador, Katarzyna Sierzputowska, Aram Comjean, and Norbert Perrimon. 12/9/2017. “Zinc Detoxification: A Functional Genomics and Transcriptomics Analysis in Drosophila melanogaster Cultured Cells.” G3 (Bethesda).Abstract
Cells require some metals, such as zinc and manganese, but excess levels of these metals can be toxic. As a result, cells have evolved complex mechanisms for maintaining metal homeostasis and surviving metal intoxication. Here, we present the results of a large-scale functional genomic screen in Drosophila cultured cells for modifiers of zinc chloride toxicity, together with transcriptomics data for wildtype or genetically zinc-sensitized cells challenged with mild zinc chloride supplementation. Altogether, we identified 47 genes for which knockdown conferred sensitivity or resistance to toxic zinc or manganese chloride treatment, and more than 1800 putative zinc-responsive genes. Analysis of the 'omics data points to the relevance of ion transporters, glutathione-related factors, and conserved disease-associated genes in zinc detoxification. Specific genes identified in the zinc screen include orthologs of human disease-associated genes CTNS, PTPRN (also known as IA-2), and ATP13A2 (also known as PARK9). We show that knockdown of red dog mine (rdog; CG11897), a candidate zinc detoxification gene encoding an ABCC-type transporter family protein related to yeast cadmium factor (YCF1), confers sensitivity to zinc intoxication in cultured cells and that rdog is transcriptionally up-regulated in response to zinc stress. As there are many links between the biology of zinc and other metals and human health, the 'omics datasets presented here provide a resource that will allow researchers to explore metal biology in the context of diverse health-relevant processes.
Eui Jae Sung, Masasuke Ryuda, Hitoshi Matsumoto, Outa Uryu, Masanori Ochiai, Molly E Cook, Na Young Yi, Huanchen Wang, James W Putney, Gary S Bird, Stephen B Shears, and Yoichi Hayakawa. 12/11/2017. “Cytokine signaling through Drosophila Mthl10 ties lifespan to environmental stress.” Proc Natl Acad Sci U S A.Abstract
A systems-level understanding of cytokine-mediated, intertissue signaling is one of the keys to developing fundamental insight into the links between aging and inflammation. Here, we employed Drosophila, a routine model for analysis of cytokine signaling pathways in higher animals, to identify a receptor for the growth-blocking peptide (GBP) cytokine. Having previously established that the phospholipase C/Ca2+ signaling pathway mediates innate immune responses to GBP, we conducted a dsRNA library screen for genes that modulate Ca2+ mobilization in Drosophila S3 cells. A hitherto orphan G protein coupled receptor, Methuselah-like receptor-10 (Mthl10), was a significant hit. Secondary screening confirmed specific binding of fluorophore-tagged GBP to both S3 cells and recombinant Mthl10-ectodomain. We discovered that the metabolic, immunological, and stress-protecting roles of GBP all interconnect through Mthl10. This we established by Mthl10 knockdown in three fly model systems: in hemocyte-like Drosophila S2 cells, Mthl10 knockdown decreases GBP-mediated innate immune responses; in larvae, Mthl10 knockdown decreases expression of antimicrobial peptides in response to low temperature; in adult flies, Mthl10 knockdown increases mortality rate following infection with Micrococcus luteus and reduces GBP-mediated secretion of insulin-like peptides. We further report that organismal fitness pays a price for the utilization of Mthl10 to integrate all of these various homeostatic attributes of GBP: We found that elevated GBP expression reduces lifespan. Conversely, Mthl10 knockdown extended lifespan. We describe how our data offer opportunities for further molecular interrogation of yin and yang between homeostasis and longevity.
Benjamin E Housden, Zhongchi Li, Colleen Kelley, Yuanli Wang, Yanhui Hu, Alexander J Valvezan, Brendan D Manning, and Norbert Perrimon. 11/28/2017. “Improved detection of synthetic lethal interactions in Drosophila cells using variable dose analysis (VDA).” Proc Natl Acad Sci U S A.Abstract
Synthetic sick or synthetic lethal (SS/L) screens are a powerful way to identify candidate drug targets to specifically kill tumor cells, but this approach generally suffers from low consistency between screens. We found that many SS/L interactions involve essential genes and are therefore detectable within a limited range of knockdown efficiency. Such interactions are often missed by overly efficient RNAi reagents. We therefore developed an assay that measures viability over a range of knockdown efficiency within a cell population. This method, called Variable Dose Analysis (VDA), is highly sensitive to viability phenotypes and reproducibly detects SS/L interactions. We applied the VDA method to search for SS/L interactions with TSC1 and TSC2, the two tumor suppressors underlying tuberous sclerosis complex (TSC), and generated a SS/L network for TSC. Using this network, we identified four Food and Drug Administration-approved drugs that selectively affect viability of TSC-deficient cells, representing promising candidates for repurposing to treat TSC-related tumors.
Ben Ewen-Campen, Stephanie E Mohr, Yanhui Hu, and Norbert Perrimon. 10/9/2017. “Accessing the Phenotype Gap: Enabling Systematic Investigation of Paralog Functional Complexity with CRISPR.” Dev Cell, 43, 1, Pp. 6-9.Abstract
Single-gene knockout experiments can fail to reveal function in the context of redundancy, which is frequently observed among duplicated genes (paralogs) with overlapping functions. We discuss the complexity associated with studying paralogs and outline how recent advances in CRISPR will help address the "phenotype gap" and impact biomedical research.
Julia Wang, Rami Al-Ouran, Yanhui Hu, Seon-Young Kim, Ying-Wooi Wan, Michael F Wangler, Shinya Yamamoto, Hsiao-Tuan Chao, Aram Comjean, Stephanie E Mohr, Stephanie E Mohr, Norbert Perrimon, Zhandong Liu, and Hugo J Bellen. 6/1/2017. “MARRVEL: Integration of Human and Model Organism Genetic Resources to Facilitate Functional Annotation of the Human Genome.” Am J Hum Genet, 100, 6, Pp. 843-853.Abstract
One major challenge encountered with interpreting human genetic variants is the limited understanding of the functional impact of genetic alterations on biological processes. Furthermore, there remains an unmet demand for an efficient survey of the wealth of information on human homologs in model organisms across numerous databases. To efficiently assess the large volume of publically available information, it is important to provide a concise summary of the most relevant information in a rapid user-friendly format. To this end, we created MARRVEL (model organism aggregated resources for rare variant exploration). MARRVEL is a publicly available website that integrates information from six human genetic databases and seven model organism databases. For any given variant or gene, MARRVEL displays information from OMIM, ExAC, ClinVar, Geno2MP, DGV, and DECIPHER. Importantly, it curates model organism-specific databases to concurrently display a concise summary regarding the human gene homologs in budding and fission yeast, worm, fly, fish, mouse, and rat on a single webpage. Experiment-based information on tissue expression, protein subcellular localization, biological process, and molecular function for the human gene and homologs in the seven model organisms are arranged into a concise output. Hence, rather than visiting multiple separate databases for variant and gene analysis, users can obtain important information by searching once through MARRVEL. Altogether, MARRVEL dramatically improves efficiency and accessibility to data collection and facilitates analysis of human genes and variants by cross-disciplinary integration of 18 million records available in public databases to facilitate clinical diagnosis and basic research.
Yanhui Hu, Aram Comjean, Stephanie E Mohr, The FlyBase Consortium, and Norbert Perrimon. 8/7/2017. “Gene2Function: An Integrated Online Resource for Gene Function Discovery.” G3 (Bethesda).Abstract
One of the most powerful ways to develop hypotheses regarding biological functions of conserved genes in a given species, such as in humans, is to first look at what is known about function in another species. Model organism databases (MODs) and other resources are rich with functional information but difficult to mine. Gene2Function (G2F) addresses a broad need by integrating information about conserved genes in a single online resource.
Yanhui Hu, Aram Comjean, Norbert Perrimon, and Stephanie E Mohr. 2/10/2017. “The Drosophila Gene Expression Tool (DGET) for expression analyses.” BMC Bioinformatics, 18, 1, Pp. 98.Abstract
BACKGROUND: Next-generation sequencing technologies have greatly increased our ability to identify gene expression levels, including at specific developmental stages and in specific tissues. Gene expression data can help researchers understand the diverse functions of genes and gene networks, as well as help in the design of specific and efficient functional studies, such as by helping researchers choose the most appropriate tissue for a study of a group of genes, or conversely, by limiting a long list of gene candidates to the subset that are normally expressed at a given stage or in a given tissue. RESULTS: We report DGET, a Drosophila Gene Expression Tool ( ), which stores and facilitates search of RNA-Seq based expression profiles available from the modENCODE consortium and other public data sets. Using DGET, researchers are able to look up gene expression profiles, filter results based on threshold expression values, and compare expression data across different developmental stages, tissues and treatments. In addition, at DGET a researcher can analyze tissue or stage-specific enrichment for an inputted list of genes (e.g., 'hits' from a screen) and search for additional genes with similar expression patterns. We performed a number of analyses to demonstrate the quality and robustness of the resource. In particular, we show that evolutionary conserved genes expressed at high or moderate levels in both fly and human tend to be expressed in similar tissues. Using DGET, we compared whole tissue profile and sub-region/cell-type specific datasets and estimated a potential source of false positives in one dataset. We also demonstrated the usefulness of DGET for synexpression studies by querying genes with expression profile similar to the mesodermal master regulator Twist. CONCLUSION: Altogether, DGET provides a flexible tool for expression data retrieval and analysis with short or long lists of Drosophila genes, which can help scientists to design stage- or tissue-specific in vivo studies and do other subsequent analyses.
Arunachalam Vinayagam, Travis E Gibson, Ho-Joon Lee, Bahar Yilmazel, Charles Roesel, Yanhui Hu, Young Kwon, Amitabh Sharma, Yang-Yu Liu, Norbert Perrimon, and Albert-László Barabási. 5/3/2016. “Controllability analysis of the directed human protein interaction network identifies disease genes and drug targets.” Proc Natl Acad Sci U S A, 113, 18, Pp. 4976-81.Abstract

The protein-protein interaction (PPI) network is crucial for cellular information processing and decision-making. With suitable inputs, PPI networks drive the cells to diverse functional outcomes such as cell proliferation or cell death. Here, we characterize the structural controllability of a large directed human PPI network comprising 6,339 proteins and 34,813 interactions. This network allows us to classify proteins as "indispensable," "neutral," or "dispensable," which correlates to increasing, no effect, or decreasing the number of driver nodes in the network upon removal of that protein. We find that 21% of the proteins in the PPI network are indispensable. Interestingly, these indispensable proteins are the primary targets of disease-causing mutations, human viruses, and drugs, suggesting that altering a network's control property is critical for the transition between healthy and disease states. Furthermore, analyzing copy number alterations data from 1,547 cancer patients reveals that 56 genes that are frequently amplified or deleted in nine different cancers are indispensable. Among the 56 genes, 46 of them have not been previously associated with cancer. This suggests that controllability analysis is very useful in identifying novel disease genes and potential drug targets.

Huajin Wang, Michel Becuwe, Benjamin E Housden, Chandramohan Chitraju, Ashley J Porras, Morven M Graham, Xinran N Liu, Abdou Rachid Thiam, David B Savage, Anil K Agarwal, Abhimanyu Garg, Maria-Jesus Olarte, Qingqing Lin, Florian Fröhlich, Hans Kristian Hannibal-Bach, Srigokul Upadhyayula, Norbert Perrimon, Tomas Kirchhausen, Christer S Ejsing, Tobias C Walther, and Robert V Farese. 2016. “Seipin is required for converting nascent to mature lipid droplets.” Elife, 5.Abstract

How proteins control the biogenesis of cellular lipid droplets (LDs) is poorly understood. Using Drosophila and human cells, we show here that seipin, an ER protein implicated in LD biology, mediates a discrete step in LD formation-the conversion of small, nascent LDs to larger, mature LDs. Seipin forms discrete and dynamic foci in the ER that interact with nascent LDs to enable their growth. In the absence of seipin, numerous small, nascent LDs accumulate near the ER and most often fail to grow. Those that do grow prematurely acquire lipid synthesis enzymes and undergo expansion, eventually leading to the giant LDs characteristic of seipin deficiency. Our studies identify a discrete step of LD formation, namely the conversion of nascent LDs to mature LDs, and define a molecular role for seipin in this process, most likely by acting at ER-LD contact sites to enable lipid transfer to nascent LDs.

Chen X and Xu L. 2016. “Genome-Wide RNAi Screening to Dissect the TGF-β Signal Transduction Pathway.” Methods in Molecular Biology. Publisher's VersionAbstract

The transforming growth factor-β (TGF-β) family of cytokines figures prominently in regulation of embryonic development and adult tissue homeostasis from Drosophila to mammals. Genetic defects affecting TGF-β signaling underlie developmental disorders and diseases such as cancer in human. Therefore, delineating the molecular mechanism by which TGF-β regulates cell biology is critical for understanding normal biology and disease mechanisms. Forward genetic screens in model organisms and biochemical approaches in mammalian tissue culture were instrumental in initial characterization of the TGF-β signal transduction pathway. With complete sequence information of the genomes and the advent of RNA interference (RNAi) technology, genome-wide RNAi screening emerged as a powerful functional genomics approach to systematically delineate molecular components of signal transduction pathways. Here, we describe a protocol for image-based whole-genome RNAi screening aimed at identifying molecules required for TGF-β signaling into the nucleus. Using this protocol we examined >90 % of annotated Drosophila open reading frames (ORF) individually and successfully uncovered several novel factors serving critical roles in the TGF-β pathway. Thus cell-based high-throughput functional genomics can uncover new mechanistic insights on signaling pathways beyond what the classical genetics had revealed.

Yanhui Hu, Aram Comjean, Charles Roesel, Arunachalam Vinayagam, Ian Flockhart, Jonathan Zirin, Lizabeth Perkins, Norbert Perrimon, and Stephanie E Mohr. 10/11/2016. “—the database of the Drosophila RNAi screening center and transgenic RNAi project: 2017 update.” Nucleic Acids Research. Publisher's VersionAbstract

The FlyRNAi database of the Drosophila RNAi Screening Center (DRSC) and Transgenic RNAi Project (TRiP) at Harvard Medical School and associated DRSC/TRiP Functional Genomics Resources website ( serve as a reagent production tracking system, screen data repository, and portal to the community. Through this portal, we make available protocols, online tools, and other resources useful to researchers at all stages of high-throughput functional genomics screening, from assay design and reagent identification to data analysis and interpretation. In this update, we describe recent changes and additions to our website, database and suite of online tools. Recent changes reflect a shift in our focus from a single technology (RNAi) and model species (Drosophila) to the application of additional technologies (e.g. CRISPR) and support of integrated, cross-species approaches to uncovering gene function using functional genomics and other approaches.