Data mining

Image of green fluorescence in GFP-tagged Drosophila cultured cells

New cell lines & new understandings using cutting-edge techniques

February 2, 2024

As a facility that supports large-scale screens in Drosophila and other insect cell lines, we get excited about reports of new Drosophila cell lines and related info.

We'd like to highlight two recent papers.

One report, a collaboration between Amanda Simcox's group, the DGRC, and our group here at the DRSC, describes new cell lines made in Amanda's group and characterized in a collaboration of the three groups. Muscle cells that pulse? Yes. That and other exciting new cell lines are reported in the publication below, and the cells are available at the DGRC...

Ashley Mae Conard, Nathaniel Goodman, Yanhui Hu, Norbert Perrimon, Ritambhara Singh, Charles Lawrence, and Erica Larschan. 2021. “TIMEOR: a web-based tool to uncover temporal regulatory mechanisms from multi-omics data.” Nucleic Acids Res, 49, W1, Pp. W641-W653.Abstract

Uncovering how transcription factors regulate their targets at DNA, RNA and protein levels over time is critical to define gene regulatory networks (GRNs) and assign mechanisms in normal and diseased states. RNA-seq is a standard method measuring gene regulation using an established set of analysis stages. However, none of the currently available pipeline methods for interpreting ordered genomic data (in time or space) use time-series models to assign cause and effect relationships within GRNs, are adaptive to diverse experimental designs, or enable user interpretation through a web-based platform. Furthermore, methods integrating ordered RNA-seq data with protein-DNA binding data to distinguish direct from indirect interactions are urgently needed. We present TIMEOR (Trajectory Inference and Mechanism Exploration with Omics data in R), the first web-based and adaptive time-series multi-omics pipeline method which infers the relationship between gene regulatory events across time. TIMEOR addresses the critical need for methods to determine causal regulatory mechanism networks by leveraging time-series RNA-seq, motif analysis, protein-DNA binding data, and protein-protein interaction networks. TIMEOR's user-catered approach helps non-coders generate new hypotheses and validate known mechanisms. We used TIMEOR to identify a novel link between insulin stimulation and the circadian rhythm cycle. TIMEOR is available at https://github.com/ashleymaeconard/TIMEOR.git and http://timeor.brown.edu.

Graphical image of tissue culture, fly pushing, and computer, and the team of people who work with them

DRSC/TRiP and DRSC-BTRR Office Hours

September 13, 2021

New this fall: Online office hours!

Do you have questions about modifying Drosophila cell lines with CRISPR or performing large-scale cell screens? Questions about in vivo RNAi with TRiP fly stocks or CRISPR knockout or activation with our sgRNA fly stocks? Questions about our new protocols and resources for CRISPR mosquito cell lines? Pop into our Zoom office hours to say hello and get our expert input! Registration is required (see below).

DRSC/TRiP & DRSC-BTRR Office Hours Schedule:

Mon. Sept. 27, 2021, 12...

Yanhui Hu, Sudhir Gopal Tattikota, Yifang Liu, Aram Comjean, Yue Gao, Corey Forman, Grace Kim, Jonathan Rodiger, Irene Papatheodorou, Gilberto Dos Santos, Stephanie E Mohr, and Norbert Perrimon. 2021. “DRscDB: A single-cell RNA-seq resource for data mining and data comparison across species.” Comput Struct Biotechnol J, 19, Pp. 2018-2026.Abstract

DRscDB.pdf

With the advent of single-cell RNA sequencing (scRNA-seq) technologies, there has been a spike in studies involving scRNA-seq of several tissues across diverse species including Drosophila. Although a few databases exist for users to query genes of interest within the scRNA-seq studies, search tools that enable users to find orthologous genes and their cell type-specific expression patterns across species are limited. Here, we built a new search database, DRscDB (https://www.flyrnai.org/tools/single_cell/web/), to address this need. DRscDB serves as a comprehensive repository for published scRNA-seq datasets for Drosophila and relevant datasets from human and other model organisms. DRscDB is based on manual curation of Drosophila scRNA-seq studies of various tissue types and their corresponding analogous tissues in vertebrates including zebrafish, mouse, and human. Of note, our search database provides most of the literature-derived marker genes, thus preserving the original analysis of the published scRNA-seq datasets. Finally, DRscDB serves as a web-based user interface that allows users to mine gene expression data from scRNA-seq studies and perform cell cluster enrichment analyses pertaining to various scRNA-seq studies, both within and across species.

A.M. Conard, N. Goodman, Hu, Y, N. Perrimon, R. Singh, C. Lawrence, and E. Larschan. 9/15/2020. “TIMEOR: a web-based tool to uncover temporal regulatory mechanisms from multi-omics data [NOTE: A modified final version was published in NAR and is now available].” BioRxiv. Publisher's Version Abstract

2020.09.14.296418v1.full_.pdf

Uncovering how transcription factors (TFs) regulate their targets at the DNA, RNA and protein levels over time is critical to define gene regulatory networks (GRNs) in normal and diseased states. RNA-seq has become a standard method to measure gene regulation using an established set of analysis steps. However, none of the currently available pipeline methods for interpreting ordered genomic data (in time or space) use time series models to assign cause and effect relationships within GRNs, are adaptive to diverse experimental designs, or enable user interpretation through a web-based platform. Furthermore, methods which integrate ordered RNA-seq data with transcription factor binding data are urgently needed. Here, we present TIMEOR (Trajectory Inference and Mechanism Exploration with Omics data in R), the first web-based and adaptive time series multi-omics pipeline method which infers the relationship between gene regulatory events across time. TIMEOR addresses the critical need for methods to predict causal regulatory mechanism networks between TFs from time series multi-omics data. We used TIMEOR to identify a new link between insulin stimulation and the circadian rhythm cycle. TIMEOR is available at https://github.com/ashleymaeconard/TIMEOR.git.

Screenshot image of our Online Tools Overview page

New publications from the DRSC bioinformatics team

October 28, 2020

The DRSC bioinformatics team, led by Dr. Claire Yanhui Hu, has recently published two new papers.

One reports development of BioLitMine, an advanced literature mining resource. The other provides an overview of our online resources, which can be grouped into reagent, gene, and data-focused resources.

As a supplement to these publications, we...

Yanhui Hu, Aram Comjean, Jonathan Rodiger, Yifang Liu, Yue Gao, Verena Chung, Jonathan Zirin, Norbert Perrimon, and Stephanie E Mohr. 2020. “FlyRNAi.org-the database of the Drosophila RNAi screening center and transgenic RNAi project: 2021 update.” Nucleic Acids Res.Abstract

gkaa936.pdf

The FlyRNAi database at the Drosophila RNAi Screening Center and Transgenic RNAi Project (DRSC/TRiP) provides a suite of online resources that facilitate functional genomics studies with a special emphasis on Drosophila melanogaster. Currently, the database provides: gene-centric resources that facilitate ortholog mapping and mining of information about orthologs in common genetic model species; reagent-centric resources that help researchers identify RNAi and CRISPR sgRNA reagents or designs; and data-centric resources that facilitate visualization and mining of transcriptomics data, protein modification data, protein interactions, and more. Here, we discuss updated and new features that help biological and biomedical researchers efficiently identify, visualize, analyze, and integrate information and data for Drosophila and other species. Together, these resources facilitate multiple steps in functional genomics workflows, from building gene and reagent lists to management, analysis, and integration of data.

Yanhui Hu, Verena Chung, Aram Comjean, Jonathan Rodiger, Fnu Nipun, Norbert Perrimon, and Stephanie E Mohr. 2020. “BioLitMine: Advanced Mining of Biomedical and Biological Literature About Human Genes and Genes from Major Model Organisms.” G3 (Bethesda).Abstract

4531.full_.pdf

The accumulation of biological and biomedical literature outpaces the ability of most researchers and clinicians to stay abreast of their own immediate fields, let alone a broader range of topics. Although available search tools support identification of relevant literature, finding relevant and key publications is not always straightforward. For example, important publications might be missed in searches with an official gene name due to gene synonyms. Moreover, ambiguity of gene names can result in retrieval of a large number of irrelevant publications. To address these issues and help researchers and physicians quickly identify relevant publications, we developed BioLitMine, an advanced literature mining tool that takes advantage of the medical subject heading (MeSH) index and gene-to-publication annotations already available for PubMed literature. Using BioLitMine, a user can identify what MeSH terms are represented in the set of publications associated with a given gene of the interest, or start with a term and identify relevant publications. Users can also use the tool to find co-cited genes and a build a literature co-citation network. In addition, BioLitMine can help users build a gene list relevant to a MeSH terms, such as a list of genes relevant to "stem cells" or "breast neoplasms." Users can also start with a gene or pathway of interest and identify authors associated with that gene or pathway, a feature that makes it easier to identify experts who might serve as collaborators or reviewers. Altogether, BioLitMine extends the value of PubMed-indexed literature and its existing expert curation by providing a robust and gene-centric approach to retrieval of relevant information.

Image of an anesthetized male Drosophila fruit fly

DRSC/TRiP presentations from June 2020 Boston Area Drosophila Meeting

June 12, 2020

Did you miss the presentations from Claire Hu and Jonathan Zirin at the June 2020 Boston Area Drosophila Meeting? No problem! The slides can be accessed from this post. Click the title above to view the whole post, then scroll down to access the PDFs. These presentations describe what's new and next in bioinformatics and in vivo technologies at the DRSC/TRiP. Feel free to reach out with questions. Interested in the BAD meeting? Info about the meeting can be found here. Read more about DRSC/TRiP presentations from June 2020 Boston Area Drosophila Meeting

Wilinski and colleagues release "FlyScape" for metabolic network visualization

November 7, 2019

The DRSC congratulates Wilinski et al. at the University of Michigan for their release and publication of FlyScape, a tool for metabolic network visualization.

Rapid metabolic shifts occur during the transition between hunger and satiety in Drosophila melanogaster

Daniel Wilinski, Jasmine Winzeler, William Duren, Jenna L. Persons, Kristina J. Holme, Johan Mosquera, Morteza Khabiri, Jason M. Kinchen, Peter L. Freddolino, Alla Karnovsky & Monica Dus

...

Screenshot of a results page from iProteinDB

DRSC Bioinformatics launches iProteinDB and BioLitMine

August 29, 2018

We have two new online tools available: iProteinDB and BioLitMine.

Both continue our series of resources aimed at integrating existing information in new ways to facilitate data mining and development of new hypotheses. In addition, iProteinDB includes a new dataset related to post-transcriptional modification (in this case,...

Missed us at ADRC 2018? View our workshop slides!

April 19, 2018

Thank you to all those who attended our workshop at last week's Annual Drosophila Research Conference in Philadelphia, PA, USA. It was great to talk fly stocks, cell screens, and bioinformatics with the community. We are here to help and look forward to continued feedback on the resources we are building to empower your research. PDFs of our workshop presentations are attached to this news item. The slides will help you learn more about our in vivo resources for CRISPR, new pooled cell-based CRISPR screen technology, and bioinformatics resources at our facility. Feel free to contact... Read more about Missed us at ADRC 2018? View our workshop slides!

Cartoon of essential gene pooled screen (made using BioRender.io)

Pooled-format CRISPR screens in Drosophila cells

March 22, 2018

The DRSC/TRiP-FGR is pleased to support collaborations on pooled CRISPR screens using the method recently, reported in eLife by Viswanatha et al. (PDF download file below).

From the abstract: "... Here, we developed a site-specific integration strategy for library delivery and performed a genome-wide CRISPR knockout screen in Drosophila S2R+ cells. Under basal growth conditions, 1235 genes were essential for cell fitness...

Yanhui Hu, Arunachalam Vinayagam, Ankita Nand, Aram Comjean, Verena Chung, Tong Hao, Stephanie E Mohr, and Norbert Perrimon. 11/16/2017. “Molecular Interaction Search Tool (MIST): an integrated resource for mining gene and protein interaction data.” Nucleic Acids Res, 46, D1, Pp. D567-D574.Abstract

gkx1116.pdf

Model organism and human databases are rich with information about genetic and physical interactions. These data can be used to interpret and guide the analysis of results from new studies and develop new hypotheses. Here, we report the development of the Molecular Interaction Search Tool (MIST; http://fgrtools.hms.harvard.edu/MIST/). The MIST database integrates biological interaction data from yeast, nematode, fly, zebrafish, frog, rat and mouse model systems, as well as human. For individual or short gene lists, the MIST user interface can be used to identify interacting partners based on protein-protein and genetic interaction (GI) data from the species of interest as well as inferred interactions, known as interologs, and to view a corresponding network. The data, interologs and search tools at MIST are also useful for analyzing 'omics datasets. In addition to describing the integrated database, we also demonstrate how MIST can be used to identify an appropriate cut-off value that balances false positive and negative discovery, and present use-cases for additional types of analysis. Altogether, the MIST database and search tools support visualization and navigation of existing protein and GI data, as well as comparison of new and existing data.

2018 Apr 13

DRSC & TRiP Workshop at ADRC

1:45pm to 3:45pm

Location:

Philadelphia, PA, USA

The DRSC & TRiP will be hosting a workshop at the Annual Drosophila Research Conference in Philadelphia, PA. The workshop is scheduled for Friday, April 13th from 1:45 to 3:45 PM. Come hear from DRSC & TRiP leaders Norbert Perrimon, Jonathan Zirin (organizer), Claire Yanhui Hu, and Stephanie Mohr. At the workshop, you will learn about new opportunities for community nomination and experiments using CRISPR knockout and activation, as well as learn what's new and popular among our online software and database tools. There will be something for everyone -- we will provide information... Read more about DRSC & TRiP Workshop at ADRC

DRSC/TRiP Functional Genomics Resources & DRSC-BTRR