One of the most powerful ways to develop hypotheses regarding biological functions of conserved genes in a given species, such as in humans, is to first look at what is known about function in another species. Model organism databases (MODs) and other resources are rich with functional information but difficult to mine. Gene2Function (G2F) addresses a broad need by integrating information about conserved genes in a single online resource.
One major challenge encountered with interpreting human genetic variants is the limited understanding of the functional impact of genetic alterations on biological processes. Furthermore, there remains an unmet demand for an efficient survey of the wealth of information on human homologs in model organisms across numerous databases. To efficiently assess the large volume of publically available information, it is important to provide a concise summary of the most relevant information in a rapid user-friendly format. To this end, we created MARRVEL (model organism aggregated resources for rare variant exploration). MARRVEL is a publicly available website that integrates information from six human genetic databases and seven model organism databases. For any given variant or gene, MARRVEL displays information from OMIM, ExAC, ClinVar, Geno2MP, DGV, and DECIPHER. Importantly, it curates model organism-specific databases to concurrently display a concise summary regarding the human gene homologs in budding and fission yeast, worm, fly, fish, mouse, and rat on a single webpage. Experiment-based information on tissue expression, protein subcellular localization, biological process, and molecular function for the human gene and homologs in the seven model organisms are arranged into a concise output. Hence, rather than visiting multiple separate databases for variant and gene analysis, users can obtain important information by searching once through MARRVEL. Altogether, MARRVEL dramatically improves efficiency and accessibility to data collection and facilitates analysis of human genes and variants by cross-disciplinary integration of 18 million records available in public databases to facilitate clinical diagnosis and basic research.
Drosophila melanogaster has become a system of choice for functional genomic studies. Many resources, including online databases and software tools, are now available to support design or identification of relevant fly stocks and reagents or analysis and mining of existing functional genomic, transcriptomic, proteomic, etc. datasets. These include large community collections of fly stocks and plasmid clones, "meta" information sites like FlyBase and FlyMine, and an increasing number of more specialized reagents, databases, and online tools. Here, we introduce key resources useful to plan large-scale functional genomics studies in Drosophila and to analyze, integrate, and mine the results of those studies in ways that facilitate identification of highest-confidence results and generation of new hypotheses. We also discuss ways in which existing resources can be used and might be improved and suggest a few areas of future development that would further support large- and small-scale studies in Drosophila and facilitate use of Drosophila information by the research community more generally.
BACKGROUND: Mapping of orthologous genes among species serves an important role in functional genomics by allowing researchers to develop hypotheses about gene function in one species based on what is known about the functions of orthologs in other species. Several tools for predicting orthologous gene relationships are available. However, these tools can give different results and identification of predicted orthologs is not always straightforward. RESULTS: We report a simple but effective tool, the Drosophila RNAi Screening Center Integrative Ortholog Prediction Tool (DIOPT; http://www.flyrnai.org/diopt), for rapid identification of orthologs. DIOPT integrates existing approaches, facilitating rapid identification of orthologs among human, mouse, zebrafish, C. elegans, Drosophila, and S. cerevisiae. As compared to individual tools, DIOPT shows increased sensitivity with only a modest decrease in specificity. Moreover, the flexibility built into the DIOPT graphical user interface allows researchers with different goals to appropriately 'cast a wide net' or limit results to highest confidence predictions. DIOPT also displays protein and domain alignments, including percent amino acid identity, for predicted ortholog pairs. This helps users identify the most appropriate matches among multiple possible orthologs. To facilitate using model organisms for functional analysis of human disease-associated genes, we used DIOPT to predict high-confidence orthologs of disease genes in Online Mendelian Inheritance in Man (OMIM) and genes in genome-wide association study (GWAS) data sets. The results are accessible through the DIOPT diseases and traits query tool (DIOPT-DIST; http://www.flyrnai.org/diopt-dist). CONCLUSIONS: DIOPT and DIOPT-DIST are useful resources for researchers working with model organisms, especially those who are interested in exploiting model organisms such as Drosophila to study the functions of human disease genes.
Damage initiates a pleiotropic cellular response aimed at cellular survival when appropriate. To identify genes required for damage survival, we used a cell-based RNAi screen against the Drosophila genome and the alkylating agent methyl methanesulphonate (MMS). Similar studies performed in other model organisms report that damage response may involve pleiotropic cellular processes other than the central DNA repair components, yet an intuitive systems level view of the cellular components required for damage survival, their interrelationship, and contextual importance has been lacking. Further, by comparing data from different model organisms, identification of conserved and presumably core survival components should be forthcoming. We identified 307 genes, representing 13 signaling, metabolic, or enzymatic pathways, affecting cellular survival of MMS-induced damage. As expected, the majority of these pathways are involved in DNA repair; however, several pathways with more diverse biological functions were also identified, including the TOR pathway, transcription, translation, proteasome, glutathione synthesis, ATP synthesis, and Notch signaling, and these were equally important in damage survival. Comparison with genomic screen data from Saccharomyces cerevisiae revealed no overlap enrichment of individual genes between the species, but a conservation of the pathways. To demonstrate the functional conservation of pathways, five were tested in Drosophila and mouse cells, with each pathway responding to alkylation damage in both species. Using the protein interactome, a significant level of connectivity was observed between Drosophila MMS survival proteins, suggesting a higher order relationship. This connectivity was dramatically improved by incorporating the components of the 13 identified pathways within the network. Grouping proteins into "pathway nodes" qualitatively improved the interactome organization, revealing a highly organized "MMS survival network." We conclude that identification of pathways can facilitate comparative biology analysis when direct gene/orthologue comparisons fail. A biologically intuitive, highly interconnected MMS survival network was revealed after we incorporated pathway data in our interactome analysis.