DIOPT Documentation

URL

http://www.flyrnai.org/diopt

About DIOPT

The identification of orthologs is commonly used for bioinformatics activities such as data mining and establishing models for human diseases. Moreover, our group notes that researchers analyzing the results of screens performed at the Drosophila RNAi Screening Center (DRSC) frequently wish to identify mammalian orthologs of the fly genes that were "hits" (positive results) in their screens.

In helping DRSC screeners to identify orthologs using existing tools and algorithms, we recognized a need for a user-friendly approach to viewing and comparing ortholog predictions obtained using different tools and algorithms. This was our motivation in developing DIOPT. To facilitate identification of orthologs specifically of human disease-associated genes, we further developed DIOPT-DIST. Information about our approaches to development of both tools is summarized below.

The DIOPT Approach

Many tools have emerged to meet the need to identify orthologs. However, low coverage and heterogeneity of these tools present an obstacle to scientists who want to identify a one or a few highest-confidence orthologs for a given gene of interest or conversely, want to cast a wide net and follow up on all possible orthologs of a gene.

Our goal is to provide an easy-to-use resource that facilitates summary, comparison and access to various sources of ortholog predictions. DIOPT integrates human, mouse, fly, worm, zebrafish and yeast ortholog predictions made by Ensembl Compara, HomoloGene, Inparanoid, Isobase, OMA, orthoMCL, Phylome, RoundUp, and TreeFam. DIOPT lets users find ortholog pairs for a specified gene or genes identified by one, many or all of these published approaches. This provides a streamlined method for integration, comparison and access to orthology predictions originating from algorithms based on sequence homology, phylogenetic trees, and functional similarity. DIOPT calculates a simple score indicating the number of tools that support a given orthologous gene-pair relationship, as well as a weighted score based on functional assessment using high quality GO molecular function annotation of all fly-human orthologous pairs predicted by each tool. Differences in the algorithms used by each tool to predict orthologous relationship is one source of difference in the set of predictions made by one tool versus another. However, we also note that some of these differences might be attributable to use of different genome annotation releases used by some tools versus others, and that not all tools cover all of the species that we include in the DIOPT tool (see Tables 1,2 and 3).

DIOPT also displays protein and domain alignments, including percent amino acid identity, for predicted ortholog pairs. These should help you to identify the most appropriate matches among multiple possible orthologs.

The following summary figures and tables help to explain our approach and summarize the tools and algorithms included in DIOPT.

DIOPT integration schemaFigure 1: Summary of the DIOPT approach to integration of results from multiple ortholog prediction tools and algorithms. In green, tools based on sequence alignment. In purple, tools based on evolutionary relationships. In orange, a tool that incorporates protein-protein interaction network data into ortholog predictions.

Table 1: Summary Information and Publications for the Tools Integrated in DIOPT

Prediction MethodSourcePrediction AlgorithmCoverageDIOPT Weight*PMID
ComparaEnsemblPhylogenetic approach70 species (vs.81)0.93119029536
HomologeneNCBICombination of BBH*, tree and synteny21 species (vs. 68)111125071
Inparanoid Stockholm University, SwedenBBH* approach to identify orthologs and in-paralogs273 species (vs. 8)1.00511743721
25429972
IsobaseMITSequence and PPI* network alignments5 species (vs.2, Nov. 2014)0.95721177658
OMACBRG, ETH ZurichBBH*, global sequence alignments1706 species (Oct 2014)1.01917545180
OrthoDBUniversity of GenevaPhylogenetic approach3027 species (vs.8)1.00120972218
25428351
orthoMCLUniversity of PennsylvaniaMarkov Cluster algorithm150 species (vs. 5)0.90312952885
PhylomeCentre for Genomic Regulation (CRG), SpainReconstruction of evolutionary histories of all genes in a genome, also known as phylome.1059 species,120 Phylomes (vs. 4)0.91217962297
24275491
RoundUpHarvard Medical SchoolRSD*, modified BBH*2044 species (Apr 2013)1.00316777906
TreeFamWellcome Trust Sanger InstituteManually curated based on trees109 species (vs. 9)0.96316381935
24194607
Panther University of Southern CaliforniaPhylogenetic approach79 species1.126578592
HGNC European Bioinformatics Institute (EMBL-EBI)Manually curated2 species1.5 
ZFIN Zebrafish Model Organism DatabaseSequence similarity analysis and manual curation4 species1.5 
eggNOGEmbl, GermanyGraph-based algorithms2031 species (vs. 4.5)0.926582926

* DIOPT weights are based on the mean semantic similarity of high quality GO molecular function annotation of all fly-human orthologous pairs predicted by each tool.
   BBH, Best Blast Hits
   RSD, Reciprocal Smallest Distance
   PPI, Protein-Protein Interactions

Table 2A: Genome Release Information for the Tools Integrated in DIOPT

 WormFishFlyHumanMouseYeastFission YeastFrogRat
ComparaWBcel235GRCz10BDGP6GRCh38.p3GRCm38.p4R64-1-1 JGI 4.2Rnor_6.0
HomologeneWS195Zv9"FlyBase r5.48"GRCh38GRCm38.p2R64-1-1ASM294v2 Rnor_5.0
OMAEnsembl v73 WBcel235Ensembl v80 GRCz10Ensembl v84 BDGP6Ensembl v80 GRCh38Ensembl v84 GRCm38Ensembl v73 (EF4)Ensembl Fungi v22 (ASM294v2)Ensembl v73 (JGI_4.2)Ensembl v83 (Rnor_6.0)
InparanoidUniProt Nov 2013UniProt Nov 2013UniProt Nov 2013UniProt Nov 2013UniProt Nov 2013UniProt Nov 2013UniProt Nov 2013UniProt Nov 2013UniProt Nov 2013
IsobaseEnsembl v59NAEnsembl v59Ensembl v59Ensembl v59Ensembl v59   
orthoMCLWS206Zv8.56BDGP5.13.56GRCh37.56NCBI v37.56FungiDBGenBank Ensembl v53
orthoDB Ensembl v75FlyBase r5.55Ensembl v75Ensembl v75UniProt Feb 2014UniProt Feb 2014Ensembl v75Ensembl v75
RoundUpUniProt Apr 2013UniProt Apr 2013UniProt Apr 2013UniProt Apr 2013UniProt Apr 2013UniProt Apr 2013UniProt Apr 2013UniProt Apr 2013 
TreeFamEnsembl v69Ensembl v69Ensembl v69Ensembl v69Ensembl v69Ensembl v69Ensembl v69Ensembl v69Ensembl v69 Phylome
PantherWormBase Apr 2015Ensembl Apr 2015FlyBase Apr 2015Ensembl Apr 2015MGI Apr 2015SGD Apr 2015PomBase Apr 2015Gene Apr 2015RGD Apr 2015
HGNC   HGNC Feb 2016HGNC Feb 2016    
ZFIN ZFIN May 2016ZFIN May 2016ZFIN May 2016ZFIN May 2016    
eggNOG         


 

Table 2B: Additional Information About Genome Releases

Other ResourceVersion
WormBaserelease255
FlyBaserelease6.13
RefSeqrelease78
EntrezGene26-Oct-16


 

Table 3. Maximum DIOPT score for each orthologous relationship

 

Orthologous RelationshipMax scoreTypeRelevant Tools
baker's yeast-baker's yeast9paralogInparanoid,Homologene,eggNOG,Compara,Isobase,OrthoDB,Panther,orthoMCL,RoundUp,
baker's yeast-fish10orthologHomologene,TreeFam,RoundUp,Phylome,Panther,orthoMCL,Inparanoid,eggNOG,Compara,OMA,
baker's yeast-fission yeast9orthologPanther,Phylome,TreeFam,OrthoDB,Inparanoid,Homologene,RoundUp,OMA,orthoMCL,
baker's yeast-fly11orthologOMA,orthoMCL,Panther,Phylome,RoundUp,TreeFam,Inparanoid,Homologene,Isobase,Compara,eggNOG,
baker's yeast-frog8orthologTreeFam,Phylome,RoundUp,OMA,Inparanoid,Homologene,eggNOG,Compara,
baker's yeast-human11orthologCompara,Panther,TreeFam,Phylome,orthoMCL,OMA,Isobase,Inparanoid,Homologene,eggNOG,RoundUp,
baker's yeast-mouse11orthologTreeFam,Compara,eggNOG,Homologene,Inparanoid,Isobase,OMA,orthoMCL,Panther,RoundUp,Phylome,
baker's yeast-rat9orthologTreeFam,Phylome,Panther,orthoMCL,OMA,Inparanoid,Compara,Homologene,eggNOG,
baker's yeast-worm11orthologPhylome,Panther,orthoMCL,OMA,Compara,eggNOG,Homologene,Inparanoid,Isobase,TreeFam,RoundUp,
fish-fish8paralogInparanoid,Homologene,Compara,eggNOG,orthoMCL,Panther,RoundUp,OrthoDB,
fish-fission yeast8orthologHomologene,Inparanoid,OMA,orthoMCL,Panther,TreeFam,RoundUp,Phylome,
fish-fly12orthologPhylome,RoundUp,ZFIN,Panther,TreeFam,orthoMCL,Compara,eggNOG,Homologene,Inparanoid,OMA,OrthoDB,
fish-frog9orthologRoundUp,TreeFam,OrthoDB,OMA,Phylome,Compara,eggNOG,Homologene,Inparanoid,
fish-human12orthologPhylome,Compara,eggNOG,Homologene,Inparanoid,OMA,OrthoDB,Panther,RoundUp,TreeFam,ZFIN,orthoMCL,
fish-mouse12orthologCompara,eggNOG,Phylome,ZFIN,RoundUp,Panther,orthoMCL,OrthoDB,OMA,Inparanoid,Homologene,TreeFam,
fish-rat10orthologorthoMCL,OMA,eggNOG,Homologene,Inparanoid,OrthoDB,Panther,Phylome,TreeFam,Compara,
fish-worm11orthologOMA,Inparanoid,OrthoDB,Homologene,eggNOG,TreeFam,RoundUp,Phylome,orthoMCL,Panther,Compara,
fission yeast-fission yeast6paralogRoundUp,Inparanoid,OrthoDB,Panther,orthoMCL,Homologene,
fission yeast-fly8orthologHomologene,Inparanoid,OMA,orthoMCL,Panther,Phylome,RoundUp,TreeFam,
fission yeast-frog6orthologHomologene,Inparanoid,OMA,Phylome,RoundUp,TreeFam,
fission yeast-human8orthologHomologene,TreeFam,OMA,orthoMCL,Panther,Phylome,RoundUp,Inparanoid,
fission yeast-mouse8orthologRoundUp,TreeFam,Homologene,Phylome,OMA,orthoMCL,Inparanoid,Panther,
fission yeast-rat7orthologTreeFam,orthoMCL,Phylome,Panther,Homologene,Inparanoid,OMA,
fission yeast-worm8orthologOMA,orthoMCL,Panther,Phylome,Homologene,RoundUp,Inparanoid,TreeFam,
fly-fly9paralogIsobase,RoundUp,Panther,OrthoDB,Inparanoid,Homologene,eggNOG,Compara,orthoMCL,
fly-frog9orthologInparanoid,eggNOG,OrthoDB,Compara,OMA,Phylome,Homologene,RoundUp,TreeFam,
fly-human12orthologOMA,Isobase,Inparanoid,Homologene,eggNOG,Panther,Compara,OrthoDB,orthoMCL,TreeFam,RoundUp,Phylome,
fly-mouse12orthologRoundUp,TreeFam,Phylome,Isobase,eggNOG,OrthoDB,Compara,Panther,Homologene,Inparanoid,OMA,orthoMCL,
fly-rat10orthologPanther,orthoMCL,OrthoDB,OMA,Inparanoid,Homologene,eggNOG,Phylome,Compara,TreeFam,
fly-worm12orthologInparanoid,OrthoDB,orthoMCL,Panther,Phylome,RoundUp,TreeFam,Isobase,Homologene,eggNOG,Compara,OMA,
frog-frog6paralogInparanoid,eggNOG,Compara,Homologene,OrthoDB,RoundUp,
frog-human9orthologeggNOG,TreeFam,Compara,Homologene,Inparanoid,OMA,OrthoDB,Phylome,RoundUp,
frog-mouse9orthologTreeFam,RoundUp,Compara,eggNOG,Homologene,Inparanoid,OMA,OrthoDB,Phylome,
frog-rat8orthologOrthoDB,TreeFam,Phylome,OMA,Inparanoid,eggNOG,Compara,Homologene,
frog-worm9orthologOMA,TreeFam,Inparanoid,Homologene,eggNOG,Compara,RoundUp,Phylome,OrthoDB,
human-human9paralogCompara,RoundUp,eggNOG,Inparanoid,OrthoDB,orthoMCL,Panther,Homologene,Isobase,
human-mouse13orthologInparanoid,TreeFam,RoundUp,Phylome,Panther,orthoMCL,OrthoDB,OMA,Isobase,Homologene,Compara,eggNOG,HGNC,
human-rat11orthologTreeFam,Phylome,Panther,HGNC,Inparanoid,orthoMCL,Compara,eggNOG,Homologene,OMA,OrthoDB,
human-worm12orthologInparanoid,OrthoDB,Homologene,eggNOG,Compara,Isobase,TreeFam,RoundUp,Phylome,Panther,orthoMCL,OMA,
mouse-mouse9paralogPanther,eggNOG,RoundUp,Inparanoid,orthoMCL,OrthoDB,Isobase,Homologene,Compara,
mouse-rat10orthologPhylome,TreeFam,Panther,orthoMCL,OrthoDB,OMA,Inparanoid,Homologene,eggNOG,Compara,
mouse-worm12orthologOMA,OrthoDB,orthoMCL,Panther,Phylome,TreeFam,eggNOG,RoundUp,Inparanoid,Homologene,Compara,Isobase,
rat-rat7paralogorthoMCL,OrthoDB,Inparanoid,Homologene,eggNOG,Compara,Panther,
rat-worm10orthologTreeFam,Phylome,Panther,orthoMCL,OrthoDB,OMA,Inparanoid,Homologene,eggNOG,Compara,
worm-worm9paralogCompara,Homologene,RoundUp,Panther,orthoMCL,OrthoDB,Isobase,Inparanoid,eggNOG,

 

 

Version information

 

6.0- Dec 2016 -

  • Updated Data Sources
  • added eggNOG as a source
  • Added paralogs

5.5- Oct 2016 - Added multi-sequence alignment from target "All" heatmap

5.4- Sept 2016 - Added target species "All" and new filter

 

5.3- May 2016 - Added more prediction tools (Panther, HGNC and ZFIN)
5.2.1- April 2016 - Added orthologous rank
High: best score both ways AND DIOPT score >=2
Moderate:
(best score forward or reverse) AND DIOPT score >=2
DIOPT score >=4
Low: all others
5.2- April 2016 - Added New Spcecies (Rattus norvegicus)
5.1.1 - December 2015 - Added Best forward and reverse columns
5.1 - November 2015 - Upgraded gene matching algorithm
5.0 - November 2015 - Upgraded data sources to version 5