DIOPT Documentation

URL

http://www.flyrnai.org/diopt

About DIOPT

The identification of orthologs is commonly used for bioinformatics activities such as data mining and establishing models for human diseases. Moreover, our group notes that researchers analyzing the results of screens performed at the Drosophila RNAi Screening Center (DRSC) frequently wish to identify mammalian orthologs of the fly genes that were "hits" (positive results) in their screens.

In helping DRSC screeners to identify orthologs using existing tools and algorithms, we recognized a need for a user-friendly approach to viewing and comparing ortholog predictions obtained using different tools and algorithms. This was our motivation in developing DIOPT. To facilitate identification of orthologs specifically of human disease-associated genes, we further developed DIOPT-DIST. Information about our approaches to development of both tools is summarized below.

The DIOPT Approach

Many tools have emerged to meet the need to identify orthologs. However, low coverage and heterogeneity of these tools present an obstacle to scientists who want to identify a one or a few highest-confidence orthologs for a given gene of interest or conversely, want to cast a wide net and follow up on all possible orthologs of a gene.

Our goal is to provide an easy-to-use resource that facilitates summary, comparison and access to various sources of ortholog predictions. DIOPT integrates human, mouse, fly, worm, zebrafish and yeast ortholog predictions made by Ensembl Compara, HomoloGene, Inparanoid, Isobase, OMA, orthoMCL, Phylome, RoundUp, and TreeFam. DIOPT lets users find ortholog pairs for a specified gene or genes identified by one, many or all of these published approaches. This provides a streamlined method for integration, comparison and access to orthology predictions originating from algorithms based on sequence homology, phylogenetic trees, and functional similarity. DIOPT calculates a simple score indicating the number of tools that support a given orthologous gene-pair relationship, as well as a weighted score based on functional assessment using high quality GO molecular function annotation of all fly-human orthologous pairs predicted by each tool. Differences in the algorithms used by each tool to predict orthologous relationship is one source of difference in the set of predictions made by one tool versus another. However, we also note that some of these differences might be attributable to use of different genome annotation releases used by some tools versus others, and that not all tools cover all of the species that we include in the DIOPT tool (see Tables 1,2 and 3).

DIOPT also displays protein and domain alignments, including percent amino acid identity, for predicted ortholog pairs. These should help you to identify the most appropriate matches among multiple possible orthologs.

The following summary figures and tables help to explain our approach and summarize the tools and algorithms included in DIOPT.

DIOPT integration schemaFigure 1: Summary of the DIOPT approach to integration of results from multiple ortholog prediction tools and algorithms. In green, tools based on sequence alignment. In purple, tools based on evolutionary relationships. In orange, a tool that incorporates protein-protein interaction network data into ortholog predictions.

Table 1: Summary Information and Publications for the Tools Integrated in DIOPT

Prediction Method Source Prediction Algorithm Coverage DIOPT Weight* PMID
Compara Ensembl Phylogenetic approach 200 species (vs 96) 0.93 19029536
Homologene NCBI Combination of BBH*, tree and synteny 21 species (vs. 68) 1 11125071
Inparanoid Stockholm University, Sweden BBH* approach to identify orthologs and in-paralogs 273 species (vs. 8) 1.05 11743721
25429972
Isobase MIT Sequence and PPI* network alignments 5 species (vs.2, Nov. 2014) 0.95 21177658
OMA CBRG, ETH Zurich BBH*, global sequence alignments 2198 species (Jun 2018 release) 1.01 17545180
OrthoDB University of Geneva Phylogenetic approach >5000 species (vs.10) 1.01 20972218
25428351
orthoMCL University of Pennsylvania Markov Cluster algorithm 150 species (vs. 5) 0.9 12952885
Phylome Centre for Genomic Regulation (CRG), Spain Reconstruction of evolutionary histories of all genes in a genome, also known as phylome. 1059 species,120 Phylomes (vs. 4) QfO 0.91 17962297
24275491
RoundUp Harvard Medical School RSD*, modified BBH* 2044 species (Apr 2013) 1.03 16777906
TreeFam Wellcome Trust Sanger Institute Manually curated based on trees 109 species (vs. 9) 0.96 16381935
24194607
Panther University of Southern California Phylogenetic approach 132 species (vs. 14.1) 1.1 26578592
HGNC European Bioinformatics Institute (EMBL-EBI) Manually curated 3 species (July 2019) 1.5  
ZFIN Zebrafish Model Organism Database Sequence similarity analysis and manual curation 4 species (July 2019) 1.5  
eggNOG Embl, Germany Graph-based algorithms 5090 species (vs. 5.0) 0.9 26582926
OrthoFinder University of Oxford Graph-based algorithms Run using protein sequence of refseq vs94 1 26578592
OrthoInspector Institut de Génétique et de Biologie Moléculaire et Cellulaire A novel orthogroup inference algorithm that solves a previously undetected gene length bias in orthogroup inference, resulting in significant improvements in accuracy. QFO 2018 1 21219603
Hieranoid Stockholm Bioinformatics Center The OrthoInspector algorithm is divided into three main steps. First, the results of a Blast all-versus-all (proteomes are blasted against each other) is provided by the user and is parsed to find all the Blast best hits for each protein and to create the groups of inparalogs. Second, the inparalog groups for each organism are compared in a pairwise fashion to define potential orthologs and/or in-paralogs. Third, best hits that contradict the potential orthology between entities are detected. QFO 2018 1 27742821

* DIOPT weights are based on the mean semantic similarity of high quality GO molecular function annotation of all fly-human orthologous pairs predicted by each tool.
   BBH, Best Blast Hits
   RSD, Reciprocal Smallest Distance
   PPI, Protein-Protein Interactions

Table 2A: Genome Release Information for the Tools Integrated in DIOPT

 

                     
  Worm Fish Fly Human Mouse Yeast Fission Yeast Frog Rat Thale cress
Compara WBcel235 GRCz11 BDGP6.22 GRCh38 GRCm38 R64-1-1 NA JGI 4.2 Rnor_6.0  
Homologene WS195 Zv9 FlyBase r5.48 GRCh38 GRCm38.p2 R64-1-1 ASM294v2 Xtropicalis_v7 Rnor_5.0 TAIR10
OMA Ensembl 86; WBcel235; 14-SEP-2016 Ensembl 90; GRCz10 Ensembl 90; BDGP6 Ensembl 86; GRCh38; 13-SEP-2016 Ensembl 86; GRCm38; 13-SEP-2016 Ensembl 73; EF4; 23-AUG-2013 Ensembl Fungi 22; ASM294v2; 17-MAR-2014 Ensembl 73; JGI_4.2; 23-AUG-2013 Ensembl 83; Rnor_6.0; 28-NOV-2015 Ensembl Plants 20; TAIR10; 2-SEP-2013
Inparanoid UniProt Nov 2013 UniProt Nov 2013 UniProt Nov 2013 UniProt Nov 2013 UniProt Nov 2013 UniProt Nov 2013 UniProt Nov 2013 UniProt Nov 2013 UniProt Nov 2013 UniProt Nov 2013
Isobase Ensembl v59 NA Ensembl v59 Ensembl v59 Ensembl v59 Ensembl v59 NA NA NA NA
orthoMCL WS206 Zv8.56 BDGP5.13.56 GRCh37.56 NCBI v37.56 FungiDB GenBank NA Ensembl v53 GenBank
orthoDB Refseq Refseq Refseq Refseq Refseq Refseq Refseq Refseq Refseq Refseq
RoundUp UniProt Apr 2013 UniProt Apr 2013 UniProt Apr 2013 UniProt Apr 2013 UniProt Apr 2013 UniProt Apr 2013 UniProt Apr 2013 UniProt Apr 2013 NA NA
TreeFam Ensembl v69 Ensembl v69 Ensembl v69 Ensembl v69 Ensembl v69 Ensembl v69 Ensembl v69 Ensembl v69 Ensembl v69 NA
Panther UniProt_QFO2018 UniProt_QFO2018 UniProt_QFO2018 UniProt_QFO2018 UniProt_QFO2018 UniProt_QFO2018 UniProt_QFO2018 UniProt_QFO2018 UniProt_QFO2018 UniProt_QFO2018
Phylome UniProt UniProt UniProt UniProt UniProt UniProt UniProt UniProt UniProt UniProt
HGNC NA NA NA HGNC July 2019 HGNC July 2019 NA NA NA HGNC July 2019 NA
ZFIN NA ZFIN July 2019 ZFIN July 2019 ZFIN July 2019 ZFIN July 2019 NA NA NA NA NA
eggNOG Ensembl/Refseq Ensembl/Refseq Ensembl/Refseq Ensembl/Refseq Ensembl/Refseq Ensembl/Refseq NA Ensembl/Refseq Ensembl/Refseq NA
OrthoFinder RefSeq94 RefSeq94 RefSeq94 RefSeq94 RefSeq94 RefSeq94 RefSeq94 RefSeq94 RefSeq94 RefSeq94
OrthoInspector UniProt_QFO2018 UniProt_QFO2018 UniProt_QFO2018 UniProt_QFO2018 UniProt_QFO2018 UniProt_QFO2018 UniProt_QFO2018 UniProt_QFO2018 UniProt_QFO2018 UniProt_QFO2018
Hieranoid UniProt_QFO2018 UniProt_QFO2018 UniProt_QFO2018 UniProt_QFO2018 UniProt_QFO2018 UniProt_QFO2018 UniProt_QFO2018 UniProt_QFO2018 UniProt_QFO2018 UniProt_QFO2018

 

 

Table 2B: Additional Information About Genome Releases

Other Resource Version
WormBase release272
FlyBase release6.28
RefSeq release94
EntrezGene 11-July-2019


 

Table 3. Maximum DIOPT score for each orthologous relationship

 

Orthologous Relationship Max score Type Relevant Tools
Orthologous Relationship Max score Type Relevant Tools
baker's yeast-fish 13 ortholog Inparanoid;TreeFam;RoundUp;Phylome;Panther;orthoMCL;OrthoInspector;OMA;Homologene;Hieranoid;eggNOG;Compara;OrthoFinder;
baker's yeast-fission yeast 11 ortholog OrthoInspector;RoundUp;Phylome;TreeFam;orthoMCL;OMA;Inparanoid;Homologene;Hieranoid;Panther;OrthoFinder;
baker's yeast-fly 14 ortholog Phylome;RoundUp;Panther;orthoMCL;OrthoInspector;OrthoFinder;Isobase;Inparanoid;Homologene;Hieranoid;eggNOG;Compara;OMA;TreeFam;
baker's yeast-frog 12 ortholog Inparanoid;RoundUp;Phylome;Panther;OrthoInspector;TreeFam;Homologene;Hieranoid;eggNOG;Compara;OrthoFinder;OMA;
baker's yeast-human 14 ortholog OMA;RoundUp;Phylome;Panther;orthoMCL;OrthoFinder;TreeFam;Isobase;Inparanoid;Homologene;Hieranoid;eggNOG;Compara;OrthoInspector;
baker's yeast-mouse 14 ortholog OrthoFinder;RoundUp;TreeFam;Phylome;Panther;orthoMCL;OrthoInspector;OMA;Isobase;Inparanoid;Homologene;Hieranoid;Compara;eggNOG;
baker's yeast-rat 12 ortholog Inparanoid;TreeFam;Phylome;Panther;orthoMCL;OrthoInspector;OMA;Homologene;Hieranoid;eggNOG;Compara;OrthoFinder;
baker's yeast-Thale cress 8 ortholog Inparanoid;Phylome;Panther;orthoMCL;OMA;Homologene;Hieranoid;OrthoInspector;
baker's yeast-worm 14 ortholog OrthoFinder;RoundUp;orthoMCL;Panther;TreeFam;OrthoInspector;Isobase;Inparanoid;Homologene;Hieranoid;eggNOG;Compara;OMA;Phylome;
fish-fission yeast 11 ortholog orthoMCL;Homologene;RoundUp;Phylome;TreeFam;Panther;OrthoInspector;OrthoFinder;Inparanoid;Hieranoid;OMA;
fish-fly 15 ortholog OrthoInspector;Compara;TreeFam;RoundUp;Phylome;Panther;orthoMCL;ZFIN;OrthoFinder;OrthoDB;OMA;Inparanoid;Homologene;eggNOG;Hieranoid;
fish-frog 13 ortholog OrthoFinder;TreeFam;RoundUp;Phylome;OrthoInspector;OrthoDB;OMA;Inparanoid;Homologene;Hieranoid;eggNOG;Compara;Panther;
fish-human 15 ortholog eggNOG;OrthoInspector;TreeFam;RoundUp;ZFIN;Phylome;Panther;orthoMCL;OrthoFinder;OrthoDB;OMA;Inparanoid;Hieranoid;Compara;Homologene;
fish-mouse 15 ortholog OrthoInspector;Homologene;ZFIN;TreeFam;RoundUp;Phylome;Panther;orthoMCL;OrthoDB;Compara;Inparanoid;OMA;Hieranoid;eggNOG;OrthoFinder;
fish-rat 13 ortholog OrthoInspector;TreeFam;Phylome;orthoMCL;OrthoFinder;OrthoDB;Inparanoid;Homologene;Hieranoid;eggNOG;Compara;OMA;Panther;
fish-Thale cress 9 ortholog OMA;Phylome;Panther;orthoMCL;OrthoDB;Inparanoid;Homologene;Hieranoid;OrthoInspector;
fish-worm 14 ortholog OrthoInspector;TreeFam;RoundUp;Phylome;orthoMCL;OrthoFinder;eggNOG;Panther;Compara;OrthoDB;Hieranoid;Homologene;Inparanoid;OMA;
fission yeast-fly 11 ortholog OMA;RoundUp;Phylome;Panther;TreeFam;orthoMCL;Inparanoid;Homologene;Hieranoid;OrthoInspector;OrthoFinder;
fission yeast-frog 10 ortholog OrthoInspector;Phylome;RoundUp;Panther;OMA;Inparanoid;OrthoFinder;Homologene;Hieranoid;TreeFam;
fission yeast-human 11 ortholog RoundUp;TreeFam;Phylome;Panther;orthoMCL;OrthoFinder;OMA;Inparanoid;Hieranoid;Homologene;OrthoInspector;
fission yeast-mouse 11 ortholog OMA;TreeFam;RoundUp;Phylome;Panther;orthoMCL;Hieranoid;OrthoFinder;Inparanoid;Homologene;OrthoInspector;
fission yeast-rat 10 ortholog OrthoInspector;Panther;Phylome;orthoMCL;OMA;Inparanoid;Hieranoid;Homologene;TreeFam;OrthoFinder;
fission yeast-Thale cress 8 ortholog orthoMCL;Homologene;Phylome;Panther;OrthoInspector;Inparanoid;Hieranoid;OMA;
fission yeast-worm 11 ortholog OrthoFinder;TreeFam;RoundUp;Phylome;Panther;Hieranoid;OrthoInspector;OMA;Inparanoid;Homologene;orthoMCL;
fly-frog 13 ortholog OrthoFinder;RoundUp;TreeFam;Phylome;Panther;OrthoInspector;OrthoDB;OMA;Inparanoid;Homologene;Hieranoid;Compara;eggNOG;
fly-human 16 ortholog OrthoInspector;Compara;TreeFam;User_Submission;RoundUp;Phylome;Panther;orthoMCL;OrthoFinder;Hieranoid;eggNOG;Homologene;Inparanoid;Isobase;OMA;OrthoDB;
fly-mouse 15 ortholog Isobase;OrthoFinder;RoundUp;Phylome;Panther;orthoMCL;OrthoInspector;TreeFam;OMA;Inparanoid;Homologene;Hieranoid;eggNOG;Compara;OrthoDB;
fly-rat 13 ortholog OrthoDB;orthoMCL;Panther;TreeFam;OrthoInspector;OrthoFinder;Inparanoid;Homologene;Hieranoid;eggNOG;Compara;Phylome;OMA;
fly-Thale cress 9 ortholog OMA;Phylome;Panther;orthoMCL;OrthoDB;Inparanoid;Homologene;Hieranoid;OrthoInspector;
fly-worm 15 ortholog OrthoFinder;TreeFam;RoundUp;Phylome;Panther;orthoMCL;OrthoInspector;OrthoDB;OMA;Isobase;Inparanoid;Homologene;Hieranoid;Compara;eggNOG;
frog-human 13 ortholog OrthoDB;RoundUp;Phylome;Panther;OrthoFinder;OMA;Inparanoid;Homologene;Hieranoid;eggNOG;Compara;TreeFam;OrthoInspector;
frog-mouse 13 ortholog OrthoFinder;TreeFam;RoundUp;Phylome;Panther;OrthoInspector;OMA;Inparanoid;Homologene;Hieranoid;eggNOG;Compara;OrthoDB;
frog-rat 12 ortholog Panther;Phylome;OrthoInspector;OrthoFinder;OrthoDB;Inparanoid;Homologene;Hieranoid;eggNOG;Compara;TreeFam;OMA;
frog-Thale cress 8 ortholog Homologene;Phylome;Panther;OrthoInspector;OrthoDB;Inparanoid;Hieranoid;OMA;
frog-worm 13 ortholog OrthoFinder;RoundUp;TreeFam;Phylome;Panther;OrthoInspector;OrthoDB;OMA;Inparanoid;Homologene;Hieranoid;Compara;eggNOG;
human-mouse 16 ortholog eggNOG;OrthoFinder;TreeFam;RoundUp;Phylome;Panther;orthoMCL;OrthoInspector;OrthoDB;OMA;Isobase;Inparanoid;Homologene;HGNC;Compara;Hieranoid;
human-rat 14 ortholog OrthoFinder;HGNC;TreeFam;Phylome;Panther;orthoMCL;OrthoInspector;OMA;Inparanoid;Hieranoid;Homologene;eggNOG;Compara;OrthoDB;
human-Thale cress 9 ortholog OMA;Phylome;Panther;orthoMCL;OrthoDB;Inparanoid;Homologene;Hieranoid;OrthoInspector;
human-worm 15 ortholog OrthoFinder;eggNOG;RoundUp;Phylome;Panther;orthoMCL;OrthoInspector;TreeFam;OrthoDB;OMA;Isobase;Inparanoid;Hieranoid;Compara;Homologene;
mouse-rat 13 ortholog Inparanoid;TreeFam;Phylome;Panther;orthoMCL;OrthoInspector;OrthoFinder;OMA;Homologene;Hieranoid;eggNOG;Compara;OrthoDB;
mouse-Thale cress 9 ortholog OMA;Phylome;Panther;orthoMCL;OrthoDB;Inparanoid;Homologene;Hieranoid;OrthoInspector;
mouse-worm 15 ortholog OMA;TreeFam;RoundUp;Phylome;Panther;orthoMCL;OrthoInspector;OrthoDB;Isobase;Inparanoid;Homologene;Hieranoid;eggNOG;Compara;OrthoFinder;
rat-Thale cress 9 ortholog OrthoInspector;Hieranoid;Phylome;Panther;orthoMCL;OMA;Inparanoid;Homologene;OrthoDB;
rat-worm 13 ortholog OrthoDB;orthoMCL;Panther;TreeFam;OrthoInspector;OrthoFinder;Inparanoid;Homologene;Hieranoid;eggNOG;Compara;Phylome;OMA;
Thale cress-worm 9 ortholog OMA;Phylome;Panther;orthoMCL;OrthoDB;Inparanoid;Homologene;Hieranoid;OrthoInspector;
baker's yeast-baker's yeast 9 paralog Inparanoid;RoundUp;Panther;orthoMCL;Isobase;Homologene;eggNOG;Compara;OrthoFinder;
fish-fish 9 paralog OrthoDB;OrthoFinder;RoundUp;orthoMCL;Compara;Homologene;Panther;eggNOG;Inparanoid;
fission yeast-fission yeast 6 paralog Homologene;Inparanoid;OrthoFinder;orthoMCL;Panther;RoundUp;
fly-fly 10 paralog OrthoFinder;OrthoDB;RoundUp;Panther;Compara;Inparanoid;orthoMCL;Homologene;eggNOG;Isobase;
frog-frog 7 paralog OrthoFinder;Homologene;RoundUp;Compara;eggNOG;OrthoDB;Inparanoid;
human-human 11 paralog OMA;RoundUp;Panther;orthoMCL;OrthoDB;Isobase;Inparanoid;Homologene;eggNOG;Compara;OrthoFinder;
mouse-mouse 12 paralog Panther;Phylome;orthoMCL;OrthoFinder;OrthoDB;Isobase;Inparanoid;Homologene;Compara;eggNOG;OMA;RoundUp;
rat-rat 8 paralog Panther;Compara;eggNOG;Homologene;Inparanoid;OrthoDB;OrthoFinder;orthoMCL;
Thale cress-Thale cress 5 paralog Inparanoid;OrthoDB;orthoMCL;Panther;Homologene;
worm-worm 10 paralog Isobase;RoundUp;Panther;orthoMCL;OrthoDB;Inparanoid;Homologene;eggNOG;Compara;OrthoFinder;

SCORE DISTRIBUTIONS

 

Ortholog Score Distribution

 

diopt_data_ortholog_2019.png

diopt ortholog graph

Paralog Score Distribution

 

diopt_data_paralog_2019.png

diopt paralog graph

Version information

 

8.0- Aug 2019 -

7.1- Mar 2018 -

  •  Allow user to submit missing relationships
  •  Allow user to add feedback

7.0- Jan 2018 -

  • Updated Data Sources
  • added 3 new algorithms: OrthoFinder, OrthoInspector, Hieranoid as a sources
  • added new species: Arabidopsis (Thale Cress)

6.0- Dec 2016 -

  • Updated Data Sources
  • added eggNOG as a source
  • Added paralogs

5.5- Oct 2016 - Added multi-sequence alignment from target "All" heatmap

5.4- Sept 2016 - Added target species "All" and new filter

 

5.3- May 2016 - Added more prediction tools (Panther, HGNC and ZFIN)
5.2.1- April 2016 - Added orthologous rank
High: best score both ways AND DIOPT score >=2
Moderate:
(best score forward or reverse) AND DIOPT score >=2
DIOPT score >=4
Low: all others
5.2- April 2016 - Added New Spcecies (Rattus norvegicus)
5.1.1 - December 2015 - Added Best forward and reverse columns
5.1 - November 2015 - Upgraded gene matching algorithm
5.0 - November 2015 - Upgraded data sources to version 5