#  COMPLEAT Documentation 

 



 ##  

  expand\_more  

 
  

 

## URL

<http://www.flyrnai.org/compleat/>

### Data upload page

**Species:** The user may choose human, fly or yeast

**Data files:** The user may upload multiple datasets in one file or multiple files. Tab-separated text files, and excel files (.xls, .xlsx, .xlsm) will be accepted.

### Table 3. Format of data file(s) for COMPLEAT

SortFormatExample of file 1Example of file 2 Tab-separated values, single file with one or more data columns

[FlyRNAi\_data\_baseline\_vs\_EGF.tsv](http://www.flyrnai.org/compleat/Download?filename=FlyRNAi_data_baseline_vs_EGF.tsv&type=text/plain)

 

 

Tab-separated values, multiple files with one or more data columns per file. (Use shift-click or ctrl-click in file browser to select multiple files.)

[FlyRNAi\_data\_baseline.tsv](http://www.flyrnai.org/compleat/Download?filename=FlyRNAi_data_baseline.tsv&type=text/plain)

[FlyRNAi\_data\_EGF.tsv](http://www.flyrnai.org/compleat/Download?filename=FlyRNAi_data_EGF.tsv&type=text/plain)

 

Excel file, single sheet with multiple data columns

[FlyRNAi\_data\_baseline\_vs\_EGF.xls](http://www.flyrnai.org/compleat/Download?filename=FlyRNAi_data_baseline_vs_EGF.xls&type=application/vnd.ms-excel)

 

 

Excel file, multiple sheets with single data columns

[FlyRNAi\_data\_baseline\_vs\_EGF\_2\_Sheets.xls](http://www.flyrnai.org/compleat/Download?filename=FlyRNAi_data_baseline_vs_EGF_2_Sheets.xls&type=application/vnd.ms-excel)

 

 

Tab-Separated, One data column

[HumanRNAiDNARepairScreen.tsv](http://www.flyrnai.org/compleat/Download?filename=HumanRNAiDNARepairScreen.tsv&type=text/plain)

 

 

Tab-Separated, One data column

[HumanRNAiStemCellDeterminantsScreen.tsv](http://www.flyrnai.org/compleat/Download?filename=HumanRNAiStemCellDeterminantsScreen.tsv&type=text/plain)

 







**Gene/protein identifiers:** COMPLEAT incorporates an ID mapping table to allow the user to upload data with Entrez GeneID, gene symbol, UniProt accession number, FlyBase GeneID for fly data or locus\_tag (ORF name) for yeast data.

### Table 4. Gene/protein identifiers for COMPLEAT

SortSourceHumanFlyYeastNCBI EntrezGene

GeneID or symbol

GeneID or symbol

GeneID, Locus-tag or symbol

FlyBase

 

FBgn, CG or symbol

 

UniProt

Accession number

Accession number

Accession number





**Advanced options:** COMPLEAT only returns detailed information for complexes below a certain p-value cutoff (default is 0.1). This detailed information is required for network visualization and further data mining. These restrictions were introduced to optimize the performance of the tool. COMPLEAT allows the user to specify a more stringent p-value cutoff than the default (0.1), to optimize performance based on user preferences. In addition, the user may choose the background from which random complexes are built. The default background (auto option) is selected based on the coverage of the input dataset. For genome-scale or close to genome scale datasets (i.e. input data larger than the size of complex resource; see Table 1 on the “About” page), the user input data will be selected as the background. If the user input is smaller (i.e. smaller than the complex resource size) the complex data are used as the background. Users can change this default option by specifying their preferred background data.

### Result page – global view

**Scatter plot:** COMPLEAT displays the enrichment results using an interactive scatter-plot, where each point corresponds to a single complex. The complex position corresponds to the score (IQM score), size reflects the relative complex size, and the color corresponds to the p-value. For a single dataset, the y-axis corresponds to the complex score (IQM score) and the x-axis corresponds to ranked complexes (based on the complex score). For multiple datasets, the x-axis shows the complex score from one dataset and the y-axis shows the complex score from another dataset. If the input is more than two datasets (up to four datasets are allowed), two dataset at a time can be compared, with the option to change what datasets are displayed on the x- and y-axes. The complexes are color-coded to distinguish significant, insignificant and (for multiple dataset inputs) dataset-specific complexes (Table 5). The user may change the p-value threshold using the p-value adjustment sliders. The user can mine additional data by entering keywords to select sub-sets of complexes associated with the keyword (e.g. enter “kinases” to view complexes that contain proteins annotated as kinases). The logical operators “AND” and “OR” can be also used in the search box to search with multiple key words.

### Table 5. Color-coding of scatter plot.

SortColorCategorymagenta

enriched in both datasets but in opposite directions

black

enriched in both dataset and in the same direction

cyan

enriched only in the dataset shown on the x-axis

blue

enriched only in the dataset shown on the y-axis

grey

not enriched in any datasets





### Search Options

COMPLEAT supports simple searches that apply across all indexed fields, as well as field-specific searches and boolean operators. Parentheses are used to specify the execution order of the search. The search is not case-sensitive with the exception of keywords AND, OR, NOT.  
**Basic search:**  
cyclin   
**Advanced search:**  
(gene:cdc2c OR gene:rpl27) AND source:literature NOT database:GO

#### Field Codes

SortField CodeValuesExamplegene

Gene ID, symbol, accession number, FBgn, CG, or locus tag (varies by organism, see table 4)

gene:cdc2c

source

Literature or Predicted

source:literature

database

CORUM, PINdb, CYC2008, Gene Ontology (GO), DPiM, KEGG, SignaLink

database:corum

name

words within complex names

name:cyclin

species

Fly, Yeast, Human (original species from which ortholog was obtained)

species:human

method

Prediction methods

method:coimmunoprecipitation

reference

PubMed ID

reference:8560263

citation

PubMed ID

citation:8560263





**Network visualization of the complex:** Users can click or select sets of complexes shown as dots in the interactive scatter plot for network visualization. When one or more complex is selected, a network representation(s) of the selected complex(es) is displayed in the web Cytoscape panel (right panel of the same page). In the network visualization, the node colors correspond to the user input values and range from green to red. Green corresponds to the lowest value and red is the maximum value. Gray nodes represent missing values (i.e. a protein that is present in the complex but missing in the user input data). The solid edges/PPIs correspond to known PPIs, and broken edges are interlogs (i.e. proteins for which the orthologous proteins in another species are known to physically interact). The network visualization and interactivity is supported by Cytoscape web. This includes the ability to move a node position and zoom into specific parts of the network.

**Table view:** The user may view enriched complexes in a sortable table that includes the complex name, input scores and COMPLEAT-computed p-values.

**Detailed complex information:** From the network visualization or table views, the user may select complexes to view additional details, including complex name, purification method and reference citation (for literature-based complexes), prediction algorithm (for predicted complexes), co-citation and co-localization information. Cytoscape images of selected complexes will be displayed in pairs showing the data from original files as well as binary interactions.

**Data download:** both the scatter plot and the Cytoscape images of selected complexes can be saved as png or jpg files. The table can be saved as tab-separated text file.

SortQuestionAnswerAre the files I upload saved on the server?

Uploaded files are not saved. Calculations are performed in memory, and the results are returned to your browser.

### Data upload page

**Species:** The user may choose human, fly or yeast

**Data files:** The user may upload multiple datasets in one file or multiple files. Tab-separated text files, and excel files (.xls, .xlsx, .xlsm) will be accepted.

### Table 3. Format of data file(s) for COMPLEAT

Sort





Tab-separated values, single file with one or more data columns





[FlyRNAi\_data\_baseline\_vs\_EGF.tsv](http://www.flyrnai.org/compleat/Download?filename=FlyRNAi_data_baseline_vs_EGF.tsv&type=text/plain)





 





Tab-separated values, multiple files with one or more data columns per file. (Use shift-click or ctrl-click in file browser to select multiple files.)





[FlyRNAi\_data\_baseline.tsv](http://www.flyrnai.org/compleat/Download?filename=FlyRNAi_data_baseline.tsv&type=text/plain)





[FlyRNAi\_data\_EGF.tsv](http://www.flyrnai.org/compleat/Download?filename=FlyRNAi_data_EGF.tsv&type=text/plain)





 





Excel file, single sheet with multiple data columns





[FlyRNAi\_data\_baseline\_vs\_EGF.xls](http://www.flyrnai.org/compleat/Download?filename=FlyRNAi_data_baseline_vs_EGF.xls&type=application/vnd.ms-excel)





 





Excel file, multiple sheets with single data columns





[FlyRNAi\_data\_baseline\_vs\_EGF\_2\_Sheets.xls](http://www.flyrnai.org/compleat/Download?filename=FlyRNAi_data_baseline_vs_EGF_2_Sheets.xls&type=application/vnd.ms-excel)





 





Tab-Separated, One data column





[HumanRNAiDNARepairScreen.tsv](http://www.flyrnai.org/compleat/Download?filename=HumanRNAiDNARepairScreen.tsv&type=text/plain)





 





Tab-Separated, One data column





[HumanRNAiStemCellDeterminantsScreen.tsv](http://www.flyrnai.org/compleat/Download?filename=HumanRNAiStemCellDeterminantsScreen.tsv&type=text/plain)





 









**Gene/protein identifiers:** COMPLEAT incorporates an ID mapping table to allow the user to upload data with Entrez GeneID, gene symbol, UniProt accession number, FlyBase GeneID for fly data or locus\_tag (ORF name) for yeast data.

### Table 4. Gene/protein identifiers for COMPLEAT

Sort











GeneID or symbol





GeneID or symbol





GeneID, Locus-tag or symbol









 





FBgn, CG or symbol





 









Accession number





Accession number





Accession number









**Advanced options:** COMPLEAT only returns detailed information for complexes below a certain p-value cutoff (default is 0.1). This detailed information is required for network visualization and further data mining. These restrictions were introduced to optimize the performance of the tool. COMPLEAT allows the user to specify a more stringent p-value cutoff than the default (0.1), to optimize performance based on user preferences. In addition, the user may choose the background from which random complexes are built. The default background (auto option) is selected based on the coverage of the input dataset. For genome-scale or close to genome scale datasets (i.e. input data larger than the size of complex resource; see Table 1 on the “About” page), the user input data will be selected as the background. If the user input is smaller (i.e. smaller than the complex resource size) the complex data are used as the background. Users can change this default option by specifying their preferred background data.

### Result page – global view

**Scatter plot:** COMPLEAT displays the enrichment results using an interactive scatter-plot, where each point corresponds to a single complex. The complex position corresponds to the score (IQM score), size reflects the relative complex size, and the color corresponds to the p-value. For a single dataset, the y-axis corresponds to the complex score (IQM score) and the x-axis corresponds to ranked complexes (based on the complex score). For multiple datasets, the x-axis shows the complex score from one dataset and the y-axis shows the complex score from another dataset. If the input is more than two datasets (up to four datasets are allowed), two dataset at a time can be compared, with the option to change what datasets are displayed on the x- and y-axes. The complexes are color-coded to distinguish significant, insignificant and (for multiple dataset inputs) dataset-specific complexes (Table 5). The user may change the p-value threshold using the p-value adjustment sliders. The user can mine additional data by entering keywords to select sub-sets of complexes associated with the keyword (e.g. enter “kinases” to view complexes that contain proteins annotated as kinases). The logical operators “AND” and “OR” can be also used in the search box to search with multiple key words.

### Table 5. Color-coding of scatter plot.

Sort



magenta





enriched in both datasets but in opposite directions





black





enriched in both dataset and in the same direction





cyan





enriched only in the dataset shown on the x-axis





blue





enriched only in the dataset shown on the y-axis





grey





not enriched in any datasets









### Search Options

COMPLEAT supports simple searches that apply across all indexed fields, as well as field-specific searches and boolean operators. Parentheses are used to specify the execution order of the search. The search is not case-sensitive with the exception of keywords AND, OR, NOT.  
**Basic search:**  
cyclin   
**Advanced search:**  
(gene:cdc2c OR gene:rpl27) AND source:literature NOT database:GO

#### Field Codes

Sort





gene





Gene ID, symbol, accession number, FBgn, CG, or locus tag (varies by organism, see table 4)





gene:cdc2c





source





Literature or Predicted





source:literature





database





CORUM, PINdb, CYC2008, Gene Ontology (GO), DPiM, KEGG, SignaLink





database:corum





name





words within complex names





name:cyclin





species





Fly, Yeast, Human (original species from which ortholog was obtained)





species:human





method





Prediction methods





method:coimmunoprecipitation





reference





PubMed ID





reference:8560263





citation





PubMed ID





citation:8560263









**Network visualization of the complex:** Users can click or select sets of complexes shown as dots in the interactive scatter plot for network visualization. When one or more complex is selected, a network representation(s) of the selected complex(es) is displayed in the web Cytoscape panel (right panel of the same page). In the network visualization, the node colors correspond to the user input values and range from green to red. Green corresponds to the lowest value and red is the maximum value. Gray nodes represent missing values (i.e. a protein that is present in the complex but missing in the user input data). The solid edges/PPIs correspond to known PPIs, and broken edges are interlogs (i.e. proteins for which the orthologous proteins in another species are known to physically interact). The network visualization and interactivity is supported by Cytoscape web. This includes the ability to move a node position and zoom into specific parts of the network.

**Table view:** The user may view enriched complexes in a sortable table that includes the complex name, input scores and COMPLEAT-computed p-values.

**Detailed complex information:** From the network visualization or table views, the user may select complexes to view additional details, including complex name, purification method and reference citation (for literature-based complexes), prediction algorithm (for predicted complexes), co-citation and co-localization information. Cytoscape images of selected complexes will be displayed in pairs showing the data from original files as well as binary interactions.

**Data download:** both the scatter plot and the Cytoscape images of selected complexes can be saved as png or jpg files. The table can be saved as tab-separated text file.

Sort



Are the files I upload saved on the server?





Uploaded files are not saved. Calculations are performed in memory, and the results are returned to your browser.