URL
http://www.flyrnai.org/compleat/
Data upload page
Species: The user may choose human, fly or yeast
Data files: The user may upload multiple datasets in one file or multiple files. Tab-separated text files, and excel files (.xls, .xlsx, .xlsm) will be accepted.
Table 3. Format of data file(s) for COMPLEAT
Format | Example of file 1 | Example of file 2 | |
---|---|---|---|
Tab-separated values, single file with one or more data columns | FlyRNAi_data_baseline_vs_EGF.tsv | ||
Tab-separated values, multiple files with one or more data columns per file. (Use shift-click or ctrl-click in file browser to select multiple files.) | FlyRNAi_data_baseline.tsv | FlyRNAi_data_EGF.tsv | |
Excel file, single sheet with multiple data columns | FlyRNAi_data_baseline_vs_EGF.xls | ||
Excel file, multiple sheets with single data columns | FlyRNAi_data_baseline_vs_EGF_2_Sheets.xls | ||
Tab-Separated, One data column | HumanRNAiDNARepairScreen.tsv | ||
Tab-Separated, One data column | HumanRNAiStemCellDeterminantsScreen.tsv | ||
Gene/protein identifiers: COMPLEAT incorporates an ID mapping table to allow the user to upload data with Entrez GeneID, gene symbol, UniProt accession number, FlyBase GeneID for fly data or locus_tag (ORF name) for yeast data.
Table 4. Gene/protein identifiers for COMPLEAT
Source | Human | Fly | Yeast |
---|---|---|---|
NCBI EntrezGene | GeneID or symbol | GeneID or symbol | GeneID, Locus-tag or symbol |
FlyBase | FBgn, CG or symbol | ||
UniProt | Accession number | Accession number | Accession number |
Advanced options: COMPLEAT only returns detailed information for complexes below a certain p-value cutoff (default is 0.1). This detailed information is required for network visualization and further data mining. These restrictions were introduced to optimize the performance of the tool. COMPLEAT allows the user to specify a more stringent p-value cutoff than the default (0.1), to optimize performance based on user preferences. In addition, the user may choose the background from which random complexes are built. The default background (auto option) is selected based on the coverage of the input dataset. For genome-scale or close to genome scale datasets (i.e. input data larger than the size of complex resource; see Table 1 on the “About” page), the user input data will be selected as the background. If the user input is smaller (i.e. smaller than the complex resource size) the complex data are used as the background. Users can change this default option by specifying their preferred background data.
Result page – global view
Scatter plot: COMPLEAT displays the enrichment results using an interactive scatter-plot, where each point corresponds to a single complex. The complex position corresponds to the score (IQM score), size reflects the relative complex size, and the color corresponds to the p-value. For a single dataset, the y-axis corresponds to the complex score (IQM score) and the x-axis corresponds to ranked complexes (based on the complex score). For multiple datasets, the x-axis shows the complex score from one dataset and the y-axis shows the complex score from another dataset. If the input is more than two datasets (up to four datasets are allowed), two dataset at a time can be compared, with the option to change what datasets are displayed on the x- and y-axes. The complexes are color-coded to distinguish significant, insignificant and (for multiple dataset inputs) dataset-specific complexes (Table 5). The user may change the p-value threshold using the p-value adjustment sliders. The user can mine additional data by entering keywords to select sub-sets of complexes associated with the keyword (e.g. enter “kinases” to view complexes that contain proteins annotated as kinases). The logical operators “AND” and “OR” can be also used in the search box to search with multiple key words.
Table 5. Color-coding of scatter plot.
Color | Category |
---|---|
magenta |
enriched in both datasets but in opposite directions |
black |
enriched in both dataset and in the same direction |
cyan |
enriched only in the dataset shown on the x-axis |
blue |
enriched only in the dataset shown on the y-axis |
grey |
not enriched in any datasets |
Search Options
COMPLEAT supports simple searches that apply across all indexed fields, as well as field-specific searches and boolean operators. Parentheses are used to specify the execution order of the search. The search is not case-sensitive with the exception of keywords AND, OR, NOT.
Basic search:
cyclin
Advanced search:
(gene:cdc2c OR gene:rpl27) AND source:literature NOT database:GO
Field Codes
Field Code | Values | Example |
---|---|---|
gene | Gene ID, symbol, accession number, FBgn, CG, or locus tag (varies by organism, see table 4) | gene:cdc2c |
source | Literature or Predicted | source:literature |
database | CORUM, PINdb, CYC2008, Gene Ontology (GO), DPiM, KEGG, SignaLink | database:corum |
name | words within complex names | name:cyclin |
species | Fly, Yeast, Human (original species from which ortholog was obtained) | species:human |
method | Prediction methods | method:coimmunoprecipitation |
reference | PubMed ID | reference:8560263 |
citation | PubMed ID | citation:8560263 |
Network visualization of the complex: Users can click or select sets of complexes shown as dots in the interactive scatter plot for network visualization. When one or more complex is selected, a network representation(s) of the selected complex(es) is displayed in the web Cytoscape panel (right panel of the same page). In the network visualization, the node colors correspond to the user input values and range from green to red. Green corresponds to the lowest value and red is the maximum value. Gray nodes represent missing values (i.e. a protein that is present in the complex but missing in the user input data). The solid edges/PPIs correspond to known PPIs, and broken edges are interlogs (i.e. proteins for which the orthologous proteins in another species are known to physically interact). The network visualization and interactivity is supported by Cytoscape web. This includes the ability to move a node position and zoom into specific parts of the network.
Table view: The user may view enriched complexes in a sortable table that includes the complex name, input scores and COMPLEAT-computed p-values.
Detailed complex information: From the network visualization or table views, the user may select complexes to view additional details, including complex name, purification method and reference citation (for literature-based complexes), prediction algorithm (for predicted complexes), co-citation and co-localization information. Cytoscape images of selected complexes will be displayed in pairs showing the data from original files as well as binary interactions.
Data download: both the scatter plot and the Cytoscape images of selected complexes can be saved as png or jpg files. The table can be saved as tab-separated text file.
Question | Answer | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Are the files I upload saved on the server? | Uploaded files are not saved. Calculations are performed in memory, and the results are returned to your browser.
Data upload pageSpecies: The user may choose human, fly or yeast Data files: The user may upload multiple datasets in one file or multiple files. Tab-separated text files, and excel files (.xls, .xlsx, .xlsm) will be accepted. Table 3. Format of data file(s) for COMPLEAT
Gene/protein identifiers: COMPLEAT incorporates an ID mapping table to allow the user to upload data with Entrez GeneID, gene symbol, UniProt accession number, FlyBase GeneID for fly data or locus_tag (ORF name) for yeast data. Table 4. Gene/protein identifiers for COMPLEAT
Advanced options: COMPLEAT only returns detailed information for complexes below a certain p-value cutoff (default is 0.1). This detailed information is required for network visualization and further data mining. These restrictions were introduced to optimize the performance of the tool. COMPLEAT allows the user to specify a more stringent p-value cutoff than the default (0.1), to optimize performance based on user preferences. In addition, the user may choose the background from which random complexes are built. The default background (auto option) is selected based on the coverage of the input dataset. For genome-scale or close to genome scale datasets (i.e. input data larger than the size of complex resource; see Table 1 on the “About” page), the user input data will be selected as the background. If the user input is smaller (i.e. smaller than the complex resource size) the complex data are used as the background. Users can change this default option by specifying their preferred background data. Result page – global viewScatter plot: COMPLEAT displays the enrichment results using an interactive scatter-plot, where each point corresponds to a single complex. The complex position corresponds to the score (IQM score), size reflects the relative complex size, and the color corresponds to the p-value. For a single dataset, the y-axis corresponds to the complex score (IQM score) and the x-axis corresponds to ranked complexes (based on the complex score). For multiple datasets, the x-axis shows the complex score from one dataset and the y-axis shows the complex score from another dataset. If the input is more than two datasets (up to four datasets are allowed), two dataset at a time can be compared, with the option to change what datasets are displayed on the x- and y-axes. The complexes are color-coded to distinguish significant, insignificant and (for multiple dataset inputs) dataset-specific complexes (Table 5). The user may change the p-value threshold using the p-value adjustment sliders. The user can mine additional data by entering keywords to select sub-sets of complexes associated with the keyword (e.g. enter “kinases” to view complexes that contain proteins annotated as kinases). The logical operators “AND” and “OR” can be also used in the search box to search with multiple key words. Table 5. Color-coding of scatter plot.
Search OptionsCOMPLEAT supports simple searches that apply across all indexed fields, as well as field-specific searches and boolean operators. Parentheses are used to specify the execution order of the search. The search is not case-sensitive with the exception of keywords AND, OR, NOT. Field Codes
Network visualization of the complex: Users can click or select sets of complexes shown as dots in the interactive scatter plot for network visualization. When one or more complex is selected, a network representation(s) of the selected complex(es) is displayed in the web Cytoscape panel (right panel of the same page). In the network visualization, the node colors correspond to the user input values and range from green to red. Green corresponds to the lowest value and red is the maximum value. Gray nodes represent missing values (i.e. a protein that is present in the complex but missing in the user input data). The solid edges/PPIs correspond to known PPIs, and broken edges are interlogs (i.e. proteins for which the orthologous proteins in another species are known to physically interact). The network visualization and interactivity is supported by Cytoscape web. This includes the ability to move a node position and zoom into specific parts of the network. Table view: The user may view enriched complexes in a sortable table that includes the complex name, input scores and COMPLEAT-computed p-values. Detailed complex information: From the network visualization or table views, the user may select complexes to view additional details, including complex name, purification method and reference citation (for literature-based complexes), prediction algorithm (for predicted complexes), co-citation and co-localization information. Cytoscape images of selected complexes will be displayed in pairs showing the data from original files as well as binary interactions. Data download: both the scatter plot and the Cytoscape images of selected complexes can be saved as png or jpg files. The table can be saved as tab-separated text file.
|