Help


FIDEA has been designed to help in the functional interpretation of the results derived by differential expression analysis by assessing which functional categories are significantly enriched by the differentially expressed genes.

Input data

As a first step the user has to select the specie under examination. Currently the choice is among five species: human, mouse, zebra-fish, fruit flies and yeast.

Next, the user has to upload the DE analysis results. It is possible to directly upload the cuffdiff results (gene_exp.diff) or, alternatively, a DE analysis result in another format. In the second case the user has to indicate whether the file (tab, dot-comma, comma separated) includes the header and the experiment name, which column stores the experiment name, the gene name, the fold change (in log2 or real values) and the corrected p-value. Although the tool was thought for the functional characterization of the DE analysis results, it also gives the possibility to perform a functional enrichment analysis on a single gene list. Both input files (DE analysis results and single gene list) can use one of the following gene ID formats: Gene symbol, Entrez gene ID, Ensembl gene ID; UCSC gene ID, Refseq ID, ZFIN Gene ID, FlyBase ID, FlyBase annotation symbol, primary SGD ID, SGD systematic name.

The functional characterization is done on differentially expressed genes, by default Fidea considers as differentially expressed those with corrected p-value lower than 0.05. However, the user has the possibility to change the p-value threshold. The maximum corrected p-value that can be used as threshold is 0.1.

The user can also set the background for the over-representation analysis. This can either be made by all the genes annotated in our database or, alternatively, a customized list of genes. In the latter case, the accepted gene IDs are: Gene symbol, Entrez gene ID, Ensembl gene ID; UCSC gene ID, Refseq ID, ZFIN Gene ID, FlyBase ID, FlyBase annotation symbol, primary SGD ID, SGD systematic name.

Preliminary statistics

The tool processes the input file retaining only the DE expressed genes for the analysis (those with a corrected p-value lower than the p-value threshold). It immediately provides the total number of genes in the input file, the number of differentially expressed genes and the number of those present in our database. A small fraction of gene IDs in the database may reflect a problem due to non supported gene IDs in the input file or to the incorrect selection of the organism under examination.

The list of the genes not recognized by FIDEA can be viewed by clicking on the "(see gene list)" link.

If more than one entry for the same gene is present in the input, the user is warned and information about their annotations are displayed. Only the entry with the lowest p-value is used in all subsequent analyses.

A graph reporting the distribution of the absolute values of log2 fold changes for each experiment is shown. The user can customize the analysis restricting it only to the genes with an absolute log2 fold change above a certain value. The effects of this filter are immediately visualized.

A graph reporting the distribution of the p-values for each experiment is shown. The user can customize the analysis by restricting it only to genes with a p-value below a certain value. The effects of this filter are immediately visualized.

A graph reporting the number of differentially expressed genes for each experiment is made each time the fold change and/or the p-value filters are applied. Once the user clicks the “Start Analysis” button, the tool performs the functional interpretation of the differentially expressed genes.

Compare different experiments

In case a file with multiple experiments is given as input , the user can intersect the DE genes (up or downregulated) derived by different experiments. A Venn diagram and a list of the genes in the intersection are shown.

The user can subsequently perform a functional characterization of this list of genes.

Functional characterization results

The considered functional classes are: KEGG, Interpro, Gene Ontology Molecular Function, Gene Ontology Biological Process, Gene Ontology Cellular Component and GoSlim. For each of them, two types of functional characterization are reported: one considering up regulated genes and down regulated genes as independent gene sets, "Upregulated and Downregulated as separate groups", and one considering all DE genes as a single gene set, "Upregulated and Downregulated as a single group".

The "Upregulated and Downregulated as separate groups" analysis produces an interactive heatmap showing the differences among the corrected p-values of upregulated genes and downregulated genes.

The functional categories significantly enriched (corrected P value <0.05) in at least one of the two groups are reported in the graph. The color code reflects the absolute log10 of the corrected p-value. A compact view of the whole graph is shown, user can zoom in (and out) the graph by using the mouse scroll button. Once the best view is chosen it is possible to visualize all the categories in the graph by using the up and down arrow keyboard keys.

Clicking on the cell the user can see the DE genes annotated in that specific functional category. Interactive graphs are made trough the CanvasXpress javascript library. By using the menu buttons at the top right it is possible to order the rows of the heatmap in ascending or descending order, according to the p-value of one of the two sets. Moreover the user can filter, resize and save the heatmap as an image.

FIDEA also provides a publication-ready heatmap. A maximum of 60 categories with the smallest corrected p-values are shown. The results are also presented as tables. The user can visualize both an interactive table with the significantly enriched categories (corrected p-value <0.05) and a textual table reporting all the functional categories (also the non significantly enriched ones).

In the "Upregulated and Downregulated as a single group" analysis, DE genes are considered as a single class without distinguish the sign of their fold change. The interactive graph reports the functional categories significantly enriched (corrected P value <0.05), in ascending order of corrected p-value.

A compact view of the whole graph is shown, user can zoom in (and out) the graph by using the mouse scroll button. Once the best view is chosen it is possible to visualize all the categories in the graph by moving the up and down arrow keyboard keys. The colored bar reports the fraction of upregulated and downregulated genes present in that functional category. Clicking on the bar the user can see the DE genes identified in the selected functional category.

It is also possible to filter, resize and save as image the graph, by using the menu buttons at the top right.

A word-cloud reporting the results of the enrichment analysis is also realized: the font size is proportional to the significance of the corrected p-values (the larger the letters, the smaller the p-values), the color reflects the fraction of upregulated and downregulated genes.

The results are also presented as tables. The user can visualize both an interactive table with the significantly enriched categories (corrected p-value <0.05) and a static table reporting all the functional categories (also the non significantly enriched ones).

Examples

The user has the possibility to explore the functionality of FIDEA by using three different examples (it is possible to load them by clicking on the corresponding link in the upper side of the “Submission Data” page).

Example 1 (Ex_1) is a cuffdiff result, directly derived from NCBI GEO ( Series GSE37703).

Example 2 (Ex_2) contains two different experiments, both derived from the same NCBI GEO Series (GSE38377) and joined in a unique file.

Example 3 (Ex_3) is a functional characterization of a single gene list; it was derived from the supplementary materials of the article of Zhang et al. [ PMID: 23295773].