Gene Set Enrichment Analysis (GSEA) is a method which tries to identify groups of genes that are regulated together. It is implemented in module orngGsea, which is included in Orange for Functional Genomics package. To use orngGsea you need to install Orange for Functional Genomics.
GSEA takes gene
expression data for multiple samples with their phenotypes and computes
gene set enrichment for given gene sets. To use it run
runGSEA
method with the following arguments:
Arguments
ExampleTable
with gene expression data. An example
should correspond to a sample with its phenotype (class value). Attributes represent individual genes. Their names
should be meaningful gene aliases.classVar
attribute descriptor are used.hsa
for human, mmu
for mouse. Default: hsa
."class"
, class values (phenotypes) are permuted. This is the default.
However, if number of samples is small (less than 10), it is advisable to use "gene"
permutations even
though they ignore gene-gene interactions.runGSEA
returns a dictionary where key is a gene set label and its value a list
of:
A note on gene name matching. Gene name matching is performed with the help of KEGG database. A gene from a gene set is tried to be matched with a gene from the data set. If an alias for a gene from the gene set is the same as an alias for a gene in the data set, then those aliases are matched. If not, it is checked if gene alias from the gene set and gene alias from the data set are both gene aliases of the same gene according to KEGG database for a given organism. If they are, we have a match.
We present a simple usage examples. Data used here are not gene expression data. For the method to work we had to specify our one sets of attributes that seem to "belong together".
Corresponding output:
We can see that a "gene" labelled "petal color" was not used, because it couldn't be matched to any attribute in the data set.
Subramanian, Aravind and Tamayo, Pablo and Mootha, Vamsi K. and Mukherjee, Sayan and Ebert, Benjamin L. and Gillette, Michael A. and Paulovich, Amanda and Pomeroy, Scott L. and Golub, Todd R. and Lander, Eric S. and Mesirov, Jill P. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. PNAS, 2005.