orngMeSH

The orngMeSH module provides the functionality to calculate MeSH term enrichment and to annotate (MeSH ontology) supported examples (chemical compounds with CID, medical articles with PMID).

orngMesh

orngMeSH is the main class for all task related to MeSH ontology

Methods

orngMeSH()
Constructor has no arguments.
setDataDir(path)
Sets path to directory containing MeSH ontology, CID and PMID annotation.
getDataDir
Returns current data directory path.
downloadOntology(callback=None)
Function downloads new MeSH ontology from the internet. Since MeSH ontology is quite large (about 60Mb) it may take a while.
findFrequentTerms(data, minSizeInTerm, callback=None)
Function returns a dictionary where keys are MeSH terms ids and values are integers representing number of examples annotated with corresponding MeSH term. Data has to be instance of ExampleTable. With argument minSizeInTerm you can select only MeSH terms that have at least minSizeInTerm annotated examples.
findEnrichedTerms(reference, cluster, pThreshold=0.05, callback=None)
Function returns a dictionary where keys are MeSH terms and values are lists of four integers (number of annotated reference examples, number of annotated cluster examples, MeSH term enrichment, fold enrichment). With attribute pThreshold you can limit MeSH terms in returned dictionary to terms with enrichment less or equal to defined constant. Data sets (reference and cluster) have to be instances of ExampleTable
printMeSH(data, selection = ["term","r","c", "p"])
Function performs a pretty print of a dictionary returned by function findFrequentTerms or findEnrichedTerms. When you are printing a dictionary of enriched MeSH terms (returned by findEnrichedTerms) you can also specify their properties and their order to print. At the moment you can choose among "term" (MeSH term name), "desc" (MeSH term description), "r" (number of examples from reference), "c" (number of examples from cluster), "p" (MeSH term enrichment) and "fold" (fold enrichment).
findTerms(ids, idType="cid")
Function returns a dictionary where keys are members of the list ids and values are lists of MeSH terms that apply to a key. With idType you can choose annotation ("cid" or TODO!!! "pmid").
parsePubMed(filename, attributes = ["pmid", "title","abstract","mesh"], skipExamplesWithout = ["mesh"])
Function parses PubMed XML file (search results saved in XML format) into Oranges ExampleTable. Of course you can select only certain attributes. At the moment supported attributes are "pmid" (PubMed ID), "title" (article title), "abstract" (article abstract), "mesh" (MeSH terms) and "affilation".
findSubset(examples, meshTerms, callback = None)
Function return a new dataset (subset of examples) with examples that apply to one or more MeSH terms from the list meshTerms. Argument examples has to be instance of ExampleTable.

Attributes

toID
Dictionary toID provides mapping between MeSH term and MeSH term ids. Please note that some MeSH terms have more than one MeSH term id (one to many relation).
toName
Dictionary toName provides mapping between MeSH term id and MeSH term.
toDesc
Dictionary toName provides mapping between MeSH term and MeSH term description.
fromCID
Dictionary fromCID provides mapping between CID (compound id) and a list of MeSH terms.
fromPMID
Dictionary fromPMID provides mapping between PMID (PubMed id) and a list of MeSH terms.

Examples

Basic operations on MeSH ontology

In our first example, we will show how to manipulate with MeSH ontology. Let's start with simple mapping between MeSH terms and their ids. This is done by the following code:

part of mesh1.py

import orange import orngMeSH

Calculating MeSH term frequency and MeSH term enrichment

asdf

Parsing PubMed XML data

asdf

Advanced: using MeSH terms relationship data

asdf

mds3.py (uses reference.tab and cluster.tab)

import orange import orngMeSH i=0 while 100>i: i+=1