orngChem implements the following classes
- FragmentMiner : The main class that does the search
- Fragment : Representation of the fragment
- Fragmenter : A class that is used to fragment an ExampleTable
- FragmentBasedLearner : A learner wrapper class that first runs the molecular fragmentation on the data
FragmentMiner
A class for finding frequent molecular fragments
Attributes
- active
- list of smiles codes of active molecules
- inactive
- list of smiles codes of inactive molecules
- minSupport
- minimum frequency in the active set of the fragments to search for
- maxSupport
- maximum frequency in the inactive set of the fragments to search for
- addWholeRings
- if True rings will be added as a whole rather then atom by atom
- canonicalPruning
- if True a cache of all cannonical codes of all fragments will be kept to avoid redundant search
- findClosed
- finds only fragments that are not sub-structures of any other fragment with the same support (default: True)
Methods
- Search()
- Runs the fragment search algorithm and returns a list of found fragments
Example
miner = FragmentMiner(active = ["NC(C)C(=O)O", "NC(CS)C(=O)O", "NC(CO)C(=O)O"], inactive = [], minSupport = 0.6)
for fragment in miner.Search():
print fragment.ToSmiles() , "Support: %.3f" %fragment.Support()
Fragment
A class representing a molecular fragment
Methods
- ToOBMol()
- Returns an openbabel.OBMol object representation
- ToSmiles()
- Returns a SMILES code representation
- ToCanonicalSmiles()
- Returns a canonical SMILES code representation
- Support()
- Returns the support of the fragment in the active set
- OcurrencesIn(smiles)
- Returns the number of times a fragment is containd in the molecule represented by the
smiles
code argument
- ContainedIn(smiles)
- Returns True if the fragment is present in the molecule represented by the
smiles
code argument
Fragmenter
An object that is used to fragment an ExampleTable
Attributes
- minSupport
- minimum frequency in the active set of the fragments to search for (default: 0.2)
- maxSupport
- maximum frequency in the inactive set of the fragments to search for (default: 0.2)
- findClosed
- finds only fragments that are not sub-structures of any other fragment with the same support (default: True)
Methods
- __call__(data, smilesAttr, activeFunc)
- Takes a data-set, and runs the FragmentMiner on it. Returns a new data-set and the fragments.
The new data-set contains new attributes that represent the presence of a fragment that was found.
Arguments
- data
- the dataset
- smilesAttr
- the attribute in the data that contains the SMILES codes (if none is provided it will try to make a smart guess)
- activeFunc
- a function that takes an example from the data-set and returns True if the example should be
considered as active (if none is provided all examples are considered active)
Example
fragmenter=Fragmenter(minSupport=0.1, maxSupport=0.05)
data, fragments=fragmenter(data, "SMILES")
FragmentBasedLearner
A learner wrapper class that first runs the molecular fragmentation on the data.
Attributes
- smilesAttr
- Attribute in the data that contains the smiles codes (if none is provided it will try to make a smart guess)
- learner
- learner that will be used to actualy learn on the fragmented data (default: orngSVM.SVMLearner)
- minSupport
- minimum frequency in the active set of the fragments to search for
- maxSupport
- maximum frequency in the inactive set of the fragments to search for
- activeFunc
- a function that takes an example from the learning data-set and returns True if the example should be
considered as active (if none is provided all examples are considered active)
- findClosed
- finds only fragments that are not sub-structures of any other fragment with the same support (default: True)