Domain depots store the descriptors for domains loaded from the files, so that they can be reused when possible. They are used internally by all file formats built in Orange (except for the basket format); you can use them if you write functions for reading other data formats. In general, you can use depots whenever you want to store and reuse domain descriptors. In even more general, you can use depots as convenient methods for constructing domains, without intention of storing them.
Methods
Attribute names must be prefixed similar as in Orange's .txt file format. First comes an optional character that denotes that the attribute is meta attribute ('m') or class attribute ('c'). Only one attribute can be marked as class attribute. Then follows the obligatory type character, 'D', 'C' or 'S' for discrete, continuous or string attributes. The next character must be '#' and what remains is the actual attribute name. For instance, if attribute names are given as ['mS#name', 'C#age', 'D#gender', 'D#race', 'cC#total', 'mS#SSN']
, the constructed domain will have three attribute (continuous "age" followed by discrete "gender" and "race") and a continuous class attribute "total"; besides, there will be two meta attributes, strings with name and SSN.
knownAttributes
is an optional argument, a list of attributes that can be reused in case the domain is not found among the stored domains. Similarly, knownMetaAttributes
provides a dictionary of known meta attributes, with IDs as keys and Variable
s as values.
dontStore
and dontCheckStored
are flags that prevent the function searching among the stored domains and storing the new domain, respectively.
Function returns a tuple, with the first element being the constructed domain, the second is the list of IDs assigned to meta attributes (in the same order as they appear in the list of attribute names) and the last telling whether the domain is constructed anew or retrieved from the existing.
prepareDomain
first calls checkDomain
for each stored domain and returns the first for which the comparison is successful (if none is found, new domain is constructed). In your programs, you might want to use this function when the user is proposing a domain to be reused.
The function returns a tuple. The first element tells whether the domain matches. The second element is a list which, if the domain matches, contains a list of meta attribute ids, just as the one returned by prepareDomain
.
Note that although this is a method of class DomainDepot
it does not use any of its data. It's there only for convenience in the C++ code (where it is declared as a static member).
Depots are generally used for constructing domains, like this.
part of domainDepot.py
This will print
The domain is as expected. In the list of IDs of meta attributes, each element corresponds to a meta attribute in the same order as they are given in the list of names. Here, -2 corresponds to the first ('mS#name') and -3 to the second ('mS#SSN').
If we call the function again, but with the order of the meta attributes changed,
the domain is reused (the order of meta attributes is irrelevant), thus isNew
is true; metaIDs
now equal [-3, -2]
since the first meta attribute in the list got the ID -3 and the second -2. If you don't find this useful, wait till you program your own routines for reading data from files.
On the other hand, if you change the order, type or name of one of the attributes, a new domain is constructed altogether and new meta id's are constructed for meta attributes.
With the first two optional arguments, we can request reuse of attributes even when a new domain is constructed.
part of domainDepot.py
Here we simply told the prepareDomain
to use whatever useful it finds among the domain
's attributes and meta-attributes. Printout reveals that although the domain descriptor is new, attribute descriptors for 'SSN', 'gender' are reused, while 'race' is not since it changed the type to continuous, and 'total' is not since we've only given the domain.attribute
which does not include the class attribute (but we could have used domain.variables
or domain.attributes + [domain.classVar]
instead). The ID for meta attribute 'SSN' is also the same as before.
Finally, we can disable storing the domains and/or looking up for the stored by adding the two flags. Here's a little game.
part of domainDepot.py
The domain is retrieved only for the last call. Two domains are stored - the second and the third -, which are essentially equal and are both appropriate for the fourth. Due to the order of storing, the third (the most recent) is reused.