channelpack API Reference

The functions txtpack() and dbfpack() and sheetpack(), returning laoded instances of the ChannelPack class, are made available by an import of channelpack. The class ChannelPack is also made available. Those objects are what channelpack mainly mean to deliver, and live in the module pack. So, most of the time it should be enough to import the namespace of channelpack:

>>> import channelpack as cp
>>> type(cp.txtpack)
<type 'function'>
>>> type(cp.dbfpack)
<type 'function'>
>>> type(cp.sheetpack)
<type 'function'>
>>> type(cp.ChannelPack)
<type 'classobj'>

The intention is to make channelpack self-documenting. Try introspecting the objects.

pack module

Provide ChannelPack. Provide lazy functions to get loaded instances of ChannelPack.

ChannelPack is a class holding data read from some data file. It takes a function as its only argument for __init__. The function is responsible for returning a dict with numpy 1d arrays corresponding to the “channels” in the data file. Keys are integers corresponding to the “columns” used, 0-based. The load-function is called with an instance of ChannelPack by calling load().

Making a pack

There are functions in this module for easy pack creation: txtpack(), dbfpack(), sheetpack(). Using one of those, a call to load is not necessary:

>>> import channelpack as cp
>>> tp = cp.txtpack('testdata/sampledat2.txt')
>>> for k in sorted(tp.chnames):
...     print tp.name(k)
...
RPT
B_CACT
P_CACT
VG_STOP
AR_BST
PLRT_1
TOQ_BUM

# Arrays are callable by name or column number
>>> tp('RPT') is tp(0)
True

Setting the mask

The ChannelPack is holding a dict with numpy arrays and provide ways to get at them by familiar names or column numbers, as just shown. The pack also holds a Boolean array, initially all true. channelpack calls the array mask, and it is of the same length as the channels in the pack:

>>> import numpy as np
>>> np.all(tp.mask)
True
>>> tp(0).size == tp.mask.size
True

The mask is used to retrieve specific parts from the channels or to filter the returned data:

>>> sp = cp.sheetpack('testdata/sampledat3.xls')
>>> for k in sorted(sp.chnames):
...     print k, sp.name(k)
...
0 txtdata
1 nums
2 floats

>>> sp('txtdata')
array([u'A', u'A', u'C', u'D', u'D'],
      dtype='<U1')

>>> sp.mask = (sp('txtdata') == 'A') | (sp('txtdata') == 'D')
>>> sp.mask
array([ True,  True, False,  True,  True], dtype=bool)
>>> sp('txtdata', 0)
array([u'A', u'A'],
      dtype='<U1')
>>> sp('txtdata', 1)
array([u'D', u'D'],
      dtype='<U1')
>>> sp('txtdata', 2)
Traceback (most recent call last):
    ...
IndexError: list index out of range

The above example try to say that parts are chunks of the channel elements that has corresponding True elements in the mask. And they are retrieved by adding an enumeration of the part in the call for the channel, see __call__().

For filtering, an attribute nof is set to the string ‘filter’:

>>> sp.nof = 'filter'
>>> sp('txtdata')
array([u'A', u'A', u'D', u'D'],
      dtype='<U1')

The attribute nof can have the values ‘filter’, None or ‘nan’. None mean that the attribute has no effect. The effect of ‘nan’ is that elements that is not corresponding to a True element in mask is replaced with numpy.nan or None in calls:

>>> sp.nof = 'nan'
>>> sp('txtdata')
array([u'A', u'A', None, u'D', u'D'], dtype=object)
>>> sp('nums')
array([   0.,   30.,   nan,   90.,  120.])

Calls for a specific part are not affected by the attribute nof:

>>> sp('txtdata', 1)
array([u'D', u'D'],
      dtype='<U1')

Calling load on a running instance

If the pack is to be loaded with a new data set that is to be subjected to the same conditions, do like this:

>>> sp.add_condition('cond', "(%('txtdata') == 'A') | (%('txtdata') == 'D')")
>>> sp.pprint_conditions()
cond1: (%('txtdata') == 'A') | (%('txtdata') == 'D')
...

Note that the string for the condition is the same as in the above assignment (sp.mask = (sp('txtdata') == 'A') | (sp('txtdata') == 'D')) with the identifier for the pack replaced with %. Now a new file with the same data lay-out can be loaded and receive the same state:

>>> sp.load('testdata/sampledat4.xls', stopcell='c6')
>>> sp('txtdata')
array([u'A', None, None, None, u'D'], dtype=object)
>>> sp.nof = None
>>> sp('txtdata')
array([u'A', u'C', u'C', u'C', u'D'],
      dtype='<U1')

array([A, C, C, C, D], dtype=object)

Functions to get a pack

channelpack.pack.dbfpack(fn, usecols=None)

Return a ChannelPack instance loaded with dbf data file fn.

This is a lazy function to get a loaded instance, using pulldbf module.

channelpack.pack.txtpack(fn, **kwargs)

Return a ChannelPack instance loaded with text data file fn.

Attempt to read out custom channel names from the file and call instance.set_channel_names(). Then return the pack.

This is a lazy function to get a loaded instance, using the cleverness provided by pulltxt module. No delimiter or rows-to-skip and such need to be provided. However, if necessary, **kwargs can be used to override clevered items to provide to numpys loadtxt. usecols might be such an item for example. Also, the cleverness is only clever if all data is numerical.

Note that the call signature is the same as numpys loadtxt, which look like this:

np.loadtxt(fname, dtype=<type 'float'>, comments='#',
delimiter=None, converters=None, skiprows=0, usecols=None,
unpack=False, ndmin=0)

But, when using this function as a wrapper, the only meaningful argument to override should be usecols.

channelpack.pack.sheetpack(fn, sheet=0, header=True, startcell=None, stopcell=None, usecols=None)

Return a ChannelPack instance loaded with data from the spread sheet file fn, (xls, xlsx).

fn: str
The file to read from.
sheet: int or str
If int, it is the index for the sheet 0-based. Else the sheet name.
header: bool or str
True if the defined data range includes a header with field names. Else False - the whole range is data. If a string, it is a spread sheet style notation of the startcell for the header (“F9”). The “width” of this record is the same as for the data.
startcell: str or None
If given, a spread sheet style notation of the cell where reading start, (“F9”).
stopcell: str or None
A spread sheet style notation of the cell where data end, (“F9”).
usecols: str or seqence of ints
The columns to use, 0-based. 0 is the spread sheet column “A”. Can be given as a string also - ‘C:E, H’ for columns C, D, E and H.

Might not be a favorite, but the header row can be offset from the data range. The meaning of usecols is then applied on both the data range and the header row. However, usecols is always specified with regards to the data range.

ChannelPack object

class channelpack.ChannelPack(loadfunc=None)

Pack of data. Hold a dict with channel index numbers as keys (column number). This object is callable by channel name or index.

__call__(key, part=None)

Make possible to retrieve channels by key.

key: string or integer.
The channel index number or channel name.
part: int or None
The 0-based enumeration of a True part to return. This has an effect whether or not the mask or filter is turned on. Raise IndexError if the part does not exist.
__init__(loadfunc=None)

Return a pack

loadfunc is a function that returns a dict holding numpy arrays, being the channels. Keys are the index integer numbers, (column numbers). Each array is of np.shape(N,).

See method load().

add_condition(conkey, cond)

Add a condition, one of the addable ones.

conkey: str

One of ‘cond’, startcond’ or ‘stopcond’. ‘start’ or ‘stop’ is accepted as shorts for ‘startcond’ or ‘stopcond’. If the conkey is given with an explicit number (like ‘stopcond3’) and already exist, it will be over-written, else created.

When the trailing number is implicit, the first condition with a value of None is taken. If no None value is found, a new condition is added.

cond: str
The condition string. See ...

Note

Updates the mask if not no_auto.

append_load(*args, **kwargs)

Append data using loadfunc.

args, kwargs:
forward to the loadfunc. args[0] must be the filename, so it means that loadfunc must take the filename as it’s first argument.

If self is not already a loaded instance, call load and return.

Make error if there is a mismatch of channels indexes or channels count.

Append the data to selfs existing data. Set filename to the new file.

Create new attribute - a dict with meta-data on all files loaded, ‘metamulti.’

Note

Updates the mask if not no_auto.

clear_conditions(*conkeys, **noclear)

Clear conditions.

Clear only the conditions conkeys if specified. Clear only the conditions not specified by conkeys if noclear is True (False default).

Note

Updates the mask if not no_auto.

counter(ch, part=None)

Return a counter on the channel ch.

ch: string or integer.
The channel index number or channel name.
part: int or None
The 0-based enumeration of a True part to return. This has an effect whether or not the mask or filter is turned on. Raise IndexError if the part does not exist.

See Counter for the counter object returned.

eat_config(conf_file=None)

Read the the conf_file and update this instance accordingly.

conf_file: str or Falseish
If conf_file is Falseish, look in the directory where self.filename sits if self is not already associated with a conf_file. If associated, and conf_file arg is Falseish, read self.conf_file. If conf_file arg is a file name, read from that file, but do not update self.conf_file accordingly. An Implicit IOError is raised if no conf_file was found.

See spit_config for documentation on the file layout.

Note

Updates the mask if not no_auto.

Note

If the config_file exist because of an earlier spit, and custom channel names was not available, channels are listed as the fallback names in the file. Then after this eat, self.chnames will be set to the list in the conf_file section ‘channels’. The result can be that self.chnames and self.chnames_0 will be equal.

The message then is that, if channel names are updated, you should spit before you eat.

load(*args, **kwargs)

Load data using loadfunc.

args, kwargs:
forward to the loadfunc. args[0] must be the filename, so it means that loadfunc must take the filename as it’s first argument.

Set the filename attribute.

Note

Updates the mask if not no_auto.

ChannelPack is assuming a need for loading data from disc. If there is a desire to load some made-up data, a filename pointing to some actual file is nevertheless required. Here is a suggestion:

>>> import channelpack as cp
>>> import tempfile

>>> tf = tempfile.NamedTemporaryFile()

>>> d = {2: np.arange(5), 5: np.arange(10, 15)}
>>> def lf(fn):
...     return d
...

>>> pack = cp.ChannelPack(lf)
>>> pack.load(tf.name)
>>> pack.filename is not None
True
>>> pack.chnames_0
{2: 'ch2', 5: 'ch5'}
make_mask(clean=True, dry=False)

Set the attribute self.mask to a mask based on the conditions.

clean: bool
If not True, let the current mask be a condition as well. If True, the mask is set solely on the pack’s current conditions
dry: bool
If True, only try to make a mask, but don’t touch self.mask

This method is called automatically unless no_auto is set to True, whenever conditions are updated.

name(ch, firstwordonly=False)

Return channel name for ch. ch is the channel name or the index number for the channel name, 0-based.

ch: str or int.
The channel name or indexed number.
firstwordonly: bool or “pattern”.
If True, return only the first non-spaced word in the name. If a string, use as a re-pattern to re.findall and return the first element found. There will be error if no match. r’w+’ is good pattern for excluding leading and trailing obscure characters.

Returned channel name is the fallback string if “custom” names are not available.

parts()

Return the enumeration of the True parts.

The list is always consecutive or empty.

See also

slicelist()

pprint_conditions()

Pretty print conditions.

This is the easiest (only exposed) way to view all conditions interactively.

See also

spit_config()

query_names(pat)

pat a shell pattern. See fnmatch.fnmatchcase. Print the results to stdout.

rebase(key, start=None, decimals=5)

Rebase a channel (key) on start.

The step (between elements) need to be constant all through, else ValueError is raised. The exception to this is the border step between data loaded from two different files.

key: int or str
The key for the channel to rebase.
start: int or float or None
If specified - replace the first element in the first loaded data channel with start.
decimals: int
Diffs are rounded to this number of decimals before the step through arrays are checked. The diffs are otherwise likely never to be all equal.

Typically this would be used to make a time channel continuous. Like, not start over from 0, when data is appended from multiple files. Or simply to rebase a channel on ‘start’.

If start is None, and the instance is loaded from one file only, this method has no effect.

Note

The instance channel is modified on success.

records(part=None, fallback=True)

Return an iterator over the records in the pack.

Each record is supplied as a namedtuple with the channel names as field names. This is useful if each record make a meaningful data set on its own.

part: int or None
Same meaning as in __call__().
fallback: boolean
The named tuple requires python-valid naming. If fallback is False, there will be an error if self.chnames is not valid names and not None. If True, fall back to the self.chnames_0 on error.

Note

The error produced on invalid names if fallback is False is not produced until iteration start. Here is a good post on stack overflow on the subject 231767

set_basefilemtime()

Set attributes mtimestamp and mtimefs. If the global list ORIGINEXTENSIONS include any items, try and look for files (in the directory where self.filename is sitting) with the same base name as the loaded file, but with an extension specified in ORIGINEXTENSIONS.

mtimestamp is a timestamp and mtimefs is the file (name) with that timestamp.

ORIGINEXTENSIONS is empty on delivery. Which means that the attributes discussed will be based on the file that was loaded, (unless ORIGINEXTENSIONS is populated before this call).

This is supposed to be a convenience in cases the data file loaded is some sort of “exported” file format, and the original file creation time is of interest.

Note

If the provided functions in this module is used to get a pack, this method does not have to be called. It is called by those functions.

set_channel_names(names)

Set self.chnames. Custom channel names that can be used in calls on this object and in condition strings.

names: list or None
It is the callers responsibility to make sure the list is in column order. self.chnames will be a dict with channel integer indexes as keys. If names is None, self.chnames will be None.
set_duration(rule)

Set the duration according to rule.

rule: str
The rule operating on the variable dur.

rule is an expression like:

>>> rule = 'dur == 150 or dur > 822'

setting a duration rule assuming a pack sp:

>>> sp.set_duration(rule)

The identifier dur must be present or the rule will fail.

Note

The logical or and and operators must be used. dur is a primitive, not an array.

Note

Updates the mask if not no_auto.

set_samplerate(rate)

Set sample rate to rate.

rate: int or float

rate is given as samples / timeunit. If sample rate is set, it will have an impact on the duration rule conditions. If duration is set to 2.5 and samplerate is 100, a duration of 250 records is required for the logical conditions to be true.

Note

Updates the mask if not no_auto.

set_stopextend(n)

Extend the True elements by n when setting the conditions based on a ‘stopcond’ condition.

n is an integer >= 0.

Note

Updates the mask if not no_auto.

slicelist()

Return a slicelist based on self.mask.

This is used internally and might not be very useful from outside. It’s exposed anyway in case of interest to quickly see where the parts are along the arrays.

It is a list of python slice objects corresponding to the True sections in self.mask. If no conditions are set, there shall be one slice in the list with start == 0 and stop == self.rec_cnt, (the mask is all True). The len of this list corresponds to the number of True sections in self.mask. (So a hint on the result from the conditions).

See also

parts()

spit_config(conf_file=None, firstwordonly=False)

Write a config_file based on this instance.

conf_file: str (or Falseish)
If conf_file is Falseish, write the file to the directory where self.filename sits, if self is not already associated with such a file. If associated, and conf_file is Falseish, use self.conf_file. If conf_file is a file name, write to that file and set self.conf_file to conf_file.
firstwordonly: bool or “pattern”
Same meaning as in name method, and applies to the channel names spitted. There is no effect on the instance channel names until eat_config is called.

Sections in the ini/cfg kind of file can be:

[channels] A mapping of self.D integer keys to channel names. Options are numbers corresponding to the keys. Values are the channel names, being the fallback names if custom names are not available (self.chnames). (When spitting that is).

[conditions] Options correspond to the keys in self.conditions, values correspond to the values in the same.

Data

channelpack.pack.ORIGINEXTENSIONS = []

A list of file extensions excluding the dot. See set_basefilemtime() for a description.

channelpack.pack.CHANNELPACK_RC_FILE = '.channelpackrc'

The humble rc file of channelpack. It can exist and have a section [channelpack]. In this section, an option is originextensions with a comma separated list of extensions as value that will be loaded to the ORIGINEXTENSIONS list on import of channelpack. Use os.path.expanduser(‘~’) to see where channelpack look for this file, and then place it there.

pulltxt module - automated study of text data files

The study of a numerical data file works by extracting matches of a digit pattern on each row up to a count of rows. When the count of matches start to be constant, it is assumed the data rows has started. Comma or point is accepted as decimal delimiter. So there is two patterns for digits. Both are tried, and the wrong one normally give a higher count of matches on each row, because matches are then found around the correct decimal delimiter. It is then simply assumed that the pattern with lesser match count is the correct one.

When the decimal delimiter and start row is determined, the delimiter for data is determined. This is done by doing a re match with the digits found on the first row of data. Each delimiter is extracted with re groups.

This is not too bad, because no assumption is made on what the data delimiter is. The code here could possibly be more elegant by re splitting on a number of optional data delimiters, but then that range of delimiters are assumed. Let’s see.

Bad is that if, say, there is only one column of data, and data starts with only zero for a number of rows exceeding the EQUAL_CNT_REQ. There will be no difference between the match count with decimal comma and decimal point. Then decdel will be assumed to be point. That can be wrong.

PatternPull object

class channelpack.pulltxt.PatternPull(fn)

Build useful attributes for determining decimal delimiter and data delimiter, and hopefully some channel names.

Note: This class is a helper utility.

channel_names(usecols=None)

Attempt to extract the channel names from the data file. Return a list with names. Return None on failed attempt.

usecols: A list with columns to use. If present, the returned list will include only names for columns requested. It will align with the columns returned by numpys loadtxt by using the same keyword (usecols).

count_matches()

Set the matches_p, matches_c and rows attributes.

loadtxtargs()

Return a dict (kwargs) to provide to numpys loadtxt, based on the resulting attributes.

The usecols attribute is set to be all columns in the file. This is done because some data file exporters put an (extra) data delimiter just after last data on each row. This is not expected by numpys loadtxt, but it’s not a problem if the usecols item is set.

rows2skip(decdel)

Return the number of rows to skip based on the decimal delimiter decdel.

When each record start to have the same number of matches, this is where the data starts. This is the idea. And the number of consecutive records to have the same number of matches is to be EQUAL_CNT_REQ.

set_decdel_rts()

Figure out the decimal seperator and rows to skip and set corresponding attributes.

study_datdel()

Figure out the data delimiter.

Functions

channelpack.pulltxt.loadtxt(fn, **kwargs)

Study the text data file fn. Call numpys loadtxt with keyword arguments based on the study.

Return data returned from numpy loadtxt.

kwargs: keyword arguments accepted by numpys loadtxt. Any keyword arguments provided will take precedence over the ones resulting from the study.

Set the module attribute PP to the instance of PatternPull used.

channelpack.pulltxt.loadtxt_asdict(fn, **kwargs)

Return what is returned from loadtxt as a dict.

The ‘unpack’ keyword is enforced to True. The keys in the dict is the column numbers loaded. It is the integers 0...N-1 for N loaded columns, or the numbers in usecols.

pulldbf module

Functions

dbfreader is a recipe Created by Raymond Hettinger on Tue, 11 Jan 2005 (PSF) http://code.activestate.com/recipes/362715/ with minor edits.

channelpack.pulldbf.dbfreader(f)

Returns an iterator over records in a Xbase DBF file.

The first row returned contains the field names. The second row contains field specs: (type, size, decimal places). Subsequent rows contain the data records. If a record is marked as deleted, it is skipped.

File should be opened for binary reads.

channelpack.pulldbf.dbf_asdict(fn, usecols=None, keystyle='ints')

Return data from dbf file fn as a dict.

fn: str
The filename string.
usecols: seqence
The columns to use, 0-based.
keystyle: str
‘ints’ or ‘names’ accepted. Should be ‘ints’ (default) when this function is given to a ChannelPack as loadfunc. If ‘names’ is used, keys will be the field names from the dbf file.
channelpack.pulldbf.channel_names(fn, usecols=None)

Return the field names (channel names) from dbf file fn. With usecols, return only names corresponding to the integers in usecols.

pullxl module

Helper module for reading tabular data from spread sheets.

Spread sheet reading principles:

  1. Data is assumed to be arranged column-wise.
  2. Default is to read a whole sheet. nrows, ncols is assumed (attributes in xlrd:s Sheet objects). Top row defaults to be a header row with field names, (header=True).
  3. A startcell and stopcell can be given. It is then given in spread sheet notation (“B15”). header option can be True or False or a cell specification of where the header start (“B15”).
  4. The interpretation of startcell and stopcell in combination with header is as follows:
    • If nothing specified, see 2.
    • If startcell is given (say ‘C3’) and header is True, header row is 3 with spread sheet enumeration. Data start at row 4
    • If startcell is given, say ‘C3’, and header is ‘C3’, header row is 3 with spread sheet enumeration. Data start at row 4.
    • If startcell is given (say ‘C3’) and header is False, data start at row 3.
  5. Type detection is done by checking the Cell object’s ctype attribute for each field’s data range. If the ctype is all the same, the type is given. If there are two types, and one of them is ‘XL_CELL_EMPTY’, the type is assumed to be the other. Then the empty cell’s values will be replaced by numpy nan if the type is float, else None. If there are more than two ctypes in the data range, the type will be object, and empty cells replaced by None. Dates will be python datetime objects.
class channelpack.pullxl.StartStop(row, col)

Zero-based integers for row and column, xlrd style. Meaning, the stop values are non-inclusive. This object is used for either start or stop.

channelpack.pullxl.fromxldate(xldate, datemode=1)

Return a python datetime object

xldate: float
The xl number.
datemode: int
0: 1900-based, 1: 1904-based. See xlrd documentation.
channelpack.pullxl.letter2num(letters, zbase=False)

A = 1, C = 3 and so on. Convert spreadsheet style column enumeration to a number.

Answers: A = 1, Z = 26, AA = 27, AZ = 52, ZZ = 702, AMJ = 1024

>>> from channelpack.pullxl import letter2num
>>> letter2num('A') == 1
True
>>> letter2num('Z') == 26
True
>>> letter2num('AZ') == 52
True
>>> letter2num('ZZ') == 702
True
>>> letter2num('AMJ') == 1024
True
>>> letter2num('AMJ', zbase=True) == 1023
True
>>> letter2num('A', zbase=True) == 0
True
channelpack.pullxl.prepread(sheet, header=True, startcell=None, stopcell=None)
Return four StartStop objects, defining the outer bounds of

header row and data range, respectively. If header is False, the first two items will be None.

–> [headstart, headstop, datstart, datstop]

sheet: xlrd.sheet.Sheet instance
Ready for use.
header: bool or str
True if the defined data range includes a header with field names. Else False - the whole range is data. If a string, it is spread sheet style notation of the startcell for the header (“F9”). The “width” of this record is the same as for the data.
startcell: str or None
If given, a spread sheet style notation of the cell where reading start, (“F9”).
stopcell: str or None
A spread sheet style notation of the cell where data end, (“F9”).

startcell and stopcell can both be None, either one specified or both specified.

Note to self: consider making possible to specify headers in a column.

channelpack.pullxl.sheet_asdict(fn, sheet=0, header=True, startcell=None, stopcell=None, usecols=None, chnames_out=None)

Read data from a spread sheet. Return the data in a dict with column numbers as keys.

fn: str
The file to read from.
sheet: int or str
If int, it is the index for the sheet 0-based. Else the sheet name.
header: bool or str
True if the defined data range includes a header with field names. Else False - the whole range is data. If a string, it is a spread sheet style notation of the startcell for the header (“F9”). The “width” of this record is the same as for the data.
startcell: str or None
If given, a spread sheet style notation of the cell where reading start, (“F9”).
stopcell: str or None
A spread sheet style notation of the cell where data end, (“F9”).
usecols: str or seqence of ints
The columns to use, 0-based. 0 is the spread sheet column “A”. Can be given as a string also - ‘C:E, H’ for columns C, D, E and H.
usecols: str or sequence of ints or None
The columns to use, 0-based. 0 is the spread sheet column “A”. Can be given as a string also - ‘C:E, H’ for columns C, D, E and H.
chnames_out: list or None
If a list it will be populated with the channel names. The size of the list will equal to the number of channel names extracted. Whatever is in the list supplied will first be removed.

Values in the returned dict are numpy arrays. Types are set based on the types in the spread sheet.

channelpack.pullxl.sheetheader(sheet, startstops, usecols=None)

Return the channel names in a list suitable as an argument to ChannelPack’s set_channel_names method. Return None if first two StartStops are None.

This function is slightly confusing, because it shall be called with the same parameters as sheet_asdict. But knowing that, it should be convenient.

sheet: xlrd.sheet.Sheet instance
Ready for use.
startstops: list
Four StartStop objects defining the data to read. See prepread(), returning such a list.
usecols: str or sequence of ints or None
The columns to use, 0-based. 0 is the spread sheet column “A”. Can be given as a string also - ‘C:E, H’ for columns C, D, E and H.
channelpack.pullxl.toxldate(datetime, datemode=1)

Return a xl-date number from the datetime object datetime.

datetime: datetime.datetime
The python datetime object
datemode: int
0: 1900-based, 1: 1904-based. See xlrd documentation.