Excel helper

A library to load data from excel files. It supports sampling from random distributions and extrapolating growth over time.

Example

Given an excel file with rows similar to the below

variable	scenario	module	distribution	param 1	param 2	param 3	unit	start date	end date	CAGR	ref date	label
a		numpy.random	choice	1			kg	01/01/2009	01/04/2009	0.10	01/01/2009	test var 1
b		numpy.random	uniform	2	4		-					label
c		numpy.random	triangular	3	6	10	-					label
d		bottom_up_comparision.sampling_core_routers	Distribution	core_routers.csv			J/Gb					label

a	s1	numpy.random	choice	2								test var 1
multiple choice		numpy.random	choice	1,2,3			kg	01/01/2007	01/01/2009			test var 1

You can run python/ numpy code that references these variables and generates random distributions.

For example, the following will initialise a variable c with a vector of size 2 with random values from a triangular distribution.

    np.random.seed(123)
    
    data = ParameterLoader.from_excel('test.xlsx', size=2, sheet_index=0)
    c = data['c']
>>> [ 7.08471918  5.45131111]

Other types of distributions include choice and normal. However you can specify any distribution from numpy that takes up to three parameters to init.

You can also specify a .csv file with samples and an empiricial distribution function is generated and variable values will be sampled from that.

Scenarios

It is possible to define scenarios and have paramter values for a variable change with each scenario.

    data = ParameterLoader.from_excel('test.xlsx', size=1, sheet_index=0)
    res = data['a'][0]
    
    assert res == 1.
    
    data.select_scenario('s1')
    res = data['a'][0]
    
    assert res == 2.

use data.unselect_scenario() to return to the default value.

Pandas Dataframes

It is possible to define a time frame for distributions and have sample values change over time.

    # the time axis of our dataset
    times = pd.date_range('2009-01-01', '2009-04-01', freq='MS')
    # the sample axis our dataset
    samples = 2

    dfl = DataSeriesLoader.from_excel('test.xlsx', times, size=samples, sheet_index=0)
    res = dfl['a']

    assert res.loc[[datetime(2009, 1, 1)]][0] == 1
    assert np.abs(res.loc[[datetime(2009, 4, 1)]][0] - pow(1.1, 3. / 12)) < 0.00001

Reload

Reloading data sources is useful when underlying excel files change.

        times = pd.date_range('2009-01-01', '2009-04-01', freq='MS')        
        samples = 2

        data = MultiSourceLoader()
        data.add_source(ExcelSeriesLoaderDataSource('test.xlsx', times, size=samples, sheet_index=0))

        res = data['a'][0]
        assert res == 1.

        wb = load_workbook(filename='test.xlsx')
        ws = wb.worksheets[0]
        ws['E2'] = 4.
        wb.save(filename='test.xlsx')

        data.reload_sources()

        res = data['a'][0]
        assert res == 4.

        wb = load_workbook(filename='test.xlsx')
        ws = wb.worksheets[0]
        ws['E2'] = 1.
        wb.save(filename='test.xlsx')

        data.reload_sources()

        data.set_scenario('s1')
        res = data['a'][0]

        assert res == 2.

        data.reset_scenario()
        res = data['a'][0]

        assert res == 1.

Metadata

The contents of the rows is also contained in the metadata

    # the time axis of our dataset
    times = pd.date_range('2009-01-01', '2009-04-01', freq='MS')
    # the sample axis our dataset
    samples = 3
    
    dfl = DataSeriesLoader.from_excel('test.xlsx', times, size=samples, sheet_index=0)
    res = dfl['a']
    
    print(res._metadata)

Excel syntax

Version 2

column name	meaning
variable
scenario
type
ref value
mean growth
variability
variability growth
ref date
unit
label
source
comment
control

Versioning

in a sheet 'metadata' create a pair of cells, one called version and the other with the version identifier (e.g 2)

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
excel_helper		excel_helper
tests		tests
.gitignore		.gitignore
AUTHORS		AUTHORS
ChangeLog		ChangeLog
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
README.rst		README.rst
build_instructions.md		build_instructions.md
conf.py		conf.py
index.rst		index.rst
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Excel helper

Example

Scenarios

Pandas Dataframes

Reload

Metadata

Excel syntax

Version 2

Versioning

About

Releases

Packages

Languages

License

dschien/PyExcelModelingHelper

Folders and files

Latest commit

History

Repository files navigation

Excel helper

Example

Scenarios

Pandas Dataframes

Reload

Metadata

Excel syntax

Version 2

Versioning

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages