Optuna Examples for TSUBAME3.0

This is a tutorial material to use Optuna in the TSUBAME3.0 infrastructure (unofficial).

This tutorial describes:

Minimum setup to run Optuna.
How to launch Optuna storage on an interactive node.
How to parallelize single node ML training.
How to parallelize multi-node, MPI-based ML training.

Minimum Setup of Optuna in TSUBAME

The following example provides quickstart of Optuna.

Points

Optuna can easily installed by pip.
sqlite:///example.db is an RDB URL to specify the storage of optimization results. In this case, SQLite is specified.
You can use PostgreSQL or in-memory storage instead of SQLite.

$ qrsh -l s_core=1 -l h_rt=00:10:00
$ module load python/3.6.5
$ pip install --user optuna
$ python tsubame-optuna-example/quadratic.py quickstart sqlite:///example.db
$ python tsubame-optuna-example/print_study_history.py quickstart sqlite:///example.db

Launch PostgreSQL in TSUBAME

RDB servers can be used for parallel optimization. In this tutorial, we use PostgreSQL.

$ GROUP=<YOUR_GROUP>

$ qrsh -g $GROUP -l s_core=1 -l h_rt=12:00:00
$ module load singularity/2.6.1
$ singularity build postgres.img docker://postgres

$ mkdir postgres_data
$ singularity run -B postgres_data:/var/lib/postgresql/data postgres.img /docker-entrypoint.sh postgres

The RDB URL is as follows:

$ STORAGE_HOST=<HOST_WHERE_POSTGRES_IS_RUNNING>  # e.g., STORAGE_HOST=r7i7n7-cnode00
$ STORAGE_URL=postgres://postgres@$STORAGE_HOST:5432/

Distributed Optimization for Single Node Learning

Let's parallelize a simple Optuna script that optimizes a quadratic function.

Set up the RDB URL and create a study identifier:

$ STORAGE_HOST=<HOST_WHERE_POSTGRES_IS_RUNNING>
$ STORAGE_URL=postgres://postgres@$STORAGE_HOST:5432/

$ module load python/3.6.5
$ pip install --user psycopg2-binary
$ STUDY_NAME=`~/.local/bin/optuna create-study --storage $STORAGE_URL`

Set up a shell script for qsub command, e.g.:

$ echo "module load python/3.6.5" >> run_quadratic.sh
$ echo "python tsubame-optuna-example/quadratic.py $STUDY_NAME $STORAGE_URL" >> run_quadratic.sh

You can parallelize the optimization just by submitting multiple jobs. For example, the following commands simultaneously run three workers in a study.

$ GROUP=<YOUR_GROUP>

$ qsub -g $GROUP -l s_core=1 run_quadratic.sh
$ qsub -g $GROUP -l s_core=1 run_quadratic.sh
$ qsub -g $GROUP -l s_core=1 run_quadratic.sh

You can list the history of optimization as follows.

$ python tsubame-optuna-example/print_study_history.py $STUDY_NAME $STORAGE_URL

Optimize ChainerMN

See this document.

Optimize TensorFlow + Horovod

See this document.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
README.chainermn.md		README.chainermn.md
README.md		README.md
README.tensorflow.md		README.tensorflow.md
chainermn_mnist.py		chainermn_mnist.py
chainermn_mnist_inmemory.py		chainermn_mnist_inmemory.py
print_study_history.py		print_study_history.py
quadratic.py		quadratic.py
tensorflow_mnist.py		tensorflow_mnist.py
tensorflow_mnist_inmemory.py		tensorflow_mnist_inmemory.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Optuna Examples for TSUBAME3.0

Minimum Setup of Optuna in TSUBAME

Launch PostgreSQL in TSUBAME

Distributed Optimization for Single Node Learning

Optimize ChainerMN

Optimize TensorFlow + Horovod

See Also

About

Releases

Packages

Languages

toshihikoyanase/tsubame-optuna-example

Folders and files

Latest commit

History

Repository files navigation

Optuna Examples for TSUBAME3.0

Minimum Setup of Optuna in TSUBAME

Launch PostgreSQL in TSUBAME

Distributed Optimization for Single Node Learning

Optimize ChainerMN

Optimize TensorFlow + Horovod

See Also

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages