Run mlrMBO based hyperparameter optimization on CANDLE Benchmarks¶
mlrMBO is an iterative optimizer written in R. It evaluates the best
values of hyperparameters for CANDLE “Benchmarks” available here:
git@github.com:ECP-CANDLE/Benchmarks.git
- given set of parameters.
Running¶
cd into the
~/Supervisor/workflows/mlrMBO/test
directorySpecify the
MODEL_NAME
in thetest-<model>.sh
file, hyperparameters incfg-prm-1.txt
Specify the number or processes, queue etc., in
cfg-sys-1.sh
fileLaunch the test by invoking
./test-1.sh benchmark machine
where the machine can becori
,theta
,titan
etc.The benchmark will be run for the number of processors specified
Final objective function value will be available in the experiments directory and also printed
User requirements¶
What you need to install to run the workflow:
This workflow -
git@github.com:ECP-CANDLE/Supervisor.git
. Clone andcd
toworkflows/nt3_mlrMBO
(the directory containing this README).NT3 benchmark -
git@github.com:ECP-CANDLE/Benchmarks.git
. Clone and switch to theframeworks
branch.benchmark data - See the individual benchmarks README for obtaining the initial data
Calling sequence¶
- Script call stack
user shell ->
test-1.sh ->
swift/workflow.sh -> (submits to compute nodes)
swift/workflow.swift ->
common/swift/obj_app.swift ->
common/sh/model.sh ->
common/python/model_runner.py ->
the benchmark/model
- Environment settings
upf-1.sh ->
cfg-sys-1.sh ->
common/sh/
env, langs .sh files
Making Changes¶
Structure¶
The point of the script structure is that it is easy to make copy and
modify the test-*.sh
script, and the cfg-*.sh
scripts. These
can be checked back into the repo for use by others. The test-*.sh
script and the cfg-*.sh
scripts should simply contain environment
variables that control how workflow.sh
and workflow.swift
operate.
test-1
and cfg-{sys,prm}-1
should be unmodified for simple
testing.
Calling a different objective function¶
To call a different objective function:
Copy
common/swift/obj_app.swift
to a new directory and/or file name.Edit the
app
function body to run your code and return the result.Edit a
test-*.sh
script to set environment variables:OBJ_DIR
: Set this to the new directory (If changed. Otherwise,OBJ_DIR
defaults to the absolute path to common/swift .)OBJ_MODULE
: Set this to the Swift file name without suffix (If changed. Otherwise,OBJ_MODULE
defaults toobj_app
.)
Run it!
Simple test for changing objective function:
$ cd mlrMBO/ # This directory
$ export OBJ_DIR=$PWD/test
$ export OBJ_MODULE=test_obj_fail # Cf. test/test_obj_fail.swift
$ test/test-1.sh ___ dunedin # Dummy argument for MODEL_NAME (unused)
...
Swift: Assertion failed!: test-obj-fail.swift was successfully invoked!
...
This indicates that the code in test_obj_fail.swift
was executed
instead of obj_app.swift
.
Where to check for output¶
This includes error output.
When you run the test script, you will get a message about
TURBINE_OUTPUT
. This will be the main output directory for your
run.
On a local system, stdout/stderr for the workflow will go to your terminal.
On a scheduled system, stdout/stderr for the workflow will go to
TURBINE_OUTPUT/output.txt
The individual objective function (model) runs stdout/stderr go into directories of the form:
TURBINE_OUTPUT/EXPID/run/RUNID/model.log
where EXPID
is the user-provided experiment ID, and RUNID
are
the various model runs generated by mlrMBO, one per parameter set, of
the form R_I_J
where R
is the restart number, I
is the
iteration number, and J
is the sample within the iteration.