Run mlrMBO based hyperparameter optimization on CANDLE Benchmarks¶
mlrMBO is an iterative optimizer written in R. It evaluates the best
values of hyperparameters for CANDLE “Benchmarks” available here:
git@github.com:ECP-CANDLE/Benchmarks.git - given set of parameters.
Running¶
cd into the
~/Supervisor/workflows/mlrMBO/testdirectorySpecify the
MODEL_NAMEin thetest-<model>.shfile, hyperparameters incfg-prm-1.txtSpecify the number or processes, queue etc., in
cfg-sys-1.shfileLaunch the test by invoking
./test-1.sh benchmark machinewhere the machine can becori,theta,titanetc.The benchmark will be run for the number of processors specified
Final objective function value will be available in the experiments directory and also printed
User requirements¶
What you need to install to run the workflow:
This workflow -
git@github.com:ECP-CANDLE/Supervisor.git. Clone andcdtoworkflows/nt3_mlrMBO(the directory containing this README).NT3 benchmark -
git@github.com:ECP-CANDLE/Benchmarks.git. Clone and switch to theframeworksbranch.benchmark data - See the individual benchmarks README for obtaining the initial data
Calling sequence¶
- Script call stack
user shell ->
test-1.sh ->
swift/workflow.sh -> (submits to compute nodes)
swift/workflow.swift ->
common/swift/obj_app.swift ->
common/sh/model.sh ->
common/python/model_runner.py ->
the benchmark/model
- Environment settings
upf-1.sh ->
cfg-sys-1.sh ->
common/sh/
env, langs .sh files
Making Changes¶
Structure¶
The point of the script structure is that it is easy to make copy and
modify the test-*.sh script, and the cfg-*.sh scripts. These
can be checked back into the repo for use by others. The test-*.sh
script and the cfg-*.sh scripts should simply contain environment
variables that control how workflow.sh and workflow.swift
operate.
test-1 and cfg-{sys,prm}-1 should be unmodified for simple
testing.
Calling a different objective function¶
To call a different objective function:
Copy
common/swift/obj_app.swiftto a new directory and/or file name.Edit the
appfunction body to run your code and return the result.Edit a
test-*.shscript to set environment variables:OBJ_DIR: Set this to the new directory (If changed. Otherwise,OBJ_DIRdefaults to the absolute path to common/swift .)OBJ_MODULE: Set this to the Swift file name without suffix (If changed. Otherwise,OBJ_MODULEdefaults toobj_app.)
Run it!
Simple test for changing objective function:
$ cd mlrMBO/ # This directory
$ export OBJ_DIR=$PWD/test
$ export OBJ_MODULE=test_obj_fail # Cf. test/test_obj_fail.swift
$ test/test-1.sh ___ dunedin # Dummy argument for MODEL_NAME (unused)
...
Swift: Assertion failed!: test-obj-fail.swift was successfully invoked!
...
This indicates that the code in test_obj_fail.swift was executed
instead of obj_app.swift .
Where to check for output¶
This includes error output.
When you run the test script, you will get a message about
TURBINE_OUTPUT . This will be the main output directory for your
run.
On a local system, stdout/stderr for the workflow will go to your terminal.
On a scheduled system, stdout/stderr for the workflow will go to
TURBINE_OUTPUT/output.txt
The individual objective function (model) runs stdout/stderr go into directories of the form:
TURBINE_OUTPUT/EXPID/run/RUNID/model.log
where EXPID is the user-provided experiment ID, and RUNID are
the various model runs generated by mlrMBO, one per parameter set, of
the form R_I_J where R is the restart number, I is the
iteration number, and J is the sample within the iteration.