This is an old revision of the document!
Bee supports the use of variables in its workflow description file.
Variables are specified with the following terminology: ${variable_type.variable_name}
.
Variable | Resolved in template |
---|---|
${config.extras_dir} | ✔ |
${task.log_stdout} | ✘ |
${task.log_stderr} | ✘ |
${bee_indexer.start_index} | ✘ |
${bee_indexer.end_index} | ✘ |
${module.version} | ✔ |
${api.xxxx} | ✔ |
The currently supported variable of this type is:
${config.extras_dir}
The value corresponds to the property extras.dir
defined in the system.config
file.
The workflow template will be resolved by substituting this variable with the effectively used value to provide traceability.
Example:
The template snippet:
<path>${config.extras_path}/ShadingCorrectionAverageImage/ComputeShadingCorrectionAvgImg.command</path>
would be resolved into:
<path>/import/bc2/home/resit/mx_nas/stage/bee/extras/ShadingCorrectionAverageImage/ComputeShadingCorrectionAvgImg.command</path>
The currently supported variables of this type are:
${task.log_stdout}
${task.log_stderr}
This variables correspond the files where the cluster job (or task) standard output and standard error streams are directed.
The extension of such files is defined to be: .stdout
and .stderr
.
These variables can be used in the task validations to specify the files to validate.
These variables will not be resolved in the workflow template since in the case of parallel running modules, one set of such files is produced per task.
In case there is interest on examining/keeping those files, they can be sent to storage by locating them in the module's work directory using their file extension (.stdout
, .stderr
).
Example:
<output> <dataset type="CLUSTER_JOB_LOGS" store="true"> <files in_dir="" regex=".*\.stderr" /> <files in_dir="" regex=".*\.stdout" /> </dataset> </datasets> <validations level="task"> <validation mode="count" sub_dir="" regex="${task.log_stdout}" comparator="equal" target_value="1" fail_status="validation_error" fail_message="Missing Stdout Log File" /> <validation mode="content" sub_dir="" regex="${task.log_stdout}" content_regex="finished successfully" comparator="equal" target_value="1" fail_status="validation_error" fail_message="Missing Success Message in Stdout Log File" />
The currently supported variables of this type are:
${bee_indexer.start_index}
${bee_indexer.end_index}
These variables correspond, in parallel running modules, to the start and end indexes of the objects to analyze in one job (task).
They can be used as arguments to be added to the executable in each of the parallel calls.
The values of them will be calculated in each task, using the values specified as indexbuilder_dataset
, indexbuilder_regex
, indexes_per_job
and indexes_start
in the module parameters.
These variables will not be resolved in the workflow template because they are used for parallel running modules, and therefore, one set of such values is produced per task.
Example:
<module name="CPv1CPCluster" version="1.*.*" class="ch.systemsx.bee.workflowmanager.module.ClusterModule" > <params> <param name="indexbuilder_dataset" value="hcs_plate" /> <param name="indexbuilder_regex" value=".*_cDAPI.*\.(TIF|JP2)$" /> <param name="indexes_per_job" value="400" /> <param name="indexes_start" value="2" /> </params> <executable> <path>${config.extras_dir}/CellProfiler1/CellProfiler1_rev004094_R2012b_12001/CPCluster.command</path> <args> <arg type="path" value="dataset:CpClusterProfiling" /> <arg type="path" value="dataset:Cpv1BatchFile" selector="Batch_data.mat" /> <arg type="string" value="${bee_indexer.start_index}" /> <arg type="string" value="${bee_indexer.end_index}" /> <arg type="path" value="dataset:CpClusterResults" /> <arg type="string" value="Batch_" /> <arg type="string" value="yes" /> <arg type="string" value="date" /> </args> </executable>
The currently supported variable of this type is:
${module.version}
This variable can be used to specify the version of the module which will be computed with the regex given in the module version
attribute (e.g.: version=“1.*.*”
).
The mdoule version attribute is a regex which should match the highest 3-digit version of a module. See Jira issue BEE-114 for a detailed description of such a match.
The workflow template will be resolved by substituting this variable with the effectively used value to provide traceability.
Example:
The template snippet:
<module name="CPv1CreateBatchFile" version="1.*.*" class="ch.systemsx.bee.workflowmanager.module.ClusterModule"> <executable> <path>/modules/CellProfiler1/CellProfiler1_ver${module.version}/CPCluster.command</path>
would be resolved into:
<module name="CPv1CreateBatchFile" version="1.*.*" class="ch.systemsx.bee.workflowmanager.module.ClusterModule"> <executable> <path>/modules/CellProfiler1/CellProfiler1_ver001.000.000/CPCluster.command</path>
The resolved workflow should be stored together with the stored results (as it is in the current iBrain2).
The api-type variables are defined by the user. They have the following structure:
${api.some_variable_name}
Those variables can be used when submitting a workflow using the REST interface. They allow substitution of variables in the workflow.xml
through the provided values when submitting the REST call.
Example:
Issuing such REST call:
curl -X POST --data-urlencode "workflow.id=20130912110409253-42113" --data-urlencode "api.hcs_bee_cppipeline=20130912105648991-42110" --data-urlencode "api.hcs_plate=20130215175757007-40162" http://localhosst:12345/apiv1/processes
would produce the substitution of the api-type submitted variables (api.hcs_bee_cppipeline
and api.hcs_plate
) through the provided values (20130912105648991-42110
and 20130215175757007-40162
) in the workflow XML description in the following way:
1. submitted workflow XML template:
<input> <datasets> <dataset name="hcs_plate" id="${api.hcs_plate}" type="HCS_IMAGE_RAW" stage="true" /> <dataset name="hcs_bee_cppipeline" id="${api.hcs_bee_cppipeline}" type="HCS_BEE_CPPIPELINE" stage="true" /> </datasets> </input>
2: resolved workflow XML
<input> <datasets> <dataset name="hcs_plate" id="20130215175757007-40162" type="HCS_IMAGE_RAW" stage="true" /> <dataset name="hcs_bee_cppipeline" id="20130912105648991-42110" type="HCS_BEE_CPPIPELINE" stage="true" /> </datasets> </input>