User Tools

Site Tools


beewm:devel:workflow_specification_syntax

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
beewm:devel:workflow_specification_syntax [2013/09/11 15:44]
epujadas [Summary of Supported Variables]
beewm:devel:workflow_specification_syntax [2016/05/17 16:17]
127.0.0.1 external edit
Line 2: Line 2:
  
 <code xml> <code xml>
-<workflow name="String (required)" author="String (required)" cleanup="TRUE|FALSE (optional, defaults to TRUE)">+<workflow name="String (optional)" author="String (required)" cleanup="TRUE|FALSE (optional, defaults to TRUE)">
     <hosts>     <hosts>
         <run_on>CLUSTER_HOST|LOCAL_HOST (optional, defaults to CLUSTER_HOST)</run_on>         <run_on>CLUSTER_HOST|LOCAL_HOST (optional, defaults to CLUSTER_HOST)</run_on>
Line 13: Line 13:
     </input>     </input>
     <modules>     <modules>
-        <module name="String (required)" version="Regex pattern (required)" class="String (optional, defaults to ch.systemsx.bee.workflowmanager.module.MockModule)" required_runtime_minutes="Integer (optional)" required_memory_mb="Integer (optional)" >+        <module name="String (required)" version="Regex pattern (required)" class="String (optional, defaults to ch.systemsx.bee.workflowmanager.module.MockModule)" required_runtime_minutes="Integer (optional)" required_memory_mb="Integer (optional)" cpus_per_job="Integer (optional)">
             <params (optional)>             <params (optional)>
                 <param name="indexbuilder_dataset|indexbuilder_regex|indexes_per_job|indexes_start (required)" value="String (required)" />                 <param name="indexbuilder_dataset|indexbuilder_regex|indexes_per_job|indexes_start (required)" value="String (required)" />
Line 27: Line 27:
             <output>             <output>
                 <datasets (optional)>                 <datasets (optional)>
-                    <dataset name="String (required for datasets not to store, no default)" type="String (required for datasets to store, no default)" store="TRUE|FALSE (optional, defaults to FALSE)" relevant="TRUE|FALSE (optional, defaults to TRUE)">+                    <dataset name="String (required for datasets not to store, no default)" type="String (required for datasets to store, no default)" store="TRUE|FALSE (optional, defaults to FALSE)" relevant="TRUE|FALSE (optional, defaults to TRUE)" dropbox="String: the name of the dropbox that will be used for storing the dataset. Optional, if it's not specified the type will be used as dropbox name.">
                         <files (considered and required only for datasets to store) in_dir="String (optional, defaults to the root of the module's work directory)" regex="Regex pattern (required)" />                         <files (considered and required only for datasets to store) in_dir="String (optional, defaults to the root of the module's work directory)" regex="Regex pattern (required)" />
                     </dataset>                     </dataset>
Line 141: Line 141:
 ==== <workflow name="..." ... > ==== ==== <workflow name="..." ... > ====
 Name of the workflow. Name of the workflow.
 +Workflow
 ==== <workflow author="..." ... > ==== ==== <workflow author="..." ... > ====
 Author of the workflow. Author of the workflow.
Line 169: Line 170:
  
 :!: This feature is not implemented. The application behaves as this value would be set to true. To be discussed if it is needed, since the directory of a dataset could be specified as metadata or as variable. :!: This feature is not implemented. The application behaves as this value would be set to true. To be discussed if it is needed, since the directory of a dataset could be specified as metadata or as variable.
 +
 +
 +==== <files in_dir="..." regex="..." /> ====
 +With this element, the files and/or directories to stage are selected. All the ''"files"'' elements defined in an output dataset will be evaluated. The files and/or directories selected will be the following:
 +  * located in the ''"in_dir"'', as a subdirectory of the module's work directory; or located directly in the module's work directory if the ''"in_dir"'' attribute is empty.
 +  * matching the regex expression specified in the ''"regex"'' attribute.
  
 ===== <module name="..."  ... > ===== ===== <module name="..."  ... > =====
Line 246: Line 253:
  
  
-Bee supports the use of variables in its workflow description file. +Bee supports the use of variables in its workflow description file.\\ 
-Variables are specified with the following terminology:  **''${variable_type.variable_name}''**+Variables are specified with the following terminology:  **''${variable_type.variable_name}''**.\\ 
- +Please check **[[:beewm:devel:resolving_workflow_templates|Resolving Workflow Templates]]** for detailed information.
-===== Summary of Supported Variables ===== +
- +
-^ Variable                   ^ Resolved in template ^  +
-| ${config.extras_dir}               ✔            | +
-| ${task.log_stdout}                 ✘            | +
-| ${task.log_stderr}                 ✘            | +
-| ${bee_indexer.start_index} |         ✘            | +
-| ${bee_indexer.end_index}           ✘            | +
-| ${module.version}          |         ✔            | +
-| ${api.xxxx}                |         ✔            | +
-===== Variable types ===== +
-==== config ==== +
-The currently supported variable of this type is: +
-  *  **''${config.extras_dir}''** +
-The value corresponds to the property ''extras.dir'' defined in the ''system.config'' file.\\ +
-The workflow template will be resolved by substituting this variable with the effectively used value to provide traceability.\\ +
-__Example__:\\ +
-The template snippet:  +
-<code xml> +
-<path>${config.extras_path}/ShadingCorrectionAverageImage/ComputeShadingCorrectionAvgImg.command</path> +
-</code> +
-would be resolved into:  +
-<code xml> +
-<path>/import/bc2/home/resit/mx_nas/stage/bee/extras/ShadingCorrectionAverageImage/ComputeShadingCorrectionAvgImg.command</path> +
-</code> +
- +
-==== task ==== +
-The currently supported variables of this type are: +
-   **''${task.log_stdout}''** +
-  *  **''${task.log_stderr}''** +
-This variables correspond the files where the cluster job (or task) standard output and standard error streams are directed.\\ +
-The extension of such files is defined to be.**''stdout''** and **''.stderr''**.\\ +
-These variables can be used in the task validations to specify the files to validate.\\ +
-These variables will not be resolved in the workflow template since in the case of parallel running modules, one set of such files is produced per task.\\ +
-In case there is interest on examining/keeping those files, they can be sent to storage by locating them in the module's work directory using their file extension (''.stdout'', ''.stderr'').\\ +
-__Example__:\\ +
-<code xml> +
-<output> +
-    <dataset type="CLUSTER_JOB_LOGS" store="true"> +
-      <files in_dir="" regex=".*\.stderr" /> +
-      <files in_dir="" regex=".*\.stdout" /> +
-    </dataset> +
-  </datasets> +
-  <validations level="task"> +
-    <validation mode="count" sub_dir="" regex="${task.log_stdout}" comparator="equal" target_value="1" fail_status="validation_error" fail_message="Missing Stdout Log File" /> +
-    <validation mode="content" sub_dir="" regex="${task.log_stdout}" content_regex="finished successfully" comparator="equal" target_value="1" fail_status="validation_error" fail_message="Missing Success Message in Stdout Log File" /> +
-</code> +
- +
- +
-==== bee_indexer ==== +
-The currently supported variables of this type are: +
-  *  **''${bee_indexer.start_index}''** +
-  *  **''${bee_indexer.end_index}''** +
-These variables correspond, in parallel running modules, to the start and end indexes of the objects to analyze in one job (task).\\ +
-They can be used as arguments to be added to the executable in each of the parallel calls.\\ +
-The values of them will be calculated in each task, using the values specified as ''indexbuilder_dataset'', ''indexbuilder_regex'', ''indexes_per_job'' and ''indexes_start'' in the module parameters.\\ +
-These variables will not be resolved in the workflow template because they are used for parallel running modules, and therefore, one set of such values is produced per task.\\ +
-__Example__:\\ +
-<code xml> +
-<module name="CPv1CPCluster" version="1.*.*" class="ch.systemsx.bee.workflowmanager.module.ClusterModule"+
-  <params> +
-    <param name="indexbuilder_dataset" value="hcs_plate" /> +
-    <param name="indexbuilder_regex" value=".*_cDAPI.*\.(TIF|JP2)$" /> +
-    <param name="indexes_per_job" value="400" /> +
-    <param name="indexes_start" value="2" /> +
-  </params> +
-  <executable> +
-    <path>${config.extras_dir}/CellProfiler1/CellProfiler1_rev004094_R2012b_12001/CPCluster.command</path> +
-    <args> +
-      <arg type="path" value="dataset:CpClusterProfiling" /> +
-      <arg type="path" value="dataset:Cpv1BatchFile" selector="Batch_data.mat" /> +
-      <arg type="string" value="${bee_indexer.start_index}" /> +
-      <arg type="string" value="${bee_indexer.end_index}" /> +
-      <arg type="path" value="dataset:CpClusterResults" /> +
-      <arg type="string" value="Batch_" /> +
-      <arg type="string" value="yes" /> +
-      <arg type="string" value="date" /> +
-    </args> +
-  </executable> +
-</code> +
- +
-==== module ==== +
-The currently supported variable of this type is: +
-   **''${module.version}''** +
-This variable can be used to specify the version of the module which will be computed with the regex given in the module ''version'' attribute (e.g.: ''version="1.*.*"'').\\ +
-The mdoule version attribute is a regex which should match the highest 3-digit version of a module. See Jira issue [[https://jira.biozentrum.unibas.ch/browse/BEE-114 | BEE-114]] for detailed description of such a match.\\ +
-The workflow template will be resolved by substituting this variable with the effectively used value to provide traceability.\\ +
-__Example__:\\ +
-The template snippet:  +
-<code xml> +
-<module name="CPv1CreateBatchFile" version="1.*.*" class="ch.systemsx.bee.workflowmanager.module.ClusterModule"> +
-  <executable> +
-    <path>/modules/CellProfiler1/CellProfiler1_ver${module.version}/CPCluster.command</path> +
-</code> +
-would be resolved into:  +
-<code xml> +
-<module name="CPv1CreateBatchFile" version="1.*.*" class="ch.systemsx.bee.workflowmanager.module.ClusterModule"> +
-  <executable> +
-    <path>/modules/CellProfiler1/CellProfiler1_ver001.000.000/CPCluster.command</path> +
-</code> +
- +
-The resolved workflow should be stored together with the stored results (as it is in the current iBrain2).\\ +
  
  
beewm/devel/workflow_specification_syntax.txt · Last modified: 2016/05/20 12:11 by admin