User Tools

Site Tools


beewm:devel:workflow_specification_syntax

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
beewm:devel:workflow_specification_syntax [2013/09/11 15:44]
epujadas [Summary of Supported Variables]
beewm:devel:workflow_specification_syntax [2016/05/20 12:11] (current)
admin
Line 2: Line 2:
  
 <code xml> <code xml>
-<workflow name="String (required)" author="String (required)" cleanup="TRUE|FALSE (optional, defaults to TRUE)">+<workflow name="String (optional)" author="String (required)" cleanup="TRUE|FALSE (optional, defaults to TRUE)">
     <hosts>     <hosts>
         <run_on>CLUSTER_HOST|LOCAL_HOST (optional, defaults to CLUSTER_HOST)</run_on>         <run_on>CLUSTER_HOST|LOCAL_HOST (optional, defaults to CLUSTER_HOST)</run_on>
Line 8: Line 8:
     <input>     <input>
         <datasets>         <datasets>
-            <dataset name="String (required)" id="String (storage ID, required)" type="String (storage dataset type, required)" stage="TRUE|FALSE (optional, defaults to TRUE)" />+            <dataset name="String (required)" id="String (storage ID, required)" type="String (storage dataset type, optional)" stage="TRUE|FALSE (optional, defaults to TRUE)" />
             ....             ....
         </datasets>         </datasets>
     </input>     </input>
     <modules>     <modules>
-        <module name="String (required)" version="Regex pattern (required)" class="String (optional, defaults to ch.systemsx.bee.workflowmanager.module.MockModule)" required_runtime_minutes="Integer (optional)" required_memory_mb="Integer (optional)" >+        <module name="String (required)" version="Regex pattern (required)" class="String (optional, defaults to ch.systemsx.bee.workflowmanager.module.MockModule)" required_runtime_minutes="Integer (optional)" required_memory_mb="Integer (optional)" cpus_per_job="Integer (optional)">
             <params (optional)>             <params (optional)>
                 <param name="indexbuilder_dataset|indexbuilder_regex|indexes_per_job|indexes_start (required)" value="String (required)" />                 <param name="indexbuilder_dataset|indexbuilder_regex|indexes_per_job|indexes_start (required)" value="String (required)" />
Line 27: Line 27:
             <output>             <output>
                 <datasets (optional)>                 <datasets (optional)>
-                    <dataset name="String (required for datasets not to store, no default)" type="String (required for datasets to store, no default)" store="TRUE|FALSE (optional, defaults to FALSE)" relevant="TRUE|FALSE (optional, defaults to TRUE)">+                    <dataset name="String (required for datasets not to store, no default)" type="String (required for datasets to store, no default)" store="TRUE|FALSE (optional, defaults to FALSE)" relevant="TRUE|FALSE (optional, defaults to TRUE)" dropbox="String: the name of the dropbox that will be used for storing the dataset. Optional, if it's not specified the type will be used as dropbox name.">
                         <files (considered and required only for datasets to store) in_dir="String (optional, defaults to the root of the module's work directory)" regex="Regex pattern (required)" />                         <files (considered and required only for datasets to store) in_dir="String (optional, defaults to the root of the module's work directory)" regex="Regex pattern (required)" />
                     </dataset>                     </dataset>
Line 50: Line 50:
         <datasets>         <datasets>
             <dataset name="RawImages" id="0bCDME-BE01" type="HCS_IMAGE_RAW" stage="true" />             <dataset name="RawImages" id="0bCDME-BE01" type="HCS_IMAGE_RAW" stage="true" />
-            <dataset name="ComputeShadingCorrectionAverageImageSettings" id="12345" type="HCS_ARGUMENTS" stage="true" />+            <dataset name="ComputeShadingCorrectionAverageImageSettings" id="123459876" stage="true" />
             <dataset name="MergeShadingCorrectionAverageImageSettings" id="1234567890" type="HCS_ARGUMENTS" stage="true" />             <dataset name="MergeShadingCorrectionAverageImageSettings" id="1234567890" type="HCS_ARGUMENTS" stage="true" />
         </datasets>         </datasets>
Line 141: Line 141:
 ==== <workflow name="..." ... > ==== ==== <workflow name="..." ... > ====
 Name of the workflow. Name of the workflow.
 +Workflow
 ==== <workflow author="..." ... > ==== ==== <workflow author="..." ... > ====
 Author of the workflow. Author of the workflow.
Line 151: Line 152:
 When a process is started by submitting a particular workflow together with one or several input datasets, it will be checked that: When a process is started by submitting a particular workflow together with one or several input datasets, it will be checked that:
   * the input dataset(s) identified by its storage ID exist in storage   * the input dataset(s) identified by its storage ID exist in storage
-  * the input dataset(s) in storage have the same type as specified in the ''<input><datasets><dataset ... />'' element+  * the input dataset(s) in storage have the same type as specified in the ''<input><datasets><dataset ... />'' element, if given
  
  
Line 161: Line 162:
  
 ==== <dataset type="..." ... > ==== ==== <dataset type="..." ... > ====
-Required. Corresponds to the dataset type provided by the storage. It is used for validation purposes.+Optional. Corresponds to the dataset type provided by the storage. It is used for validation purposes.
  
 ==== <dataset stage="..." ... > ==== ==== <dataset stage="..." ... > ====
Line 169: Line 170:
  
 :!: This feature is not implemented. The application behaves as this value would be set to true. To be discussed if it is needed, since the directory of a dataset could be specified as metadata or as variable. :!: This feature is not implemented. The application behaves as this value would be set to true. To be discussed if it is needed, since the directory of a dataset could be specified as metadata or as variable.
 +
 +
 +==== <files in_dir="..." regex="..." /> ====
 +With this element, the files and/or directories to stage are selected. All the ''"files"'' elements defined in an output dataset will be evaluated. The files and/or directories selected will be the following:
 +  * located in the ''"in_dir"'', as a subdirectory of the module's work directory; or located directly in the module's work directory if the ''"in_dir"'' attribute is empty.
 +  * matching the regex expression specified in the ''"regex"'' attribute.
  
 ===== <module name="..."  ... > ===== ===== <module name="..."  ... > =====
Line 246: Line 253:
  
  
-Bee supports the use of variables in its workflow description file. +Bee supports the use of variables in its workflow description file.\\ 
-Variables are specified with the following terminology:  **''${variable_type.variable_name}''**+Variables are specified with the following terminology:  **''${variable_type.variable_name}''**.\\ 
- +Please check **[[:beewm:devel:resolving_workflow_templates|Resolving Workflow Templates]]** for detailed information.
-===== Summary of Supported Variables ===== +
- +
-^ Variable                   ^ Resolved in template ^  +
-| ${config.extras_dir}               ✔            | +
-| ${task.log_stdout}                 ✘            | +
-| ${task.log_stderr}                 ✘            | +
-| ${bee_indexer.start_index} |         ✘            | +
-| ${bee_indexer.end_index}           ✘            | +
-| ${module.version}          |         ✔            | +
-| ${api.xxxx}                |         ✔            | +
-===== Variable types ===== +
-==== config ==== +
-The currently supported variable of this type is: +
-  *  **''${config.extras_dir}''** +
-The value corresponds to the property ''extras.dir'' defined in the ''system.config'' file.\\ +
-The workflow template will be resolved by substituting this variable with the effectively used value to provide traceability.\\ +
-__Example__:\\ +
-The template snippet:  +
-<code xml> +
-<path>${config.extras_path}/ShadingCorrectionAverageImage/ComputeShadingCorrectionAvgImg.command</path> +
-</code> +
-would be resolved into:  +
-<code xml> +
-<path>/import/bc2/home/resit/mx_nas/stage/bee/extras/ShadingCorrectionAverageImage/ComputeShadingCorrectionAvgImg.command</path> +
-</code> +
- +
-==== task ==== +
-The currently supported variables of this type are: +
-   **''${task.log_stdout}''** +
-  *  **''${task.log_stderr}''** +
-This variables correspond the files where the cluster job (or task) standard output and standard error streams are directed.\\ +
-The extension of such files is defined to be.**''stdout''** and **''.stderr''**.\\ +
-These variables can be used in the task validations to specify the files to validate.\\ +
-These variables will not be resolved in the workflow template since in the case of parallel running modules, one set of such files is produced per task.\\ +
-In case there is interest on examining/keeping those files, they can be sent to storage by locating them in the module's work directory using their file extension (''.stdout'', ''.stderr'').\\ +
-__Example__:\\ +
-<code xml> +
-<output> +
-    <dataset type="CLUSTER_JOB_LOGS" store="true"> +
-      <files in_dir="" regex=".*\.stderr" /> +
-      <files in_dir="" regex=".*\.stdout" /> +
-    </dataset> +
-  </datasets> +
-  <validations level="task"> +
-    <validation mode="count" sub_dir="" regex="${task.log_stdout}" comparator="equal" target_value="1" fail_status="validation_error" fail_message="Missing Stdout Log File" /> +
-    <validation mode="content" sub_dir="" regex="${task.log_stdout}" content_regex="finished successfully" comparator="equal" target_value="1" fail_status="validation_error" fail_message="Missing Success Message in Stdout Log File" /> +
-</code> +
- +
- +
-==== bee_indexer ==== +
-The currently supported variables of this type are: +
-  *  **''${bee_indexer.start_index}''** +
-  *  **''${bee_indexer.end_index}''** +
-These variables correspond, in parallel running modules, to the start and end indexes of the objects to analyze in one job (task).\\ +
-They can be used as arguments to be added to the executable in each of the parallel calls.\\ +
-The values of them will be calculated in each task, using the values specified as ''indexbuilder_dataset'', ''indexbuilder_regex'', ''indexes_per_job'' and ''indexes_start'' in the module parameters.\\ +
-These variables will not be resolved in the workflow template because they are used for parallel running modules, and therefore, one set of such values is produced per task.\\ +
-__Example__:\\ +
-<code xml> +
-<module name="CPv1CPCluster" version="1.*.*" class="ch.systemsx.bee.workflowmanager.module.ClusterModule"+
-  <params> +
-    <param name="indexbuilder_dataset" value="hcs_plate" /> +
-    <param name="indexbuilder_regex" value=".*_cDAPI.*\.(TIF|JP2)$" /> +
-    <param name="indexes_per_job" value="400" /> +
-    <param name="indexes_start" value="2" /> +
-  </params> +
-  <executable> +
-    <path>${config.extras_dir}/CellProfiler1/CellProfiler1_rev004094_R2012b_12001/CPCluster.command</path> +
-    <args> +
-      <arg type="path" value="dataset:CpClusterProfiling" /> +
-      <arg type="path" value="dataset:Cpv1BatchFile" selector="Batch_data.mat" /> +
-      <arg type="string" value="${bee_indexer.start_index}" /> +
-      <arg type="string" value="${bee_indexer.end_index}" /> +
-      <arg type="path" value="dataset:CpClusterResults" /> +
-      <arg type="string" value="Batch_" /> +
-      <arg type="string" value="yes" /> +
-      <arg type="string" value="date" /> +
-    </args> +
-  </executable> +
-</code> +
- +
-==== module ==== +
-The currently supported variable of this type is: +
-   **''${module.version}''** +
-This variable can be used to specify the version of the module which will be computed with the regex given in the module ''version'' attribute (e.g.: ''version="1.*.*"'').\\ +
-The mdoule version attribute is a regex which should match the highest 3-digit version of a module. See Jira issue [[https://jira.biozentrum.unibas.ch/browse/BEE-114 | BEE-114]] for detailed description of such a match.\\ +
-The workflow template will be resolved by substituting this variable with the effectively used value to provide traceability.\\ +
-__Example__:\\ +
-The template snippet:  +
-<code xml> +
-<module name="CPv1CreateBatchFile" version="1.*.*" class="ch.systemsx.bee.workflowmanager.module.ClusterModule"> +
-  <executable> +
-    <path>/modules/CellProfiler1/CellProfiler1_ver${module.version}/CPCluster.command</path> +
-</code> +
-would be resolved into:  +
-<code xml> +
-<module name="CPv1CreateBatchFile" version="1.*.*" class="ch.systemsx.bee.workflowmanager.module.ClusterModule"> +
-  <executable> +
-    <path>/modules/CellProfiler1/CellProfiler1_ver001.000.000/CPCluster.command</path> +
-</code> +
- +
-The resolved workflow should be stored together with the stored results (as it is in the current iBrain2).\\ +
  
  
beewm/devel/workflow_specification_syntax.1378907088.txt.gz · Last modified: 2013/09/11 15:44 by epujadas