User Tools

Site Tools


beewm:devel:workflow_specification_syntax

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
beewm:devel:workflow_specification_syntax [2013/09/11 14:28]
epujadas [Example]
beewm:devel:workflow_specification_syntax [2016/05/20 12:11] (current)
admin
Line 2: Line 2:
  
 <code xml> <code xml>
-<workflow name="String (required)" author="String (required)" cleanup="TRUE|FALSE (optional, defaults to TRUE)">+<workflow name="String (optional)" author="String (required)" cleanup="TRUE|FALSE (optional, defaults to TRUE)">
     <hosts>     <hosts>
         <run_on>CLUSTER_HOST|LOCAL_HOST (optional, defaults to CLUSTER_HOST)</run_on>         <run_on>CLUSTER_HOST|LOCAL_HOST (optional, defaults to CLUSTER_HOST)</run_on>
Line 8: Line 8:
     <input>     <input>
         <datasets>         <datasets>
-            <dataset name="String (required)" id="String (storage ID, required)" type="String (storage dataset type, required)" stage="TRUE|FALSE (optional, defaults to TRUE)" />+            <dataset name="String (required)" id="String (storage ID, required)" type="String (storage dataset type, optional)" stage="TRUE|FALSE (optional, defaults to TRUE)" />
             ....             ....
         </datasets>         </datasets>
     </input>     </input>
     <modules>     <modules>
-        <module name="String (required)" version="Regex pattern (required)" class="String (optional, defaults to ch.systemsx.bee.workflowmanager.module.MockModule)" required_runtime_minutes="Integer (optional)" required_memory_mb="Integer (optional)" >+        <module name="String (required)" version="Regex pattern (required)" class="String (optional, defaults to ch.systemsx.bee.workflowmanager.module.MockModule)" required_runtime_minutes="Integer (optional)" required_memory_mb="Integer (optional)" cpus_per_job="Integer (optional)">
             <params (optional)>             <params (optional)>
                 <param name="indexbuilder_dataset|indexbuilder_regex|indexes_per_job|indexes_start (required)" value="String (required)" />                 <param name="indexbuilder_dataset|indexbuilder_regex|indexes_per_job|indexes_start (required)" value="String (required)" />
Line 27: Line 27:
             <output>             <output>
                 <datasets (optional)>                 <datasets (optional)>
-                    <dataset name="String (required for datasets not to store, no default)" type="String (required for datasets to store, no default)" store="TRUE|FALSE (optional, defaults to FALSE)" relevant="TRUE|FALSE (optional, defaults to TRUE)">+                    <dataset name="String (required for datasets not to store, no default)" type="String (required for datasets to store, no default)" store="TRUE|FALSE (optional, defaults to FALSE)" relevant="TRUE|FALSE (optional, defaults to TRUE)" dropbox="String: the name of the dropbox that will be used for storing the dataset. Optional, if it's not specified the type will be used as dropbox name.">
                         <files (considered and required only for datasets to store) in_dir="String (optional, defaults to the root of the module's work directory)" regex="Regex pattern (required)" />                         <files (considered and required only for datasets to store) in_dir="String (optional, defaults to the root of the module's work directory)" regex="Regex pattern (required)" />
                     </dataset>                     </dataset>
Line 50: Line 50:
         <datasets>         <datasets>
             <dataset name="RawImages" id="0bCDME-BE01" type="HCS_IMAGE_RAW" stage="true" />             <dataset name="RawImages" id="0bCDME-BE01" type="HCS_IMAGE_RAW" stage="true" />
-            <dataset name="ComputeShadingCorrectionAverageImageSettings" id="12345" type="HCS_ARGUMENTS" stage="true" />+            <dataset name="ComputeShadingCorrectionAverageImageSettings" id="123459876" stage="true" />
             <dataset name="MergeShadingCorrectionAverageImageSettings" id="1234567890" type="HCS_ARGUMENTS" stage="true" />             <dataset name="MergeShadingCorrectionAverageImageSettings" id="1234567890" type="HCS_ARGUMENTS" stage="true" />
         </datasets>         </datasets>
Line 141: Line 141:
 ==== <workflow name="..." ... > ==== ==== <workflow name="..." ... > ====
 Name of the workflow. Name of the workflow.
 +Workflow
 ==== <workflow author="..." ... > ==== ==== <workflow author="..." ... > ====
 Author of the workflow. Author of the workflow.
Line 151: Line 152:
 When a process is started by submitting a particular workflow together with one or several input datasets, it will be checked that: When a process is started by submitting a particular workflow together with one or several input datasets, it will be checked that:
   * the input dataset(s) identified by its storage ID exist in storage   * the input dataset(s) identified by its storage ID exist in storage
-  * the input dataset(s) in storage have the same type as specified in the ''<input><datasets><dataset ... />'' element+  * the input dataset(s) in storage have the same type as specified in the ''<input><datasets><dataset ... />'' element, if given
  
  
Line 161: Line 162:
  
 ==== <dataset type="..." ... > ==== ==== <dataset type="..." ... > ====
-Required. Corresponds to the dataset type provided by the storage. It is used for validation purposes.+Optional. Corresponds to the dataset type provided by the storage. It is used for validation purposes.
  
 ==== <dataset stage="..." ... > ==== ==== <dataset stage="..." ... > ====
Line 169: Line 170:
  
 :!: This feature is not implemented. The application behaves as this value would be set to true. To be discussed if it is needed, since the directory of a dataset could be specified as metadata or as variable. :!: This feature is not implemented. The application behaves as this value would be set to true. To be discussed if it is needed, since the directory of a dataset could be specified as metadata or as variable.
 +
 +
 +==== <files in_dir="..." regex="..." /> ====
 +With this element, the files and/or directories to stage are selected. All the ''"files"'' elements defined in an output dataset will be evaluated. The files and/or directories selected will be the following:
 +  * located in the ''"in_dir"'', as a subdirectory of the module's work directory; or located directly in the module's work directory if the ''"in_dir"'' attribute is empty.
 +  * matching the regex expression specified in the ''"regex"'' attribute.
  
 ===== <module name="..."  ... > ===== ===== <module name="..."  ... > =====
Line 246: Line 253:
  
  
-Bee supports the use of variables in its workflow description file. +Bee supports the use of variables in its workflow description file.\\ 
-Variables are specified with the following terminology:  **''${variable_type.variable_name}''**+Variables are specified with the following terminology:  **''${variable_type.variable_name}''**.\\ 
-===== Variable types ===== +Please check **[[:beewm:devel:resolving_workflow_templates|Resolving Workflow Templates]]** for detailed information.
-==== config ==== +
-The currently supported variable of this type is: +
-  *  **''${config.extras_dir}''** +
-The value corresponds to the property ''extras.dir'' defined in the ''system.config'' file.\\ +
-The workflow template will be resolved by substituting this variable with the effectively used value to provide traceability.\\ +
-__Example__:\\ +
-The template snippet:  +
-<code xml> +
-<path>${config.extras_path}/ShadingCorrectionAverageImage/ComputeShadingCorrectionAvgImg.command</path> +
-</code> +
-would be resolved into:  +
-<code xml> +
-<path>/import/bc2/home/resit/mx_nas/stage/bee/extras/ShadingCorrectionAverageImage/ComputeShadingCorrectionAvgImg.command</path> +
-</code> +
- +
-==== task ==== +
-The currently supported variables of this type are: +
-   **''${task.log_stdout}''** +
-  *  **''${task.log_stderr}''** +
-This variables correspond the files where the cluster job (or task) standard output and standard error streams are directed.\\ +
-The extension of such files is defined to be.**''stdout''** and **''.stderr''**.\\ +
-These variables can be used in the task validations to specify the files to validate.\\ +
-These variables will not be resolved in the workflow template since in the case of parallel running modules, one set of such files is produced per task.\\ +
-In case there is interest on examining/keeping those files, they can be sent to storage by locating them in the module's work directory using their file extension (''.stdout'', ''.stderr'').\\ +
-__Example__:\\ +
-<code xml> +
-<output> +
-    <dataset type="CLUSTER_JOB_LOGS" store="true"> +
-      <files in_dir="" regex=".*\.stderr" /> +
-      <files in_dir="" regex=".*\.stdout" /> +
-    </dataset> +
-  </datasets> +
-  <validations level="task"> +
-    <validation mode="count" sub_dir="" regex="${task.log_stdout}" comparator="equal" target_value="1" fail_status="validation_error" fail_message="Missing Stdout Log File" /> +
-    <validation mode="content" sub_dir="" regex="${task.log_stdout}" content_regex="finished successfully" comparator="equal" target_value="1" fail_status="validation_error" fail_message="Missing Success Message in Stdout Log File" /> +
-</code> +
- +
- +
-The resolved workflow should be stored together with the stored results (as it is in the current iBrain2).\\ +
  
  
beewm/devel/workflow_specification_syntax.1378902530.txt.gz · Last modified: 2013/09/11 14:28 by epujadas