User Tools

Site Tools


beewm:user:monitoring_and_debugging_workflows

Monitoring and debugging workflows

After submitting a workflow you can go for a coffee and wait for the email notification which is sent by bee. If all went well, then you will receive a success notification, and find your results in the storage system. But in a complex environment, things can go wrong and they sometimes will go wrong. In the remaining part of this section we show some hints how you can troubleshoot your workflow.

Checking the status of a processing

If you are using openBIS as storage provide you can use the webclient for checking the status of your processing. Otherwise you can use directly the RESTful interface of the workflow manager.

To get the status of your workflow through the RESTful interface you can use your favorite browser with the address of the workflow manager (which is by default localhost:9999), just use the following address:

http://localhost:9999/apiv1/processes/PROCESS_ID

where PROCESS_ID is the id you got after submission.

This will return a json object with the status of the workflow and its modules. If you are not familiar with json you can use a json viewer like http://jsonviewer.stack.hu/ into which you can paste in the whole output to see it formatted in a structured way.

Debugging

Several things can go wrong when you perform a large scale analysis on a cluster: infrastructure problems (failing nodes, filesystems), data problems, wrong parameters. In the following we give some hints which can help to find the root cause of a problem:

  • Check the log files under the log directory:
    • Are there exceptions in the log files?
    • Are these exception related to some specific resource like filesystem or cluster resource?
  • Check the working directory of the module where the problem occurs. Each workflow has its own working directory, which corresponds to the process id of the process in the cluster scratch directory (which is set in the configuration). In this directory you can find a subdirectory for each module. In each module directory you can find the std-out and std-error logs created by the queuing system, and all the files generated by the submitted executable. (These files can be either in the root of the directory, or in the .tasks subdirectory if the validation of the task failed). NOTE: If the run of a module was successful and the results are stored successfully and they aren't needed anymore by other modules, then they will get deleted automatically by bee.
beewm/user/monitoring_and_debugging_workflows.txt ยท Last modified: 2014/05/07 09:11 by 127.0.0.1