Dependency Resolution

Motivation

To reduce the amount of the unnecessary processing to save resources and time we would like to compute only that steps that didn't already produce a compatible result (Dataset Equivalence Classes).

Possible Solutions

Top Down Approach

With the expected result definition in the workflow definition it would be easy to implement this approach. We just start from the expected results:

checkList = { Expected Results}
modulesToExecute = { }
while checkList is not empty do
    dataSet = checkList.getHead()
    if dataSet is an input dataset
        continue
    if  modulesToExecute contains module which produces dataSet
        continue
    if there is compatible dataset in storage
        continue
    module = the module from the workflow which produces dataSet
    modulesToExecute.add( module )
    checkList.pushBack( module.getInputDatasets() )
done

Note: This algorithm will only select that modules that are needed to compute the results. So if the workflow description contains branches/modules that produce datasets not listed in the results section or needed to an other module these modules never will be executed.

Bottom Up Approach

In this case we start from the modules only depend on the input datasets, check if their result should be recomputed then continue with the modules need this dataset too.

dataSetsReady = { Input dataSets}
modules = { all the modules from the workflow}
modulesToExecute = { }
while modules is not empty do
    module = a module from modules which only depends on datasets from dataSetsReady
    modules.remove(module)
    dataSetReady.add( module.getOutputs() )
    if there's no compatible dataset in the storage for the outputs of module
        modulesToExecute.add(module)
done

Implementation

Top-down approach implemented.
See the following two methods in the WorkflowStarter class:

setOutputEquivalences(Set<Module> modules, WorkflowConfig workflowConfig, DatasetEquivalenceChecker datasetEquivalenceChecker)
setParentsComplete(List<Module> moduleGraph)

screeningBee Data Analysis Tools

Sidebar

Table of Contents

Dependency Resolution

Motivation

Possible Solutions

Top Down Approach

Bottom Up Approach

Implementation

screeningBee Data Analysis Tools

User Tools

Site Tools

Sidebar

Table of Contents

Dependency Resolution

Motivation

Possible Solutions

Top Down Approach

Bottom Up Approach

Implementation

Page Tools