Prerequisite Tasks and Subtasks
Xylar Asay-Davis
date: 2017/06/12
Summary
Currently, no tasks depend on other tasks to run. However, in order to allow multiple plots to be generated simulataneously, it is desirable to break tasks into multiple subtasks, and some of these subtasks will need rely on data from other subtasks. It is also conceivable that multiple tasks could rely on the same data (e.g. a common climatology dataset). The proposed solution to this problem is to allow “prerequisite tasks” to a given analysis task. The task will only run after the prerequisite task(s) have completed. Prerequisite tasks could be used to build up a sequence of analysis tasks in several steps. Some of these steps could be shared between analysis tasks (e.g. computing single data set and then plotting it in various ways). Implementation of this design will be considered a success if dependent tasks only run once their prerequisite tasks have completed successfully.
Requirements
Requirement: Define prerequisite tasks
Date last modified: 2017/06/12
Contributors: Xylar Asay-Davis
A simple mechanism (such as a list of task names) exists to define prerequisite tasks of each analysis task.
Requirement: Add prerequisites to task list
Date last modified: 2017/06/12
Contributors: Xylar Asay-Davis
Given a task that we want to run, a mechanism must exist for adding its prerequisites (if any) to the list of tasks to be run.
Requirement: Holding dependent tasks
Date last modified: 2017/06/12
Contributors: Xylar Asay-Davis
Dependent tasks (those with prerequisites) must be prevented from running until their prerequisites have successfully finished.
Requirement: Cancel dependents of failed prerequisites
Date last modified: 2017/06/12
Contributors: Xylar Asay-Davis
If a prerequisite of a dependent tasks has failed, the dependent task should not be run.
Algorithmic Formulations
Design solution: Define prerequisite tasks
Date last modified: 2017/09/19
Contributors: Xylar Asay-Davis
Each task will be constructed with a list of the names of prerequisite tasks. If a task has no prerequisites (the default), the list is empty.
Design solution: Add prerequisites to task list
Date last modified: 2017/10/11
Contributors: Xylar Asay-Davis
A recursive function will be used to add a given task (assuming its
check_generate
method returns True
, meaning that task should be generated)
and its dependencies to a list of analyses to run. The code (with a few
error messages removed for brevity) is as follows:
analysesToGenerate = []
# check which analysis we actually want to generate and only keep those
for analysisTask in analyses:
# update the dictionary with this task and perhaps its subtasks
add_task_and_subtasks(analysisTask, analysesToGenerate)
def add_task_and_subtasks(analysisTask, analysesToGenerate,
callCheckGenerate=True):
if analysisTask in analysesToGenerate:
return
if callCheckGenerate and not analysisTask.check_generate():
# we don't need to add this task -- it wasn't requested
return
# first, we should try to add the prerequisites of this task and its
# subtasks (if they aren't also subtasks for this task)
prereqs = analysisTask.runAfterTasks
for subtask in analysisTask.subtasks:
for prereq in subtask.runAfterTasks:
if prereq not in analysisTask.subtasks:
prereqs.extend(subtask.runAfterTasks)
for prereq in prereqs:
add_task_and_subtasks(prereq, analysesToGenerate,
callCheckGenerate=False)
if prereq._setupStatus != 'success':
# this task should also not run
analysisTask._setupStatus = 'fail'
return
# make sure all prereqs have been set up successfully before trying to
# set up this task -- this task's setup may depend on setup in the prereqs
try:
analysisTask.setup_and_check()
except (Exception, BaseException):
analysisTask._setupStatus = 'fail'
return
# next, we should try to add the subtasks. This is done after the current
# analysis task has been set up in case subtasks depend on information
# from the parent task
for subtask in analysisTask.subtasks:
add_task_and_subtasks(subtask, analysesToGenerate,
callCheckGenerate=False)
if subtask._setupStatus != 'success':
analysisTask._setupStatus = 'fail'
return
analysesToGenerate.append(analysisTask)
analysisTask._setupStatus = 'success'
Design solution: Holding dependent tasks
Date last modified: 2017/10/11
Contributors: Xylar Asay-Davis
Each task is given a _runStatus
attribute, which is a multiprocessing.Value
object that can be shared and changed across processes. A set of constant
possible values for this attribute, READY
, BLOCKED
, RUNNING
, SUCCES
and
FAIL
are defined inAnalysisTask
. If a task has no prerequisites, initially
_runStatus = READY
; otherwise _runStatus = BLOCKED
. Any READY
task can be run (_runStatus = 'running'
). Any task that finishes is given
_runStatus = SUCCESS
or _runStatus = FAIL
(I know, not grammatically
consistent but compact…).
When a new parallel slot becomes available, all BLOCKED
tasks are checked
to see if any prerequisites have failed (in which case the task also fails) or
if all prerequisites have succeeded, in which case the task is now READY
.
After that, the next READY
task is run.
Design solution: Cancel dependents of failed prerequisites
Date last modified: 2017/06/12
Contributors: Xylar Asay-Davis
Same as above: When a new parallel slot becomes available, all BLOCKED
tasks are checked to see if any prerequisites have failed (in which case the
task also fails).
Design and Implementation
The design has been implemented in the branch xylar/add_mpas_climatology_task
Implementation: Define prerequisite tasks
Date last modified: 2017/10/11
Contributors: Xylar Asay-Davis
AnalysisTask
now has an attribute runAfterTasks
, which default to empty.
Prerequisite tasks can be added by calling run_after(self, task)
with the
task that this task should follow.
Implementation: Add prerequisites to task list
Date last modified: 2017/10/11
Contributors: Xylar Asay-Davis
build_analysis_list
in run_mpas_analysis
has been modified to call a
recursive function add_task_and_subtasks
that adds a task, its prerequisites
(if they have not already been added) and its subtasks to the list of tasks
to run.
Implementation: Holding dependent tasks
Date last modified: 2017/06/12
Contributors: Xylar Asay-Davis
The run_analysis
function in run_mpas_analysis
has been updated to be aware
of the status of each task, as described in the algorithms section.
Implementation: Cancel dependents of failed prerequisites
Date last modified: 2017/06/12
Contributors: Xylar Asay-Davis
Again, the run_analysis
function in run_mpas_analysis
has been updated to
be aware of the status of each task, as described in the algorithms section.
Testing and Validation
Date last modified: 2017/06/12
All plots will be tested to ensure they are bit-for-bit identical to
those produced by
Contributors: Xylar Asay-Davis
develop
for all tests defined in the configs/edison
and configs/lanl
directories. Task will be run in parallel and I will
verify that no dependent tasks run before prerequisite tasks have completed.