Developers: Understanding an analysis task¶
This tutorial walks a new developer through an existing analysis task to get a more in-depth understanding of the code. This tutorial is meant as the starting point for the Developers: Adding a new analysis task tutorial. It is a common practice to find an existing analysis task that is as close as possible to the new analysis, and to copy that existing task as a template for the new task. This tutorial describes an existing analysis task, and Developers: Adding a new analysis task uses it as a starting point for developing a new task.
1. The big picture¶
MPAS-Analysis is meant to provide a first look at E3SM simulation output from the MPAS-Ocean and MPAS-Seaice components. The analysis is intended to be robust and automated. However, there is currently little effort to ensure that the time period covered by the observations and model output are the same. In other words, we often compare pre-industrial simulation results with present-day observations. The justification for this is twofold. First, we typically have few if any observations covering the pre-industrial period. Second, we may be attempting to reduce biases that we assess to be much larger than expected differences between pre-industrial and present-day climate conditions. Under these conditions, MPAS-Analysis provides us with a useful first impression of how our simulation is doing.
1.1 MPAS output¶
The primary output from MPAS-Ocean and MPAS-Seaice are monthly and daily averages of a large number of data fields. Here are links to the list of:
The components also produce a smaller amount of more specialized output, such as monthly maximum/minimum values.
MPAS data is provided on unstructured meshes, meaning that it isn’t particularly amenable to analysis with standard tools such as ESMValTool. Additionally, E3SM’s science campaigns require unique, sometimes regionally focused analysis not available in existing tools.
1.2 Analysis tasks¶
MPAS-Analysis is designed to a series of interdependent analysis tasks in parallel with one another. It builds up dependency graph between the tasks, allowing independent tasks to run at the same time while putting dependent tasks on hold until the tasks they depend on are completed. Additionally, MPAS-Analysis has some rudimentary tools for keeping track of the resources that some computationally intensive tasks require to prevent the tool from running out of memory.
Currently, nearly all operations in MPAS-Analysis must run on a single HPC node. (The exception is ncclimo, which is used to generate climatologies, and which can run in parallel across up to 12 nodes if desired.) We hope to support broader task parallelism in the not-too-distant future using the parsl python package.
Each analysis tasks is a class that descends from the
AnalysisTask
base class. Tasks
can also have “subtasks” that do part of the work needed for the final
analysis. A subtask might perform a computation on a specific region, period
of time, or season. It might combine data from other subtasks into a single
dataset. Or it might plot the data computed by a previous task. The
advantages of dividing up the work are 1) that each subtask can potentially
run in parallel with other subtasks and 2) it can allow code reuse if the same
subtask can be used across multiple analysis tasks.
2. Tour of an analysis task (ClimatologyMapOHCAnomaly
)¶
Aside from some code that takes care of managing analysis tasks and generating
web pages, MPAS-Analysis is made up almost entirely of analysis tasks and
shared functions they can call. Since adding new analysis nearly always
mean creating a new class for the task, we start with a tour of an existing
analysis task as well as the AnalysisTask
base class that it descends from.
We will use ClimatologyMapOHCAnomaly
as an
example analysis task for this tour because it will turn out to be a useful
staring point for the analysis we want to add in Developers: Adding a new analysis task.
You can read more about climatologyMapOHCAnomaly in the User’s
Guide.
It will be useful to open the following links in your browser to have a look at the code directly: ClimatologyMapOHCAnomaly
If you want to be a little more adventurous, you can also pull up the code for the base class: AnalysisTask
2.1 Attributes¶
Classes can contain pieces of data called attributes. In MPAS-Analysis, the
objects representing tasks share several attributes that they inherit from
the AnalysisTask
class. A few of the most
important attributes of an analysis task are:
config
- an object for getting the values of config optionsnamelist
- an object for getting namelist options from the E3SM simulationrunStreams
- an object for finding MPAS output files in therun
directory. In practice, this is always a restart file used to get the MPAS mesh and, for MPAS-Ocean, the vertical coordinate.historyStreams
- an object for finding MPAS history streams (oftentimeSeriesStatsMonthlyOutput
).calendar
- the name of the calendar that was used in the MPAS run (in practice always'noleap'
or until recently'gregorian_noleap'
).xmlFileNames
- a list of XML files associated with plots produced by this analysis task. As we will discuss, these are used to help populate the web page showing the analysis.logger
- and object that keeps track of sending output to log files (rather than the terminal) when the analysis is running. During therun_task()
phase of the analysis when tasks are running in parallel with each other, make sure to uselogger.info()
instead ofprint()
to send output to the log file.
Within the methods of analysis task class, these attributes can be accessed
using the self
object, e.g. self.config
. It is often helpful to make
a local reference to the object to make the code more compact, e.g.:
config = self.config
seasons = config.getexpression('climatologyMapOHCAnomaly', 'seasons')
The analysis task we’re looking at, ClimatologyMapOHCAnomaly
has some attributes of its own:
mpasClimatologyTask
- the task that produced the climatology to be remapped and plottedrefYearClimatologyTask
- The task that produced the climatology from the first year to be remapped and then subtracted from the main climatology (since we want to plot an anomaly from the beginning of the simulation)
2.2 Constructor¶
Almost all classes have “constructors”, which are methods for making a new
object of that class. In python, the constructor is called __init__()
.
In general, the __
(double underscore) is used in python to indicate a
function or method with special meaning.
The constructor of a subclass (such as
ClimatologyMapOHCAnomaly
) always calls the
constructor of the superclass (AnalysisTask
in this case). So we’ll talk about the constructor for
ClimatologyMapOHCAnomaly
first and then get
to AnalysisTask
.
The constructor for ClimatologyMapOHCAnomaly
starts off like this:
def __init__(self, config, mpas_climatology_task,
ref_year_climatology_task, control_config=None):
As with all methods, it takes the self
object as the first argument.
Then, it takes a config
object, which is true of all analysis tasks. Then,
it has some other arguments that are more specific to the analysis being
performed. Here, we have 2 other analysis tasks as arguments:
mpasClimatologyTask
and refYearClimatologyTask
. As described in
the previous section, these are tasks for computing climatologies that will
later be remapped to a comparison grid for plotting. A little later in the
constructor, we store references to these tasks as attributes:
self.mpas_climatology_task = mpas_climatology_task
self.ref_year_climatology_task = ref_year_climatology_task
Returning to the constructor above, the first thing we do it to call the
super class’s __init__()
method:
def __init__(self, config, mpas_climatology_task,
ref_year_climatology_task, control_config=None):
"""
Construct the analysis task.
Parameters
----------
config : mpas_tools.config.MpasConfigParser
Configuration options
mpas_climatology_task : mpas_analysis.shared.climatology.MpasClimatologyTask
The task that produced the climatology to be remapped and plotted
ref_year_climatology_task : mpas_analysis.shared.climatology.RefYearMpasClimatologyTask
The task that produced the climatology from the first year to be
remapped and then subtracted from the main climatology
control_config : mpas_tools.config.MpasConfigParser, optional
Configuration options for a control run (if any)
"""
field_name = 'deltaOHC'
# call the constructor from the base class (AnalysisTask)
super().__init__(config=config, taskName='climatologyMapOHCAnomaly',
componentName='ocean',
tags=['climatology', 'horizontalMap', field_name,
'publicObs', 'anomaly'])
We’re passing along the config
options to the base class so it can store
them. Then, we’re giving the task a unique taskName
(the same as the class
name except that it starts with a lowercase letter). We’re saying that the
MPAS componentName
is the ocean.
Then, we giving the task a number of tags
that can be helpful in
determining whether or not to generate this particular analysis based on the
Generate Option. The tags are used to describe various aspects of the
analysis. Here, we will produce plots of a climatology
(as opposed to a
time series). The plot will be a horizontalMap
. It will involve the
variable deltaOHC
. This analysis doesn’t involve any observations, but we
include a publicObs
tag to indicate that it doesn’t require any proprietary
observational data sets that we do not have the rights to make public.
(Currently, we have a few such data sets for things like Antarctic melt rates.)
Finally, the analysis involves an anomaly
computed relative to the
beginning of the simulation.
From there, we get the values of some config options, raising errors if we find something unexpected:
section_name = self.taskName
# read in what seasons we want to plot
seasons = config.getexpression(section_name, 'seasons')
if len(seasons) == 0:
raise ValueError(f'config section {section_name} does not contain '
f'valid list of seasons')
comparison_grid_names = config.getexpression(section_name,
'comparisonGrids')
if len(comparison_grid_names) == 0:
raise ValueError(f'config section {section_name} does not contain '
f'valid list of comparison grids')
depth_ranges = config.getexpression('climatologyMapOHCAnomaly',
'depthRanges',
use_numpyfunc=True)
By default, these config options look like this:
[climatologyMapOHCAnomaly]
## options related to plotting horizontally remapped climatologies of
## ocean heat content (OHC) against control model results (if available)
...
# Months or seasons to plot (Jan, Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct,
# Nov, Dec, JFM, AMJ, JAS, OND, ANN)
seasons = ['ANN']
# comparison grid(s) ('latlon', 'antarctic') on which to plot analysis
comparisonGrids = ['latlon']
# A list of pairs of minimum and maximum depths (positive up, in meters) to
# include in the vertical sums. The default values are the equivalents of the
# default ranges of the timeSeriesOHCAnomaly task, with a value of -10,000 m
# intended to be well below the bottom of the ocean for all existing MPAS-O
# meshes.
depthRanges = [(0.0, -10000.0), (0.0, -700.0), (-700.0, -2000.0), (-2000.0, -10000.0)]
We plot only the annual mean OHC anomaly and we plot it only on a global latitude-longitude grid. The range of depths is:
the full ocean column
sea surface to 700 m depth
700 m to 2000 m depth
2000 m to the seafloor
A user would be free to change any of these config options, and the analysis should run correctly. They could choose to plot on a different comparison grid, add new seasons, or change the depth range. As long as they ran the analysis in a fresh directory (or purged output from a previous analysis run), this should work correctly.
Next, we store some values that will be useful later:
mpas_field_name = 'deltaOHC'
variable_list = ['timeMonthly_avg_activeTracers_temperature',
'timeMonthly_avg_layerThickness']
This particular analysis involves 4 different depth ranges over which we
compute the ocean heat content. The remainder of the analysis is performed
separately for each of these depth ranges in subtask. We loop over the
depth range and add a subtask that will first compute the ocean heat content
(OHC) and then remap it to the comparison grids (RemapMpasOHCClimatology
):
for min_depth, max_depth in depth_ranges:
depth_range_string = \
f'{np.abs(min_depth):g}-{np.abs(max_depth):g}m'
remap_climatology_subtask = RemapMpasOHCClimatology(
mpas_climatology_task=mpas_climatology_task,
ref_year_climatology_task=ref_year_climatology_task,
parent_task=self,
climatology_name=f'{field_name}_{depth_range_string}',
variable_list=variable_list,
comparison_grid_names=comparison_grid_names,
seasons=seasons,
min_depth=min_depth,
max_depth=max_depth)
self.add_subtask(remap_climatology_subtask)
...
We will explore the RemapMpasOHCClimatology
subtask later in the tutorial
so we will not discuss it further here.
Still within the loop over depth range, we then add a subtask
(PlotClimatologyMapSubtask
) for plot we want to create, one for each each
comparison grid and season. (By default, there is only one comparison grid
and one “season”: the full year, ANN
.)
for min_depth, max_depth in depth_ranges:
...
out_file_label = f'deltaOHC_{depth_range_string}'
remap_observations_subtask = None
if control_config is None:
ref_title_label = None
ref_field_name = None
diff_title_label = 'Model - Observations'
else:
control_run_name = control_config.get('runs', 'mainRunName')
ref_title_label = f'Control: {control_run_name}'
ref_field_name = mpas_field_name
diff_title_label = 'Main - Control'
for comparison_grid_name in comparison_grid_names:
for season in seasons:
# make a new subtask for this season and comparison grid
subtask_name = f'plot{season}_{comparison_grid_name}_{depth_range_string}'
subtask = PlotClimatologyMapSubtask(
self, season, comparison_grid_name,
remap_climatology_subtask, remap_observations_subtask,
controlConfig=control_config, subtaskName=subtask_name)
subtask.set_plot_info(
outFileLabel=out_file_label,
fieldNameInTitle=f'$\\Delta$OHC over {depth_range_string}',
mpasFieldName=mpas_field_name,
refFieldName=ref_field_name,
refTitleLabel=ref_title_label,
diffTitleLabel=diff_title_label,
unitsLabel=r'GJ m$^{-2}$',
imageCaption=f'Anomaly in Ocean Heat Content over {depth_range_string}',
galleryGroup='OHC Anomaly',
groupSubtitle=None,
groupLink='ohc_anom',
galleryName=None)
self.add_subtask(subtask)
First, we make sure the subtask has a unique name. If two tasks or subtasks
have the same taskName
and subtaskName
, MPAS-Analysis will only run
the last one and the task manager may become confused.
Then, we create a subtask
object that is an instance of the
PlotClimatologyMapSubtask
class. This class is shared between several ocean analysis tasks for plotting
climatologies as horizontal maps. It can plot just MPAS output, remapped to
one or more comparison grids and averaged over one or more seasons. It can
also plot that data against an observational field that has been remapped to
the same comparison grid and averaged over the same seasons. In this case,
there are no observations available for comparison
(remap_observations_subtask = None
). A user may have provided a
“control” run of MPAS-Analysis to compare with this analysis run (a so-called
“model vs. model” comparison). If so, control_config
will have config
options describing the other analysis run. If not, control_config
is
None
.
Next, We call the
set_plot_info()
method of PlotClimatologyMapSubtask
to provide things like the title and units for the plot and the field to plot.
We also provide information needed for the final analysis web page such as the
name of the gallery group. (We do not provide a gallery name within the
gallery group because there will be no other galleries within this group.)
All the plots for a given comparison grid will end up in the same gallery,
with different depths and seasons one after the other.
Finally, we call add_subtask()
to add the subtask
to this task.
2.3 setup_and_check()
method¶
The setup_and_check()
method of an analysis task is called when it is clear
that this particular analysis has been requested (but before the analysis is
actually ready to run). This is in contrast to the constructor, which is
run for every analysis task everytime MPAS-Analysis runs because we need
information from the analysis task (its name, component and tags) in order to
determine if it should run or not.
In this method, we would typically perform checks to make sure the simulation has been configured properly to run the analysis. For example, is the necessary analysis member enabled.
def setup_and_check(self):
"""
Checks whether analysis is being performed only on the reference year,
in which case the analysis will not be meaningful.
Raises
------
ValueError: if attempting to analyze only the reference year
"""
# first, call setup_and_check from the base class (AnalysisTask),
# which will perform some common setup, including storing:
# self.runDirectory , self.historyDirectory, self.plotsDirectory,
# self.namelist, self.runStreams, self.historyStreams,
# self.calendar
super().setup_and_check()
start_year, end_year = self.mpas_climatology_task.get_start_and_end()
ref_start_year, ref_end_year = \
self.ref_year_climatology_task.get_start_and_end()
if (start_year == ref_start_year) and (end_year == ref_end_year):
raise ValueError('OHC Anomaly is not meaningful and will not work '
'when climatology and ref year are the same.')
In this particular case, we first call the super class’ version of the
setup_and_check()
method. This
takes care of some important setup.
Then, we use this method to check if the user has specified meaningful values for the climatology start and end year and the reference year. If they happen to be the same, it doesn’t really make sense to run the analysis and it will raise an error so the analysis gets skipped.
The ClimatologyMapOHCAnomaly
has delegated all its work to its subtasks
so it doesn’t define a run_task()
method. Tasks or subtasks that actually
do the work typically need to define this method, as we will explore below.
3. Tour of a subtask (RemapMpasOHCClimatology
)¶
The class RemapMpasOHCClimatology
is, in some ways, more complicated than
its “parent” task ClimatologyMapOHCAnomaly
.
It descends not from the AnalysisTask
base
class but from another subtask,
RemapMpasClimatologySubtask
.
This tutorial won’t attempt to cover
RemapMpasClimatologySubtask
in
all its detail. The basics are that that class starts with MPAS climatology
data over one or more seasons
that has previously been computed by an
MpasClimatologyTask
task. It
remaps that data from the MPAS mesh to one or more comparison grids (e.g.
global latitude-longitude or Antarctic stereographic) where it can be plotted
and compared with observations or another MPAS-Analysis run.
Here, we are not just using
RemapMpasClimatologySubtask
directly because we need to add to its functionality. We need to compute the
OHC, which is not available straight from MPAS-Ocean output, from the
monthly-mean temperature and layer thickness.
3.1 Attributes¶
The docstring indicates the attributes that RemapMpasOHCClimatology
includes. (It also has all the attributes of its super class,
RemapMpasClimatologySubtask
,
and that class’ super class, AnalysisTask
,
but we don’t redundantly document these in the docstring in part because that
would be a maintenance nightmare.)
class RemapMpasOHCClimatology(RemapMpasClimatologySubtask):
"""
A subtask for computing climatologies of ocean heat content from
climatologies of temperature
Attributes
----------
ref_year_climatology_task : mpas_analysis.shared.climatology.RefYearMpasClimatologyTask
The task that produced the climatology from the first year to be
remapped and then subtracted from the main climatology
min_depth, max_depth : float
The minimum and maximum depths for integration
"""
The attributes are a task for computing the climatology over the reference
year (usually the start of the simulation), ref_year_climatology_task
,
and the minimum and maximum depth over which the ocean heat content will be
integrated.
3.2 Constructor¶
def __init__(self, mpas_climatology_task, ref_year_climatology_task,
parent_task, climatology_name, variable_list, seasons,
comparison_grid_names, min_depth, max_depth):
"""
Construct the analysis task and adds it as a subtask of the
``parent_task``.
Parameters
----------
mpas_climatology_task : mpas_analysis.shared.climatology.MpasClimatologyTask
The task that produced the climatology to be remapped
ref_year_climatology_task : mpas_analysis.shared.climatology.RefYearMpasClimatologyTask
The task that produced the climatology from the first year to be
remapped and then subtracted from the main climatology
parent_task : mpas_analysis.shared.AnalysisTask
The parent task, used to get the ``taskName``, ``config`` and
``componentName``
climatology_name : str
A name that describes the climatology (e.g. a short version of
the important field(s) in the climatology) used to name the
subdirectories for each stage of the climatology
variable_list : list of str
A list of variable names in ``timeSeriesStatsMonthly`` to be
included in the climatologies
seasons : list of str, optional
A list of seasons (keys in ``shared.constants.monthDictionary``)
to be computed or ['none'] (not ``None``) if only monthly
climatologies are needed.
comparison_grid_names : list of {'latlon', 'antarctic'}
The name(s) of the comparison grid to use for remapping.
min_depth, max_depth : float
The minimum and maximum depths for integration
"""
depth_range_string = f'{np.abs(min_depth):g}-{np.abs(max_depth):g}m'
subtask_name = f'remapMpasClimatology_{depth_range_string}'
# call the constructor from the base class
# (RemapMpasClimatologySubtask)
super().__init__(
mpas_climatology_task, parent_task, climatology_name,
variable_list, seasons, comparison_grid_names,
subtaskName=subtask_name)
self.ref_year_climatology_task = ref_year_climatology_task
self.run_after(ref_year_climatology_task)
self.min_depth = min_depth
self.max_depth = max_depth
Most of the arguments to the constructor are passed along to the constructor
of RemapMpasClimatologySubtask
.
These include a reference to the class for computing MPAS climatologies
(used to find the input files and to make sure this task waits until that
task is finished), a reference to the “parent”
ClimatologyMapOHCAnomaly
task for some of its
attributes, the name of the climatology supplied by the parent (something like
deltaOHC_0-700m
, depending on the depth range), a list of the variables
that go into computing the OHC, the season(s) over which the climatology was
requested, the comparison grid(s) to plot on and a unique name for this
subtask.
The ref_year_climatology_task
that computing the climatology over the
reference year is retained as an attribute to the class along with
the depth range. These attributes will all be needed later when we compute the
OHC. We indicate that this task must wait for the reference climatology to be
available by calling the run_after()
.
The super class will do the same for the mpas_climatology_task
task. It
will also add this task as a subtask of the parent task.
3.3 setup_and_check()
method¶
As in the parent task, we need to define the setup_and_check()
method.
def setup_and_check(self):
"""
Perform steps to set up the analysis and check for errors in the setup.
"""
# first, call setup_and_check from the base class
# (RemapMpasClimatologySubtask), which will set up remappers and add
# variables to mpas_climatology_task
super().setup_and_check()
# don't add the variables and seasons to mpas_climatology_task until
# we're sure this subtask is supposed to run
self.ref_year_climatology_task.add_variables(self.variableList,
self.seasons)
In this particular case, we first call the super class’ version of the
setup_and_check()
method. This takes care of some important setup including adding the variables
and season(s) we need to the mpas_climatology_task
.
Then, we use this method to add variables we need
and the requested season(s) to the task for computing the climatology over the
reference year (ref_year_climatology_task
). We don’t do this in the
constructor because if we did, we would always be asking for the variables
needed to compute the OHC even if we don’t actually end up computing it. This
could be a big waste of time and disk space. The super class
RemapMpasClimatologySubtask
can’t
take care of this for us because it isn’t designed for computing anomalies,
just “normal” climatologies over a range of years.
3.4 run_task()
method¶
Normally, the main work of a task happens in the run_task()
method.
The RemapMpasOHCClimatology
class doesn’t define this method because it is
happy to inherit the
run_task()
method from its super class,
RemapMpasClimatologySubtask
.
An abbreviated version of that method looks like this:
def run_task(self):
"""
Compute the requested climatologies
"""
...
for season in self.seasons:
self._mask_climatologies(season, dsMask)
...
It calls a private helper method:
def _mask_climatologies(self, season, dsMask):
"""
For each season, creates a masked version of the climatology
"""
...
if not os.path.exists(maskedClimatologyFileName):
...
# customize (if this function has been overridden)
climatology = self.customize_masked_climatology(climatology,
season)
write_netcdf(climatology, maskedClimatologyFileName)
This private method (the leading underscore indicates that it is private), in
turn, calls the customize_masked_climatology()
method, which is our chance
to make changes to the climatology before it gets remapped. That’s where
we will actually compute the OHC from variables available from MPAS output.
3.5 customize_masked_climatology()
method¶
Here is how we compute the OHC itself:
def customize_masked_climatology(self, climatology, season):
"""
Compute the ocean heat content (OHC) anomaly from the temperature
and layer thickness fields.
Parameters
----------
climatology : xarray.Dataset
the climatology data set
season : str
The name of the season to be masked
Returns
-------
climatology : xarray.Dataset
the modified climatology data set
"""
ohc = self._compute_ohc(climatology)
...
We call a private helper method to do the actual work, so let’s take a look
at that before we continue with customize_masked_climatology()
.
def _compute_ohc(self, climatology):
"""
Compute the OHC from the temperature and layer thicknesses in a given
climatology data sets.
"""
ds_restart = xr.open_dataset(self.restartFileName)
ds_restart = ds_restart.isel(Time=0)
# specific heat [J/(kg*degC)]
cp = self.namelist.getfloat('config_specific_heat_sea_water')
# [kg/m3]
rho = self.namelist.getfloat('config_density0')
units_scale_factor = 1e-9
n_vert_levels = ds_restart.sizes['nVertLevels']
z_mid = compute_zmid(ds_restart.bottomDepth, ds_restart.maxLevelCell-1,
ds_restart.layerThickness)
vert_index = xr.DataArray.from_dict(
{'dims': ('nVertLevels',), 'data': np.arange(n_vert_levels)})
temperature = climatology['timeMonthly_avg_activeTracers_temperature']
layer_thickness = climatology['timeMonthly_avg_layerThickness']
masks = [vert_index < ds_restart.maxLevelCell,
z_mid <= self.min_depth,
z_mid >= self.max_depth]
for mask in masks:
temperature = temperature.where(mask)
layer_thickness = layer_thickness.where(mask)
ohc = units_scale_factor * rho * cp * layer_thickness * temperature
ohc = ohc.sum(dim='nVertLevels')
return ohc
This function uses a combination of mesh information taken from an MPAS
restart file (available from the self.restartFileName
attribute inherited
from RemapMpasClimatologySubtask
),
namelist options available from the self.namelist
reader (inherited from
AnalysisTask
), and temperature
and
layer_thickness
from the climatology
dataset itself. As the
docstring for customize_masked_climatology()
states, climatology
is
and xarray.Dataset
. We know it has variables
timeMonthly_avg_activeTracers_temperature
and
timeMonthly_avg_layerThickness
because we requested them back in the
constructor of ClimatologyMapOHCAnomaly
.
We compute the ohc
as an xarray.DataArray
that we return from
this helper method.
Back to customize_masked_climatology()
, we have:
def customize_masked_climatology(self, climatology, season):
...
ohc = self._compute_ohc(climatology)
ref_file_name = self.ref_year_climatology_task.get_file_name(season)
ref_year_climo = xr.open_dataset(ref_file_name)
if 'Time' in ref_year_climo.dims:
ref_year_climo = ref_year_climo.isel(Time=0)
ref_ohc = self._compute_ohc(ref_year_climo)
climatology['deltaOHC'] = ohc - ref_ohc
climatology.deltaOHC.attrs['units'] = 'GJ m^-2$'
start_year = self.ref_year_climatology_task.startYear
climatology.deltaOHC.attrs['description'] = \
f'Anomaly from year {start_year} in ocean heat content'
climatology = climatology.drop_vars(self.variableList)
return climatology
We use the same helper function to compute the ref_ohc
using the
climatology for the reference year. Then, we compute the anomaly (the
difference between these two, deltaOHC
) and we add some attributes,
units
and description
, to make the NetCDF output that will go into the
analysis output directory a little more useful.
4. The full code for posterity¶
Since the ClimatologyMapOHCAnomaly
analysis task may evolve in the future,
here is the full analysis task as described in this tutorial:
# This software is open source software available under the BSD-3 license.
#
# Copyright (c) 2022 Triad National Security, LLC. All rights reserved.
# Copyright (c) 2022 Lawrence Livermore National Security, LLC. All rights
# reserved.
# Copyright (c) 2022 UT-Battelle, LLC. All rights reserved.
#
# Additional copyright and license information can be found in the LICENSE file
# distributed with this code, or at
# https://raw.githubusercontent.com/MPAS-Dev/MPAS-Analysis/main/LICENSE
import xarray as xr
import numpy as np
from mpas_analysis.shared import AnalysisTask
from mpas_analysis.shared.climatology import RemapMpasClimatologySubtask
from mpas_analysis.ocean.plot_climatology_map_subtask import \
PlotClimatologyMapSubtask
from mpas_analysis.ocean.utility import compute_zmid
class ClimatologyMapOHCAnomaly(AnalysisTask):
"""
An analysis task for comparison of the anomaly from a reference year
(typically the start of the simulation) of ocean heat content (OHC)
Attributes
----------
mpas_climatology_task : mpas_analysis.shared.climatology.MpasClimatologyTask
The task that produced the climatology to be remapped and plotted
ref_year_climatology_task : mpas_analysis.shared.climatology.RefYearMpasClimatologyTask
The task that produced the climatology from the first year to be
remapped and then subtracted from the main climatology
"""
def __init__(self, config, mpas_climatology_task,
ref_year_climatology_task, control_config=None):
"""
Construct the analysis task.
Parameters
----------
config : mpas_tools.config.MpasConfigParser
Configuration options
mpas_climatology_task : mpas_analysis.shared.climatology.MpasClimatologyTask
The task that produced the climatology to be remapped and plotted
ref_year_climatology_task : mpas_analysis.shared.climatology.RefYearMpasClimatologyTask
The task that produced the climatology from the first year to be
remapped and then subtracted from the main climatology
control_config : mpas_tools.config.MpasConfigParser, optional
Configuration options for a control run (if any)
"""
field_name = 'deltaOHC'
# call the constructor from the base class (AnalysisTask)
super().__init__(config=config, taskName='climatologyMapOHCAnomaly',
componentName='ocean',
tags=['climatology', 'horizontalMap', field_name,
'publicObs', 'anomaly'])
self.mpas_climatology_task = mpas_climatology_task
self.ref_year_climatology_task = ref_year_climatology_task
section_name = self.taskName
# read in what seasons we want to plot
seasons = config.getexpression(section_name, 'seasons')
if len(seasons) == 0:
raise ValueError(f'config section {section_name} does not contain '
f'valid list of seasons')
comparison_grid_names = config.getexpression(section_name,
'comparisonGrids')
if len(comparison_grid_names) == 0:
raise ValueError(f'config section {section_name} does not contain '
f'valid list of comparison grids')
depth_ranges = config.getexpression('climatologyMapOHCAnomaly',
'depthRanges',
use_numpyfunc=True)
mpas_field_name = 'deltaOHC'
variable_list = ['timeMonthly_avg_activeTracers_temperature',
'timeMonthly_avg_layerThickness']
for min_depth, max_depth in depth_ranges:
depth_range_string = \
f'{np.abs(min_depth):g}-{np.abs(max_depth):g}m'
remap_climatology_subtask = RemapMpasOHCClimatology(
mpas_climatology_task=mpas_climatology_task,
ref_year_climatology_task=ref_year_climatology_task,
parent_task=self,
climatology_name=f'{field_name}_{depth_range_string}',
variable_list=variable_list,
comparison_grid_names=comparison_grid_names,
seasons=seasons,
min_depth=min_depth,
max_depth=max_depth)
self.add_subtask(remap_climatology_subtask)
out_file_label = f'deltaOHC_{depth_range_string}'
remap_observations_subtask = None
if control_config is None:
ref_title_label = None
ref_field_name = None
diff_title_label = 'Model - Observations'
else:
control_run_name = control_config.get('runs', 'mainRunName')
ref_title_label = f'Control: {control_run_name}'
ref_field_name = mpas_field_name
diff_title_label = 'Main - Control'
for comparison_grid_name in comparison_grid_names:
for season in seasons:
# make a new subtask for this season and comparison grid
subtask_name = f'plot{season}_{comparison_grid_name}_{depth_range_string}'
subtask = PlotClimatologyMapSubtask(
self, season, comparison_grid_name,
remap_climatology_subtask, remap_observations_subtask,
controlConfig=control_config, subtaskName=subtask_name)
subtask.set_plot_info(
outFileLabel=out_file_label,
fieldNameInTitle=f'$\\Delta$OHC over {depth_range_string}',
mpasFieldName=mpas_field_name,
refFieldName=ref_field_name,
refTitleLabel=ref_title_label,
diffTitleLabel=diff_title_label,
unitsLabel=r'GJ m$^{-2}$',
imageCaption=f'Anomaly in Ocean Heat Content over {depth_range_string}',
galleryGroup='OHC Anomaly',
groupSubtitle=None,
groupLink='ohc_anom',
galleryName=None)
self.add_subtask(subtask)
def setup_and_check(self):
"""
Checks whether analysis is being performed only on the reference year,
in which case the analysis will not be meaningful.
Raises
------
ValueError: if attempting to analyze only the reference year
"""
# first, call setup_and_check from the base class (AnalysisTask),
# which will perform some common setup, including storing:
# self.runDirectory , self.historyDirectory, self.plotsDirectory,
# self.namelist, self.runStreams, self.historyStreams,
# self.calendar
super().setup_and_check()
start_year, end_year = self.mpas_climatology_task.get_start_and_end()
ref_start_year, ref_end_year = \
self.ref_year_climatology_task.get_start_and_end()
if (start_year == ref_start_year) and (end_year == ref_end_year):
raise ValueError('OHC Anomaly is not meaningful and will not work '
'when climatology and ref year are the same.')
class RemapMpasOHCClimatology(RemapMpasClimatologySubtask):
"""
A subtask for computing climatologies of ocean heat content from
climatologies of temperature
Attributes
----------
ref_year_climatology_task : mpas_analysis.shared.climatology.RefYearMpasClimatologyTask
The task that produced the climatology from the first year to be
remapped and then subtracted from the main climatology
min_depth, max_depth : float
The minimum and maximum depths for integration
"""
def __init__(self, mpas_climatology_task, ref_year_climatology_task,
parent_task, climatology_name, variable_list, seasons,
comparison_grid_names, min_depth, max_depth):
"""
Construct the analysis task and adds it as a subtask of the
``parent_task``.
Parameters
----------
mpas_climatology_task : mpas_analysis.shared.climatology.MpasClimatologyTask
The task that produced the climatology to be remapped
ref_year_climatology_task : mpas_analysis.shared.climatology.RefYearMpasClimatologyTask
The task that produced the climatology from the first year to be
remapped and then subtracted from the main climatology
parent_task : mpas_analysis.shared.AnalysisTask
The parent task, used to get the ``taskName``, ``config`` and
``componentName``
climatology_name : str
A name that describes the climatology (e.g. a short version of
the important field(s) in the climatology) used to name the
subdirectories for each stage of the climatology
variable_list : list of str
A list of variable names in ``timeSeriesStatsMonthly`` to be
included in the climatologies
seasons : list of str, optional
A list of seasons (keys in ``shared.constants.monthDictionary``)
to be computed or ['none'] (not ``None``) if only monthly
climatologies are needed.
comparison_grid_names : list of {'latlon', 'antarctic'}
The name(s) of the comparison grid to use for remapping.
min_depth, max_depth : float
The minimum and maximum depths for integration
"""
depth_range_string = f'{np.abs(min_depth):g}-{np.abs(max_depth):g}m'
subtask_name = f'remapMpasClimatology_{depth_range_string}'
# call the constructor from the base class
# (RemapMpasClimatologySubtask)
super().__init__(
mpas_climatology_task, parent_task, climatology_name,
variable_list, seasons, comparison_grid_names,
subtaskName=subtask_name)
self.ref_year_climatology_task = ref_year_climatology_task
self.run_after(ref_year_climatology_task)
self.min_depth = min_depth
self.max_depth = max_depth
def setup_and_check(self):
"""
Perform steps to set up the analysis and check for errors in the setup.
"""
# first, call setup_and_check from the base class
# (RemapMpasClimatologySubtask), which will set up remappers and add
# variables to mpas_climatology_task
super().setup_and_check()
# don't add the variables and seasons to mpas_climatology_task until
# we're sure this subtask is supposed to run
self.ref_year_climatology_task.add_variables(self.variableList,
self.seasons)
def customize_masked_climatology(self, climatology, season):
"""
Compute the ocean heat content (OHC) anomaly from the temperature
and layer thickness fields.
Parameters
----------
climatology : xarray.Dataset
the climatology data set
season : str
The name of the season to be masked
Returns
-------
climatology : xarray.Dataset
the modified climatology data set
"""
ohc = self._compute_ohc(climatology)
ref_file_name = self.ref_year_climatology_task.get_file_name(season)
ref_year_climo = xr.open_dataset(ref_file_name)
if 'Time' in ref_year_climo.dims:
ref_year_climo = ref_year_climo.isel(Time=0)
ref_ohc = self._compute_ohc(ref_year_climo)
climatology['deltaOHC'] = ohc - ref_ohc
climatology.deltaOHC.attrs['units'] = 'GJ m^-2'
start_year = self.ref_year_climatology_task.startYear
climatology.deltaOHC.attrs['description'] = \
f'Anomaly from year {start_year} in ocean heat content'
climatology = climatology.drop_vars(self.variableList)
return climatology
def _compute_ohc(self, climatology):
"""
Compute the OHC from the temperature and layer thicknesses in a given
climatology data sets.
"""
ds_restart = xr.open_dataset(self.restartFileName)
ds_restart = ds_restart.isel(Time=0)
# specific heat [J/(kg*degC)]
cp = self.namelist.getfloat('config_specific_heat_sea_water')
# [kg/m3]
rho = self.namelist.getfloat('config_density0')
units_scale_factor = 1e-9
n_vert_levels = ds_restart.sizes['nVertLevels']
z_mid = compute_zmid(ds_restart.bottomDepth, ds_restart.maxLevelCell-1,
ds_restart.layerThickness)
vert_index = xr.DataArray.from_dict(
{'dims': ('nVertLevels',), 'data': np.arange(n_vert_levels)})
temperature = climatology['timeMonthly_avg_activeTracers_temperature']
layer_thickness = climatology['timeMonthly_avg_layerThickness']
masks = [vert_index < ds_restart.maxLevelCell,
z_mid <= self.min_depth,
z_mid >= self.max_depth]
for mask in masks:
temperature = temperature.where(mask)
layer_thickness = layer_thickness.where(mask)
ohc = units_scale_factor * rho * cp * layer_thickness * temperature
ohc = ohc.sum(dim='nVertLevels')
return ohc