Quick Start Guide¶
Analysis for simulations produced with Model for Prediction Across Scales (MPAS) components and the Energy Exascale Earth System Model (E3SM), which used those components.
Installation for users¶
MPAS-Analysis is available as an anaconda package via the
conda config --add channels conda-forge conda create -n mpas-analysis mpas-analysis conda activate mpas-analysis
Installation for developers¶
- To use the latest version for developers, get the code from:
Then, you will need to set up a conda environment from the MPAS-Analysis repo.
This environment will include the required dependencies for the development
dev-spec.txt and will install the
mpas_analysis package into
the conda environment in a way that points directly to the local branch (so
changes you make to the code directly affect
mpas_analysis in the conda
conda config --add channels conda-forge conda config --set channel_priority strict conda create -y -n mpas_dev --file dev-spec.txt conda activate mpas_dev python -m pip install -e .
If you are developing another conda package at the same time (this is common
for MPAS-Tools or geometric_features), you should first comment out the other
dev-spec.txt. Then, you can install both packages in the same
development environment, e.g.:
conda create -y -n mpas_dev --file tools/MPAS-Tools/conda_package/dev-spec.txt \ --file analysis/MPAS-Analysis/dev-spec.txt conda activate mpas_dev cd tools/MPAS-Tools/conda_package python -m pip install -e . cd ../../../analysis/MPAS-Analysis python -m pip install -e .
Obviously, the paths to the repos may be different in your local clones. With
mpas_dev environment as defined above, you can make changes to both
mpas-analysis packages in their respective branches, and
these changes will be reflected when refer to the packages or call their
respective entry points (command-line tools).
Download analysis input data¶
If you installed the
mpas-analysis package, download the data that is
necessary to MPAS-Analysis by running:
download_analysis_data -o /path/to/mpas_analysis/diagnostics
/path/to/mpas_analysis/diagnostics is the main folder that will contain
mpas_analysis, which includes mapping and region mask files for standard resolution MPAS meshes
observations, which includes the pre-processed observations listed in the Observations table and used to evaluate the model results
Once you have downloaded the analysis data, you will point to its location
(your equivalent of
path/to/mpas_analysis/diagnostics above) in the config
baseDirectory in the
If you installed the
mpas-analysis package, list the available analysis tasks
This lists all tasks and their tags. These can be used in the
command-line option or config option. See
for more details.
Running the analysis¶
Create and empty config file (say
example.cfg, or copy one of the example files in the
configsdirectory (if using a git repo) or download one from the example configs directory.
Either modify config options in your new file or copy and modify config options from
mpas_analysis/default.cfg(in a git repo) or directly from GitHub: default.cfg.
If you installed the
mpas_analysis myrun.cfg. This will read the configuration first from
mpas_analysis/default.cfgand then replace that configuration with any changes from from
If you want to run a subset of the analysis, you can either set the
[output]in your config file or use the
--generateflag on the command line. See the comments in
mpas_analysis/default.cfgfor more details on this option.
Requirements for custom config files:
At minimum you should set
[output]to the folder where output is stored. NOTE this value should be a unique directory for each run being analyzed. If multiple runs are analyzed in the same directory, cached results from a previous analysis will not be updated correctly.
Any options you copy into the config file must include the appropriate section header (e.g. ‘[run]’ or ‘[output]’)
You do not need to copy all options from
mpas_analysis/default.cfg. This file will automatically be used for any options you do not include in your custom config file.
You should not modify
List of MPAS output files that are needed by MPAS-Analysis:¶
mpaso.hist.am.timeSeriesStatsMonthly.*.nc(Note: since OHC anomalies are computed wrt the first year of the simulation, if OHC diagnostics is activated, the analysis will need the first full year of
mpaso.hist.am.timeSeriesStatsMonthly.*.ncfiles, no matter what
[timeSeries]/endYearare. This is especially important to know if short term archiving is used in the run to analyze: in that case, set
[input]/seaIceHistorySubdirectoryto the appropriate run and archive directories and choose
[timeSeries]/endYearto include only data that have been short-term archived).
mpaso.rst.0002-01-01_00000.nc(or any other mpas-o restart file)
mpasseaice.rst.0002-01-01_00000.nc(or any other mpas-seaice restart file)
Note: for older runs, mpas-seaice files will be named:
mpas-cice_inAlso, for older runs
mpaso_inwill be named:
Purge Old Analysis¶
To purge old analysis (delete the whole output directory) before running run
the analysis, add the
--purge flag. If you installed
a package, run:
mpas_analysis --purge <config.file>
All of the subdirectories listed in
output will be deleted along with the
climatology subdirectories in
It is a good policy to use the purge flag for most changes to the config file, for example, updating the start and/or end years of climatologies (and sometimes time series), changing the resolution of a comparison grid, renaming the run, changing the seasons over which climatologies are computed for a given task, updating the code to the latest version.
Cases where it is reasonable not to purge would be, for example, changing
options that only affect plotting (color map, ticks, ranges, font sizes, etc.),
rerunning with a different set of tasks specified by the
(though this will often cause climatologies to be re-computed with new
variables and may not save time compared with purging), generating only the
final website with
--html_only, and re-running after the simulation has
progressed to extend time series (however, not recommended for changing the
bounds on climatologies, see above).
Running in parallel via a queueing system¶
If you are running from a git repo:
If you are running from a git repo, copy the appropriate job script file from
configs/<machine_name>to the root directory (or another directory if preferred). The default script,
configs/job_script.default.bash, is appropriate for a laptop or desktop computer with multiple cores.
If using the
mpas-analysisconda package, download the job script and/or sample config file from the example configs directory.
Modify the number of parallel tasks, the run name, the output directory and the path to the config file for the run.
Note: the number of parallel tasks can be anything between 1 and the number of analysis tasks to be performed. If there are more tasks than parallel tasks, later tasks will simply wait until earlier tasks have finished.
Submit the job using the modified job script
If a job script for your machine is not available, try modifying the default
job script in
configs/job_script.default.bash or one of the job scripts for
another machine to fit your needs.
Customizing plots or creating new ones¶
There are three main ways to either customize the plots that MPAS-Analysis already makes or creating new ones:
customize the config file. Some features, such as colormaps and colorbar limits for color shaded plot or depth ranges for ocean region time series, can be customized: look at
mpas_analysis/default.cfgfor available customization for each analysis task.
read in the analysis data computed by MPAS-Analysis into custom scripts. When running MPAS-Analysis with the purpose of generating both climatologies and time series, the following data sets are generated:
[baseDirectory]/clim/mpas/avg/unmasked_[mpasMeshName]: MPAS-Ocean and MPAS-seaice climatologies on the native grid.
[baseDirectory]/clim/mpas/avg/remapped: remapped climatologies for each chosen task (climatology files are stored in different subdirectories according to the task name).
[baseDirectory]/clim/obs: observational climatologies.
[baseDirectory]/timeseries: various time series data. Custom scripts can then utilize these datasets to generate custom plots.
add a new analysis task to MPAS-Analysis (see below).
Instructions for creating a new analysis task¶
Analysis tasks can be found in a directory corresponding to each component,
mpas_analysis/ocean for MPAS-Ocean. Shared functionality is contained
create a new task by
copying mpas_analysis/analysis_task_template.pyto the appropriate folder (
sea_ice, etc.) and modifying it as described in the template. Take a look at
mpas_analysis/shared/analysis_task.pyfor additional guidance.
note, no changes need to be made to
mpas_analysis/default.cfg(and possibly any machine-specific config files in
import new analysis task in
add new analysis task to
build_analysis_list, see below.
A new analysis task can be added with:
This will add a new object of the
MyTask class to a list of analysis tasks
build_analysis_list. Later on in
run_analysis, it will first
go through the list to make sure each task needs to be generated
check_generate, which is defined in
setup_and_check on each task (to make sure the appropriate AM is
on and files are present), and will finally call
run on each task that is
to be generated and is set up properly.
Create a development environment as described above in “Installation for
developers”. Then run:
To generate the
sphinx documentation, run:
cd docs make clean make html
The results can be viewed in your web browser by opening: