Overview

compass is a python package. All of the code in the package can be accessed in one of two ways. The first is the command-line interface with commands like compass list and compass setup. The second way is through import commands like:

from compass.io import symlink


symlink('../initial_condition/initial_condition.nc', 'init.nc')

Before we dig into the details of how to develop new test cases and other infrastructure for compass, we first give a little bit of background on the design philosophy behind the package.

Code Style

All code is required to adhere fairly strictly to the PEP8 style guide. A bot will flag any PEP8 violations as part of each pull request to https://github.com/MPAS-Dev/compass. Please consider using an editor that automatically flags PEP8 violations during code development, such as pycharm or spyder, or a linter, such as flake8 or pep8. We discourage you from automatically reformatting your code (e.g. with autopep8) because this can often produce undesirable and confusing results.

The flake8 utility for linting python files to the PEP8 standard is included in the COMPASS conda environment. To use flake8, just run flake8 from any directory and it will return lint results for all files recursively through all subdirectories. You can also run it for a single file or using wildcards (e.g., flake8 *.py). There also is a vim plugin that runs the flake8 linter from within vim. If you are not using an IDE that lints automatically, it is recommended you run flake8 from the command line or the vim plugin before committing your code changes.

Packages and Modules

Why a python package? That sounds complicated.

Some of the main advantages of compass being a package instead of a group of scripts (as was the case for Legacy COMPASS) are that:

  1. it is a lot easier to share code between test cases;

  2. there is no need to create symlinks to individual scripts or use subprocess calls to run one python script from within another;

  3. functions within compass modules and subpackages have relatively simple interfaces that are easier to document and understand than the arguments passed into a script; and

  4. releases of the compass package would make it easy for developers of other python packages and scripts to use our code (though there are not yet any “downstream” packages that use compass).

This documentation won’t try to provide a whole tutorial on python packages, modules and classes but we know most developers won’t be too clued in on these concepts so here’s a short intro.

Packages

A python package is a directory that has a file called __init__.py. That file can be empty or it can have code in it. If it has functions or classes inside of it, they act like they’re directly in the package. As an example, the compass file compass/ocean/__init__.py has a class compass.ocean.Ocean() that looks like this (with the docstrings stripped out):

class Ocean(MpasCore):
    def __init__(self):
        super().__init__(name='ocean')

        self.add_test_group(BaroclinicChannel(mpas_core=self))
        self.add_test_group(GlobalOcean(mpas_core=self))
        self.add_test_group(IceShelf2d(mpas_core=self))
        self.add_test_group(Ziso(mpas_core=self))

This class contains all of the ocean test groups, which contain all the ocean test cases and their steps. The details aren’t important. The point is that the class can be imported like so:

from compass.ocean import Ocean


ocean = Ocean()

So you don’t ever refer to __init__.py, it’s like a hidden shortcut so the its contents can be referenced with just the subdirectory (package) name.

A package can contain other packages and modules (we’ll discuss these in just a second). For example, the ocean package mentioned above is inside the compass package. The sequence of dots in the import is how you find your way from the root (compass for this package) into subpackages and modules. It’s similar to the / characters in a unix directory.

Modules

Modules are just python files that aren’t scripts. Since you can often treat scripts like modules, even that distinction isn’t that exact. But for the purposes of the compass package, every single file ending in .py in the compass package is a module (except maybe the __init__.py, not sure about those…).

As an example, the compass package contains a module list.py. There’s a function compass.list.list_machines() in that module:

def list_machines():
    machine_configs = contents('compass.machines')
    print('Machines:')
    for config in machine_configs:
        if config.endswith('.cfg'):
            print('   {}'.format(os.path.splitext(config)[0]))

It lists the supported machines. You would import this function just like in the package example above:

from compass.list import list_machines


list_machines()

So a module named foo.py and a package in a directory named foo with an __init__.py file look exactly the same when you import them.

So why choose one over the other?

The main reason to go with a package over a module is if you need to include other files (such as other modules and packages, but also other things like Config Files, namelists and streams files). It’s always pretty easy to make a module into a package (by making a directory with the name of the package, moving the module in, an renaming it __init__.py) or visa versa (by renaming __init__.py to the module name, moving it up a directory, and deleting the subdirectory).

Classes

In the process of developing MPAS-Analysis, we found that many of our developers were not very comfortable with classes, methods, inheritance and other concepts related to object-oriented programming. In MPAS-Analysis, tasks are implemented as classes to make it easier to use python’s multiprocessing capability. In practice, this led to code that was complex enough that only a handful of developers felt comfortable contributing directly to the code.

Based on this experience, we were hesitant to use classes in compass and tried an implementation without them. This led to a clumsy set of functions and python dictionaries that was equally complex but harder to understand and document than classes.

The outcome of this experience is that we have used classes to define MPAS cores, test groups, test cases and steps. Each MPAS core will “descend” from the compass.MpasCore base class; each test groups descends from compass.TestGroup; each test case descends from compass.TestCase; and each steps descends from compass.Step. These base classes contain functionality that can be shared with the “child” classes that descend from them and also define a few “methods” (functions that belong to a class) that the child class is meant to “override” (replace with their own version of the function, or augment by replacing the function and then calling the base class’s version of the same function).

We will provide a tutorial on how to add new MPAS cores, test groups, test cases and steps in the near future that will explain the main features of classes that developers need to know about. Until that is available, we hope that the examples currently in the package can provide a starting point.

Code sharing

Very nearly all of the code in Legacy COMPASS was in the form of python scripts. A significant amount of external code was also in this form. A test case was composed of XML files, and python scripts parsed these XML files to produce other python scripts to run the test case. These scripts were dense. The XML files had a unique syntax that made the learning curve for Legacy COMPASS pretty high. Errors in syntax were often hard to understand because the script-generating scripts were difficult to read and understand.

The compass package is also dense and will have a learning curve. We hope the python package approach is worth it because the skills learned to work with it will be more broadly applicable than those required for Legacy COMPASS. In developing compass we endeavor to increase code readability and code sharing in a number of ways.

In compass framework

The compass framework (modules and packages not in the MPAS-core packages) has a lot of code that is shared across existing test cases and could be very useful for future ones.

Most of the framework currently has roughly the same functionality as Legacy COMPASS, but it has been broken into more modules that make it clear what functionality each contains, e.g. compass.namelists and compass.streams are for manipulating namelist and streams files, respectively; compass.io has functionality for downloading files from the LCRC server and creating symlinks; and compass.validation can be used to ensure that variables are bit-for-bit identical between steps or when compared with a baseline, and to compare timers with a baseline. This functionality was all included in 4 very long scripts in Legacy COMPASS.

One example that doesn’t have a clear analog in Legacy COMPASS is the compass.parallel module. It contains a function compass.parallel.get_available_cores_and_nodes() that can find out the number of total cores and nodes available for running steps.

Within an MPAS core

Legacy COMPASS shared functionality within a MPAS core by having scripts at the core level that were linked within test cases and which took command-line arguments that function roughly the same way as function arguments. But these scripts were not able to share any code between them unless it is from mpas_tools or another external python package.

An MPAS core in compass could, theoretically, build out functionality as complex as in the MPAS components themselves. This has already been accomplished for several of the idealized test cases included in compass.

The shared functionality in the Ocean core is described in Ocean framework.

Within a test group

So far, the most common type of shared code within test group are modules defining steps that are used in multiple test cases. For example, the baroclinic_channel configuration uses shared modules to define the initial_state and forward steps of each test case. Configurations also often include namelist and streams files with replacements to use across test cases.

In addition to shared steps, the global_ocean configuration includes some additional shared framework described in Framework.

The shared code in global_ocean has made it easy to define 138 different test cases using the QU240 (or QUwISC240) mesh. This is possible because the same conceptual test (e.g. restart) can be defined:

  • with or without ice-shelf cavities

  • with the WOA23, PHC or EN4 1900 initial conditions

  • with the RK4 or split-explicit time integrators

Within a test case

The main way code is currently reused with a test case is when the same module for a step gets used multiple times within a test case. For example, the rpe_test test case uses the same forward run with 5 different values of the viscosity.