|
|
WIP |
|
|
\ No newline at end of file |
|
|
Containers are isolated software environments containing all dependencies and tools required to run a particular piece of software. `Fre/2025.01` offers tools for creating containers and compiling and running models within them. In addition, users can use a separate container to post-processes model output with `Fre/2025.01`, and MED Division provides a Dockerfile that allow users to conduct containerized runs outside of the `Fre` workflow altogether. Note that MED does not yet support full retrospective runs outside of the `FRE` workflow, but future updates to this repo will add helper scripts to support this option. (WIP - push Dockerfile and related tools to github )
|
|
|
|
|
|
<!-- TODO: update the above to reflect the current status of non-fre runs -->
|
|
|
|
|
|
# NWA Retrospective Run and Post Processing with Containers in `Fre`
|
|
|
The latest version of `Fre` contains tools for building and running models in containers. Note that as of the time of writing this wiki page, `Fre/2025.01` is under active development and has not been officially been released yet, so the instructions below are subject to change.
|
|
|
|
|
|
## Building and Running the Model
|
|
|
The following sections assume that the following are true:
|
|
|
1.) You have access to Gaea or some other system with access to the `Fre` workflow.
|
|
|
2.) You have access to both `podman` and `apptainer/singularity` on your system. Note that on gaea, you must open a help desk ticket and be added to the list of users you can use `podman` in order to get access to the `podman` commmand
|
|
|
|
|
|
If you are running this workflow outside of gaea, note that the container images, `.tar`, and `.sif` files produced by `podman` and `apptainer` can be quite large - approximately 25 GB each.
|
|
|
|
|
|
|
|
|
### Setup Environment
|
|
|
First clone this repository to get access to the prewritting yamls and xmls:
|
|
|
```
|
|
|
git clone https://github.com/NOAA-GFDL/CEFI-regional-MOM6.git
|
|
|
```
|
|
|
|
|
|
Once the repo is cloned, navigated to the yamls directory:
|
|
|
```
|
|
|
cd CEFI-regional-MOM6/yamls/NWA12
|
|
|
```
|
|
|
|
|
|
Load the relevant modules to setup up your environment correctly:
|
|
|
```
|
|
|
module unuse /ncrc/home2/fms/local/modulefiles
|
|
|
module use /ncrc/home2/fms/local/modulefiles_test
|
|
|
module load fre/2025.01
|
|
|
```
|
|
|
You should now have access to the `fre` command, as well as the `fre make` sub command. If each of these commands work, you can begin compiling the model.
|
|
|
|
|
|
### Creating Container and Compiling Model
|
|
|
Begin by running the following command to create a checkout script - this is the script that will `git clone` the model components into the container so they can be compiled:
|
|
|
```
|
|
|
fre make create-checkout -y CEFI_NWA12_cobalt.yaml -p hpcme.2023 -t prod -npc
|
|
|
```
|
|
|
|
|
|
If this step completes successfully, you should see a new folder named `/tmp/hpcme.2023` containing the checkout script. Next you will need to create a `Makefile` to compile the model inside of the container, as well as a Dockerfile to create the container itself:
|
|
|
```
|
|
|
fre make create-makefile -y CEFI_NWA12_cobalt.yaml -p hpcme.2023 -t prod
|
|
|
fre make create-dockerfile -y CEFI_NWA12_cobalt.yaml -p hpcme.2023 -t prod
|
|
|
```
|
|
|
|
|
|
If the these steps are successful, you should now see a `Makefile` and a `execrunscript.sh` in the `/tmp/hpcme.2023` folder, as well as a `Dockerfile` and a `createContainer.sh` script in your current directory. The `Dockerfile` contains instructions on how to create the container and then compile the model within the container, while the `createContainer.sh` script contains commands to automate the process of building the container and saving it to both a `tar` file and a singularity image file that can be used to run the container/model. Simply run the script to begin this process:
|
|
|
```
|
|
|
./createContainer.sh
|
|
|
```
|
|
|
Note that you can have the script run automatically by adding the `--execute` flag to the `create-dockerfile` command above.
|
|
|
|
|
|
### Running the Model
|
|
|
Once the container has been built - this may take some time, depending on the availability of resources - you should now have `.tar` and `.sif` files named `mom6_sis2_generic_4p_compile_symm_yaml-prod`. The `.sif` file contains the `execrunscript.sh` created in the previous step, meaning it can now be used just like model executable. In order to conduct the full 27 year retrospective in the NWA domain using fre, first run:
|
|
|
```
|
|
|
module unload fre/2025.01
|
|
|
module load fre/bronx-23
|
|
|
```
|
|
|
|
|
|
This replaces the latest version of `fre` with a version that can perform container runs. Next navigate to the xmls directory:
|
|
|
```
|
|
|
cd ../../xmls/NWA12
|
|
|
```
|
|
|
|
|
|
To run the containerized model using fre, make the following changes to the xml:
|
|
|
1.) Change the `fre_version` variable from `bronx-22` to `bronx-23`
|
|
|
2.) Right below the line defining the `CEFI_NWA12_COBALT_V1` experiment, add a line pointing to the container:
|
|
|
```
|
|
|
<experiment name="CEFI_NWA12_COBALT_V1" inherit="MOM6_SIS2_GENERIC_4P_compile_symm">
|
|
|
<container file="/gpfs/f5/cefi/scratch/Utheri.Wagura/CEFI-regional-MOM6/yamls/NWA12/mom6_sis2_generic_4p_compile_symm_yaml-prod.sif"/>
|
|
|
<description>
|
|
|
```
|
|
|
|
|
|
Finally run `frerun` with the `--container` flag to run the experiment:
|
|
|
```
|
|
|
frerun -x CEFI_NWA12_cobalt.xml -p ncrc6.intel23 -t prod CEFI_NWA12_COBALT_V1 --container
|
|
|
```
|
|
|
If you are running on c5, be sure to change the platform to `ncrc5.intel23`
|
|
|
|
|
|
## Postprocessing Model Output
|
|
|
# NWA Retrospective Run Outside of `Fre` |
|
|
\ No newline at end of file |