Articles by Luojie D

Here's how to derive new experiment group(s) based on a complete control group experiment in MPAS.

  1. Even before the control group experiments, prepare a directory with a skeleton model. This could probably include:

    • The model
    • WRF physics
    • Grid file and static file.
    • Possibly scripts for conveniently submitting as a job
    • You may want to delete the namelists, but keep the streams and the streamlists.
  2. Copy the skeleton directory over. It’s probably a good idea to use cp -L (dereferencing the links) because with this you get the actual model binaries copied. So it should be faster if you’re running multiple experiments at the same time. (*not benchmarked)
  3. Copy over the namelist of the control group. Make appropriate edits.
  4. Link/Copy over the init of the control group.
  5. Link/Copy over the sfc_update of the control group
  6. Link/Copy over the LBCs of the control group.
Basically, what you would do with init_atmosphere mode 7, 8 and 9. You don't need to use WPS on the downloaded data again.
  1. Copy any other input streams (e.g. FDDA).

The simple idea of calculating hourly total precipitation data from MPAS is that, start by calculating the differences in the rainc and rainnc fields, summing them up, and interpolate onto a lon-lat grid. (Assuming the diag files are output hourly)

3C's: Calculate diffs, Concatenate and Convert

1. First, we do the diffs using cdo.

Added parallel running support using python's subprocess.

In <project>/public/precip/call_precip.py:

#!/usr/bin/env python3

import sys
from datetime import datetime, timedelta
import subprocess
from concurrent.futures import ProcessPoolExecutor, as_completed
import os

def run_cdo_task(t1, t2):
    file1 = f"diag.{t1.strftime('%Y-%m-%d_%H')}.00.00.nc"
    file2 = f"diag.{t2.strftime('%Y-%m-%d_%H')}.00.00.nc"
    outfile = f"diff_{t2.strftime('%Y-%m-%d_%H')}.nc"

    print(f"[{os.getpid()}] Processing {file1} -> {file2} => {outfile}")
    cmd = [
        "cdo", "-b", "F32", "-selname,rainc,rainnc",
        "-sub", file2, file1, outfile
    ]

    try:
        subprocess.run(cmd, check=True)
        return f"✓ {outfile} done"
    except subprocess.CalledProcessError as e:
        return f"✗ Failed: {file1} / {file2} ({e})"

def generate_task_times(start_str, end_str):
    start = datetime.strptime(start_str, "%Y-%m-%d_%H")
    end = datetime.strptime(end_str, "%Y-%m-%d_%H")

    tasks = []
    current = start
    while current + timedelta(hours=1) <= end:
        tasks.append((current, current + timedelta(hours=1)))
        current += timedelta(hours=1)
    return tasks

def main(start_str, end_str, max_workers=None):
    tasks = generate_task_times(start_str, end_str)

    with ProcessPoolExecutor(max_workers=max_workers) as executor:
        future_to_time = {executor.submit(run_cdo_task, t1, t2): (t1, t2) for t1, t2 in tasks}
        for future in as_completed(future_to_time):
            print(future.result())

if __name__ == "__main__":
    if len(sys.argv) < 3:
        print("Usage: ./calc_precip.py YYYY-MM-DD_HH YYYY-MM-DD_HH [max_workers]")
        sys.exit(1)

    start_time = sys.argv[1]
    end_time = sys.argv[2]
    max_workers = int(sys.argv[3]) if len(sys.argv) > 3 else None

    main(start_time, end_time, max_workers)

After this, we also need to

2. concat them using ncrcat tool in nco.

I've prepared a useful script to look for all the files and concat them.

In <project>/public/precip/concat_precip_v2.py:

#!/usr/bin/env python3
import os
import re
import argparse
from datetime import datetime, timedelta
import subprocess

parser = argparse.ArgumentParser(description="Concatenate NetCDF files with 1-hour intervals using ncrcat.")
parser.add_argument("output", help="Name of the output NetCDF file")
args = parser.parse_args()
output_file = args.output

pattern = re.compile(r"diff_(\d{4}-\d{2}-\d{2})_(\d{2})\.nc") # matches files like diff_2000-01-01_00.nc

files = [f for f in os.listdir('.') if pattern.match(f)]

file_datetimes = []
file_map = {}
for f in files:
    match = pattern.match(f)
    if match:
        date_str, hour_str = match.groups()
        dt = datetime.strptime(f"{date_str} {hour_str}", "%Y-%m-%d %H")
        file_datetimes.append(dt)
        file_map[dt] = f

if not file_datetimes:
    print("❌ No matching files found.")
    exit(1)

file_datetimes.sort()
missing = []
for i in range(1, len(file_datetimes)):
    expected = file_datetimes[i - 1] + timedelta(hours=1)
    if file_datetimes[i] != expected:
        dt = expected
        while dt < file_datetimes[i]:
            missing.append(dt)
            dt += timedelta(hours=1)

if missing:
    print("⚠️ Warning: Missing files for the following datetimes:")
    for m in missing:
        print("  ", m.strftime("%Y-%m-%d %H:%M"))
else:
    print("✅ All hours are continuous.")

input_files = [file_map[dt] for dt in file_datetimes]
try:
    subprocess.run(["ncrcat", "-O"] + input_files + [output_file], check=True)
except subprocess.CalledProcessError:
    print("❌ ncrcat failed. Make sure NCO is installed and files are valid.")
    exit(2)

# Report
print(f"\nStart: {file_datetimes[0]}")
print(f"End:   {file_datetimes[-1]}")
print(f"Output file written: {output_file}")
The job scripts to run for 1. and 2., <project>/all_calc_precip.sh and <project>/all_concat_precip.sh are generated using <notebooks>/.../figs/precip/Batch calc precip.ipynb. These are trivial and contain experiment information, thus are not disclosed.

3. convert_mpas

After Calculating the diffs and Concatenating the process, step 3 is to Convert onto lon-lat grid. Use the script <project>/precip/convert-latlon.sh:

#!/bin/bash
# Note: not to be submitted. Load the modules first and run it interactively.

for file in *precip.nc; do
    # Skip if no .nc file exists
    [ -e "$file" ] || continue

    echo "Processing $file..."

    convert_mpas ../EXAMPLE.static.nc  "$file"

    # Check if latlon.nc was created
    if [ -f latlon.nc ]; then
        # Extract base name without extension
        base="${file%.nc}"
        # Rename latlon.nc to something like originalfilename_latlon.nc
        mv latlon.nc "${base}_latlon.nc"
        echo "Output saved as ${base}_latlon.nc"
    else
        echo "latlon.nc not found for $file"
    fi
done

Since convert_mpas is called, don't forget to put copies of include_fields and target_domain in the same directory, if you need them.

Now you should get the hourly data of tp.