The simple idea of calculating hourly total precipitation data from MPAS is that, start by calculating the differences in the rainc and rainnc fields, summing them up, and interpolate onto a lon-lat grid. (Assuming the diag files are output hourly)

3C's: Calculate diffs, Concatenate and Convert

1. First, we do the diffs using `cdo`.

Added parallel running support using python's subprocess.

In <project>/public/precip/call_precip.py:

#!/usr/bin/env python3

import sys
from datetime import datetime, timedelta
import subprocess
from concurrent.futures import ProcessPoolExecutor, as_completed
import os

def run_cdo_task(t1, t2):
    file1 = f"diag.{t1.strftime('%Y-%m-%d_%H')}.00.00.nc"
    file2 = f"diag.{t2.strftime('%Y-%m-%d_%H')}.00.00.nc"
    outfile = f"diff_{t2.strftime('%Y-%m-%d_%H')}.nc"

    print(f"[{os.getpid()}] Processing {file1} -> {file2} => {outfile}")
    cmd = [
        "cdo", "-b", "F32", "-selname,rainc,rainnc",
        "-sub", file2, file1, outfile
    ]

    try:
        subprocess.run(cmd, check=True)
        return f"✓ {outfile} done"
    except subprocess.CalledProcessError as e:
        return f"✗ Failed: {file1} / {file2} ({e})"

def generate_task_times(start_str, end_str):
    start = datetime.strptime(start_str, "%Y-%m-%d_%H")
    end = datetime.strptime(end_str, "%Y-%m-%d_%H")

    tasks = []
    current = start
    while current + timedelta(hours=1) <= end:
        tasks.append((current, current + timedelta(hours=1)))
        current += timedelta(hours=1)
    return tasks

def main(start_str, end_str, max_workers=None):
    tasks = generate_task_times(start_str, end_str)

    with ProcessPoolExecutor(max_workers=max_workers) as executor:
        future_to_time = {executor.submit(run_cdo_task, t1, t2): (t1, t2) for t1, t2 in tasks}
        for future in as_completed(future_to_time):
            print(future.result())

if __name__ == "__main__":
    if len(sys.argv) < 3:
        print("Usage: ./calc_precip.py YYYY-MM-DD_HH YYYY-MM-DD_HH [max_workers]")
        sys.exit(1)

    start_time = sys.argv[1]
    end_time = sys.argv[2]
    max_workers = int(sys.argv[3]) if len(sys.argv) > 3 else None

    main(start_time, end_time, max_workers)

After this, we also need to

2. concat them using `ncrcat` tool in `nco`.

I've prepared a useful script to look for all the files and concat them.

In <project>/public/precip/concat_precip_v2.py:

#!/usr/bin/env python3
import os
import re
import argparse
from datetime import datetime, timedelta
import subprocess

parser = argparse.ArgumentParser(description="Concatenate NetCDF files with 1-hour intervals using ncrcat.")
parser.add_argument("output", help="Name of the output NetCDF file")
args = parser.parse_args()
output_file = args.output

pattern = re.compile(r"diff_(\d{4}-\d{2}-\d{2})_(\d{2})\.nc") # matches files like diff_2000-01-01_00.nc

files = [f for f in os.listdir('.') if pattern.match(f)]

file_datetimes = []
file_map = {}
for f in files:
    match = pattern.match(f)
    if match:
        date_str, hour_str = match.groups()
        dt = datetime.strptime(f"{date_str} {hour_str}", "%Y-%m-%d %H")
        file_datetimes.append(dt)
        file_map[dt] = f

if not file_datetimes:
    print("❌ No matching files found.")
    exit(1)

file_datetimes.sort()
missing = []
for i in range(1, len(file_datetimes)):
    expected = file_datetimes[i - 1] + timedelta(hours=1)
    if file_datetimes[i] != expected:
        dt = expected
        while dt < file_datetimes[i]:
            missing.append(dt)
            dt += timedelta(hours=1)

if missing:
    print("⚠️ Warning: Missing files for the following datetimes:")
    for m in missing:
        print("  ", m.strftime("%Y-%m-%d %H:%M"))
else:
    print("✅ All hours are continuous.")

input_files = [file_map[dt] for dt in file_datetimes]
try:
    subprocess.run(["ncrcat", "-O"] + input_files + [output_file], check=True)
except subprocess.CalledProcessError:
    print("❌ ncrcat failed. Make sure NCO is installed and files are valid.")
    exit(2)

# Report
print(f"\nStart: {file_datetimes[0]}")
print(f"End:   {file_datetimes[-1]}")
print(f"Output file written: {output_file}")

The job scripts to run for 1. and 2., <project>/all_calc_precip.sh and <project>/all_concat_precip.sh are generated using <notebooks>/.../figs/precip/Batch calc precip.ipynb. These are trivial and contain experiment information, thus are not disclosed.

3. `convert_mpas`

After Calculating the diffs and Concatenating the process, step 3 is to Convert onto lon-lat grid. Use the script <project>/precip/convert-latlon.sh:

#!/bin/bash
# Note: not to be submitted. Load the modules first and run it interactively.

for file in *precip.nc; do
    # Skip if no .nc file exists
    [ -e "$file" ] || continue

    echo "Processing $file..."

    convert_mpas ../EXAMPLE.static.nc  "$file"

    # Check if latlon.nc was created
    if [ -f latlon.nc ]; then
        # Extract base name without extension
        base="${file%.nc}"
        # Rename latlon.nc to something like originalfilename_latlon.nc
        mv latlon.nc "${base}_latlon.nc"
        echo "Output saved as ${base}_latlon.nc"
    else
        echo "latlon.nc not found for $file"
    fi
done

Since convert_mpas is called, don't forget to put copies of include_fields and target_domain in the same directory, if you need them.

Now you should get the hourly data of tp.

Derive hourly total precipitation from MPAS output

1. First, we do the diffs using `cdo`.

2. concat them using `ncrcat` tool in `nco`.

3. `convert_mpas`

Add a new comment

Categories

Archives

Miscellaneous

Derive hourly total precipitation from MPAS output

1. First, we do the diffs using cdo.

2. concat them using ncrcat tool in nco.

3. convert_mpas