A ungrib.exe segmentation fault fix: WPS for MPAS IC
Problem
Previously, to use WPS on NSCC to prepare MPAS’s Initial Condition (IC), I would use
module load wps
module load wrf
cp -r /app/apps/wps <RELEVANT_PATH>Today, on a new account on the NSCC system, using the commands above, WPS's ungrib.exe didn’t work on the first try and returned an error:
Name of source model =>ECMWF
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
ungrib.exe 000000000047B4CA for__signal_handl Unknown Unknown
libpthread-2.28.s 00007F0BFCDC8B20 Unknown Unknown Unknown
ungrib.exe 0000000000420BA8 Unknown Unknown Unknown
ungrib.exe 0000000000413103 Unknown Unknown Unknown
ungrib.exe 000000000040BEA2 Unknown Unknown Unknown
libc-2.28.so 00007F0BFC692493 __libc_start_main Unknown Unknown
ungrib.exe 000000000040BDAE Unknown Unknown UnknownChecking the directory, all the PFILES have been generated. So ungrib has gone through the GRIB files at least once. It seems the 'integration' part went wrong.
Solution
The solution is to
- Load
wpsonly (version 3.9.1-b2), and also loadmkl/2024.0.
The system used to prompt the user to load mkl but not anymore. So I tried manually loading it.
module load wps
# module load wrf
module load mkl/2024.0- Add some common memory hacks (for the new environment)
As I am running using a new account that hasn’t run MPAS/WRF before, I also tried adding these lines to .bashrc :
ulimit -s unlimited # help prevent stack overflow
export MALLOC_CHECK_=0After applying the above changes, ungrib.exe ran successfully.
It is still unclear which modification was strictly necessary; further testing would be required to isolate the exact cause.
Tags
WPS, ungrib, SIGSEGV, segmentation fault, MPAS IC