[AMD Official Use Only]
Dear Dr. Wallcraft,
I was investigating benchmarking HYCOM on our latest EPYC CPUs and refreshing our white paper.
We last benchmarked HYCOM in mid-2019. Is HYCOM still under active development?
Looking at the github repository , it seems that the latest version dates back to at least a year ago.
Is the HYCOM github page still the best place to find the latest version (HYCOM.org · GitHub)?
Sincerely,
Alvaro Fernández, Ph.D.
Senior Member of Technical Staff | AMD
HPC Applications Engineering
 
[AMD Official Use Only]
[AMD Official Use Only]
Hi Alan,
Thanks for sending me the data. And I finally got back to this.
I’m starting off by re-running GLBT0.08 prior to going with your new dataset, as both a sanity check and a way to see generational uplift.
I’m getting the error below when running with 123 MPI ranks, no OpenMP:
…
input: nreg = 3
timer statistics, processor 1 out of 123
-----------------------------------------------
xc**** calls = 1 time = 0.00000 time/call = 0.00000000
total calls = 1 time = 0.01427 time/call = 0.01427158
processor 1: memory (words) now,high = 0 0
processor 1: memory (GB) now,high = 0.000 0.000
processor 1: eq. 3-D arrays now,high = 0.000 0.000
**************************************************
xcspmd: patch.input for wrong nreg
**************************************************
mpi_finalize called on processor 1
mpi_finalize called on processor 49
I am running 123 ranks and the patch file appears to be the right one.
npes npe mpe idm jdm ibig jbig nreg minsea maxsea avesea
123 12 12 4500 3298 375 275 3 0 103125 72798
I don’t recall having to worry about the nreg parameter at all. Any ideas what this might be about?
[AMD Official Use Only]
Hi Alan,
It’s possible the data tarball was corrupted when I downloaded it.
I’ve been successful in compiling HYCOM with the instructions below and the AMD compiler, but the GLBT04 tarball fails halfway during decompression.
Are there checksums you can share for these tarballs?
The link for the download has expired of course, so if it is corrupt, I may have to pester you for another link – sorry…
[AMD Official Use Only]
Good morning Alan,
Single node runs in progress on two separate nodes, at 1060 and 1200 timesteps currently.
Can you confirm the nominal memory footprint for HYCOM running this workload, as well as the nominal I/O expected?
We appear to be using 1 TB of RAM in each, and writing ~3.54 GiB every iteration. See below for details.
Details
[alvaro@pluto31 ~]$ ps aux | head -1; ps aux | sort -rnk 4 | head -127
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
alvaro 73955 99.7 0.8 149599944 9047248 ? Rl Nov10 1616:58 ./hycom
alvaro 73954 99.6 0.8 149507356 9239548 ? Rl Nov10 1616:12 ./hycom
alvaro 73953 99.6 0.8 149167828 9051004 ? Rl Nov10 1616:13 ./hycom
…
IO is directed to local NVME drives in this cluster and not to a parallel I/O filesystem.
Examining iostat difference between two iterations, the kB written seems the biggest.
Device tps kB_read/s kB_wrtn/s kB_read kB_wrtn
nvme0n1p1 4.01 344.17 3618.79 59091541 621320860
Device tps kB_read/s kB_wrtn/s kB_read kB_wrtn
nvme0n1p1 4.04 343.55 3633.82 59091661 625030952
Substracting the kB written from one iteration to the next:
625030952 kB wrtn
-621320860 kB wrtn
===========
3,710,092 ~ 3.54 GiB kB wrtn every time step