2012-2022 time series

132 views
Skip to first unread message

Jonathan Sherman - NOAA Affiliate

unread,
Jan 19, 2024, 6:02:27 PMJan 19
to HYCOM.org Forum
Hi all,
I wanted to ask this question as a "sanity check" before I go and download 10 years of data particularly since using HYCOM is not my expertise.
I am working on evaluating net primary productivity models from remote sensing on a global scale. Some of the models use the mixed layer depth as an input. So far I have been using the Temp and salinity data from GLBu0.08 exp91.1 (data from 2015) to calculate the MLD using various temp /density thresholds and then re-grid to the same grid as the satellite data (9km).

I am now working on extending my evaluation over the 2012-2022 period and I wanted to make sure I am tracking down the correct files over that period.
So far, I have a url list from 2012 to 11/20/2018 from GLBu0.08 (expt 19.1, 90.9, 91.0, 91.1, 91.2, 93.0) and a list from GLBy0.08 for the remainder of the time period (late 2018-2022). (all hindcasts)

There is a grid change between GLBu and GLBy as far as I understand, and I was wondering if there is better way to go about getting the data for the whole period as consistently as possible. I see I can get GLBy from mid 2014- but couldn't find the earlier expt only 93.0 (https://data.hycom.org/datasets/GLBy0.08).

I'd be happy to hear if anyone has had experience with a similar question.
Below adding my python script to search for all file urls for reference 
Thank you,
Jonathan Sherman

import requests
from bs4 import BeautifulSoup
from urllib.parse import urljoin, urlparse

def get_file_list(url, end_str):
    response = requests.get(url)
   
    if response.status_code == 200:
        soup = BeautifulSoup(response.text, 'html.parser')
        file_links = set()  # Using a set to automatically eliminate duplicates

        for link in soup.find_all('a', href=True):
            absolute_url = urljoin(url, link['href'])
            # Check if the link ends with "ts3z.nc"
            if urlparse(absolute_url).path.endswith(end_str):
                file_links.add(absolute_url)

        return sorted(list(file_links))
    else:
        print(f"Failed to retrieve content from {url}. Status code: {response.status_code}")
        return None


url_2012_191 = "https://data.hycom.org/datasets/GLBu0.08/expt_19.1/data/2012/" #expt_19.1 is from 8/1/1995-12/31/2012
files_2012 = get_file_list(url_2012_191, 't000.nc')
#################################################################################################################################
url_2013_909 = "https://data.hycom.org/datasets/GLBu0.08/expt_90.9/data/hindcasts/2013/" # expt_90.9 is from 1/3/2011 - 8/20/2013
files_2013_909 = get_file_list(url_2013_909, 'ts3z.nc')

url_2013_910 = "https://data.hycom.org/datasets/GLBu0.08/expt_91.0/data/hindcasts/2013/" # expt 91.0 is from 8/21/2013 - 4/4/2014
files_2013_910 = get_file_list(url_2013_910, 'ts3z.nc')
files_2013 = files_2013_909[:-4] + files_2013_910
#################################################################################################################################
url_2014_910 = "https://data.hycom.org/datasets/GLBu0.08/expt_91.0/data/hindcasts/2014/" # expt 91.0 is from 8/21/2013 - 4/4/2014
files_2014_910 = get_file_list(url_2014_910, 'ts3z.nc')


url_2014_911 = "https://data.hycom.org/datasets/GLBu0.08/expt_91.1/data/hindcasts/2014/" # expt 91.1 is from 4/4/2014 - 4/18/2016
files_2014_911 = get_file_list(url_2014_911, 'ts3z.nc')
files_2014 = files_2014_910[:-2] + files_2014_911

#################################################################################################################################
url_2015 = "https://data.hycom.org/datasets/GLBu0.08/expt_91.1/hindcasts/2015/" # expt 91.1 is from 4/4/2014 - 4/18/2016
files_2015 = get_file_list(url_2015, 'ts3z.nc')

#################################################################################################################################
url_2016_911 = "https://data.hycom.org/datasets/GLBu0.08/expt_91.1/hindcasts/2016/" # expt 91.1 is from 4/4/2014 - 4/18/2016
files_2016_911 = get_file_list(url_2016_911, 'ts3z.nc')

url_2016_912 = "https://data.hycom.org/datasets/GLBu0.08/expt_91.2/data/hindcasts/2016/" # expt 91.2 is from 4/18/2016 - 11/20/2018
files_2016_912 = get_file_list(url_2016_912, 'ts3z.nc')
files_2016 = files_2016_911[:-1] + files_2016_912

#################################################################################################################################
url_2017 = "https://data.hycom.org/datasets/GLBu0.08/expt_91.2/data/hindcasts/2017/" # expt 91.2 is from 4/18/2016 - 11/20/2018
files_2017 = get_file_list(url_2017, 'ts3z.nc')

#################################################################################################################################
url_2018_912 = "https://data.hycom.org/datasets/GLBu0.08/expt_91.2/data/hindcasts/2018/" # expt 91.2 is from 4/18/2016 - 11/20/2018
files_2018_912 = get_file_list(url_2018_912, 'ts3z.nc')

url_2018_930 = "https://data.hycom.org/datasets/GLBu0.08/expt_93.0/data/hindcasts/2018/" # expt 91.2 is from 4/18/2016 - 11/20/2018
files_2018_930 = get_file_list(url_2018_930, 't000_ts3z.nc')

files_2018_930_GLBy = get_file_list("https://data.hycom.org/datasets/GLBy0.08/expt_93.0/data/hindcasts/2018/", 't000_ts3z.nc') # GLBy expt_93.0 is from July 2014 to present, but has a differnt grid then GLBu

files_2018 = files_2018_912[:256] + files_2018_930[:-5] + files_2018_930_GLBy
### https://www.hycom.org/faqs/474-glby-glbv-glbu-grids SEE this

#################################################################################################################################
files_2019 = get_file_list("https://data.hycom.org/datasets/GLBy0.08/expt_93.0/data/hindcasts/2019/", 't000_ts3z.nc')

#################################################################################################################################
files_2020 = get_file_list("https://data.hycom.org/datasets/GLBy0.08/expt_93.0/data/hindcasts/2020/", 't000_ts3z.nc')

#################################################################################################################################
files_2021 = get_file_list("https://data.hycom.org/datasets/GLBy0.08/expt_93.0/data/hindcasts/2021/", 't000_ts3z.nc')

#################################################################################################################################
files_2022 = get_file_list("https://data.hycom.org/datasets/GLBy0.08/expt_93.0/data/hindcasts/2022/", 't000_ts3z.nc')



Alan Wallcraft

unread,
Jan 20, 2024, 12:07:01 PMJan 20
to HYCOM.org Forum, Jonathan Sherman - NOAA Affiliate
I suggest using the GOFS 3.1 Reanalysis through 2015 and then the GOFS 3.1 Analysis.  This is the best and most consistent global time series, with the primary difference between the expts being the atmospheric forcing.

The GLBv0.08 grid is a subset of the GLBy0.08 grid, and both are subsets of the GLBu0.08 grid (80S-80N only).  The native model grid is 0.08 degree Mercator (square cells in meters), so its longitudinal resolution is 0.08 degrees but its latitudinal resolution gets finer from the equator to the pole.  Hence we distribute on a 0.04 lat grid to get close to the high latitude native resolution.  However for many purposes, a uniform 0.08 degree grid (GLBu0.08) is good enough.

Alan.

Jonathan Sherman - NOAA Affiliate

unread,
Jan 22, 2024, 12:39:11 PMJan 22
to HYCOM.org Forum, Alan Wallcraft, Jonathan Sherman - NOAA Affiliate
Thank you Alan for your response.
I believe I found the needed urls, and I was wondering if there is a recommend way to batch download each years data ("*t000.nc"). In understand I can't use wild cards in the wget call, or have more then 1 download at a time (saw that in some answer in the forum). As each file is ~4.5Gb and looping over each url is slow I was hoping there is a more efficient way perhaps for global data access. 
Best,
Jonathan

Alan Wallcraft

unread,
Jan 22, 2024, 1:47:59 PMJan 22
to HYCOM.org Forum, Jonathan Sherman - NOAA Affiliate, Alan Wallcraft

The Maximum number of simultaneous connections per IP address is 8. Abuse of this service will result in the auto block of your IP address!

Actually using 8 connections might overload your local machine, so you should experiment to find the optimal number.

Note that there is a lower limit when you are subsetting fields, because that requires work on our servers, but you are not doing this.

Alan.
Reply all
Reply to author
Forward
0 new messages