Hi Fabian,
ok the 'to_dataframe' being missing for GHCN vs GSOD is just an artifact of the when we wrote the download scripts. Some ulmo modules have it and others don't. In the current rewrite of ulmo that should be fixed.
my understanding is that the GHCN is roughly a subset of GSOD that has undergone some extra quality control to allow better long term historical comparisons of climate...
from the NOAA website:
GHCN -> "The Global Historical Climatology Network (GHCN) is an integrated database of climate summaries from land surface stations across the globe that have been subjected to a common suite of quality assurance reviews. The data are obtained from more than 20 sources. Some data are more than 175 years old while others are less than an hour old. GHCN is the official archived dataset, and it serves as a replacement product for older NCEI-maintained datasets that are designated for daily temporal resolution (i.e., DSI 3200, DSI 3201, DSI 3202, DSI 3205, DSI 3206, DSI 3208, DSI 3210, etc.)."
GSOD -> "Global Surface Summary of the Day is a product produced by the National Climatic Data Center (NCDC), and is derived from the synoptic/hourly observations contained in the Integrated Surface Hourly (ISH) dataset (DSI-3505). The latest daily summary data are normally available 1-2 days after the date-time of the observations used in the daily summaries, and over 9000 worldwide stations' data are available. Daily elements (as available) include mean values of temperature, dew point, sea level and station pressures, visibility, and wind speed plus maximum sustained wind speed and/or wind gusts, maximum and minimum temperature, precipitation amounts, snow depth, and indicators for occurrences of various weather elements. Historical data are generally available for 1929 to the present, with data from 1973 to the present being the most complete. Daily extremes and totals--maximum wind gust, precipitation amount, and snow depth-- only appear if the station reports the data sufficiently to provide a valid value. Therefore, these three elements appear less frequently than other values. Since these elements are derived from the original synoptic/hourly data as are reported and based on Greenwich Mean Time (GMT, 0000Z-2359Z), they often comprise a 24-hour period which includes a portion of the previous day (i.e., offset from local standard times)."
- dharhas