If a ship makes multiple cruises, what is a "trajectory"?

43 views

Skip to first unread message

bobsimons2.00

unread,

Jun 2, 2023, 3:16:34 PM6/2/23

to ERDDAP

A user asked: if a ship makes multiple cruises, what is one "trajectory" (as defined by CF and by ERDDAP's cdm_trajectory_variables)?

It could be

a) the ship's trajectory (with many cruises as subsections)

b) or it could be just one of the cruises.

In either case, shouldn't a .nc file with this data have one dimension for the ship, one dimension for cruise, and one dimension for observation?

My answer is:

The CF group made these definitions in a chapter called Discrete Sampling Geometries. "Geometry" is the key word. These definitions are closer to math than oceanography. The geometry types they defined (what CF and ERDDAP call a "featureType" and which ERDDAP also calls a "cdm_data_type", which is the older, broader name for this concept) are Point (observations from an unrelated collection of points), TimeSeries (all observations at one location), Trajectory, Profile, TimeSeriesProfile, and TrajectoryProfile. See

https://cfconventions.org/Data/cf-conventions/cf-conventions-1.10/cf-conventions.html#discrete-sampling-geometries

ERDDAP just tries to support CF's ideas.

So, a featureType/cdm_data_type, e.g., trajectory, is defined by its geometry. And the CF definition is that a file (or an ERDDAP dataset) of a given type can include a collection of that type (e.g., a collection of trajectories). CF defines several file representations for each featureType. For the CF multidimensional file representation of Trajectory, CF only supports files with 2 dimensions (trajectory and observation) for that collection.

I understand your reasoning for wanting separate levels or dimensions in ERDDAP for ship and cruise. CF doesn't support that idea. "Trajectory" is a 2-level concept.

Note that ERDDAP doesn't care if you give it files with 3 dimensions (ship, cruise, observation), 2 dimensions (ship+cruise, observation). or a flat file with 1 dimension (with columns for ship, cruise, and each of the observation variables).

What really matters to ERDDAP is cdm_trajectory_variables, because that tells ERDDAP what you consider to be a "trajectory".

My understanding of CF's definitions is that you can set it up either way and still be CF compliant:

a) A trajectory includes all the cruises of a given ship. Then, cdm_trajectory_variables (and presumably subsetVariables) includes only the variables that are unchanging for a given ship. And cruise_id would instead be one of the observation variables. A cruise is then essentially a subset of a trajectory.

b) A trajectory is the equivalent of one cruise. Then, cdm_trajectory_variables (and presumably subsetVariables) includes only the variables that are unchanging for a given cruise (including cruise_id).

From a geometric (math) standpoint, both are valid.

Note that in ERDDAP, for either option, a user can request all data from a given ship, or a given cruise.

I slightly recommend (b), because it seems like (a) loses the idea that a cruise can be considered a trajectory. Also, things like sensors may be recalibrated or replaced between cruises and it is nice that everything for a given trajectory is consistent. But it is your decision. I don't think there is a right or wrong answer. (In any case, that would be up to CF.)

I hope that helps. The ERDDAP documentation for cdm_data_type may also help: https://coastwatch.pfeg.noaa.gov/erddap/download/setupDatasetsXml.html#cdm_data_type