Returning rows with absent data value in filter

34 views
Skip to first unread message

Rob Thomas

unread,
Jul 11, 2025, 8:29:38 AMJul 11
to ERDDAP
Hi,

We are preparing to publish a tabular biological abundance dataset on our Erddap. Within the dataset there is an individualCount column which is null when a taxon has been observed but not enumerated (standard practice for biological abundance reporting e.g. Darwin Core). The individualCount variable has been set up with -9 as the _FillValue attribute. 

We can do a successful data search that excludes rows without any presence data, where the filter is set to !=-9, but we are struggling to get the opposite filter (=-9) to return rows for taxa that are present but have not been counted.

Has anyone any advice or had this issue previously?

Cheers
Rob

Mathew Biddle - NOAA Federal

unread,
Jul 11, 2025, 8:53:03 AMJul 11
to Rob Thomas, ERDDAP
Hi Rob,

Can you provide a link to the dataset of question?

Thanks,

Matt

--
You received this message because you are subscribed to the Google Groups "ERDDAP" group.
To unsubscribe from this group and stop receiving emails from it, send an email to erddap+un...@googlegroups.com.
To view this discussion, visit https://groups.google.com/d/msgid/erddap/1dd5d45e-e67f-41cb-b994-4346e7e07658n%40googlegroups.com.


--
Mathew Biddle, Physical Scientist
NOAA/NOS
US Integrated Ocean Observing System Office
1315 East-West Highway
Silver Spring MD 20910

Mathew Biddle - NOAA Federal

unread,
Jul 11, 2025, 9:12:19 AMJul 11
to Rob Thomas, ERDDAP
Rob,

Here is an example of the query you're looking to perform:

1. The entire dataset without filtering: https://erddap.ioos.us/erddap/tabledap/by_the_numbers.htmlTable

A number of things could be happening to cause you to get an unexpected response (is it a number or string? did the configuration get built correctly? have you tried <=-8 ? etc.). Including a link to the dataset will help folks understand the problem.

Thanks,

Matt
Message has been deleted

Rob Thomas

unread,
Jul 11, 2025, 10:20:56 AMJul 11
to Mathew Biddle - NOAA Federal, ERDDAP
Hi Matt,

The dataset is internal at present but I hope to publish externally next week. I hope some screenshots will help in the interim.

First three screenshots are results from the same sample but differing filters using the absent data value.

Results for a sample with filter =-9 for individualCount
filter-9.PNG

Results for a sample with filter !=-9 for individualCount
filternot-9.PNG


Results for a sample without a filter on individualCount
nofilter.PNG
Confirmation of the variable attribute set up
variable_attributes.PNG

Roy Mendelssohn - NOAA Federal

unread,
Jul 11, 2025, 10:22:57 AMJul 11
to Rob Thomas, erDDAP Bob Simons via, Mathew Biddle - NOAA Federal
Thanks Matt!!

You beat me to this - I was just going through my email.. Note If for any ERDDAP, you click on the link for “tabledap documentation” and search for "The valid operators are” there is more discussion on the operators.

-Roy
> To view this discussion, visit https://groups.google.com/d/msgid/erddap/CA%2BRW_ctf0Tyx9V5WM-KvryfM66%3DWx74vo1XWyPwqMOoAetHLgw%40mail.gmail.com.


Roy Mendelssohn - NOAA Federal

unread,
Jul 11, 2025, 10:28:16 AMJul 11
to Rob Thomas, Mathew Biddle - NOAA Federal, erDDAP Bob Simons via
Hi Rob:

To help my bad eyes can you capture the URLs as text so I can read them. Also a lot will depend on knowing exactly how your dataset and datasets.xml are setup. Can you capture the DAS for that dataset.

Thanks,

-Roy

> On Jul 11, 2025, at 7:20 AM, Rob Thomas <rob.th...@gmail.com> wrote:
>
> Hi Matt,
>
> The dataset is internal at present but I hope to publish externally next week. I hope some screenshots will help in the interim.
>
> First three screenshots are results from the same sample but differing filters using the absent data value.
>
> Results for a sample with filter =-9 for individualCount<filter-9.PNG>
> Results for a sample with filter !=-9 for individualCount<filternot-9.PNG>
>
> Results for a sample without a filter on individualCount<nofilter.PNG>Confirmation of the variable attribute set up<variable_attributes.PNG>
> To view this discussion, visit https://groups.google.com/d/msgid/erddap/CAD14vn35j9CeB64pRYsm7SCaQkcNTQx_pZ3nP8OPig%3D%2Bm74Bgg%40mail.gmail.com.


Rob Thomas

unread,
Jul 11, 2025, 10:49:50 AMJul 11
to Roy Mendelssohn - NOAA Federal, Mathew Biddle - NOAA Federal, erDDAP Bob Simons via
benthic.xml

Roy Mendelssohn - NOAA Federal

unread,
Jul 11, 2025, 10:55:46 AMJul 11
to Rob Thomas, Mathew Biddle - NOAA Federal, erDDAP Bob Simons via
Thanks. Can you capture the DAS that is shown at the bottom of the page.

-Roy
> <benthic.xml>

Rob Thomas

unread,
Jul 11, 2025, 11:11:10 AMJul 11
to Roy Mendelssohn - NOAA Federal, Mathew Biddle - NOAA Federal, erDDAP Bob Simons via
Here you go.

On Fri, Jul 11, 2025 at 3:55 PM Roy Mendelssohn - NOAA Federal
benthicDAS.txt

Roy Mendelssohn - NOAA Federal

unread,
Jul 11, 2025, 11:42:37 AMJul 11
to Rob Thomas, Mathew Biddle - NOAA Federal, erDDAP Bob Simons via
Thanks. One more thing. Can you select only individualCount as well as “distinct” so I can see what ERDDAP sees as the distinct values.

Sorry to do this so piecemeal - coffee hasn’t kicked in yet.

-Roy
> <benthicDAS.txt>

Rob Thomas

unread,
Jul 11, 2025, 11:53:23 AMJul 11
to Roy Mendelssohn - NOAA Federal, Mathew Biddle - NOAA Federal, erDDAP Bob Simons via
Hi Roy,

Appreciate the time given it is not public yet for you to be able to
access and interrogate yourself.

It is showing an empty cell at the bottom of the list when the
distinct results are ordered. There is no -9 value. The list starts at
1.0

Cheers
Rob

On Fri, Jul 11, 2025 at 4:42 PM Roy Mendelssohn - NOAA Federal
image.png
image.png

Roy Mendelssohn - NOAA Federal

unread,
Jul 11, 2025, 6:13:31 PMJul 11
to Rob Thomas, Mathew Biddle - NOAA Federal, erDDAP Bob Simons via
Hi Rob:

The blank entry says that ERDDAP is seeing missing values, so it is correctly interpreting what you put in the datasets.xml.  From the tabledap docs:

  • NaN (Not-a-Number) (external link).png - Many numeric variables have an attribute which identifies a number (e.g., -99) as a missing_value or a _FillValue. When ERDDAP tests constraints, it always treats these values as NaN's. So:
    Don't create constraints like temperature!=-99 .
    Do     create constraints like temperature!=NaN .
    • For numeric variables, tests of variable=NaN (Not-a-Number) and variable!=NaN will usually work as expected.
      WARNING: numeric queries with =NaN and !=NaN may not work as desired since many data sources don't offer native support for these queries and ERDDAP can't always work around this problem. For some datasets, queries with =NaN or !=NaN may fail with an error message (insufficient memory or timeout) or erroneously report Your query produced no matching results.
    • For numeric variables, tests of variable<NaN (Not-a-Number) (or <=, >, >=) will return false for any value of the variable, even NaN. NaN isn't a number so these tests are nonsensical. Similarly, tests of variable<aNonNaNValue (or <=, >, >=) will return false whenever the variable's value is NaN.

See if this solves your problem,  Let me know.

-Roy


<image.png><image.png>

Message has been deleted

Rob Thomas

unread,
Jul 15, 2025, 9:07:02 AMJul 15
to ERDDAP
Hi Roy,

Thanks for further looking into this and pointing to the additional documentation. I suspect this is related to the numeric queries with NaN not working as expected in both text in the excerpt above. Replacing -9 with NaN give the following error regardless of != or = operator:
Error { code=500; message="Internal Server Error: ERROR from data source: com.microsoft.sqlserver.jdbc.SQLServerException: The incoming tabular data stream (TDS) remote procedure call (RPC) protocol stream is incorrect. Parameter 3 (\"\"): The supplied value is not a valid instance of data type real. Check the source data for invalid values. An example of an invalid value is data of numeric type with scale greater than precision."; }

I'll try explicitly adding -9 as an absent data value in the underlying table and logging this as the missing data value in the variable attributes.

Cheers
Rob

Rob Thomas

unread,
Jul 15, 2025, 9:08:28 AMJul 15
to ERDDAP
Hi Matt,

Currently the dataset is internal only. I hope to have it published externally next week. I hope some screenshots will help in the interim:

Variable attributes confirming FIllValue set to -9.0

variable_attributes.PNG

Results from a sample that contains both enumerated and unenumerated taxa
nofilter.PNG

Results from same sample but excluding rows using the absent data value to filter
filternot-9.PNG

Results from same sample but including rows using the absent data value to filter. Using <1 as the filter also returns the "nRows = 0" 404 message.

filter-9.PNG

Roy Mendelssohn - NOAA Federal

unread,
Jul 15, 2025, 10:18:02 AMJul 15
to Rob Thomas, erDDAP Bob Simons via
HI Rob:

Putting NaN in the file won’t work because you are mixing datatypes. Putting -9 in the file, setting that as the _FillValue in the datasets.xml, and then using NaN in the constraint expression to look for missing values is what should work.

HTH,

-Roy


> On Jul 14, 2025, at 6:42 AM, Rob Thomas <rob.th...@gmail.com> wrote:
>
> Hi Roy,
>
> Thanks for further looking into this and pointing to the additional documentation. I suspect this is related to the numeric queries with NaN not working as expected in both text in the excerpt above. Replacing -9 with NaN give the following error regardless of != or = operator:
> Error { code=500; message="Internal Server Error: ERROR from data source: com.microsoft.sqlserver.jdbc.SQLServerException: The incoming tabular data stream (TDS) remote procedure call (RPC) protocol stream is incorrect. Parameter 3 (\"\"): The supplied value is not a valid instance of data type real. Check the source data for invalid values. An example of an invalid value is data of numeric type with scale greater than precision."; }
>
> I'll try explicitly adding -9 as an absent data value in the underlying table and logging this as the missing data value in the variable attributes.
>
> Cheers
> Rob
> On Friday, July 11, 2025 at 11:13:31 PM UTC+1 roy.men...@noaa.gov wrote:
> Hi Rob:
>
> The blank entry says that ERDDAP is seeing missing values, so it is correctly interpreting what you put in the datasets.xml. From the tabledap docs:
>
> • NaN (Not-a-Number) - Many numeric variables have an attribute which identifies a number (e.g., -99) as a missing_value or a _FillValue. When ERDDAP tests constraints, it always treats these values as NaN's. So:
> To view this discussion, visit https://groups.google.com/d/msgid/erddap/86f8ab3b-7e8e-495a-a591-dfc02cc6ce6cn%40googlegroups.com.

Rob Thomas

unread,
Jul 17, 2025, 10:05:45 AMJul 17
to Roy Mendelssohn - NOAA Federal, erDDAP Bob Simons via
Hi Roy,

We have the dataset available from a publicly available Erddap now, so
these URLs should return viewable results. Absent data value of -9 is
now in the underlying dataset and attributes of "missing_value" and
"_FillValue" for individualCount are set to -9. Still seeing the same
behaviour as previously.

https://erddap3.marine.ie/erddap/tabledap/wfd_benthic.htmlTable?&station_number=%22MIBE14-021%22
- returns 4 records
https://erddap3.marine.ie/erddap/tabledap/wfd_benthic.htmlTable?&station_number=%22MIBE14-021%22&individualCount=-9
- returns 0 records but would expect to return 3 records
https://erddap3.marine.ie/erddap/tabledap/wfd_benthic.htmlTable?&station_number=%22MIBE14-021%22&individualCount!=-9
- returns 1 record

Using NaN in the query as shown in the URL still gives an error (see
below). This is not the behaviour that would be expected?

https://erddap3.marine.ie/erddap/tabledap/wfd_benthic.htmlTable?&station_number=%22MIBE14-021%22&individualCount=NaN

Error {
code=500;
message="Internal Server Error: ERROR from data source:
com.microsoft.sqlserver.jdbc.SQLServerException: The incoming tabular
data stream (TDS) remote procedure call (RPC) protocol stream is
incorrect. Parameter 4 (\"\"): The supplied value is not a valid
instance of data type real. Check the source data for invalid values.
An example of an invalid value is data of numeric type with scale
greater than precision.";
}


https://erddap3.marine.ie/erddap/tabledap/wfd_benthic.htmlTable?&station_number=%22MIBE14-021%22&individualCount!=NaN

Error {
code=500;
message="Internal Server Error: ERROR from data source:
com.microsoft.sqlserver.jdbc.SQLServerException: The incoming tabular
data stream (TDS) remote procedure call (RPC) protocol stream is
incorrect. Parameter 4 (\"\"): The supplied value is not a valid
instance of data type real. Check the source data for invalid values.
An example of an invalid value is data of numeric type with scale
greater than precision.";
}

Cheers
Rob

On Tue, Jul 15, 2025 at 3:18 PM Roy Mendelssohn - NOAA Federal

Chris John - NOAA Affiliate

unread,
Jul 17, 2025, 12:43:53 PMJul 17
to Rob Thomas, Roy Mendelssohn - NOAA Federal, erDDAP Bob Simons via
So there is a mention of this edgecase in the EDDTableFromDatabase documentation: https://erddap.github.io/docs/server-admin/datasets#integer-nulls-1

I believe to support a NaN query we'd need to make a change to the code here: https://github.com/ERDDAP/erddap/blob/main/WEB-INF/classes/gov/noaa/pfel/erddap/dataset/EDDTableFromDatabase.java#L1049

A hacky workaround for now could be to use -9 in the database for unset, but not tell ERDDAP™ that -9 is the fill/missing value. This should allow you to query for -9.

Christopher John (he/him)
NOAA Appointed Technical Director of ERDDAP™
Computer and Information Systems Manager, TSPi




Roy Mendelssohn - NOAA Federal

unread,
Jul 17, 2025, 12:46:38 PMJul 17
to erDDAP Bob Simons via, Rob Thomas, Chris John - NOAA Affiliate
Thanks Chris. What I didn’t understand until I saw the actual error message is that it was connected to a Microsoft SQLServer database. I was assuming it was a file.

-Roy
Reply all
Reply to author
Forward
0 new messages