Search API error on deaccessioned dataset

71 views
Skip to first unread message

Dario Basset

unread,
Dec 14, 2022, 10:46:51 AM12/14/22
to Dataverse Users Community
We have dataverse 4.20 in University of Milan, Italy.

We would like to use the search API to have a list of our datasets. The API call (that scans datasets):

https://dataverse.unimi.it/api/search?q=*&type=dataset&per_page=50&start=300&key=xxxxxxxxxxxxxxxxxxxxxxxx

gives Internal Server Error (500)


We discovered that the problem is due to only one deaccessioned dataset in our production environment. That deaccessioned dataset responds anyway to the direct api call:

https://dataverse.unimi.it/api/datasets/12923           (you can try yourself)



So, we cannot understand the problem with the "scanning" search api above that is not working, while the direct api is working. What is your suggestion?

Philip Durbin

unread,
Dec 19, 2022, 2:47:14 PM12/19/22
to dataverse...@googlegroups.com
Hi Dario,

Two things. First, I'd try reindexing that dataset if you think it's causing problems: https://guides.dataverse.org/en/5.12.1/admin/solr-search-index.html#reindexing-datasets

Second, if you're still getting the 500 error, can you please email your server.log to sup...@dataverse.org? There should be a stace trace that gives more details about what's wrong.

Thanks,

Phil


--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/8485fa92-91f4-44a7-83e8-8d37bc1e6082n%40googlegroups.com.


--

Alfredo Cosco

unread,
Jan 20, 2023, 8:38:52 AM1/20/23
to Dataverse Users Community
Hi all,
we made several attempts, but none have been successful.

The Dataverse version in use is: 4.20

Just to resume, the problem is:
We have a stack of 339 datasets
Now when I query the datasets:
curl -s https://dataverse.unimi.it/api/search?q=*&type=dataset&per_page=1&start=317&key=XYZ0-MYKEY-1234

I expect a json like this:
{
  "status": "OK",
  "data": {
    "q": "*",
    "total_count": 339,
    "start": 310,
    "spelling_alternatives": {},
    "items": [
      {
        "name": "DV-AI_Dataverse Guidelines",
        "type": "dataset",
        "url": "https://doi.org/10.ABCDE/XY_UNINW/ABCDE",
        "global_id": "doi:10.ABCDE/XY_UNINW/ABCDE",
        "description": "Lorem ipsum dolor sit amet",
        "published_at": "2021-10-04T10:20:11Z",
        "publisher": "DV-AI H2020 ABCD research project",
        "citationHtml": "John, Doe, 2021, \"DV-AI_Dataverse Guidelines\", <a href=\"https://doi.org/10.ABCDE/XY_UNINW/ABCDE\" target=\"_blank\">https://doi.org/10.ABCDE/XY_UNINW/ABCDE</a>, UNINW Dataverse, V1",
        "identifier_of_dataverse": "dv-AI",
        "name_of_dataverse": "DV-AI H2020 ABCD research project",
        "citation": "John, Doe, 2021, \"DV-AI_Dataverse Guidelines\", https://doi.org/10.ABCDE/XY_UNINW/ABCDE, UNINW Dataverse, V1",
        "storageIdentifier": "s3://10.ABCDE/XY_UNINW/ABCDE",
        "subjects": [
          "Other"
        ],
        "fileCount": 1,
        "versionId": 164,
        "versionState": "RELEASED",
        "majorVersion": 1,
        "minorVersion": 0,
        "createdAt": "2021-10-04T10:20:04Z",
        "updatedAt": "2021-10-04T10:40:18Z",
        "contacts": [
          {
            "name": "John, Doe",
            "affiliation": "University of Nowhere"
          }
        ],
        "authors": [
          "John, Doe"
        ]
      }
    ],
    "count_in_response": 1
  }
}

But in the 339 datasets (range 0-338) there is always one that answers like this:
{
  "status": "ERROR",
  "code": 500,
  "message": "Internal server error. More details available at the server logs.",
  "incidentId": "14aec796-7f7c-4442-bc22-a7c35270a95e"
}

I got some info from this issue:
https://github.com/IQSS/dataverse/issues/5613

But even if I go back to the DATAVERSE that contains the DATASET throwing the error and, as suggested in the issue, from the UI I "check/uncheck inherit metadata blocks and save again", the wrong record doesn't disappear, it just changes id.

And I still have an endpoint in the DATASET (with a random ID in the range 0-338) that responds with an error 500.

I did also a SOLR clear and reindex ( https://guides.dataverse.org/en/4.20/admin/solr-search-index.html#clear-and-reindex ), everything was ok but it didn't solved the problem.

I set-up logs to FINE ( https://guides.dataverse.org/en/4.20/developers/debugging.html ).
When I curl the wrong record the server log ( /usr/local/glassfish4/glassfish/domains/domain1/logs/server.log ) just writes:

[2023-01-20T10:06:30.957+0000] [glassfish 4.1] [SEVERE] [] [edu.harvard.iq.dataverse.api.errorhandlers.ServeletExceptionHandler] [tid: _ThreadID=247940 _ThreadName=jk-connector(1)] [timeMillis: 1674209190957] [levelValue: 1000] [[
  API internal error 14aec796-7f7c-4442-bc22-a7c35270a95e: Null Pointer
java.lang.NullPointerException
]]


Any idea?
How can I have a more detailed log?

Thanks,
Alfredo

Philip Durbin

unread,
Mar 23, 2023, 4:47:58 PM3/23/23
to dataverse...@googlegroups.com
Hi Alfredo,

Just checking in. I know it's been a while. Any news on this issue?

Thanks,

Phil

Message has been deleted

Alfredo Cosco

unread,
May 22, 2023, 3:31:56 AM5/22/23
to Dataverse Users Community
Hi Philip,
sorry for the late news, we upgraded the dataverse from 4.20 to 5.13 and this issue disappeared.
Regards,
Alfredo
Reply all
Reply to author
Forward
0 new messages