Getting reproducible results for resuming search across longer periods of time?

11 views
Skip to first unread message

Stephan

unread,
Jan 22, 2026, 7:18:02 AM (8 days ago) Jan 22
to Europe PMC Developer Forum
Hi all, and thanks for providing the REST service for EuropePMC!

I have some questions around the cursorMark value in GET /search results.

I've done some tests with different query parameters. It seems that even if the data changes (e.g., with rising hit counts), the nextCursorMark values stay stable for the same query run at different times when sorting by, e.g., "sort=P_DATE_D asc" (see below).

I'm looking to run search queries for my research, and would like to gauge their reproducibility, ideally for a longer period of time. E.g., the same query using a nextCursorMark with sorting by publication date (for past years) produces the same results next year.

I realize that queries with publication dates in the past and current year or the future may be "unstable", as publications are being added to the database.

  • Q1: Is this true also for publication dates further in the past than, say, last year? I.e., can results for, e.g., 2014 change?

Given that cursor marks replaced "offset" (CIT-2221), I'm assuming that they are stable.

  • Q2: Can you confirm this?

  • Q3: Additionally, what is the likelihood/the parameters under which specific nextCursorMarks will change? E.g., with the next major release of epmc-rest? Changes in the data? Restarts of the service?

Many thanks and kind regards
Stephan

---

Overview of naive testing results

Testing for query="cancer", resultType="idlist", pageSize="1", sort=" P_DATE_D asc" at different times today (3 iterations with different hitCounts), always gives me these results for the first ten pages:

  • p. 1: cursorMark in query: "*" -> "id": "29139677", "nextCursorMark": "AoIH///6lIpLjAAoMzc0NTM0OTc="
  • p. 2: cursorMark in query: "AoIH///6lIpLjAAoMzc0NTM0OTc=" -> "id": "29139674", "nextCursorMark": "AoIH///6lIpLjAAoMzc0NTM0OTQ="
  • p. 3: cursorMark in query: "AoIH///6lIpLjAAoMzc0NTM0OTQ=" -> "id": "PMC5534590", "nextCursorMark": "AoIH///6lIpLjAAoMzc0NTA2MjQ="
  • p. 4: cursorMark in query: "AoIH///6lIpLjAAoMzc0NTA2MjQ=" -> "id": "PMC5534606", "nextCursorMark": "AoIH///6lSnwsAAoMzc0NTA2Mzk="
  • p. 5: cursorMark in query: "AoIH///6lSnwsAAoMzc0NTA2Mzk=" -> "id": "PMC5545433", "nextCursorMark": "AoIH///6mhKAYAAoMzc0NTA5NTQ="
  • p. 6: cursorMark in query: "AoIH///6mhKAYAAoMzc0NTA5NTQ=" -> "id": "PMC5550175", "nextCursorMark": "AoIH///6mrIlhAAoMzc0NTEyMTU="
  • p. 7: cursorMark in query: "AoIH///6mrIlhAAoMzc0NTEyMTU=" -> "id": "PMC5550171", "nextCursorMark": "AoIH///6mrIlhAAoMzc0NTEyMTE="
  • p. 8: cursorMark in query: "AoIH///6mrIlhAAoMzc0NTEyMTE=" -> "id": "PMC5550125", "nextCursorMark": "AoIH///6mrIlhAAoMzc0NTExNjU="
  • p. 9: cursorMark in query: "AoIH///6mrIlhAAoMzc0NTExNjU=" -> "id": "PMC5550113", "nextCursorMark": "AoIH///6mrIlhAAoMzc0NTExNTM="
  • p. 10: cursorMark in query: "AoIH///6mrIlhAAoMzc0NTExNTM=" -> "id": "PMC5545520", "nextCursorMark": "AoIH///6ognWsAAoMzc0NTEwMzk="

Mohamed Selim

unread,
Jan 23, 2026, 6:16:07 AM (7 days ago) Jan 23
to Europe PMC Developer Forum, Stephan
Hi Stephan,
Thank you for reaching out. I think regarding all your questions  it all comes to the fact that we have a near real time  index updates. This means that average update to the dataset is around 800K updates this includes all sorts of updates; inserting, updating and delrtions.
that's why nest cursor mark is not reliable on index changing. Also publish date as we might receive new articles with publish date that already been passed by some time, this is not usual case but it happens.
My recommendation will be breaking down your query to time intervals and each time query through the results till the end without saving cursor mark.
The most reliable field is  first_idate short for first indexing date. This field marks when the article is indexed first in our system.
example:

The results should always be the same. The only exception for this is deletions but I believe that is acceptable.
Please let me know if this answers your questions and if I can help with anything else.
Kind Regards,
Mohamed

Stephan

unread,
Jan 26, 2026, 9:37:21 AM (4 days ago) Jan 26
to Europe PMC Developer Forum, mse...@ebi.ac.uk
Thanks, Mohamed, for your answer.

This makes it sufficiently clear to me what I can expect.

All the best
Stephan
Reply all
Reply to author
Forward
0 new messages