num-found seems to be floating

61 views
Skip to first unread message

Paolo Simeone

unread,
Nov 24, 2020, 5:55:08 AM11/24/20
to ORCID API Users
Dear all,

I am noticing some confusing result with my requests from v3.0 API.

This is live from my (Python) debugger. Is this something expected or am I getting something wrong in my request?

(Pdb) url_search = 'https://pub.orcid.org/v3.0/search/?q=*:*+AND+profile-last-modified-date:[2020-11-16T00:00:00Z TO 2020-11-23T00:00:00Z]&start=0&rows=100'
(Pdb) res = requests.get(url_search, data=payload, headers=headers)
(Pdb) ET.fromstring(res.text.encode('utf-8')).get('num-found')
'483763'
(Pdb) res = requests.get(url_search, data=payload, headers=headers)
(Pdb) ET.fromstring(res.text.encode('utf-8')).get('num-found')
'483730'
(Pdb) url_search = 'https://pub.orcid.org/v3.0/search/?q=*:*+AND+profile-last-modified-date:[2020-11-16T00:00:00Z TO 2020-11-23T00:00:00Z]&start=100&rows=100'
(Pdb) res = requests.get(url_search, data=payload, headers=headers)
(Pdb) ET.fromstring(res.text.encode('utf-8')).get('num-found')
'483730'
(Pdb) url_search = 'https://pub.orcid.org/v3.0/search/?q=*:*+AND+profile-last-modified-date:[2020-11-16T00:00:00Z TO 2020-11-23T00:00:00Z]&start=483700&rows=100'
(Pdb) res = requests.get(url_search, data=payload, headers=headers)
(Pdb) ET.fromstring(res.text.encode('utf-8')).get('num-found')
'483730'
(Pdb) url_search = 'https://pub.orcid.org/v3.0/search/?q=*:*+AND+profile-last-modified-date:[2020-11-16T00:00:00Z TO 2020-11-23T00:00:00Z]&start=483700&rows=100'
(Pdb) res = requests.get(url_search, data=payload, headers=headers)
(Pdb) ET.fromstring(res.text.encode('utf-8')).get('num-found')
'483682'

Pedro Costa

unread,
Nov 26, 2020, 11:13:46 AM11/26/20
to ORCID API Users
Hi Paolo,

We keep live data in Solr so query results are always dynamic. We just store the latest last-modified-date, not a history of last-modified-dates, so the results of a query such as the one in your post are expected to change. So, for example, a given record's last-modified date might fall within the timeframe of your query today, but I update that record today and you run the query again, the last-modified-date for that record no longer falls within your timeframe.

We have a lambda file generated daily which contains the last-modified-date for all records. You could save that file every day and then get the data from it. This could be a solution to what you're trying to do. Please note this is available only to ORCID members.

Kind regards,
Pedro Costa
QA Lead
Reply all
Reply to author
Forward
0 new messages