Dear all,
I am rather new to OpenAlex so advice is highly appreciated! My intention is to download a certain subset of all works. My command for the CLI is as follows:
openalex download --api-key j2jWXXXXXXXAE --filter "cited_by_count:>1,publication_year:1991,type:article|book|book-chapter" --output "/XXXXX/1991" --workers 100
When I enter these specific filters in the web interface, I get a total of 774.800 hits. After the download has finished with this exact command, I find a total of only 699416 files. I then entered the command again, which gave me this output in the terminal:
ℹ Resuming from checkpoint: 774800 downloaded, 0 failed
ℹ Starting download with filter:
cited_by_count:>1,publication_year:1991,type:article|book|book-chapter
ℹ Content: metadata only, Workers: 100
⚠ Downloading 10,200+ files to flat directory. Consider using --nested for
better filesystem performance.
This seems odd as the command apparently sees all files. Why are there fewer ones when I go to the specific folder?
(Btw, next time I will try the nested structure as suggested)
Thanks!