I'm a fourth-year journalist at Ryerson University. I'm trying to organize an access to information request I just got back that has over 7,000 pages of pdf files. I keep getting "Failed with error 'do-convert-single-file exited with status code 1'" when I try to upload with the OCR option.
I've tried a few different things, like cutting the file into 1000 pages, and then 500 pages, before uploading, but that didn't work. I was able to upload the 500-page file but without OCR. I was able to get OCR on a 105-page file I cut out of the main file, with either "One file is one document" or "one page is one document" settings.
Do you have any suggestions? My first intention is to separate the documents into relevant groupings. Second, I want to see if there are duplicates. Third, I want to actually read them.