Hi Parthasarathi!
About the issue of one subject URI for one Title/Abstract document: My gut feeling is that this is not a problem for training a project, if you are able to get a sufficient number of training documents. For a new text Annif will anyway give multiple subject suggestions, just like you mentioned a human indexer would give.
However, I think training on documents with single subject means that you need to have more of them to get a project of the same quality as a project trained on documents with multiple subjects. But actually maybe you will have quite many documents, as you mentioned that you are collecting documents from multiple databases. Anyway, it is beneficial to have multiple documents for each subject, if possible.
Another concern could arise if you would use such single-subject documents for evaluating a project. Annif gives 10 subject suggestions for a text by default, and then for such documents 9 of the suggestions would be considered "wrong" (false positives) always, so most of the metrics given by the Annif's eval command would not be applicable. Maybe you have a separate set of documents you can use for evaluating your final project, with multiple subjects per document. Having some test set is highly desirable, so you know how well a project is performing, and if a change on the project makes it better or worse.
In the end, I wonder if you could somehow merge back the "instances" of the same document that are separated by different subjects in the collecting process. I assume the documents in the online databases usually have multiple subjects. Maybe using the DOI, or just the whole title+abstract content. This would at least reduce the disk size of the corpus, if nothing else.
One point to consider is whether using the preflabels for fetching articles could lead to inconsistencies (e.g. preflabel "rock" could result to articles about the music genre and some stones). For this reason using URIs for querying the articles would be optimal, but maybe not possible.
Also I noticed that the columns in the example document for the preflabel "Gendercide" are wrong way around: in the short-text document corpus format the document text is in the first column, and the subject URIs are in the second after the tab.
I think you have seen the Jupyter notebook of the Annif tutorial for creating a custom corpus, but for general knowledge I link it here:
https://github.com/NatLibFi/Annif-tutorial/blob/master/data-sets/arxiv/create-arxiv-corpus.ipynb-Juho