Hi Parthasarathi,
That's a good question, I'll try give some information and background for the a) and b) options.
a) Deleting the old vocabulary and loading the new
Note that this can now (since Annif 0.57) be done in one command by using the "--force" option of the loadvoc command, for example:
annif loadvoc project-id path/to/vocabulary.ttl --force
If a project using the vocabulary in question has already been trained, the project needs to be retrained afterwards. Otherwise the suggestions that the project gives could be wrong (because the project itself does not know that the vocabulary is changed).
b) Update the vocabulary with the new one
This means running loadvoc without "--force". Quoting Annif-wiki (
https://github.com/NatLibFi/Annif/wiki/Commands#load-vocabulary):
"If a vocabulary has already been loaded, reinvoking loadvoc with a new subject file will update the Annif's internal vocabulary: label names are updated and any subject not appearing in the new subject file is removed. Note that new subjects will not be suggested before the project is retrained with the updated vocabulary."
So this option too requires retraining of the project, if one wants to have the newly added subjects in the Annif suggestions.
The bottom line is: option b is usually better, because it does necessitate retraining of the project (if one is happy just with updated label names and removed subjects).
However, in your case, as you have not yet trained the Annif project, I would go with the option a (it would keep the internal vocabulary slightly simpler/shorter, and would be a "clean start").
By the way: as you are preparing the training dataset, it is best to use the new vocabulary in there too, so you will have some example documents in the dataset that are assigned with newly added subjects.
-Juho