Dear Robina,
1.) Using corpus in XML-format is not possible, you need to convert documents to a TSV file (short-text corpus) or to TXT and TSV files in a directory (full-text corpus): see this
Annif wiki page.
2.) The way to organize experimentation is a good question. It can have a very big impact on how easy it is to run and track experiments, but I don't have a good answer, apart from "try out and do what best suits you workflow".
Maybe I would set up individual project configurations for e.g. different analyzers (for which there are a limited number of choices), but not for corpus sizes (which can have arbitrary many values). Different project configurations also allows to run experiments in parallel more confidently (the configurations are read when starting an operation, so in principle it is possible to change parameters when one operation has started and start another one, but I think it is not good in practice).
Note that Annif commands have the
--backend-param/-b option to override (most) values set up in a configuration file, which can be helpful.
For keeping track of evaluation scores etc. we use online spreadsheets.
However, if you are setting up a larger projects set which will be used for a longer time, I recommend to take a look at
DVC pipelines. While it adds complexity to the initial setup, in the long rung it helps in many ways.
-Juho