Dear annif team,
during our last series of tests with the SVC backend, I made a typo in projects.cfg and wanted to ask if our observation is so correct.
If a typo occurs in the parameter name in our case with the min_df parameter, is the line ignored and if present the default value is used?
If this is the case, is there an option to stop the process if the projects.cfg contains a non-valid line? And is there a way to display all the parameters with which a model was actually trained?
Due to the typo described above, we discovered another problem. When using the SVC backend in conjunction with the simplema analyzer, we observed an error message that we can't quite explain.
In our test case we have about 100 classes to be trained on, as training material we use about 624,000 digital tables of contents and about 298,000 full texts but truncated to 30,000 characters.
If we set the parameter min_df to 2 everything is fine and the training is completed successfully. But if we use the default value of 1 the training stops after a few hours and we get this output:
Backend svc: creating vectorizer
Backend svc: creating classifier
Command terminated by signal 11
Do you have any idea what could be the reason for the termination?
Best regards from the German National Library
Frank
Hi Frank,
Best regards,
Annif-team
Dear annif team,
during our last series of tests with the SVC backend, I made a typo in projects.cfg and wanted to ask if our observation is so correct.
If a typo occurs in the parameter name in our case with the min_df parameter, is the line ignored and if present the default value is used?
If this is the case, is there an option to stop the process if the projects.cfg contains a non-valid line?
And is there a way to display all the parameters with which a model was actually trained?
Due to the typo described above, we discovered another problem. When using the SVC backend in conjunction with the simplema analyzer, we observed an error message that we can't quite explain.
In our test case we have about 100 classes to be trained on, as training material we use about 624,000 digital tables of contents and about 298,000 full texts but truncated to 30,000 characters.
If we set the parameter min_df to 2 everything is fine and the training is completed successfully. But if we use the default value of 1 the training stops after a few hours and we get this output:
Backend svc: creating vectorizer
Backend svc: creating classifier
Command terminated by signal 11
Do you have any idea what could be the reason for the termination?