I read a little about Luigi via the documentation, and it makes a lot more sense. Here are some feedback about the Qanta documentation as well as a question:1) I know that Qanta is meant to be run on AWS, but it would be helpful to have a quick start guide for non-AWS users as well. This could be something to add later in the future. Most of the information is already in the current readme, but some of it can easily be skipped over such as the crucial “Quanta on Path” section.
2) The “running on batch mode” instructions should be updated. In step 2, the “CreateGuesses” task is missing in the pipeline itself, and in step 3, the “AllSummaries” task needs to be updated according to a comment in the file itself. I’m not sure if the “AllSummaries” task works as expected or if that comment should be deleted.
3) This is pretty minor, but it might be helpful to include the modification of using “—local-schedule” in the Luigi targets for those running Qanta without Spark.
4) Could I get some documentation for the Luigi parameters in the TrainGuesser and GenerateGuesses tasks? It seems that GenerateGuesses specifies TrainGuesser as a dependency, so is there any way to save the model and reuse it for GenerateGuesses?