Genie monitors a local process, set up by your ‘command’.
If your command submits a job then immediately terminates (while the actual job is still running on the cluster), then yes, looking at genie status won’t be very useful.
Monitoring streaming jobs in Hadoop and Spark is possible.
But the process you launch just needs to have the same lifecycle as the job running on the cluster:
- Execute as long as the job is running on the cluster
- Exit with code 0 if the job succeeded
- Exit with code non-zero if the job fails.
An example is the Hadoop executor.
Spark-submit may not behave this way natively (unless there is some kind of —wait option).
So you may need to adapt it by wrapping it into a script that stays running as long as the job is running.