Hello Matteo,
Thanks for checking in on this. I'm going to give some background followed by the next steps.
Background
It looks like we got to the bottom of this. Not all operations support partial failure, and batch jobs only accept operations that support partial failure because batch jobs will do retries for you if something initially fails. In some of these cases where the job is taking a long time, we're seeing operations that don't support partial failures. For example, this is something we see in our stack traces:
This operation cannot be used with "partial_failure"., at mutate_operations[2].conversion_goal_campaign_config_operation.
https://developers.google.com/google-ads/api/reference/rpc/v10/MutateConversionGoalCampaignConfigsRequest does not have partial_failure as an option. This means that this operations is not supported for batch jobs.
Next Steps
On our side, we'll work towards failing faster when we see operations in the job that do not support partial failure. On your side, you can double-check to make sure there aren't any operations that do not support partial failure in your job; that would decrease the number of batch jobs you're seeing where the job is running for long periods of time.
Regards,