Tasks scheduled but not queued

2,370 views
Skip to first unread message

tomas....@unacast.com

unread,
Sep 11, 2018, 7:42:21 AM9/11/18
to cloud-composer-discuss
For the second day in a row we notice a lot of our DAGs getting tasks that are "scheduled" but they are not moving forward to "queued". 

We are a little bit stuck on how we can debug this.

tomas....@unacast.com

unread,
Sep 12, 2018, 8:06:50 AM9/12/18
to cloud-composer-discuss
Yeah, so this happened again today. We have no clue what is going on here and would appreciate any help in debugging the issue.

Conrad Lee

unread,
Oct 3, 2018, 7:57:11 AM10/3/18
to tomas....@unacast.com, cloud-compo...@googlegroups.com
I'm constantly running into this issue.  Sometimes it helps if I go into the UI, click on the scheduled task, and click on the 'clear' action.  Other time it seems the scheduler is frozen and not queueing up any tasks from any dags.  Sometimes, after an hour or so, the scheduler seems to come back to life and start queuing and executing tasks again.

This flaky task execution is my main complain against cloud composer so far.  

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-di...@googlegroups.com.
To post to this group, send email to cloud-compo...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cloud-composer-discuss/304177be-2044-489c-8a48-689d505fcd79%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tomas Jansson

unread,
Oct 3, 2018, 7:58:28 AM10/3/18
to Conrad Lee, cloud-compo...@googlegroups.com
I’ve upgraded to the latest version and haven’t experience it in the last 1.5 weeks.
--

 

Tomas Jansson

Sr. Director of Software Engineering

+47 91862293 | @tomasjansson | skype:mastoj

Karl Johans gate 21, 0159 Oslo, Norway


   

LSA17 Ad-To-Action Award Winners!

Ready for your next journey to begin? We're hiring!

Feng Lu

unread,
Oct 3, 2018, 9:55:56 AM10/3/18
to tomas....@unacast.com, con...@parsely.com, cloud-composer-discuss
It was an known issue for beta composer environments, GA ( >= composer-1.0.0) environments should be a lot more stable as Tomas said. 
We also added additional error checking mechanism since Composer 1.1.0 to fix the rare scheduler frozen issue in Airflow. 
In the upcoming Composer 1.2.0 (not fully rolled out yet) release, we have switched the scheduler restart behavior from run based to time based (release note will be updated to reflect this when it's 100% rolled out). That should fix the issue when you have DAG straggler (e.g., a DAG that takes significant longer to process) in the environment.

Please PM me if you need to upgrade your environment manually to the latest release (1.2.0).  

Logu Venkatachalam

unread,
Jun 20, 2019, 2:51:33 AM6/20/19
to cloud-composer-discuss
Same problem. After I triggered the DAG the state says Scheduled, but it never runs. I takes a very long time.

Here is the my dag:

default_args = {
   'owner': 'airflow',
   'depends_on_past': False,
   'start_date': datetime(2019,6,1),
   'email_on_failure': False,
   'email_on_retry': False,
   'retries': 0,
   'retry_delay': timedelta(minutes=30),
}


dag = DAG('dag-name', schedule_interval=None, default_args=default_args, dagrun_timeout=timedelta(minutes=10))

What is going on?


On Wednesday, October 3, 2018 at 6:55:56 AM UTC-7, Feng Lu wrote:
It was an known issue for beta composer environments, GA ( >= composer-1.0.0) environments should be a lot more stable as Tomas said. 
We also added additional error checking mechanism since Composer 1.1.0 to fix the rare scheduler frozen issue in Airflow. 
In the upcoming Composer 1.2.0 (not fully rolled out yet) release, we have switched the scheduler restart behavior from run based to time based (release note will be updated to reflect this when it's 100% rolled out). That should fix the issue when you have DAG straggler (e.g., a DAG that takes significant longer to process) in the environment.

Please PM me if you need to upgrade your environment manually to the latest release (1.2.0).  

On Wed, Oct 3, 2018 at 4:58 AM Tomas Jansson <tomas....@unacast.com> wrote:
I’ve upgraded to the latest version and haven’t experience it in the last 1.5 weeks.
On Wed, 3 Oct 2018 at 13:57, Conrad Lee <con...@parsely.com> wrote:
I'm constantly running into this issue.  Sometimes it helps if I go into the UI, click on the scheduled task, and click on the 'clear' action.  Other time it seems the scheduler is frozen and not queueing up any tasks from any dags.  Sometimes, after an hour or so, the scheduler seems to come back to life and start queuing and executing tasks again.

This flaky task execution is my main complain against cloud composer so far.  

On Wed, Sep 12, 2018 at 2:06 PM <tomas....@unacast.com> wrote:
Yeah, so this happened again today. We have no clue what is going on here and would appreciate any help in debugging the issue.

On Tuesday, September 11, 2018 at 1:42:21 PM UTC+2, tomas....@unacast.com wrote:
For the second day in a row we notice a lot of our DAGs getting tasks that are "scheduled" but they are not moving forward to "queued". 

We are a little bit stuck on how we can debug this.

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-discuss+unsub...@googlegroups.com.
--

 

Tomas Jansson

Sr. Director of Software Engineering

+47 91862293 | @tomasjansson | skype:mastoj

Karl Johans gate 21, 0159 Oslo, Norway


   

LSA17 Ad-To-Action Award Winners!

Ready for your next journey to begin? We're hiring!

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-discuss+unsub...@googlegroups.com.

Rick Otten

unread,
Jul 15, 2019, 2:28:13 PM7/15/19
to cloud-composer-discuss

On Thursday, June 20, 2019 at 2:51:33 AM UTC-4, Logu Venkatachalam wrote:
Same problem. After I triggered the DAG the state says Scheduled, but it never runs. I takes a very long time.


We are seeing this now too for some of our tasks after upgrading to composer-1.7.2-airflow-1.10.2
:-(

I've tried doing the "clear" trick listed earlier in this thread.  It didn't help.  I also tried bouncing the scheduler.
kubectl get deployment airflow-scheduler --output yaml --namespace=composer-1-7-2-airflow-1-10-2-xxxxxx | kubectl replace --force -f -

I tried deleting the DAGs and then adding them back with new names.  That seemed to help for a while. I'm not convinced that is sufficient since they worked for a while after the upgrade before suddenly getting stuck overnight.



Rick Otten

unread,
Jul 17, 2019, 6:44:53 PM7/17/19
to cloud-composer-discuss

I figured out the root cause of our problem here.   It was totally not obvious for the longest time.  We had a task pool configured in some of our DAGs.
However, we had accidentally dropped the task pool from the airflow configuration.  (Admin/Pools)
So, apparently the default is 0 tasks when you refer to a pool that doesn't exist.  They get scheduled, but they never get queued.

There is no warning, no messages, no alerts, nor any other indication in the UI that the scheduled jobs are waiting for an available slot in the pool to open up.

That took me close to 3 days to figure out, and 10 seconds to solve.  gah.
Reply all
Reply to author
Forward
0 new messages