Airflow metadata database size

90 views
Skip to first unread message

Antonio Benjumea

unread,
Feb 15, 2024, 7:13:43 AMFeb 15
to cloud-composer-discuss
The metadata database size of the composer instance is growing significantly, though not yet problematic as it is well under 16 GB. In response, we implemented a DAG as suggested by GCP Documentation (Clean up the Airflow database). The DAG was executed yesterday, but we haven't observed any reduction in the size of the Airflow metadata database, as indicated in the monitoring section of the console.

We are unsure whether the lack of reduction is due to the fact that we haven't executed the command `VACUUM FULL`. Is this command executed during maintenance on a scheduled basis, or do we need to run it manually?

Thanks

Rafal Biegacz

unread,
Feb 15, 2024, 10:40:36 AMFeb 15
to Antonio Benjumea, cloud-composer-discuss
Hi Antonio,

Airflow DB cleaning DAG doesn't clean everything.

Things two take a look at:
a) table for DAGs
b) table for Airflow variables

If you are dynamically creating DAGs and/or variables then these tables are going to grow over time. DB cleaning DAG cannot delete the content of these tables as it doesn't know which data is still in use and which is not necessary any more.

Regards, Rafal.

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-di...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cloud-composer-discuss/c49fd6b2-6b9d-4bc3-8008-4f31f52245c4n%40googlegroups.com.

Antonio Benjumea

unread,
Feb 16, 2024, 4:28:39 AMFeb 16
to cloud-composer-discuss
Hi Rafal,

Thanks for your answer. 

We are not  dynamically creating DAGs and/or variables. So, it should not be an issue for us. Also, those tables are small, in the order of kbytes.

Thanks,
Antonio

sgus...@butterflynetinc.com

unread,
Mar 4, 2024, 1:33:05 PMMar 4
to cloud-composer-discuss
I had the same experience, so i altered analyze_db() to following, note - it will block database for duration, so any task failures need to be taken care of due to that:

def analyze_db(): session = settings.Session() session.connection(execution_options={"isolation_level": "AUTOCOMMIT"}) session.execute("VACUUM FULL ANALYZE", execution_options={"autocommit": True}) session.commit()

Rafal Biegacz

unread,
Mar 4, 2024, 3:04:57 PMMar 4
to sgus...@butterflynetinc.com, cloud-composer-discuss
BTW - what version of Composer do you use?

At certain point in time (in the end of March 2023) we published updated version of Composer that has more accurate DB size calculation: https://cloud.google.com/composer/docs/release-notes#March_18_2023

Regards, Rafal

CONFIDENTIALITY NOTICE: This communication, along with any attachments, is covered by federal and state law governing electronic communications and may contain confidential and legally privileged information. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, or distribution, use or copying of this message is strictly prohibited. If you have received this in error, please reply immediately to the sender and delete this message.

--
You received this message because you are subscribed to the Google Groups "cloud-composer-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cloud-composer-di...@googlegroups.com.

sgus...@butterflynetinc.com

unread,
Mar 6, 2024, 2:51:42 PMMar 6
to cloud-composer-discuss
i was on composer-2.0.30-airflow-2.3.3

Rafal Biegacz

unread,
Mar 6, 2024, 3:18:17 PMMar 6
to sgus...@butterflynetinc.com, cloud-composer-discuss
If you upgrade to a newer version of Composer then DB size calculations will be more accurate there.

In general, `VACUUM FULL` operation is not performed by Composer itself.

a) a user could run Vacuum Full operation on their own

b) during an upgrade operation the deleted rows are not transferred to a new DB

Regards, rafal.

sgus...@butterflynetinc.com

unread,
Mar 19, 2024, 3:56:17 PMMar 19
to cloud-composer-discuss
I was needed to run Vacuum specifically for upgrade, otherwise it's blocked.

Sergei

Reply all
Reply to author
Forward
0 new messages