Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#1015301: gitlab: Upgrade to 15.0.4 fails because BackgroundMigrations could not finalize

513 views
Skip to first unread message

Maximilian Stein

unread,
Jul 19, 2022, 6:00:03 AM7/19/22
to
Package: gitlab
Version: 15.0.4+ds1-1~fto11+2
Severity: important

Dear Maintainer,

I just tried to upgrade to Gitlab 15.0.4 on my two instances. While I
had no issues in my smaller instance, my bigger one had some issues.

First, postinst failed with the message

/etc/systemd/system/gitaly.service.d/override.conf already exist

I moved the file away and the upgrade could continue. At the end of
the upgrade, gitaly did not start anymore, so I restored the file (it
actually contains the setting of the user running gitaly).

Then, however, I stumbled upon a much more serious issue: The database
migration could not finish as there were pending background jobs that
failed to finalize:

gitlab_production database is not empty, skipping gitlab setup
Attention: used pure ruby version of MurmurHash3
/usr/share/gitlab/lib/gitlab.rb:47: warning: already initialized constant Gitlab::APP_DIRS_PATTERN
/usr/share/gitlab/lib/gitlab.rb:47: warning: previous definition of APP_DIRS_PATTERN was here
/usr/share/gitlab/lib/gitlab.rb:48: warning: already initialized constant Gitlab::VERSION
/usr/share/gitlab/lib/gitlab.rb:48: warning: previous definition of VERSION was here
/usr/share/gitlab/lib/gitlab.rb:49: warning: already initialized constant Gitlab::INSTALLATION_TYPE
/usr/share/gitlab/lib/gitlab.rb:49: warning: previous definition of INSTALLATION_TYPE was here
/usr/share/gitlab/lib/gitlab.rb:50: warning: already initialized constant Gitlab::HTTP_PROXY_ENV_VARS
/usr/share/gitlab/lib/gitlab.rb:50: warning: previous definition of HTTP_PROXY_ENV_VARS was here
== 20220213103859 RemoveIntegrationsType: migrating ===========================
rake aborted!
StandardError: An error has occurred, all later migrations canceled:

Gitlab::Database::BackgroundMigration::BatchedMigrationRunner::FailedToFinalize

I did some research, and found somebody else having a similar issue
[1]. Unfortunately, manually running the background jobs [2] did not
work either as they continued to fail. In the database I identified
the following stuck background migrations [3]:


gitlab_production=# select id,status,job_class_name, table_name, column_name, job_arguments from batched_background_migrations where status <> 3;
id | status | job_class_name | table_name | column_name | job_arguments
----+--------+--------------------------------------------------+--------------+-------------+---------------
17 | 4 | BackfillNamespaceIdForNamespaceRoute | routes | id | []
19 | 4 | BackfillMemberNamespaceForGroupMembers | members | id | []
20 | 4 | MigratePersonalNamespaceProjectMaintainerToOwner | members | id | []
23 | 4 | BackfillGroupFeatures | namespaces | id | [10000]
15 | 4 | BackfillIntegrationsTypeNew | integrations | id | []
16 | 4 | BackfillUserNamespace | namespaces | id | []
18 | 4 | BackfillIssueSearchData | issues | id | []


I then proceeded to simply change the status of the jobs to 3 in the
database as proposed in the issue mentioned above [1]. I could then
finish the upgrade normally with `apt upgrade`.

Afterwards, I simply undid the database change (i.e., reverted the
status of the failed background migrations to 4) and then restarted
the migrations in the web UI.

As far as I can tell, everything seems normal now. The seven
background migration jobs are still pending, but I can continue to use
Gitlab normally.

Do you need any more information on the matter?

Thanks for your investigation into the issue!

Best,
Maximilian


[1]: https://gitlab.com/gitlab-org/gitlab/-/issues/340193
[2]: https://docs.gitlab.com/ee/user/admin_area/monitoring/background_migrations.html#manually-finishing-a-batched-background-migration
[3]: https://docs.gitlab.com/ee/update/index.html#batched-background-migrations

Maximilian Stein

unread,
Jul 19, 2022, 7:40:04 AM7/19/22
to

Update on the background migrations:

All except one background migrations finished successfully, leaving only "BackfillIntegrationsTypeNew: integrations" stuck at failed. This one is exactly the migration mentioned in [1], so I guess this one might cause trouble in general

I then tried to migrate manually with `gitlab-rake db:migrate:redo VERSION=20210727113447`. This actually worked, however, I still have the Background Migration job in the admin page. So I guess I simply set the job's state on success as I don't experience any issues.

Antoine Le Gonidec

unread,
Jul 19, 2022, 1:40:03 PM7/19/22
to
I could work around the failure to execute the last remaining migration using the attached patch.

This patch is provided only for helping in diagnosing the underlying issue, I would not suggest applying it as-is.
0001-Work-around-background-migration-failure.patch

Maximilian Stein

unread,
Jul 20, 2022, 2:30:03 AM7/20/22
to
Thanks for the patch!

Unfortunately, the jobs still fails for me. I actually get the message:

    PG::UndefinedColumn: ERROR: Column integrations.type does not exist
LINE 47: AND integrations.type = mapping.old_type ^

So I guess my database is now to new for this migration and I can only
manually set it to "successful"…


Best,

Maximilian
0 new messages