Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Planned Taskcluster Migration Upcoming

126 views
Skip to first unread message

Dustin Mitchell

unread,
Aug 6, 2019, 2:48:36 PM8/6/19
to dev-platform, tools-taskcluster
*tl;dr:* Taskcluster, the platform supporting Firefox CI, will be moving to
a new hosting environment during the tree-closing window at the end of
September. This is a major change and upgrade to the platform, but may
cause some bumps along the way.

Taskcluster is in the midst of a few migrations [1], and one of those is
moving to a new hosting environment. This has a few advantages:
* Firefox CI will be handled by dedicated teams, separate from the TC team:
* Service operations will be managed by cloudops
* Administration will be managed by releng
* Workers will be managed by relops
* Hosting will be in a more trusted environment (GCP instead of Heroku)
* The services will have a new, faster UI
* Worker provisioning will be faster and more reliable, and cover more
cloud environments
* ..lots of other achievements unlocked by shedding 5+ years of legacy

The downside is that URLs will be changing, likely leaving some dangling
pointers. We've anticipated some of the most common (such as
https://queue.taskcluster.net/ URLs) and are planning workarounds, but
nonetheless anticipate some dead links for a while following the switch.

This work is tracked in bug 1546801 [2]. If you have questions or
concerns, please get in touch with me or anyone on the team, or file a bug
in Taskcluster :: General and we will triage it appropriately.

Dustin

[1]
https://mana.mozilla.org/wiki/pages/viewpage.action?spaceKey=TAS&title=Migrations
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=1546801

Chris Cooper

unread,
Nov 4, 2019, 5:00:17 PM11/4/19
to
tl;dr:

Taskcluster, the platform supporting Firefox CI, will be moving to a new hosting environment during the tree closing window (TCW) this coming Saturday, Nov 9. Trees will be closed from 14:00 UTC to 23:00 UTC. CI services will be available as soon as possible thereafter, pending verification of the new setup.

More verbose:

While originally scheduled for September, the Taskcluster hosting migration was pushed back to the upcoming TCW in November to allow more bake time for the new worker provisioning system, but also to give community projects more time to prepare for migration.

The primary driver for this migration was a separation of concerns between Firefox CI and all the other various Mozilla projects that also require CI. After this migration, there will be two production Taskcluster deployments where right now there is one:

1. Firefox CI - This cluster needs to be secure and optimized for the release process of Firefox and associated products (mobile browsers, etc). We have multiple teams responsible for shepherding Firefox code through the release process, and each team will be responsible for a different aspect of the Firefox CI deployment:

Deploying and running Taskcluster services - CloudOps
Worker management and imaging - RelOps
Cluster admin (granting scopes, roles, etc) - RelEng
Software issues with Taskcluster itself - Taskcluster team

2. Community projects - These projects might be small one-offs or might be speculative endeavors that may eventually become products of their own. Generally these projects are running on github. There is a greater capacity and tolerance for experimentation in this deployment, and projects are able to self-administer. The breakdown of responsibilities for the Community cluster is as follows:

Deploying and running Taskcluster services - CloudOps
Worker management and imaging - Taskcluster team, or individual projects
Cluster admin (granting scopes, roles, etc) - Taskcluster team, or individual projects
Software issues with Taskcluster itself - Taskcluster team

As a result of this migration, root URLs will be different for each cluster:

Firefox CI: https://firefox-ci-tc.services.mozilla.com
Community: https://community-tc.services.mozilla.com

Tools like treeherder and in-tree references will be updated to use these new URLs automatically as part of the downtime. Existing URLs, e.g. links to artifacts, will continue to work once the current monolithic deployment is put into read-only mode on Monday, Nov 11.

If you notice anything broken after the weekend, please file a bug. Here is a list of the best Product::Components to use in Bugzilla, depending on what is broken:

Deploying and running Taskcluster services - Cloud Services::Operations: Taskcluster
Worker management and imaging - Taskcluster::Workers
Cluster admin - Release Engineering::Firefox-CI Administration
Software issues with Taskcluster itself - Taskcluster::Services
Anything else - Taskcluster::General

Please be sure to indicate whether you're on the Firefox CI or Community deployment.

The bug tracking this work is here: https://bugzilla.mozilla.org/show_bug.cgi?id=1546801

You can find the Taskcluster team on Slack or IRC in #taskcluster if you have concerns or questions.

cheers,
--
coop

Chris Cooper

unread,
Nov 4, 2019, 5:08:38 PM11/4/19
to tools-ta...@lists.mozilla.org, dev-pl...@lists.mozilla.org, dev-b...@lists.mozilla.org

Simon Sapin

unread,
Nov 4, 2019, 5:29:05 PM11/4/19
to dev-pl...@lists.mozilla.org
On 04/11/2019 23:00, Chris Cooper wrote:
> Existing URLs, e.g. links to artifacts, will continue to work once
> the current monolithic deployment is put into read-only mode on
> Monday, Nov 11.
How long is are these read-only URLs expected to be maintained after that?

--
Simon Sapin

James Graham

unread,
Nov 8, 2019, 11:47:24 AM11/8/19
to dev-pl...@lists.mozilla.org, tools-ta...@lists.mozilla.org
On 04/11/2019 22:00, Chris Cooper wrote:
> tl;dr:
>
> Taskcluster, the platform supporting Firefox CI, will be moving to a new hosting environment during the tree closing window (TCW) this coming Saturday, Nov 9. Trees will be closed from 14:00 UTC to 23:00 UTC. CI services will be available as soon as possible thereafter, pending verification of the new setup.

I'd just like to call out how great the TaskCluster team have been at
helping projects moving to the community instance.

We have just finished moving the upstream wpt testing to the new setup,
and the TaskCluster team proactively made sure we were aware the change
was happening, tried to identify likely issues in our setup, filed PRs
for the most obvious changes, and made sure we were on track to finish
by the deadline. They also helped us work through teething issues once
we were running on the new instance.

Overall the experience has been as good as you could wish for with a
large infrastructure change. Thanks to everyone involved; I hope the
Gecko migration goes just as smoothly!
0 new messages