Moving some old builds from Mozilla's FTP server to an offline archive

2 views
Skip to first unread message

Ben Hearsum

unread,
Jun 17, 2009, 3:17:10 PM6/17/09
to
(Cross posting widely, please reply to mozilla.dev.builds)

Hi Everyone,

Over the course of the past few years we've greatly increased the number
of builds and other files we push to Mozilla's FTP server. Despite
having a vast amount of space available we've nearly filled it up. We'd
like to recover quite a bit of space in one shot by archiving all
Firefox, Thunderbird, and SeaMonkey builds from 2006 and earlier to an
offline backup.

For Firefox, this means is:
* up to 1.5.0.10pre nightlies on the 1.8.0 branch
* up to 2.0.0.2pre nightlies on the Mozilla 1.8 branch
* up to 3.0a2pre on the 1.9 branch.

Both 1.5.0.x and 2.0.0.x are de-supported at this point, so I don't
expect anybody needs them. I also don't expect anyone needs 3.0a2pre builds.

For Thunderbird, this is:
* up to 1.5.0.10pre nightlies on the 1.8.0 branch
* up to 2.0b1pre nightlies on the 1.8 branch
* up to 3.0a1pre nightlies on the 1.9 branch.

1.5.0.x is de-supported at this point for Thunderbird, and 2.0b1pre and
3.0a1pre are pretty ancient and I suspect no one looks at this regularly.

For SeaMonkey, this is:
* 1.0.7pre nightlies on the 1.8.0 branch
* 1.1pre nightlies on the 1.8 branch
* 1.5a1pre nightlies on the 1.9.0 branch
1.0.x is unsupported at this point for SeaMonkey, and both 1.1pre and
1.5a1pre are very ancient, and as with Thunderbird, I suspect they
aren't often looked at.

Note that this is only nightly builds we're talking about - releases
(including alphas and betas) will remain on FTP.

Does anyone have a practical use for builds on a regular basis? Are
there other strong objections to this?

Note that if we do need to retrieve older builds for some reason they
can be put online again (temporarily) by filing an IT bug.

I'll give folks a day or two to respond to this before we go ahead with it.

- Ben

Samuel Sidler

unread,
Jun 17, 2009, 3:58:22 PM6/17/09
to
> Does anyone have a practical use for builds on a regular basis? Are
> there other strong objections to this?

Not on my end. All of the Firefox and Thunderbird builds you mentioned
seem fine to clear out (can't speak to Thunderbird 3, but everything
else...).

-Sam

Message has been deleted

Phil Ringnalda

unread,
Jun 17, 2009, 4:26:26 PM6/17/09
to
On 6/17/2009 12:17 PM, Ben Hearsum wrote:

> Over the course of the past few years we've greatly increased the number
> of builds and other files we push to Mozilla's FTP server. Despite
> having a vast amount of space available we've nearly filled it up. We'd
> like to recover quite a bit of space in one shot by archiving all
> Firefox, Thunderbird, and SeaMonkey builds from 2006 and earlier to an
> offline backup.
>
> For Firefox, this means is:
> * up to 1.5.0.10pre nightlies on the 1.8.0 branch
> * up to 2.0.0.2pre nightlies on the Mozilla 1.8 branch
> * up to 3.0a2pre on the 1.9 branch.
>
> Both 1.5.0.x and 2.0.0.x are de-supported at this point, so I don't
> expect anybody needs them. I also don't expect anyone needs 3.0a2pre
> builds.


Do you mean "all builds" or "all branch builds"?

I don't think I'd miss branch builds on unsupported branches - it can be
handy to have a separate branch regression window when it's not clear
which checkin in a trunk window is at fault, but having that work out is
fairly rare.

But if you mean all builds, including trunk, I'd really want to know
just how much money we're talking about saving by throwing away the
opportunity that we currently have, for anyone at all who cares about a
regression to vastly, massively increase the odds of that regression
being fixed by getting a one day regression window. It's nice that Ria
apparently has a private mirror of ftp.mo and can pinpoint any Firefox
regression in minutes, but that's only Firefox and Core, and she could
get tired of us at any time.

Ben Hearsum

unread,
Jun 17, 2009, 5:01:28 PM6/17/09
to Simon Paquet
On 17/06/09 4:01 PM, Simon Paquet wrote:

> Ben Hearsum wrote:
>
>> (Cross posting widely, please reply to mozilla.dev.builds)
>>
>> Hi Everyone,
>>
>> Over the course of the past few years we've greatly increased the number
>> of builds and other files we push to Mozilla's FTP server. Despite
>> having a vast amount of space available we've nearly filled it up. We'd
>> like to recover quite a bit of space in one shot by archiving all
>> Firefox, Thunderbird, and SeaMonkey builds from 2006 and earlier to an
>> offline backup.
>
> Ben,
>
> if you want to, you can archive old Sunbird and Lightning nightly builds
> from 2006 as well. I seriously doubt that anyone still needs those
> builds. They are currently all located in
>
> http://ftp.mozilla.org/pub/mozilla.org/calendar/sunbird/nightly/2006/
> http://ftp.mozilla.org/pub/mozilla.org/calendar/lightning/nightly/2006/
>
> That should free up additional 40GB to 60GB (rough estimate from me) on
> the FTP server.
>
> Let me know what you think.
>
> Simon Paquet

Sounds good to me. I'll add it to the list.
https://bugzilla.mozilla.org/show_bug.cgi?id=496876, by the way.

Ben Hearsum

unread,
Jun 17, 2009, 5:03:08 PM6/17/09
to Phil Ringnalda
On 17/06/09 4:26 PM, Phil Ringnalda wrote:
> But if you mean all builds, including trunk, I'd really want to know
> just how much money we're talking about saving by throwing away the
> opportunity that we currently have, for anyone at all who cares about a
> regression to vastly, massively increase the odds of that regression
> being fixed by getting a one day regression window. It's nice that Ria
> apparently has a private mirror of ftp.mo and can pinpoint any Firefox
> regression in minutes, but that's only Firefox and Core, and she could
> get tired of us at any time.

I'm not involved in regression hunting, so forgive the uninformed
question, but how often do we have to go as far back as 3.0a2pre to find
a regression range?

Note that neither mozilla-central nor mozilla-1.9.1 existed in 2006, so
we're not archiving any of those builds.

Phil Ringnalda

unread,
Jun 17, 2009, 6:00:18 PM6/17/09
to
On 6/17/2009 2:03 PM, Ben Hearsum wrote:

> I'm not involved in regression hunting, so forgive the uninformed
> question, but how often do we have to go as far back as 3.0a2pre to find
> a regression range?


All the time often. It's hardly a perfect example since nobody bothered
to look for a range, but bug 321783 that I patched this morning would
have been pretty obvious if someone had found the (three day, since a
couple of nightlies are missing) window in June 2004, at any time
between when it was reported in 2005 and when it finally got the other
approach to finding it done, someone running in a debugger and then
noticing that unlikely though it seems, it had to do with stylesheet
loading and something that was in Thunderbird but not in SeaMonkey.

Before 3.0a2pre? That would include nsIThreadManager, which has only had
one regression added so far this year, but had 14 added last year, two
years after it landed, and refactor-reflow, which depending on how you
count has picked up between four and six regressions so far this year.

Unfortunately, I can't find a useful way to query bugzilla for either
things in a date range that caused regressions, or for regressions
caused by old things, so I'm left with nothing but anecdotes like bug
427060 or bug 491180 (a nice substitute query is "commenter is ria and
comment includes 2006") and the fact that even ignoring the hundreds of
older regression bugs that still lack a window, nobody would be the
least bit surprised to hear that something newly reported regressed
before 2007.

Boris Zbarsky

unread,
Jun 17, 2009, 6:53:41 PM6/17/09
to
Ben Hearsum wrote:
> Does anyone have a practical use for builds on a regular basis?

Sure, for finding regression ranges. I tend to use my local archive,
but others actually use the FTP server for this.

Can we just move these builds to a different server or something

-Boris

Hartmut Figge

unread,
Jun 17, 2009, 7:03:48 PM6/17/09
to
Boris Zbarsky:
>Ben Hearsum wrote:

>> Does anyone have a practical use for builds on a regular basis?
>
>Sure, for finding regression ranges.

Yes.

>I tend to use my local archive, [...]

So do i. My private builds of nightlies are in directories named like
this one, 0602010118. Enough information, but after finding the range i
have to transform the date to PDT. *g*

Hartmut

Boris Zbarsky

unread,
Jun 17, 2009, 7:08:47 PM6/17/09
to
Ben Hearsum wrote:
> I'm not involved in regression hunting, so forgive the uninformed
> question, but how often do we have to go as far back as 3.0a2pre to find
> a regression range?

Very very commonly. Anything that regressed between Firefox 2 and
Firefox 3, say.

About one time in ten when I do regression hunting I end up having to go
earlier than Firefox 2 ship.

-Boris

Robert Kaiser

unread,
Jun 18, 2009, 7:54:15 AM6/18/09
to
Phil Ringnalda wrote:
> It's nice that Ria
> apparently has a private mirror of ftp.mo and can pinpoint any Firefox
> regression in minutes, but that's only Firefox and Core, and she could
> get tired of us at any time.

In earlier times when we had run into such a problem once, we had
archive.mozilla.org point to a large array that included all the old
builds (but was probably just a single machine) while ftp.m.o had them
removed.
It might be interesting to think about a similar solution again...

Robert Kaiser

Serge Gautherie

unread,
Jun 18, 2009, 2:01:42 PM6/18/09
to
Ben Hearsum wrote:

> I'm not involved in regression hunting, so forgive the uninformed
> question, but how often do we have to go as far back as 3.0a2pre to find
> a regression range?

I haven't done it for a while (as I currently work on other things),
but there was a time when I did use (archive) ftp to find out such
regression timeframes on a +/- regular basis.

> Note that neither mozilla-central nor mozilla-1.9.1 existed in 2006, so
> we're not archiving any of those builds.

Right, but mxr is still available for cvs roots,
and comm-central applications (at least) haven't released from m-1.9.1
(nor cvs/1.9.0 !) yet.

And, in the particular case of SeaMonkey, these builds might be used to
deal with (some of) all the bugs we inherited from MozillaAS (and
previous) days.

(I think keeping everything online, here or elsewhere, would be better,
if possible.)

Ben Hearsum

unread,
Jun 18, 2009, 2:07:17 PM6/18/09
to Serge Gautherie
On 18/06/09 2:01 PM, Serge Gautherie wrote:
> Ben Hearsum wrote:
>
>> I'm not involved in regression hunting, so forgive the uninformed
>> question, but how often do we have to go as far back as 3.0a2pre to
>> find a regression range?
>
> I haven't done it for a while (as I currently work on other things),
> but there was a time when I did use (archive) ftp to find out such
> regression timeframes on a +/- regular basis.
>
>> Note that neither mozilla-central nor mozilla-1.9.1 existed in 2006,
>> so we're not archiving any of those builds.
>
> Right, but mxr is still available for cvs roots,
> and comm-central applications (at least) haven't released from m-1.9.1
> (nor cvs/1.9.0 !) yet.
>

Yes, but the amount of space required to store old code is a tiny tiny
fraction of the space needed to store old builds.

> (I think keeping everything online, here or elsewhere, would be better,
> if possible.)
>

Agreed. I'm working with IT to see what we can do here.

Nick Thomas

unread,
Jun 18, 2009, 6:32:42 PM6/18/09
to
Robert Kaiser wrote:
> In earlier times when we had run into such a problem once, we had
> archive.mozilla.org point to a large array that included all the old
> builds (but was probably just a single machine) while ftp.m.o had them
> removed.
> It might be interesting to think about a similar solution again...

The disk behind archive.m.o is the same disk that we're wanting to clean
up here. There's no distinction between archive.m.o and ftp.m.o any more
because they use the same backend storage.

Robert Kaiser

unread,
Jun 18, 2009, 7:49:51 PM6/18/09
to

Yes, nowadays. I talked about "in earlier times", did you see that? ;-)

Robert Kaiser

Ben Hearsum

unread,
Jun 19, 2009, 1:30:33 PM6/19/09
to
(Crossposting again, please reply to mozilla.dev.builds)

Aravind kindly granted me access to the FTP logs so we can get some
better data. Here's what I've come up with:
BUILD YEAR || PRODUCT || NUMBER OF DOWNLOADS IN 2009
2004 firefox 310
2005 firefox 594
2006 firefox 921
2004 thunderbird 9
2005 thunderbird 178
2006 thunderbird 245
2005 seamonkey 6
2006 seamonkey 19
2006 calendar 107

So for example, in this current calendar year there have been 310
downloads of a Firefox build (defined as .zip, .exe, .tar.gz, or .dmg)
from 2004.

I can go back in time further if people like, too, but I think 6 months
of history is probably a large enough data set.

--

Only Firefox builds from 2005 and 2006 are averaging more than 2 hits
per day and given that I'd like to humbly suggest that we archive the
following:
* Firefox nightly builds from 2004
* Thunderbird nightly builds from 2004, 2005, and 2006
* SeaMonkey nightly builds from 2005 and 2006
* Calendar nightly builds from 2005 and 2006

This would let us keep *all* Firefox nightlies from 1.5.0.x and 2.0.0.x
- bz/phil/others, does that make things better for you in terms of
regression hunting?

Does this sound fairer to everyone?

As an aside, IT is currently looking into pricing and the possibility of
adding more space to this storage array. More on that when I hear back.

- Ben

Ben Hearsum

unread,
Jun 19, 2009, 1:32:42 PM6/19/09
to

Mark Hansen

unread,
Jun 19, 2009, 1:51:20 PM6/19/09
to
On 06/19/09 10:32, Ben Hearsum wrote:
> (Crossposting again, please reply to mozilla.dev.builds)

If you want everyone to reply only to mozilla.dev.builds, why are
you not setting a Followup-To: header?

Peter Weilbacher

unread,
Jun 19, 2009, 2:20:45 PM6/19/09
to
On 19.06.2009 19:32, Ben Hearsum wrote:

> Aravind kindly granted me access to the FTP logs so we can get some
> better data.

I am curious: what amounts of data are we talking about here? (How
large is the current storage, how much did you intend to move, etc.)
Peter.

P.S.: I actually wanted to cite something else from your post but
the "-- " in the middle causes the rest to be cut off.

Ben Hearsum

unread,
Jun 19, 2009, 2:59:25 PM6/19/09
to
On 19/06/09 2:20 PM, Peter Weilbacher wrote:
> On 19.06.2009 19:32, Ben Hearsum wrote:
>
>> Aravind kindly granted me access to the FTP logs so we can get some
>> better data.
>
> I am curious: what amounts of data are we talking about here? (How
> large is the current storage, how much did you intend to move, etc.)
> Peter.
>

We've got a 3TB array for the FTP server. Currently there's 120G free
and we use a few GB per day (with spikes on days when we build
releases). Here's the usages per product per year:
Firefox 2004: 16G
Firefox 2005: 53G
Firefox 2006: 191G
Thunderbird 2004: 17G
Thunderbird 2005: 47G
Thunderbird 2006: 87G
SeaMonkey 2005: 24G
SeaMonkey 2006: 89G
Calendar 2006: 35G

So, for everything except Firefox 2005/2006 totals to 315G. There's also
roughly 700GB of data in a private area on that array I plan to archive.

- Ben

Ben Hearsum

unread,
Jun 19, 2009, 3:39:51 PM6/19/09
to
On 19/06/09 2:59 PM, Ben Hearsum wrote:
> SeaMonkey 2005: 24G
> SeaMonkey 2006: 89G

One more update:
I just found nightly/contrib/200[56]. Adding those to the above gives
new totals of:
SeaMonkey 2005: 37G
SeaMonkey 2006: 128G

I also found 115G worth of old Mozilla-suite nightlies,
http://ftp.mozilla.org/pub/mozilla.org/mozilla (nightly and l10n dirs).
I highly doubt anyone uses these nightlies for regression hunting, and
we have all the old releases here:
http://ftp.mozilla.org/pub/mozilla.org/mozilla/releases so I'd like to
archive the aformentioned directories too.

Boris Zbarsky

unread,
Jun 19, 2009, 3:59:44 PM6/19/09
to
Ben Hearsum wrote:
> Firefox 2004: 16G

That seems low enough to be worth keeping as a first cut, to me...

-Boris

Paul White

unread,
Jun 19, 2009, 4:23:08 PM6/19/09
to Ben Hearsum
Ben Hearsum wrote:
> I also found 115G worth of old Mozilla-suite nightlies,
> http://ftp.mozilla.org/pub/mozilla.org/mozilla (nightly and l10n dirs).
> I highly doubt anyone uses these nightlies for regression hunting, and
> we have all the old releases here:
> http://ftp.mozilla.org/pub/mozilla.org/mozilla/releases so I'd like to
> archive the aformentioned directories too.

Speaking of which, I noticed that there are still nightlies from when
Firefox was known as Firebird:
http://ftp.mozilla.org/pub/mozilla.org/firebird/nightly/

Should those be archived as well?

Paul

Ben Hearsum

unread,
Jun 19, 2009, 4:35:43 PM6/19/09
to Paul White

Yeah, let's add these to the list too.

Ben Hearsum

unread,
Jun 19, 2009, 4:37:20 PM6/19/09
to

Given the large volume of other things I found to archive that seems
reasonable, consider it done.

Boris Zbarsky

unread,
Jun 19, 2009, 5:11:58 PM6/19/09
to
Ben Hearsum wrote:
> Given the large volume of other things I found to archive that seems
> reasonable, consider it done.

In that case, sounds ok to me. Builds from longer ago than about 2003
tend to not run on modern OSes anyway, sadly... And if someone needs
the really old Linux builds, they can always come talk to me. ;)

-Boris

Phil Ringnalda

unread,
Jun 19, 2009, 8:36:39 PM6/19/09
to
On 6/19/2009 11:59 AM, Ben Hearsum wrote:

> Thunderbird 2004: 17G
> Thunderbird 2005: 47G
> Thunderbird 2006: 87G


Finally realized that I'm doing the classic expecting-a-free-lunch.

If it's at all possible to hold out that long, could you pretty-please
arrange with gozer to transfer that 151GB over to MoMo storage before
you delete it?. davida says it's fine with him, we've got terabytes to
spare.

Don't know how hard it is to delete individual things per day, but if it
helps, I can keep looking for things that nobody will miss, to make room
in the meantime, things like the Firefox -fs builds from 2005-04-07
through 2006-03-13, and the -aviary1.0.1 builds from 2005-02 through
2006-06 (branches are nice, but vastly less useful than trunk, and that
branch is less useful than most), and the incomprehensible
-firefox1.5.0.4 and -firefox1.5.0.5 directories from 2006-05 through
2006-07 that sometimes include en-US builds and .mars that are probably
just a few hours off the -1.8.0 builds from the same day, and other
times include 650MB of Windows installers and zips and mars for every
locale, and other times include around 1GB of Mac dmgs and mars for
every locale, and the similar but more clearly labelled
-firefox2.0b1-l10n and b2-l10n directories that run on to random days in
2006-08, and the similar -thunderbird1.5.0.5 dirs, and
firebird/nightly/contrib and firebird/nightly/experimental, and
thunderbird/m-builds/ that nobody knows what they are, and
mozilla/tinderbox-builds/, and OJI/, and profiles/ and data/, and msgsdk
(lolwut? it's a Netscape IMAP/POP3/SMTP library in C and Java from
1998?), and diskimages/ (I bet we can do without ready access to a
631,208KB iso of Thunderbird 1.0 and Suite 1.7.5), and and and.

Phil Ringnalda

unread,
Jun 20, 2009, 5:26:17 PM6/20/09
to
On 6/19/2009 11:59 AM, Ben Hearsum wrote:

> So, for everything except Firefox 2005/2006 totals to 315G.


If my calculator work was right and we're at around 1.7GB per day right
now, that would buy us six months before we need to do this again?

(Well, unless we're going to start saving nightly builds for Fx l10n,
since with just 22 locales and two builds, mobile-trunk-l10n is adding
close to half a GB a day, and with 70-some locales and a minimum of 3
builds, mozilla-central-l10n would be more like an added 2.4GB/day.)


> roughly 700GB of data in a private area on that array I plan to archive.


Ah, so more like a year and a half? Is killing Tb+SM+Calendar for 2007
in January 2011 going to be enough to then get us through 2011? The way
I figured it, current firefox/ is adding more per day than the
combination of thunderbird/+seamonkey/+calendar/+camino/. So we should
probably be planning for the next round to either need to kick
everything other than Firefox off, or to start throwing away old Firefox
nightlies, too (or, both and in Q1 2010, with mozilla-central-l10n).

Wayne Mery

unread,
Jun 21, 2009, 7:44:05 AM6/21/09
to

Agree. IMO 2006 Thunderbird nightlies should not be archived, and not be
moved to some other store unless they can be referenced directly from
the normal ftp hierarchy. In roughly the past year I've downloaded ~40
2006 Thunderbird nightly builds of v2 and trunk - not having those would
have been unhandy.

2005? I have 2006 builds that were downloaded prior to 2007 but I can't
tell whether I've used them in the past year, so I have no datapoints on
usage (my guess is I have used some in the past year). But given that
Thunderbird v3's origins are in 2006, after v3 goes live I'd say
regressions will be found that originate in 2005, and we would want
those builds readily available.

In past year I dl'd half dozen each Minefield and SM 2006 builds. None
from 2005.

Ben Hearsum

unread,
Jun 22, 2009, 5:03:53 PM6/22/09
to
On 20/06/09 5:26 PM, Phil Ringnalda wrote:
> On 6/19/2009 11:59 AM, Ben Hearsum wrote:
>
>> So, for everything except Firefox 2005/2006 totals to 315G.
>
>
> If my calculator work was right and we're at around 1.7GB per day right
> now, that would buy us six months before we need to do this again?
>
> (Well, unless we're going to start saving nightly builds for Fx l10n,
> since with just 22 locales and two builds, mobile-trunk-l10n is adding
> close to half a GB a day, and with 70-some locales and a minimum of 3
> builds, mozilla-central-l10n would be more like an added 2.4GB/day.)
>

We're not going to be archiving l10n nightlies AFAIK.

Average usage is around 2.4GB/day, but everytime we spin a release build
we use up an additional space (6.5GB for 3.0.x, 8GB for 3.5.x).

>
>> roughly 700GB of data in a private area on that array I plan to archive.
>
>
> Ah, so more like a year and a half? Is killing Tb+SM+Calendar for 2007
> in January 2011 going to be enough to then get us through 2011?

Sorry, I'm not following this, can you rephrase?

> The way
> I figured it, current firefox/ is adding more per day than the
> combination of thunderbird/+seamonkey/+calendar/+camino/. So we should
> probably be planning for the next round to either need to kick
> everything other than Firefox off, or to start throwing away old Firefox
> nightlies, too (or, both and in Q1 2010, with mozilla-central-l10n).

Firefox builds will need to be archived at some point. The fact is
though that there is significantly more usage of them for a longer
period of time than other products.

When I was looking at the amount of Tb build downloads I didn't think
there would be this much pushback when it came to archiving them. You've
spoken up strongly about, as has Wayne Mery. I've talked with IT and
they're open to adding space to the storage array. Given that, I think
we can keep the Thunderbird builds around awhile longer.

I do want to say that we have to have a cut-off point somewhere. We
cannot keep these things online forever. I guess we don't have to cross
that bridge again for awhile, though.

So, this means the final list is as follows:
/pub/mozilla.org/seamonkey/nightly/200[56]
/pub/mozilla.org/seamonkey/nightly/contrib/200[56]
/pub/mozilla.org/calendar/sunbird/nightly/2006
/pub/mozilla.org/calendar/lightning/nightly/2006
/pub/mozilla.org/mozilla/nightly
/pub/mozilla.org/mozilla/l10n
/pub/mozilla.org/firebird/nightly

Phil Ringnalda

unread,
Jun 23, 2009, 12:02:41 AM6/23/09
to
On 6/22/09 2:03 PM, Ben Hearsum wrote:
> We're not going to be archiving l10n nightlies AFAIK.

If the Fennec ones are accidental, it might be worth reversing that
accident: they took a while to build up, but with the number being
produced now, they alone would eat up the space you would free by
killing all of firefox/2004/ in about six weeks.

> Average usage is around 2.4GB/day, but everytime we spin a release build
> we use up an additional space (6.5GB for 3.0.x, 8GB for 3.5.x).
>
>>
>>> roughly 700GB of data in a private area on that array I plan to archive.
>>
>>
>> Ah, so more like a year and a half? Is killing Tb+SM+Calendar for 2007
>> in January 2011 going to be enough to then get us through 2011?
>
> Sorry, I'm not following this, can you rephrase?

Trying to do (bad, apparently) math to figure out what the next round of
cutting will look like and when it will happen.

If you're determined that we will not have a regular increase in the
amount of storage, then there will be another round, and unless there
are more 700GB private areas, they'll be more painful, and I'll want to
avoid that pain by spending the intervening time campaigning to have it
be our policy that we *will* keep all builds after some point available
by budgeting for regular additions of storage, or failing that, that the
number of years we'll keep available will be clearly more than we need,
rather than very, very clearly less than we need.

2.4GB/day * 365, that's 876GB/year, plus releases, plus things we'll add
like Places nightlies, or Electrolysis nightlies, or something else.
Call it 1TB/year, to make the math easy enough for me, and if we didn't
add any storage then after just a few years we'd be down to just three
years. "Enough" is really hard to quantify, but that ain't it: three
years right now would cut off all the things that landed early in the
1.9 cycle because they were certain to cause regressions, regressions
that we haven't finished finding yet.

Apparently I'm naive about how much serious storage costs, but say the
number I got of $600/TB is right. That's *cheap*. It's not unreasonable
to guess that we fix a couple of regressions every year that only take a
couple of hours because they only call for looking for the cause in a
single patch, instead of spending a week in a debugger in unfamiliar
code with no idea where the problem really is. How much do we pay for a
week of programmer? How much do we "pay" for having those regressions go
unfixed because nobody wants to spend a week in a debugger in unfamiliar
code?

> I do want to say that we have to have a cut-off point somewhere. We
> cannot keep these things online forever. I guess we don't have to cross
> that bridge again for awhile, though.

Why? You keep saying that like it's an absolute truth, and I don't know
why. A product that's truly dead (not the App Suite, since that's just
SeaMonkey with a different name, but something like Grendel), sure, but
for a live project? Why, and when? Certainly when not one line of code
that was added in year n is still around (which so far means
"considerably more than 11 years"), certainly when nobody is still
running a system that will run the builds, probably for branches that
are no longer supported, even though they can still be handy for
triangulation, but in general, just absolutely "we will get rid of
builds, whether or not we can afford to keep them, whether or not they
might still be useful"? I just can't see that as a defensible position.
If we choose to do so, why isn't keeping the measly 16GB of
firefox/2004/ around forever, rather than removing it to replace it with
6.66 days of current builds, a reasonable and acceptable thing to do?
Why isn't keeping tomorrow and the next day's Firefox builds around for
ten years in case it takes that long for someone to notice the screwup
I'm going to push tomorrow was wrong a reasonable thing to do? Why three
years, or ten, but not fifteen, twenty or fifty, if it takes that long
until they're no longer runnable? The nine downloads of
thunderbird/2004, that you see as so few that it's worthless? With just
a little bit of luck, nine is plenty for a binary search to get down to
a one day window for a regression that's been bothering someone for five
years, and thus to be able to point at probably one or two patches that
caused it. How much is that worth?

Ben Hearsum

unread,
Jun 23, 2009, 9:36:30 AM6/23/09
to Phil Ringnalda
On 23/06/09 12:02 AM, Phil Ringnalda wrote:
> On 6/22/09 2:03 PM, Ben Hearsum wrote:
>> We're not going to be archiving l10n nightlies AFAIK.
>
> If the Fennec ones are accidental, it might be worth reversing that
> accident: they took a while to build up, but with the number being
> produced now, they alone would eat up the space you would free by
> killing all of firefox/2004/ in about six weeks.
>

Thanks for pointing that out. I just filed bug 499919 on this.

>>> Ah, so more like a year and a half? Is killing Tb+SM+Calendar for 2007
>>> in January 2011 going to be enough to then get us through 2011?
>>
>> Sorry, I'm not following this, can you rephrase?
>
> Trying to do (bad, apparently) math to figure out what the next round of
> cutting will look like and when it will happen.
>
> If you're determined that we will not have a regular increase in the
> amount of storage, then there will be another round, and unless there
> are more 700GB private areas, they'll be more painful

Slightly offtopic: It turns out this private area is on a different
storage array, so it's not going to help this situation as I'd
originally thought.

and I'll want to
> avoid that pain by spending the intervening time campaigning to have it
> be our policy that we *will* keep all builds after some point available
> by budgeting for regular additions of storage, or failing that, that the
> number of years we'll keep available will be clearly more than we need,
> rather than very, very clearly less than we need.

I think it's fair to remove based on usage. It's clear by the response
and logs that people are still using old Firefox and Thunderbird builds
- so let's keep them. But when the usage for a particular group of
builds drops off we should archive.

> 2.4GB/day * 365, that's 876GB/year, plus releases, plus things we'll add
> like Places nightlies, or Electrolysis nightlies, or something else.
> Call it 1TB/year, to make the math easy enough for me, and if we didn't
> add any storage then after just a few years we'd be down to just three
> years. "Enough" is really hard to quantify, but that ain't it: three
> years right now would cut off all the things that landed early in the
> 1.9 cycle because they were certain to cause regressions, regressions
> that we haven't finished finding yet.

I think it's perfectly legitimate to get more space when our daily disk
usage goes up. It needs to be give and take, though. We can't keep
builds around just because maybe some developer will need them sometime.
When usage drops off, the cost/benefit of keeping them sways heavily
towards archiving.

> Apparently I'm naive about how much serious storage costs, but say the
> number I got of $600/TB is right. That's *cheap*.

Enterprise storage is not that cheap to buy, setup, run, or maintain. I
do understand your point about storage cost vs. cost of developer time,
though.


Another idea a few people have thrown out is to delete every other
nightly build past a certain cut-off. I've been told we've done this in
the past to save space. By doing this we'd get to keep older builds, but
the regression ranges would be a bit larger. Is that any better?

Serge Gautherie

unread,
Jun 23, 2009, 6:02:47 PM6/23/09
to
Ben Hearsum wrote:

> Another idea a few people have thrown out is to delete every other

Delete, as is Archive!?

> nightly build past a certain cut-off. I've been told we've done this in
> the past to save space. By doing this we'd get to keep older builds, but
> the regression ranges would be a bit larger. Is that any better?

Better, sure!

Mike Shaver

unread,
Jun 23, 2009, 6:15:08 PM6/23/09
to dev-b...@lists.mozilla.org
On Tue, Jun 23, 2009 at 3:36 PM, Ben Hearsum<bhea...@mozilla.com> wrote:
> Enterprise storage is not that cheap to buy, setup, run, or maintain. I do
> understand your point about storage cost vs. cost of developer time, though.

Why do we need to use enterprise storage? We'll have a copy in the
archival place ("archive" means that we can get it off tape if we need
it, right?) in case of something truly catastrophic, but otherwise --
especially given the presumption that these are infrequently used, and
therefore the duty cycle is pretty mild -- we should be able to get by
with pairs of commodity SATA drives and nagios yelling when SMART gets
uppity.

But even if it does cost $2000/TB for Enterprise Grade Storage, I
would happily incur a cost of $6K/year in extra stoarge to save this
developer time -- and let us avoid having to rebuild them all to
recalibrate talos, or do mining via other automation, etc.

There are lots of things that need the build team's time, and most of
them we can't defer for another year by spending a couple of thousand
dollars, so I'd really rather do that now than try and scrape a few
more months of build storage out of the current system.

Mike

Ben Hearsum

unread,
Jun 24, 2009, 8:18:53 AM6/24/09
to
On 23/06/09 6:02 PM, Serge Gautherie wrote:
> Ben Hearsum wrote:
>
>> Another idea a few people have thrown out is to delete every other
>
> Delete, as is Archive!?
>

Good catch. Yes, I meant archive.

Phil Ringnalda

unread,
Jun 24, 2009, 3:42:11 PM6/24/09
to
On 6/23/2009 6:36 AM, Ben Hearsum wrote:

> I think it's fair to remove based on usage. It's clear by the response
> and logs that people are still using old Firefox and Thunderbird builds
> - so let's keep them. But when the usage for a particular group of
> builds drops off we should archive.


Arguing "existence value" always sounds like you're just making excuses,
whether in the original context of wilderness ("it's valuable just to
know that it exists, and you could go, whether or not you ever do") or
here, but even if nobody actually does download any of them, there is
value in being able to say in a bug where people keep screaming about
how long the bug has been open that there is something they can do to
move it forward. Possibly not enough to be worth it, but even builds
that don't get downloaded at all in some period of time do still have value.


> Another idea a few people have thrown out is to delete every other
> nightly build past a certain cut-off. I've been told we've done this in
> the past to save space. By doing this we'd get to keep older builds, but
> the regression ranges would be a bit larger. Is that any better?


Yep, if push comes to shove two-day windows are certainly better than
two-month or twelve-month windows between releases. At least for my
uses, I'd put the best order to kill things at

firefox/nightly/*/*-fs (we didn't ever do Tb-fs, did we?) which are now
a positive nuisance since they only serve to confuse people who never
knew about them or have forgotten what they were

*/nightly/*/(firefox|thunderbird)1.5.0* and
firefox/nightly/*/firefox2.0* (look like release candidates, often with
l10n and partial and complete mars, so they are huge, but nobody will
know how they relate to what was in the releases so they aren't useful)

*/nightly/*/*aviary(-?)1.0* - nice to have if you are triangulating
something that landed on the aviary branch at a different time than on
trunk, but that's probably rare by now and they are mightily confusing:
what's the actual difference between
ftp://ftp.mozilla.org/pub/thunderbird/nightly/2006/05/2006-05-08-08-aviary1.0/
and
ftp://ftp.mozilla.org/pub/thunderbird/nightly/2006/05/2006-05-08-10-aviary1.0.1/?

*/nightly/*/*mozilla1.8.0 (more so for Firefox, where 1.5 was forever
ago, but when we have to drop them for Thunderbird too, for me at least
it's a better bet to keep one-day windows on the trunk longer, rather
than to possibly get triangulation from a 1.8.0 landing)

firefox/nightly/*/*mozilla1.8 (already out of support)
thunderbird/nightly/*/*mozilla1.8 (ouch, it's what we're still shipping,
but, there's little evidence we even look at patches we land on it in
2009, much less patches we landed on it in 2006)

as a desperate last resort, every other build, from the oldest year
forward and from non-Fx to Fx within that year, as we have to (not that
I'm saying that Core things visible in Fx are most likely to have old
unseen regressions, but, well, they are, plus they're most likely to get
narrowed and most likely to get fixed)

And to add to my list of random things that should get the axe (and I
did mention grendel, didn't I? rip its arm off, cut off its head, slay
the beast!), since firefox/releases/partners/ is just 2.0.0.6 and a
scattering of 1.5.0.x, I'd think it could go to wherever all the other
partner builds are living, whether that's another disk or just out of my
sight.

m...@mozilla.com

unread,
Jun 29, 2009, 1:17:58 AM6/29/09
to
On Jun 17, 12:17 pm, Ben Hearsum <bhear...@mozilla.com> wrote:
> (Cross posting widely, please reply to mozilla.dev.builds)
>
> Hi Everyone,
>
> Over the course of the past few years we've greatly increased the number
> of builds and other files we push to Mozilla's FTP server. Despite
> having a vast amount of space available we've nearly filled it up. We'd
> like to recover quite a bit of space in one shot by archiving all
> Firefox, Thunderbird, and SeaMonkey builds from 2006 and earlier to an
> offline backup.

I've taken the action to expand on the analysis Ben did with the help
of our Metrics group to get a better understanding of access
patterns. Should have something to report later this week or next.

We're not in any crisis mode - we have enough storage right now and
have some time to figure out what the right storage solution ought to
be.

- mz

Reply all
Reply to author
Forward
0 new messages