There has been enough discussion about keeping development downstream
and how painful that is. Even more so, I think we all agree, is having
downstream releases while tracking upstream releases, trunk and other
branches (ex. Android).
I have proposed "en passant" a few times already, but now I'm going to
do it to a wider audience:
Shall we sync our upstream release with the bulk of other downstream
ones as well as OS distributions?
This work involves a *lot* of premises that are not encoded yet, so
we'll need a lot of work from all of us. But from the recent problems
with GCC abi_tag and the arduous job of downstream release managers to
know which patches to pick, I think there has been a lot of wasted
effort by everyone, and that generates stress, conflicts, etc.
I'd also like to reinforce the basic idea of software engineering: to
use the same pattern for the same problem. So, if we have one way to
link sequences of patches and merge them upstream, we ought to use the
same pattern (preferably the same scripts) downstream, too. Of course
there will be differences, but we should treat them as the exception,
not the rule.
So, a list of things will need to be solved to get to a low waste
release process:
1. Timing
Many downstream release managers, as well as distro maintainers have
complained about the timing of our releases, and how unreliable they
are, and how that makes it hard for them to plan their own branches,
cherry-picks and merges. If we release too early, they miss out
important optimisations, if we do too late, they'll have to branch
"just before" and risk having to back-port late fixes to their own
modified trees.
Products that rely on LLVM already have their own life cycles and we
can't change that. Nor we can make all downstream products align to
our quasi-chaotic release process. However, the important of the
upstream release for upstream developers is *a lot* lower than for the
downstream maintainers, so unless the latter group puts their weight
(and effort) in the upstream process, little is going to happen to
help them.
A few (random) ideas:
* Do an average on all product cycles, pick the least costly time to
release. This would marginalise those beyond the first sigma and we'd
make their lives much harder than those within one sigma.
* Do the same average on the projects that are willing to lend a
serious hand to the upstream release process. This has the same
problem, but it's based on actual effort. It does concentrate bias on
the better funded projects, but it's also easier for low key projects
to change their release schedules.
* Try to release more often. The current cost of a release is high,
but if we managed to lower it (by having more people, more automation,
shared efforts), than it could be feasible and much fairer than
weighted averages.
2. Process
Our release process is *very* lean, and that's what makes it
quasi-chaotic. In the beginning, not many people / companies wanted to
help or cared about the releases, so the process was what whomever was
doing, did. The major release process is now better defined, but the
same happened to the minor releases.
For example, we have no defined date to start, or to end. We have no
assigned people to do the official releases, or test the supported
targets. We still rely on voluntary work from all parties. That's ok
when the release is just "a point in time", but if downstream releases
and OS distributions start relying on our releases, we really should
get a bit more professional.
A few (random) ideas:
* We should have predictable release times, both for starting it and
finishing it. There will be complications, but we should treat them as
the exception, not the rule.
* We should have appointed members of the community that would be
responsible for those releases, in the same way we have code owners
(volunteers, but no less responsible), so that we can guarantee a
consistent validation across all relevant targets. This goes beyond
x86/ARM/MIPS/PPC and includes the other targets like AMD, NVidia, BPF,
etc.
* The upstream release should be, as much as possible, independent of
which OS they run on. OS specific releases should be done in the
distributions themselves, and people interested should raise the
concern in their own communities.
* Downstream managers should be an integral part of the upstream
release process. Whenever the release manager sends the email, they
should test on their end and reply with GREEN/RED flags.
* Downstream managers should also propose back-ports that are
important to them in the upstream release. It's likely that a fix is
important to a number of downstream releases but not many people
upstream (since we're all using trunk). So, unless they tell us, we
won't know.
* OS distribution managers should test on their builds, too. I know
FreeBSD and Mandriva build by default with Clang. I know that Debian
has an experimental build. I know that RedHat and Ubuntu have LLVM
packages that they do care. All that has to be tested *at least* every
major release, but hopefully on all releases. (those who already do
that, thank you!)
* A number of upstream groups, or downstream releases that don't
track upstream releases, should *also* test them on their own
workloads. Doing so, will get the upstream release in a much better
quality level, and in turn, allow those projects to use it on their
own internal releases.
* Every *new* bug found in any of those downstream tests should be
reported in Bugzilla with the appropriate category (critical / major /
minor). All major bugs have to be closed for the release to be out,
etc. (the specific process will have to be agreed and documented).
3. Automation
As exposed in the timing and process sections, automation is key to
reducing costs for all parties. We should collate the encoded process
we have upstream with the process projects have downstream, and
convert upstream everything that we can / is relevant.
For example, finding which patches revert / fix another one that was
already cherry-picked is a common task that should be identical to
everyone. A script that would sweep the commit logs, looking for
clues, would be useful to everyone.
A few (random) ideas:
* We should discuss the process, express the consensus on the
official documentation, and encode it in a set of scripts. It's a lot
easier to tell a developer "please do X because it helps our script
back-port your patch" than to say "please do X because it's nice" or
"do X because it's in the 'guideline'".
* There's no way to force (via git-hook) developers to add a bugzilla
ID or a review number on the commit message (not all commits are
equal), so the scripts that scan commits will have to be smart enough,
but that'll create false-positives, and they can't commit without
human intervention. Showing why a commit wasn't picked up by the
script, or was erroneously picked up, is a good way to educate people.
* We could have a somewhat-common interface with downstream releases,
so some scripts that they use could be upstreamed, if many of them
used the same entry point for testing, validating, building,
packaging.
* We could have the scripts that distros use for building their own
packages in our tree, so they could maintain them locally and we'd
know which changes are happening and would be much easier to warn the
others, common up the interface, etc.
In the end, we're a bunch of people from *very* different communities
doing similar work. In the spirit of open source, I'd like to propose
that we share the work and the responsibility of producing high
quality software with minimal waste.
I don't think anyone receiving this email disagrees with the statement
that we can't just take and not give back, and that being part of this
community means we may have to work harder than our employers would
think brings direct profit, so that they can profit even more
indirectly later, and with that, everyone that uses or depends on our
software.
My personal and very humble opinion is that coalescing the release
process will, in the long term, actually *save* us a lot of work, and
the quality will be increased. Even if we don't reach perfection, and
by no means I think we will, at least we'll have something slightly
better. If anything, at least we tried.
I'd hate to continue doing an inefficient process without even trying
an alternative...
Comments?
cheers,
--renato
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
- I like to think that the major releases have been shipped on a
pretty reliable six-month schedule lately. So we have that going for
us :-)
- It seems hard to align our upstream schedule to various downstream
preferences. One way would be to release much more often, but I don't
know if that's really desirable.
- I would absolutely like to see more involvement in the upstream
release processes from downstream folks and distros.
- I think we should use the bug tracker to capture issues that affect
releases. It would be cool if a commit hook could update bugzilla
entries that refer to it.
Cheers,
Hans
My random thoughts:
At least for the major releases, I think we're doing pretty well on
timing in terms of predictability: since 3.6, we have release every
six months: first week of March and first week of September (+- a few
days). Branching has been similarly predictive: mid-January and
mid-July.
If there are many downstream releases for which shifting this schedule
would be useful, I suppose we could do that, but it seems unlikely
that there would be agreement on this, and changing the schedule is
disruptive for those who depend on it.
The only reasonable way I see of aligning upstream releases with
downstream schedules would be to release much more often. This works
well in Chromium where there's a 6-week staged release schedule. This
would mean there's always a branch going for the next release, and
important bug fixes would get merged to that. In Chromium we drive
this from the bug tracker -- it would be very hard to scan each commit
for things to cherry-pick. This kind of process has a high cost
though, there has to be good infrastructure for it (buildbots on the
branch for all targets, for example), developers have to be aware, and
even then it's a lot of work for those doing the releases. I'm not
sure we'd want to take this on. I'm also not sure it would be suitable
for a compiler, where we want the releases to have long life-time.
> 2. Process
>
> Our release process is *very* lean, and that's what makes it
> quasi-chaotic. In the beginning, not many people / companies wanted to
> help or cared about the releases, so the process was what whomever was
> doing, did. The major release process is now better defined, but the
> same happened to the minor releases.
>
> For example, we have no defined date to start, or to end.
For the major releases, I've tried to do this. We could certainly
formalize it by posting it on the web page though.
> We have no
> assigned people to do the official releases, or test the supported
> targets. We still rely on voluntary work from all parties. That's ok
> when the release is just "a point in time", but if downstream releases
> and OS distributions start relying on our releases, we really should
> get a bit more professional.
Most importantly, those folks should get involved :-)
>
> A few (random) ideas:
>
> * We should have predictable release times, both for starting it and
> finishing it. There will be complications, but we should treat them as
> the exception, not the rule.
SGTM, we pretty much already have this for major releases.
> * We should have appointed members of the community that would be
> responsible for those releases, in the same way we have code owners
> (volunteers, but no less responsible), so that we can guarantee a
> consistent validation across all relevant targets. This goes beyond
> x86/ARM/MIPS/PPC and includes the other targets like AMD, NVidia, BPF,
> etc.
In practice, we kind of have this for at least some of the targets.
Maybe we should write this down somewhere instead of me asking for
(the same) volunteers each time the release process starts?
Thanks Hans!
I'll respond them inline, below.
> - I think we should use the bug tracker to capture issues that affect
> releases. It would be cool if a commit hook could update bugzilla
> entries that refer to it.
That seems like a simple hook.
> At least for the major releases, I think we're doing pretty well on
> timing in terms of predictability: since 3.6, we have release every
> six months: first week of March and first week of September (+- a few
> days). Branching has been similarly predictive: mid-January and
> mid-July.
Indeed, we got a lot better more recently (last 2y), and mostly thanks
to you. :)
We used to vary 3 months +-, and now we're down to a few days.
Whatever we decide, I think we should make it official but putting it
out somewhere, so people can rely on that.
Right now, even if you're extra awesome, there's nothing telling the
distros and LLVM-based products that it will be so if someone else
takes over the responsibility, so they can't adapt.
That's what I meant by "quasi-chaotic".
> If there are many downstream releases for which shifting this schedule
> would be useful, I suppose we could do that, but it seems unlikely
> that there would be agreement on this, and changing the schedule is
> disruptive for those who depend on it.
That's the catch. If we want them to participate, the process has to
have some meaning to them. The fact that not many people do, is clear
to me what it means.
We also need to know better how many other releases already depend on
the upstream process (not just Chromium, for obvious reasons), to be
able to do an informed choice of dates and frequency.
The more well positioned and frequent we are, the more people will
help, but there's a point where the curve bends down, and the cost is
just too great. We need to find the inflection point, and that will
require some initial investigations and guesses, and a lot of fine
tuning later. But if we're all on the same page, I think we can do
that, even if it takes time.
I'm particularly concerned with Android, because they not only have
their own tree with heavily modified LLVM components (ex.
Compiler-RT), but they also build differently and so their process are
completely alien to ours. One of the key reasons why these things
happened is because:
* They couldn't rely on our releases, as fixing bugs and back-porting
wasn't a thing back then
* They already had their own release schedule, so aligning with ours
brought no extra benefit
* We always expected people to work off trunk, and everyone had to
create their own process
I don't want to change how people work, just to add one more valid way
of working, which is most stable for upstream releases. :)
> The only reasonable way I see of aligning upstream releases with
> downstream schedules would be to release much more often. This works
> well in Chromium where there's a 6-week staged release schedule. This
> would mean there's always a branch going for the next release, and
> important bug fixes would get merged to that.
Full validation every 6 weeks is just not possible. But a multiple of
that, say every 3~4 months, could be much easier to work around.
> In Chromium we drive
> this from the bug tracker -- it would be very hard to scan each commit
> for things to cherry-pick. This kind of process has a high cost
> though, there has to be good infrastructure for it (buildbots on the
> branch for all targets, for example), developers have to be aware, and
> even then it's a lot of work for those doing the releases. I'm not
> sure we'd want to take this on. I'm also not sure it would be suitable
> for a compiler, where we want the releases to have long life-time.
This works because you have a closed system. As you say, Chromium is
mostly final product, not a tool to develop other products, and the
validation is a lot simpler.
With Clang, we'd want to involve external releases into it, and it
simply wouldn't scale.
> For the major releases, I've tried to do this. We could certainly
> formalize it by posting it on the web page though.
I think that'd be the first step, yes. But I wanted to start with a
good number. 2 times a year? Would 3 times improve things that much
for the outsiders? Or just moving the dates would be enough for most
people?
That's why I copied so many outsiders, so they can chime in and let us
know what would be good for *them*.
> Most importantly, those folks should get involved :-)
Indeed!
> In practice, we kind of have this for at least some of the targets.
> Maybe we should write this down somewhere instead of me asking for
> (the same) volunteers each time the release process starts?
I give consent to mark me as the ARM/AArch64 release tester for the
foreseeable future. :)
I can also help Sylvestre, Doko, Ed, Jeff, Bero etc. to test on their
system running on ARM/AArch64 hardware.
Just a small note: Chromium doesn't use the releases, but instead
picks a good revision of the trunk every other week or so.
There has been enough discussion about keeping development downstream
and how painful that is. Even more so, I think we all agree, is having
downstream releases while tracking upstream releases, trunk and other
branches (ex. Android).
This work involves a *lot* of premises that are not encoded yet, so
we'll need a lot of work from all of us. But from the recent problems
with GCC abi_tag and the arduous job of downstream release managers to
know which patches to pick, I think there has been a lot of wasted
effort by everyone, and that generates stress, conflicts, etc.
Many downstream release managers, as well as distro maintainers have
complained about the timing of our releases,
* Do the same average on the projects that are willing to lend a
serious hand to the upstream release process.
* Try to release more often. The current cost of a release is high,
but if we managed to lower it (by having more people, more automation,
shared efforts), than it could be feasible and much fairer than
weighted averages.
For example, we have no defined date to start, or to end. We have no
assigned people to do the official releases, or test the supported
targets. We still rely on voluntary work from all parties. That's ok
when the release is just "a point in time", but if downstream releases
and OS distributions start relying on our releases, we really should
get a bit more professional.
* Downstream managers should be an integral part of the upstream
release process. Whenever the release manager sends the email, they
should test on their end and reply with GREEN/RED flags.
* Downstream managers should also propose back-ports that are
important to them in the upstream release. It's likely that a fix is
important to a number of downstream releases but not many people
upstream (since we're all using trunk). So, unless they tell us, we
won't know.
* We could have the scripts that distros use for building their own
packages in our tree, so they could maintain them locally and we'd
know which changes are happening and would be much easier to warn the
others, common up the interface, etc.
On 11 May 2016 at 23:47, Bernhard Rosenkränzer
<bernhard.r...@linaro.org> wrote:
> In the OpenMandriva world, we usually try to have clang (our primary
> compiler) as close as possible to the latest upstream stable release.
It would be good to know what's stopping you from using a fully
upstream process (like patch trunk, back-port, pull).
I'm sure you're not the only one having the same problems, and even if
the only take-out of this thread is to streamline our back-port
process, all downstream releases will already benefit.
> We're currently following the release_38 branch, and expect to jump on trunk
> as soon as our distro release has happened (because we expect 3.9 to be
> ready before we'll make our subsequent release - better to get on the branch
> we'll be using for the next release early than to suddenly face problems
> when updating to the next release).
That's a very good strategy, indeed. And somewhat independent on how
good the release is.
Early help can also increase downstream participation pre-release, and
will eventually considerably reduce the need to back-ports.
> In the AOSP world, we obviously have to (somewhat) follow what Google does,
> which is typically pick a trunk snapshot and work from there - but we have
> some work underway to extract their patches so we can apply them on top of a
> release or snapshot of our choice (current thought is mostly nightly builds
> for testing).
This sounds awful, and is the very thing I'm trying to minimise. I can
certainly understand the need to local patches, but using different
trunks makes the whole thing very messy.
If the releases are good and timely, and if the back-porting process
is efficient, I hope we'll never have the need to do that.
> From both perspectives, it would be great to have a common set of "known
> good" and relevant patches like gcc abi_tag, or fixes for bugs commonly
> encountered.
> Ideally, I'd like to see those patches just backported on the release_38
> branch to keep the number of external patches low.
Indeed, my point exactly. Downstream releases and distros can easily
share sets of known "good" patches for this or that, but I'd very much
prefer to have as much as possible upstream.
> gcc abi_tag is a bit of a headache in the OpenMandriva world, while we build
> just about everything with clang these days, of course it would be good to
> restore binary compatibility with the bigger distributions (almost all of
> which are using current gcc with the new abi enabled).
Ubuntu LTS has just released with LLVM 3.8, which doesn't have the
abi_tag fixes.
If we don't back-port them to 3.8.1 or 3.8.2, Ubuntu will have to do
that on their own, and that's exactly what I'm trying to solve here.
I also feel like they're not the only one with that problem...
> The timing has been quite predictable lately -- but of course the website
> still says "TBD" for both 3.8.1 and 3.9.0, maybe communicating the (likely)
> plan could use some improvement.
Right, Hans was saying how we should improve in that area, too.
I think that's a very easy consensus to reach, but we still need all
require people to commit to a more rigorous schedule.
> What can we (this time being OpenMandriva) do? We don't have any great
> compiler engineers, but we're heavy users - would it help to run a mass
> build of all packages for all supported architectures (at this time: i586,
> x86_64, armv7hnl, aarch64) to detect errors on a prerelease builds?
YES PLEASE!! :)
> We have
> the infrastructure in place, even showing a fairly nice list of failed
> builds along with build logs. (But of course we there will be false
> positives caused by e.g. a library update that happened around the same time
> as the compiler update.)
Can you compare the failures of two different builds?
We don't want to fix *all* the problems during the releases, we just
want to know if the new release breaks more stuff than the previous.
How we deal with the remaining bugs is irrelevant to this discussion...
> That would be a good idea IMO, we've run into "current trunk is much better
> than the last stable release anyway" situations more than once (in both
> projects).
I expect that, if distros chime in during the release process, this
will be a lot less of a problem.
It also seems that the timing is not that bad, so maybe the best
course of action now is to streamline the process and only if the
pressure is still great, we change the timings.
> Backporting some more fixes to the stable branches would be great too (but
> of course I realize that's a daunting and not very interesting task).
I think that's crucial to keeping the releases relevant.
> Sounds good to me, volunteering to participate in both.
Thanks Bero! If you haven't yet, please subscribe to the
release...@lists.llvm.org mailing list.
> While interesting from an upstream perspective, I doubt that will happen
> reliably -- there's too many people working on the build scripts who would
> not automatically have write access to the tree etc. and most distro build
> farms rely on having the build scripts in a common place, so duplication
> would be unavoidable.
I was sceptical about the shared scripts, too. It was an idea that
came to my mind in the last minute, but I'm not sure how much that
would help anyway.
Indeed, I think we're all in agreement there.
Not only having people committed to validate the release, or having a
"known" ballpark time, but a public commitment: web pages updates,
emails sent to the appropriate lists, deadlines exposed, etc.
For all those interested, release...@lists.llvm.org is the list
we discuss about the validation and plans. Regardless, the web pages
should also be changed as part of the planning and release processes.
> Data point: At Sony we ship our stuff every 6 months and that isn't
> going to change. Our policy has been to base our releases on upstream
> releases, and given our current lead times, the current upstream release
> schedule is actually not bad, especially if it has stopped drifting.
> Having four releases pretty consistently follow the current schedule is
> extremely positive, thanks!
Excellent, thanks! It seems that the timing is (so far) less of a
problem than I anticipated. This is good news.
> It would help our internal planning to publish the schedule for future
> releases. You never know, we might even be able to fork an internal
> branch in time to help with the release testing, although that is not
> in any way a lightweight process (so no promises).
I'm betting on the fact that this will get easier the more often we
do, and the more of us that does it.
Filtering false positives is a local cost that cannot be avoided, and
only automated to a point, but filtering thousands of real bugs and
reporting them all should be, in no way, the responsibility of any
downstream release.
The more downstream releases and OS distributions we have doing the
testing (like Bero said), the less each one of us will have to track.
And the more of them that say they're affected by a bug, the higher
the priority is should have to the upstream release.
We've done something similar with the GCC abi tag, but that was a
separate email thread, started by Jeff, and external to our release
process. The number of people involved is staggering, but yet, the
patches are sitting there waiting for review for a very long time.
I think this expresses my point really well. Downstream releases and
OS distros can help us *a lot* with validation, not necessarily
implementing core functionality or even fixing those bugs themselves.
But if we don't fix those bugs or implement the features they need,
they won't have *any* incentive in spending a lot of time and
resources validating the upstream release, and then, all of them will
spend *more* time validating their own.
Increasing the importance of stable releases might get us more work to
do for other people with no real benefit to us, yes. But it'll also
bring us a massive validation network and transform Clang/LLVM into a
production compiler from start (upstream) and that will benefit
everyone, including all users of all downstream and upstream tools.
> There has been some talk about moving toward releasing from upstream
> trunk, or more precisely to streamlining our internal testing process
> in order to allow us to seriously contemplate releasing from upstream
> trunk. I can't argue with the streamlining part, for sure. Whether
> the rest of it works out must remain to be seen.
That's what Chromium does, and it's mainly because of the problems
Bero exposed (trunk is often much better than any release).
But also they are using the tool to compile a very small subset of
programs (mainly Chromium/Chrome), so it's *a lot* easier to validate
that.
When you're actually shipping a toolchain, you have to worry not only
with the programs you have, but also customers programs you don't (and
can't) have access to.
If the release process (including minor releases) ends up as frequent
as possible, wouldn't that be similar to merging to trunk every other
month?
In that case, the validation process will be minimal (almost the same
code), but you'd have to spend a bit more time sifting through patches
(which I want to automate) instead.
Hope that makes some sense.
Hi Kristof,
Indeed, that is true.
But there are so much additional (internal) buildbots can do. If you
have any additional validation steps that you only do when you pick a
candidate for release, and there are failures in that set, you won't
see them during the trunk monitoring.
Also, distributions don't have as many compiler developers as most
groups releasing toolchains, and they have to rely on the public
buildbots (which are by no means comprehensive).
This is not about downstream *development*, but downstream *release*
process, which in most situations, require additional validation.
Following trunk for your development process is cheaper, as you said,
and most people agree. But if you release your product a few weeks
after the release is out, it may be better to get the release itself,
since it has been validated but many other groups (assuming everyone
join in), than trunk.
> In my opinion, it would be better overall for the LLVM project if top-of-trunk is
> tested as much as possible, if testing resources are so scarce that a choice
> has to be made between testing top-of-trunk or testing a release branch.
That's the balance I'm trying to get right. :)
There's also the other topic about the community.
LLVM is mostly a tool kit for compilers, yes, and most LLVM developers
are using it as such. But more and more projects are depending on the
actual compiler solution (Clang+LLVM+RT), and by the reaction of most
distros I've spoken to, we could improve our OSS community
interactions quite a lot.
I can certainly understand why most companies are worried about their
own processes, but that's undermining the ability of the OSS release
to achieve it's goal, which is to be used to build all kinds of
software in the wild.
We have FreeBSD, Mandriva and Android using it by default, which is a
*big* win. We have Debian, RedHat and Canonical worried about the
integration of LLVM in their packages, that's awesome.
I just want us to look at that picture, and see if we can do anything
better for them, that will in turn, make our processes slightly
cheaper because of the synergy it will create.
FWIW, for our ARM Compiler product, we follow top-of-trunk, not the releases.Next to picking up new functionality quicker, it also allows us to detect regressionsin LLVM against our in-house testing quickly, not 6 months later. We find that whenwe find a regression within 24 to 48 hours of the commit introducing it, it's muchcheaper to get it fixed.
In my opinion, it would be better overall for the LLVM project if top-of-trunk istested as much as possible, if testing resources are so scarce that a choicehas to be made between testing top-of-trunk or testing a release branch.
* They couldn't rely on our releases, as fixing bugs and back-porting
wasn't a thing back then
* They already had their own release schedule, so aligning with ours
brought no extra benefit
+1 to this. As a developer of llvmlite and numba, it would be
counter-productive for me to try to follow the LLVM ToT, as opposed to
migrate after a X.Y.1 release as we currently do.
Regards
Antoine.
Hi Steve,
I think we're all in agreement of ToT development and testing. This is
more about releases and upstream users, including OS distributions.
> I am not sure why you think Android's compiler-rt is an example of a
> "heavily modified" component. As I see it, our compiler-rt matches upstream
> almost exactly (with one minor mistake from a duplicate merge that results
> in extra copies of some static functions that we don't even build). We do
> have 3 cherry-picks for some MIPS ASan patches, but all of those come
> directly from TOT master.
Sorry, that was a bit heavy-handed... I meant that it's hard to change
the Android's copy of RT because of how it's built.
This patch is an example: https://android-review.googlesource.com/#/c/125910/1
It introduces a set of nice changes that cannot go as in into LLVM's
compiler-rt because of how RT is built in LLVM.
This is not Android's fault per se, but it's an example of how
proliferation of patches can happen if one downstream repo depends on
another, as is the case for AOSP+Android+(anyone else that develops
Android).
That change should have been developed on upstream RT to begin with,
but the merge would be hard to control on the third-generation copy.
That's the reason why I want to bring all downstream repos to only
depend on upstream LLVM.
> The real problem is retaining history. Release branches don't make this very
> nice, and necessitate that we swap out an entire chunk of history for a
> different chunk of history every time we change releases.
That's interesting... I haven't heard that before, so I don't know
exactly what you mean. :)
Can you give an example?
> It is not 100% clear that Android will want to be dependent on LLVM's
> release schedule. I think that there are definitely benefits to having
> everyone do extra validation, but I am unconvinced that it is the "same"
> validation for everyone that is valuable, hence this might not make things
> go much smoother/faster for those groups.
That's a very good point, and one that I was considering while
thinking about this.
Will Sony's validation be any relevant to Chromium builds? Probably
not. But ARM's extra validation will be relevant to anyone using ARM
targets, and that in turn will bring more adoption to ARM. Sony's
validation, will be interesting to the CPUs and GPUs they validate on,
so that's a bonus to anyone who uses the same hardware.
I can't put a concrete value on any of this, but having people chiming
in is the best way to know what kind of things people do, and how
valuable they are to each other.
There's also how many APIs do we create packages for?
If your projects use 3.5, some others use 3.6, etc. then distros need
to keep packages for *all* of them and *also* have Clang packages,
maybe more than one. This can get nasty...
I'm not sure how to solve that... :(
--renato
I'm particularly concerned with Android, because they not only have
their own tree with heavily modified LLVM components (ex.
Compiler-RT), but they also build differently and so their process are
completely alien to ours. One of the key reasons why these things
happened is because:
* They couldn't rely on our releases, as fixing bugs and back-porting
wasn't a thing back then
* They already had their own release schedule, so aligning with ours
brought no extra benefit
* We always expected people to work off trunk, and everyone had to
create their own process
I don't want to change how people work, just to add one more valid way
of working, which is most stable for upstream releases. :)
Full validation every 6 weeks is just not possible. But a multiple of
that, say every 3~4 months, could be much easier to work around.
In my opinion, it would be better overall for the LLVM project if top-of-trunk istested as much as possible, if testing resources are so scarce that a choicehas to be made between testing top-of-trunk or testing a release branch.
I obviously don't speak for Android and have already apologised to
Steve about my choice of words.
> So if android is your particular concern here, i can pretty much state that
> android LLVM is on a release process close to the rest of Google, which is
> 'follow TOT very closely'.
Isn't this what I said?
Following ToT very closely is only good for groups that have high
involvement in LLVM, like Google and Android.
And for that reason (and others), Android doesn't use the upstream
releases. I was wondering if we could make anything so they would.
The major benefit wouldn't be, as I explained, specifically for
Google/Android, but for Android users, Linux users, Linux distros,
LLVM library users (including Renderscript), etc.
Is there anything we can do to make they care?
What I heard from them is that the upstream process wasn't clear
enough with regards to fixes, API stability and process (which were
pretty much echoed in this thread).
Maybe, if we fix most of those problems, they would care more?
> (For reference, Google *ships* the equivalent of about 13-16 linux
> distributions in products, uses about 5-6x that internally, and we have a
> single monolithic source repository for the most part. I have the joy of
> owning the third party software policies/etc for it, and so end up
> responsible for trying to deal with maintaining single versions of llvm for
> tens to hundreds of packages).
You sound like the perfect guy to describe a better upstream policy to
please more users.
But I don't want to volunteer yourself. :)
--renato
On 12 May 2016 at 16:57, Daniel Berlin <dbe...@dberlin.org> wrote:
> Errr, Stephen has spoken up here, but my folks are in contact with android
> folks pretty much every week, and I don't think what you are stating is
> correct on a lot of fronts.
I obviously don't speak for Android and have already apologised to
Steve about my choice of words.
> So if android is your particular concern here, i can pretty much state that
> android LLVM is on a release process close to the rest of Google, which is
> 'follow TOT very closely'.
Isn't this what I said?
Following ToT very closely is only good for groups that have high
involvement in LLVM, like Google and Android.
And for that reason (and others), Android doesn't use the upstream
releases. I was wondering if we could make anything so they would.
The major benefit wouldn't be, as I explained, specifically for
Google/Android, but for Android users, Linux users, Linux distros,
LLVM library users (including Renderscript), etc.
--renato
Then I apologise again! :)
My point was that following ToT is perfect for developer teams working
*on* LLVM. Everyone should be doing that, and most people are. Check.
But for some people, including library users, LTS distributions and
some downstream releases (citation needed), having an up-to-date and
stable release *may* (citation needed) be the only stable way to
progress into newer LLVM technology.
> IE make ToT more appealing to follow, have folks follow that.
> Maybe that's true, maybe it's not, but it needs a lot more evidence :)
There were responses on this thread that said it's possible and
desirable to test ToT better, than only validate releases, and I think
this is great. Mostly because ultimately this will eventually benefit
the releases anyway.
Maybe, the solution to the always-too-old-release problem is to get
better trunk and give up at all on releases, like Arch Linux rolling
releases (which I use), so I'm ok with it, too.
As long as we make it a clear and simple process, so upstream users
can benefit too, whatever works. :)
cheers,
On 12 May 2016 at 17:07, Daniel Berlin via llvm-dev
<llvm...@lists.llvm.org> wrote:
> In my talks with a number of these projects, they pretty much don't care
> what anyone else does, and plan to stick to their own import/etc schedules
> no matter what LLVM does with stable releases :)
Is there anything we can do to make they care?
What I heard from them is that the upstream process wasn't clear
enough with regards to fixes, API stability and process (which were
pretty much echoed in this thread).
Maybe, if we fix most of those problems, they would care more?
> (For reference, Google *ships* the equivalent of about 13-16 linux
> distributions in products, uses about 5-6x that internally, and we have a
> single monolithic source repository for the most part. I have the joy of
> owning the third party software policies/etc for it, and so end up
> responsible for trying to deal with maintaining single versions of llvm for
> tens to hundreds of packages).
You sound like the perfect guy to describe a better upstream policy to
please more users.
That's what I did. :)
I copied all people that have expressed concerns about our release or
back-porting process in some way on this email.
I'm glad Stephen, Bero, Paul Kristof, Antoine and David replied, as
well as you and Hans, so that we could have a better picture of who's
interested in participating more on the release process, and also how
much does our process really hurts them.
Seems I was wrong about many things, but not all of it. For me, being
proven that I *don't* have a problem is a big win, so thanks everyone!
:)
> Maybe. In any case, LLVM (as a community) has to define who the customers
> are that it wants to prioritize, and know what they care about, before you
> can start solving their problems. :)
I'm advocating for them to help us solve their own problems. Which
nicely solves the "who do we care more" problem. :)
> We already do this a little bit in the community, telling people they need
> to update tests for what their patches break, etc.
Yup. And I'm glad folks have now explicitly said they could help the
releases with some extra testing (building packages with the
pre-release). That, for me, is already a major win.
If that leads into them helping more later, or us being more
pro-active with their requests that we have been in some cases
(specifically the abi_tag case), that's a bonus (and slightly
selfish).
> This is not unlike that, just at a larger scale. So, for example, saying who
> bears the cost of API compatibility, and to what degree.
The API is slightly harder to solve that way. Most people that use our
APIs are not big companies or projects, and we want to be nice to
them, too, even if they can't help as much as Google.
Same for some distros, that the packagers are responsible for *a lot*
of packages, and they can't spend all month on a single one.
I don't have a solution to that, and this email was a request to solve
that problem (as well as the distros). I also don't know how to reach
them in any different way than this email.
If anyone has better ideas, please feel free to do what you can.
cheers,
On May 11, 2016, at 9:16 AM, Hans Wennborg via llvm-dev <llvm...@lists.llvm.org> wrote:This is a long email :-) I've made some comments inline, but I'll
summarize my thoughts here:
- I like to think that the major releases have been shipped on a
pretty reliable six-month schedule lately. So we have that going for
us :-)
- It seems hard to align our upstream schedule to various downstream
preferences. One way would be to release much more often, but I don't
know if that's really desirable.
- I would absolutely like to see more involvement in the upstream
release processes from downstream folks and distress.
_______________________________________________
cfe-dev mailing list
cfe...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
On May 12, 2016, at 7:56 AM, Kristof Beyls <kristo...@arm.com> wrote:FWIW, for our ARM Compiler product, we follow top-of-trunk, not the releases.Next to picking up new functionality quicker, it also allows us to detect regressionsin LLVM against our in-house testing quickly, not 6 months later. We find that whenwe find a regression within 24 to 48 hours of the commit introducing it, it's muchcheaper to get it fixed.
In my opinion, it would be better overall for the LLVM project if top-of-trunk istested as much as possible, if testing resources are so scarce that a choicehas to be made between testing top-of-trunk or testing a release branch.
The downstream consumers are all going to have their own needs &
schedules and as a result I doubt you'll find any upstream release
schedule & cadence that is acceptable to enough downstream consumers to
make this viable.
With that in mind, I'm a proponent of timing based schedules with as
much predictability as can be made. That allows the downstream
consumers to make informed decisions based on likely release dates.
>
>
> This work involves a *lot* of premises that are not encoded yet, so
> we'll need a lot of work from all of us. But from the recent problems
> with GCC abi_tag and the arduous job of downstream release managers to
> know which patches to pick, I think there has been a lot of wasted
> effort by everyone, and that generates stress, conflicts, etc.
Ideally as we continue to open more lines of communication we won't run
into anything as problematical as the abi_tag stuff. While it's
important, I wouldn't make it the primary driver for where you're trying
to go.
WRT downstream consumers. The more the downstream consumer is wired
into the development community, the more risk (via patches) the
downstream consumer can reasonably take. A downstream consumer without
intimate knowledge of the issues probably shouldn't be taking
incomplete/unapproved patches and applying them to their tree.
>
> 1. Timing
>
> Many downstream release managers, as well as distro maintainers have
> complained about the timing of our releases, and how unreliable they
> are, and how that makes it hard for them to plan their own branches,
> cherry-picks and merges. If we release too early, they miss out
> important optimisations, if we do too late, they'll have to branch
> "just before" and risk having to back-port late fixes to their own
> modified trees.
And this just gets bigger as the project gets more downstream consumers.
Thus I think you pick a time based release schedule, whatever it may
be and the downstream consumers can then adjust.
Note that this can have the effect of encouraging them to engage more
upstream to ensure issues of concern to them are addressed in a timely
manner.
>
> 2. Process
>
> Our release process is *very* lean, and that's what makes it
> quasi-chaotic. In the beginning, not many people / companies wanted to
> help or cared about the releases, so the process was what whomever was
> doing, did. The major release process is now better defined, but the
> same happened to the minor releases.
>
> For example, we have no defined date to start, or to end. We have no
> assigned people to do the official releases, or test the supported
> targets. We still rely on voluntary work from all parties. That's ok
> when the release is just "a point in time", but if downstream releases
> and OS distributions start relying on our releases, we really should
> get a bit more professional.
Can't argue with getting a bit more structured, but watch out for going
too far. I'd really like to squish down the release phase on the GCC
side, but it's damn hard at this point.
>
> A few (random) ideas:
>
> * We should have predictable release times, both for starting it and
> finishing it. There will be complications, but we should treat them as
> the exception, not the rule.
Yes.
> * We should have appointed members of the community that would be
> responsible for those releases, in the same way we have code owners
> (volunteers, but no less responsible), so that we can guarantee a
> consistent validation across all relevant targets. This goes beyond
> x86/ARM/MIPS/PPC and includes the other targets like AMD, NVidia, BPF,
> etc.
Good luck ;-) Don't take this wrong, but wrangling volunteers into
release work is hard. It's sometimes hard to find a way to motivate
them to focus on issues important for the release when there's new
development work they want to be doing.
Mark Mitchell found one good tool for that in his years as the GCC
release manager -- namely tightening what was allowed on the trunk as
the desired release date got closer. ie, there's a free-for-all period,
then just bugfixes, then just regression fixes, then just doc fixes.
Developers then had a clear vested interest in moving the release
forward -- they couldn't commit their new development work until the
release manager opened the trunk for new development.
> * OS distribution managers should test on their builds, too. I know
> FreeBSD and Mandriva build by default with Clang. I know that Debian
> has an experimental build. I know that RedHat and Ubuntu have LLVM
> packages that they do care. All that has to be tested *at least* every
> major release, but hopefully on all releases. (those who already do
> that, thank you!)
LLVM's usage on the Fedora side is still small, and smaller still within
Red Hat. But we do have an interest in this stuff "just working".
Given current staffing levels I would expect Fedora to follow the
upstream releases closely with minimal changes.
> * Every *new* bug found in any of those downstream tests should be
> reported in Bugzilla with the appropriate category (critical / major /
> minor). All major bugs have to be closed for the release to be out,
> etc. (the specific process will have to be agreed and documented).
Yes, GCC has a similar policy and it has worked reasonably well. In
fact it's a good lever for the release manager if you've got a locked
trunk. If the developers don't address the issues, then the release
doesn't branch and the trunk doesn't open for development. It aligns
the release manager and a good chunk of the development team's goals.
jeff
First, thanks everyone that replied, it made many things much more
clear to many people (especially me!).
But would be good to do a re-cap, since some discussions ended up
list-only, and too many people said too many things to keep track. I
read the whole thread again, and this is my summary of what people
do/want the most.
TL;DR version:
* Upstream needs to be more effective at communicating and
formalising the process.
* Downstream / Distros need to be respectively more vocal / involved
about testing ToT and releases.
* Upstream may need to change some process (bugzilla meta, feature
freeze, more back-ports, git branches) to facilitate downstream help.
* Downstream needs to be more pushy with their back-port suggestions,
so we do it upstream more often.
* We all need to come up with a better way of tracking patches for
back-port (separate thread ongoing)
Now, the (slightly) longer version:
By far, the most important thing is cadence and transparency. We
already do a good job at the former, not so much at the latter.
The proposals were:
- Formalise the dates/duration on a webpage, clarify volunteered
roles, channels (release list, etc).
- Formalise how we deal with proposals (llvm-commits to release list
/ owner), some suggested using git branches (I like this, but svn).
- Formalise how we deal with bugs (bugzilla meta, make it green),
this could also be used as back-port proposal.
- Formalise how we deal with back-ports, how long we do, overlapping
minor releases, etc.
The other important factor is top-of-tree validation as a way to get
more stable releases. We have lots of buildbots and the downstream
release folks are already validating ToT enough. We need the distros
in as well.
Same goes for releases, but that process isn't clear, it needs to be.
The proposals were:
- Have distros build all packages with ToT as often as possible, report bugs.
- Do the same process on stable branches, at least once (RC1).
- Downstream releases also need to acknowledge when a validation came
back green (instead of just report bugs).
- All parties need to coordinate the process in the same place, for
example, a meta bug in bugzilla, and give their *ack* when things are
good for them.
The third big point was following changes on long running downstream
stable branches. Many releases/distros keep stable local branches for
many years, and have to back-port on their own or keep local patches
for that long.
The proposals were:
- Better tracking of upstream patch streams, fixes for old stable
release bugs. Making a bug depend on an old release meta may help.
- Distros and releases report keeping lots of local patches (some
already on trunk). Back-porting them to minor releases would help
everyone, keeping old releases for longer, too.
- Create patch bundles and publish them somewhere (changelog?
bugzilla?) so that downstream knows all patches to back-port without
duplicating efforts.
Other interesting points were:
- Moving versions for small projects / API users is not an easy task.
Having release numbers that make sense and multiple releases at a time
may help not introduce API changes to old releases.
- Volume of changes is too great. Having a slow down (staged freeze)
may help developers invest more in stability just before the branch.
We sort of have that, needs formalisation.
- If we have major bugs after the branch, we stop the whole process,
which makes it slower and more unpredictable. Feature freeze may help
with that, but can affect the branch date.
- Volunteering is still required, but we could keep documented who
does what, and change the docs as that changes.
- Upstreaming package maintenance scripts had mixed views, but I
still encourage those that want, to try.
cheers,
--renato
[I'm the FreeBSD LLVM package maintainer]
I'm not really worried about patches. Where I need them, I've got
infrastructure in place to fetch them and add them to the build system.
I tend to track minor releases and GCing patches during that process
isn't much work. If there was a more regular stream of changes, it
would be easy enough to follow. Mostly I apply patches when someone
actually hits a regression so people don't have to rebuild/reinstall for
patches that often don't apply to our users. This is particularly
important to 3.8+ where shared builds have been broken for quite some
time so the full package takes up >1GB of disk space.
For llvm-devel I have a script to grab the current git checksums from
the github API. It works well and is easy to use. For actual releases,
I consider the tarballs to be the source of truth.
-- Brooks