[LLVMdev] Proposal to add Bitcode version field to bitcode file wrapper

95 views
Skip to first unread message

Yung, Douglas

unread,
Sep 26, 2014, 7:31:49 PM9/26/14
to llv...@cs.uiuc.edu, cfe...@cs.uiuc.edu

Hi,

 

We would like to add a version number to the bitcode wrapper. This feature would allow easier identification of what compiler produced the bitcode so that known incompatibilities with LTO compilation could be detected. Roughly speaking, this version number would consist of the major, minor and optionally the patch version of the compiler used to produce the bitcode. The version information would be encoded in 4 bytes, with the first byte representing the major version number, the second byte the minor version number, and the third and fourth bytes optionally encoding the patch version or other information. As to where to place this information, we are considering two different possibilities for updating the bitcode wrapper specification.

 

The first is to simply add a single 32bit wide field at the end of the existing bitcode wrapper format field. This would result in the new structure looking like this:
 
[Magic_{32}, Version_{32}, Offset_{32}, Size_{32}, CPUType_{32}, BitcodeVersion_{32}]
 
All of the existing fields would keep their current meanings, and the new field BitcodeVersion is simply appended with the format described in the first paragraph.
 
A second idea was to use the existing Version field in the bitcode wrapper format to store the bitcode version information. According to the documentation (http://llvm.org/docs/BitCodeFormat.html#bitcode-wrapper-format) this field is currently always set to 0. This would allow us to make use of what is (presumably) an unused field.

 

As this is a feature that we feel would be beneficial to the community, we wanted to get feedback on the design for our upcoming patches. Any thoughts or opinions on this would be greatly appreciated.

 

Thanks!

 

Douglas Yung

Bob Wilson

unread,
Sep 26, 2014, 7:41:11 PM9/26/14
to Yung, Douglas, cfe...@cs.uiuc.edu, llv...@cs.uiuc.edu
Bitcode backward compatibility, at least for the current major version, is supposed to make this unnecessary. Can you provide more information about what “known incompatibilities” you’re seeing?

_______________________________________________
LLVM Developers mailing list
LLV...@cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Yung, Douglas

unread,
Sep 26, 2014, 8:20:47 PM9/26/14
to Bob Wilson, cfe...@cs.uiuc.edu, llv...@cs.uiuc.edu

Sorry if I was unclear. There are currently no “known incompatibilities” that I am aware of, although I fully admit to not being an expert on the topic. The idea is that we add versioning information to the bitcode so that if an issue were discovered, it could be easily detected and dealt with.

 

Douglas Yung

Duncan Exon Smith

unread,
Sep 26, 2014, 9:12:41 PM9/26/14
to Yung, Douglas, cfe...@cs.uiuc.edu, llv...@cs.uiuc.edu
If we discover an issue, shouldn't we just fix it?

-- dpnes

Sean Silva

unread,
Sep 26, 2014, 9:30:23 PM9/26/14
to Yung, Douglas, cfe...@cs.uiuc.edu, llv...@cs.uiuc.edu
On Fri, Sep 26, 2014 at 5:18 PM, Yung, Douglas <dougla...@playstation.sony.com> wrote:

Sorry if I was unclear. There are currently no “known incompatibilities” that I am aware of, although I fully admit to not being an expert on the topic. The idea is that we add versioning information to the bitcode so that if an issue were discovered, it could be easily detected and dealt with.


It sounds like time would be better invested in improving the testing of our bitcode compatibility promise.

-- Sean Silva

Greg Bedwell

unread,
Sep 27, 2014, 6:25:51 AM9/27/14
to Sean Silva, Yung, Douglas, cfe...@cs.uiuc.edu, llv...@cs.uiuc.edu
As I understand it, the bitcode compatibility promise doesn't extend as far as debug info metadata (happy to be wrong here!).  I think we have a usecase where need to guarantee that debug information from any two arbitrary bitcode files going into an LTO link will result in the expected/correct debug information going into the resulting ELF file; unless we can be sure that this will always work between bitcode files generated by different versions we'd need some way of flagging up an incompatibility and providing useful information on the reason to the user.

--Greg Bedwell
SN Systems - Sony Computer Entertainment Group

_______________________________________________
cfe-dev mailing list
cfe...@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev


David Blaikie

unread,
Sep 27, 2014, 11:54:29 AM9/27/14
to Greg Bedwell, Yung, Douglas, cfe...@cs.uiuc.edu, llv...@cs.uiuc.edu
In general the story is these days that we won't /crash/ on old debug info metadata formats, we'll just drop the old debug info metadata - so you won't get debug info, but you can still link/use your old IR libraries with new IR/compiler.

Alex Rosenberg

unread,
Sep 28, 2014, 2:44:50 AM9/28/14
to Greg Bedwell, Yung, Douglas, cfe...@cs.uiuc.edu, llv...@cs.uiuc.edu
How is this use case different from the LTO-supported toolchains shipped by other vendors such as Apple? Do they have this theoretical problem too?

If the issue is solely constrained to debug info metadata, then why not use metadata to describe the format/version of the debug info?

Alex

David Blaikie

unread,
Sep 28, 2014, 4:09:49 AM9/28/14
to Alex Rosenberg, Yung, Douglas, cfe...@cs.uiuc.edu, llv...@cs.uiuc.edu
On Sat, Sep 27, 2014 at 11:35 PM, Alex Rosenberg <al...@leftfield.org> wrote:
How is this use case different from the LTO-supported toolchains shipped by other vendors such as Apple? Do they have this theoretical problem too?

If the issue is solely constrained to debug info metadata, then why not use metadata to describe the format/version of the debug info?

FWIW (I haven't followed the rest of this thread) - that's what we/Apple have done. There's a module flag metadata that specifies the debug info metadata schema version, then the verifier (or some other pass, I forget how it's phrased) can check if the version matches the current LLVM's debug info metadata version, and if it doesn't match, it strips out all the debug info metadata.

Robinson, Paul

unread,
Sep 28, 2014, 6:15:59 PM9/28/14
to Greg Bedwell, Sean Silva, Yung, Douglas, cfe...@cs.uiuc.edu, llv...@cs.uiuc.edu

| Bitcode backward compatibility, at least for the current major version, is supposed to make this unnecessary.

 

I think the "at least for the current major version" part is one thing that concerns us.  LLVM 4.0 will promise to read LLVM 3.4 bitcode, but LLVM 4.1 will not, according to my understanding of the current promise.  Smoothly identifying that point and being able to provide an intelligent diagnostic seems like goodness. Hard to distinguish "old" bitcode from "broken" bitcode without recording version info of some kind, and the sooner we start recording the version number the more completely we're able to diagnose the situation properly when the time comes.

--paulr

Renato Golin

unread,
Sep 29, 2014, 4:31:30 AM9/29/14
to Robinson, Paul, Yung, Douglas, cfe...@cs.uiuc.edu, llv...@cs.uiuc.edu
On 28 September 2014 23:10, Robinson, Paul
<Paul_R...@playstation.sony.com> wrote:
> | Bitcode backward compatibility, at least for the current major version, is
> supposed to make this unnecessary.

It didn't use to work that way, and I'm not sure we want it at all.


> I think the "at least for the current major version" part is one thing that
> concerns us. LLVM 4.0 will promise to read LLVM 3.4 bitcode, but LLVM 4.1
> will not, according to my understanding of the current promise.

I've never heard such promises, but even if you're right (that there
is a promise), we cannot enforce it, since we have no tests to make
sure we do.

Right now, the only guarantee I know exists (and it's a new one, from
3.4 onwards) is that minor releases won't break ABI or API
compatibility, which includes IR logic. So 3.4.2 is guaranteed to
parse 3.4 IR but not 3.3 or 3.5.x.


> Smoothly identifying that point and being able to provide an intelligent diagnostic
> seems like goodness. Hard to distinguish "old" bitcode from "broken" bitcode
> without recording version info of some kind, and the sooner we start
> recording the version number the more completely we're able to diagnose the
> situation properly when the time comes.

There are two problems with this:

1. Due to the nature of our development strategy, IR compatibility can
be broken between two releases, which means any two commits within the
same revision can fail to parse (or parse incorrectly) IR from each
other. Do we care about between-release compatibility?

Some people get specific commits, rather than releases, for timing
reasons, for their products, and in doing so, you could get a commit
that is actually IR incompatible with the next major release. If you
care about compatibility, you should increment the IR version every
time something radical changes, which can be multiple times between
the same two releases, or spawn across multiple releases.

IR versioning should be completely independent of major / minor
release cycles. The hard part is to truly detect, and validate, IR
compatibility changes.

2. IR incompatibility is different from metadata incompatibility. If
the IR is incompatible (say we drop or add a new type, or we change
how exceptions are propagated), the new parser will not understand the
old and vice-versa. But if metadata changes, it can still be parsed,
and as David said, if we can't understand it, we just drop it.

If you want your parser to break the least, you'll have to have at
least two version: IR and Debug. Other metadata versioning can be done
individually (since they change at different rates). You may want to
warn on stale metadata status (since it's not an error), but you
should stop on stale IR.


Finally, both problems end up in the same place: how do you validate
this? We'd have to add a new class of tests, and for every new change
in IR/metadata, we'd increase the version number and create a test
that checks old parser+new syntax and old syntax+new parser and makes
sure they fail/warn.

You'd also need to have a table of major releases vs. IR versions, so
that in the error/warning message you tell: please use LLVM M.N
instead.

cheers,
--renato

Robinson, Paul

unread,
Sep 29, 2014, 10:28:20 AM9/29/14
to Renato Golin, Yung, Douglas, cfe...@cs.uiuc.edu, llv...@cs.uiuc.edu


> -----Original Message-----
> From: Renato Golin [mailto:renato...@linaro.org]
> Sent: Monday, September 29, 2014 1:27 AM
> To: Robinson, Paul
> Cc: Greg Bedwell; Sean Silva; Yung, Douglas; cfe...@cs.uiuc.edu;
> llv...@cs.uiuc.edu
> Subject: Re: [LLVMdev] [cfe-dev] Proposal to add Bitcode version field to
> bitcode file wrapper
>
> On 28 September 2014 23:10, Robinson, Paul
> <Paul_R...@playstation.sony.com> wrote:
> > | Bitcode backward compatibility, at least for the current major
> version, is
> > supposed to make this unnecessary.
>
> It didn't use to work that way, and I'm not sure we want it at all.
>
>
> > I think the "at least for the current major version" part is one thing
> that
> > concerns us. LLVM 4.0 will promise to read LLVM 3.4 bitcode, but LLVM
> 4.1
> > will not, according to my understanding of the current promise.
>
> I've never heard such promises, but even if you're right (that there
> is a promise), we cannot enforce it, since we have no tests to make
> sure we do.

That promise is what I understood from a discussion within the past month,
e.g. http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-September/076815.html
If I misunderstood, clarification on the clarification would be helpful. ;-)

And recently tests have appeared that I thought were intended to validate
this sort of thing, e.g. r218297.
--paulr

Renato Golin

unread,
Sep 29, 2014, 10:36:47 AM9/29/14
to Robinson, Paul, Yung, Douglas, cfe...@cs.uiuc.edu, llv...@cs.uiuc.edu
On 29 September 2014 15:16, Robinson, Paul
<Paul_R...@playstation.sony.com> wrote:
> That promise is what I understood from a discussion within the past month,
> e.g. http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-September/076815.html
> If I misunderstood, clarification on the clarification would be helpful. ;-)

I see, I missed that one. My concerns seem to be similar to Bob's,
though in the past, when we discussed the same topic, there was one
major hurdle to implement that: we'd have to know what features we
removed / stopped supporting and warn on what version brackets
supported that feature. This can only grow as the compiler ages.

Enforcing backwards compatibility with only the major version created
another hurdle: we'd only be able to deprecate bad/temporary features
every few years, creating another bag of legacy. Warnings can be made,
and deprecation of whole sets of features will happen at major
version, which will stress the release validation and increase the
influx of bugs on all major releases.

Whenever I think of any of that, I remember Chris' words: "LLVM IR is
a compiler IR. Nothing more, nothing less". I don't think we should
try to standardise that too much.

My tuppence.

Bob Wilson

unread,
Sep 29, 2014, 12:02:57 PM9/29/14
to Renato Golin, llv...@cs.uiuc.edu, Yung, Douglas, cfe...@cs.uiuc.edu

> On Sep 29, 2014, at 7:29 AM, Renato Golin <renato...@linaro.org> wrote:
>
> On 29 September 2014 15:16, Robinson, Paul
> <Paul_R...@playstation.sony.com> wrote:
>> That promise is what I understood from a discussion within the past month,
>> e.g. http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-September/076815.html
>> If I misunderstood, clarification on the clarification would be helpful. ;-)
>
> I see, I missed that one. My concerns seem to be similar to Bob's,
> though in the past, when we discussed the same topic, there was one
> major hurdle to implement that: we'd have to know what features we
> removed / stopped supporting and warn on what version brackets
> supported that feature. This can only grow as the compiler ages.
>
> Enforcing backwards compatibility with only the major version created
> another hurdle: we'd only be able to deprecate bad/temporary features
> every few years, creating another bag of legacy. Warnings can be made,
> and deprecation of whole sets of features will happen at major
> version, which will stress the release validation and increase the
> influx of bugs on all major releases.
>
> Whenever I think of any of that, I remember Chris' words: "LLVM IR is
> a compiler IR. Nothing more, nothing less". I don't think we should
> try to standardise that too much.

It is a compiler IR but with LTO we need backward compatibility for IR in object files. This is not a new requirement. We have required auto-upgrade support for old bitcode files for as long as I have worked on LLVM for exactly this reason. The testing to enforce that has been pretty minimal, but we’re now making an effort to be more systematic about it.

Renato Golin

unread,
Sep 29, 2014, 12:31:59 PM9/29/14
to Bob Wilson, llv...@cs.uiuc.edu, Yung, Douglas, cfe...@cs.uiuc.edu
On 29 September 2014 16:58, Bob Wilson <bob.w...@apple.com> wrote:
> It is a compiler IR but with LTO we need backward compatibility for IR in object files. This is not a new requirement. We have required auto-upgrade support for old bitcode files for as long as I have worked on LLVM for exactly this reason. The testing to enforce that has been pretty minimal, but we’re now making an effort to be more systematic about it.

By all means, I think we're on the same page as far as testing and the
burden of keeping the compatibility, I just didn't remember it being
such a big requirement.

Maybe I'm making a bigger issue than it needs to be... which is
actually quite likely. :)

cheers,
--renato

Eric Christopher

unread,
Sep 29, 2014, 2:37:50 PM9/29/14
to Bob Wilson, Yung, Douglas, cfe...@cs.uiuc.edu, llv...@cs.uiuc.edu

Except, backward compatibility breaks are allowed at major versions.
See, for example, the type system rewrite.

-eric

Renato Golin

unread,
Sep 29, 2014, 2:58:00 PM9/29/14
to Eric Christopher, Yung, Douglas, cfe...@cs.uiuc.edu, llv...@cs.uiuc.edu
On 29 September 2014 19:26, Eric Christopher <echr...@gmail.com> wrote:
> Except, backward compatibility breaks are allowed at major versions.
> See, for example, the type system rewrite.

Yes, that was stated in the original email. I think we all agree that
this is an important part of the policy.

cheers,
--renato

jahanian

unread,
Sep 29, 2014, 7:45:22 PM9/29/14
to llvmdev
Hi All,
We have internal request to allow ‘_’ in addition to ‘.’ as version tuple separators.
So, in addition to 'major[.minor[.subminor]]’, proposal is to allow ‘major[_minor[_subminor]]’
as well. Is there a reason we shouldn’t do this?

- Thanks, Fariborz

Yung, Douglas

unread,
Oct 5, 2014, 11:18:59 PM10/5/14
to David Blaikie, Alex Rosenberg, cfe...@cs.uiuc.edu, llv...@cs.uiuc.edu

Hi –

 

I realize the thread has drifted a little, but I wanted to get back to my original proposal. I would like to make a change to the bitcode file wrapper to include the version of llvm that produced the bitcode. I would like to write this version into the unused version field that currently exists. Would there be any objections to this change?

 

Since the original wrapper is only emitted for Darwin targets, I ran an experiment where I took bitcode files produced by the official Apple tools and then modified the version field to be non-zero. From my simple tests, there seemed to be no problems with existing tools when the version field is non-zero.

 

Douglas Yung

Sean Silva

unread,
Oct 6, 2014, 3:32:51 PM10/6/14
to Yung, Douglas, cfe...@cs.uiuc.edu, llv...@cs.uiuc.edu
On Sun, Oct 5, 2014 at 8:10 PM, Yung, Douglas <dougla...@playstation.sony.com> wrote:

Hi –

 

I realize the thread has drifted a little, but I wanted to get back to my original proposal. I would like to make a change to the bitcode file wrapper to include the version of llvm that produced the bitcode. I would like to write this version into the unused version field that currently exists. Would there be any objections to this change?


If the version field is currently unused, I don't see the harm in filling it in. It probably would be fine to just use it to store the LLVM major version, so that we can detect incompatibilities. I still would be opposed to using a granularity finer than the "intended" breakage cycle (major versions), since that would give the impression that it is "ok" to break compatibility since it can be detected through the version field.

-- Sean Silva

Robinson, Paul

unread,
Oct 6, 2014, 5:27:23 PM10/6/14
to Sean Silva, Yung, Douglas, cfe...@cs.uiuc.edu, llv...@cs.uiuc.edu

One advantage to including the minor version number is that it allows the IR to evolve in more flexible ways (semantic difference without a syntactic difference) but auto-upgrade stays feasible. That is, it can help *avoid* breakages.

Another advantage would be for cases like debug-info metadata, where IIRC there's no backward compatibility at all; we just throw away old stuff. If we had a minor version in there, it becomes more reasonable to support one-minor-version backward compatibility.  (This is a use case that we would be interested in.)

--paulr

David Blaikie

unread,
Oct 6, 2014, 5:43:47 PM10/6/14
to Robinson, Paul, Yung, Douglas, cfe...@cs.uiuc.edu, llv...@cs.uiuc.edu
On Mon, Oct 6, 2014 at 2:24 PM, Robinson, Paul <Paul_R...@playstation.sony.com> wrote:

One advantage to including the minor version number is that it allows the IR to evolve in more flexible ways (semantic difference without a syntactic difference) but auto-upgrade stays feasible. That is, it can help *avoid* breakages.

Another advantage would be for cases like debug-info metadata, where IIRC there's no backward compatibility at all; we just throw away old stuff. If we had a minor version in there, it becomes more reasonable to support one-minor-version backward compatibility.  (This is a use case that we would be interested in.)


FWIW we already have a debug info metadata version flag which could be used to facilitate this functionality. Kind of a painful thing to do, depending on the particular breakage. Current strategy is that old version debug info is dropped, no reason it couldn't be upgraded instead.

- David

Eric Christopher

unread,
Oct 6, 2014, 5:49:08 PM10/6/14
to Robinson, Paul, Yung, Douglas, cfe...@cs.uiuc.edu, llv...@cs.uiuc.edu
On Mon, Oct 6, 2014 at 2:24 PM, Robinson, Paul
<Paul_R...@playstation.sony.com> wrote:
> One advantage to including the minor version number is that it allows the IR
> to evolve in more flexible ways (semantic difference without a syntactic
> difference) but auto-upgrade stays feasible. That is, it can help *avoid*
> breakages.
>
> Another advantage would be for cases like debug-info metadata, where IIRC
> there's no backward compatibility at all; we just throw away old stuff. If
> we had a minor version in there, it becomes more reasonable to support
> one-minor-version backward compatibility. (This is a use case that we would
> be interested in.)
>

This just isn't something we're willing to support now.

-eric

Reply all
Reply to author
Forward
0 new messages