[LLVMdev] [RFC] Removing static initializers for command line options

256 views
Skip to first unread message

Chris Bieneman

unread,
Aug 18, 2014, 2:51:30 PM8/18/14
to LLVM Dev
Today command line arguments in LLVM are global variables. An example argument from Scalarizer.cpp is:

static cl::opt<bool> ScalarizeLoadStore
  ("scalarize-load-store", cl::Hidden, cl::init(false),
   cl::desc("Allow the scalarizer pass to scalarize loads and store"));

This poses a problem for clients of LLVM that aren’t traditional compilers (i.e. WebKit, and Mesa). My proposal is to take a phased approach at addressing this issue.

The first phase is to move the ownership of command line options to a singleton, OptionRegistry. The OptionRegistry can be made to work with the existing global command line definitions so that the changes to migrate options can be done in small batches. The primary purpose of this change is to move the ownership of the command line options out of the global scope, and to provide a vehicle for threading them through the compiler. At the completion of this phase, all the command line arguments will be constructed during LLVM initialization and registered under the OptionRegistry. This will replace the 100’s of static initialized cl::opt objects with a single static initialized OptionRegistry.

With this change options can be constructed during initialization. For the example option above the pass initialization would get a line like:

cl::OptionRegistry::CreateOption<bool>("ScalarizeLoadStore",
  "scalarize-load-store", cl::Hidden, cl::init(false),
  cl::desc("Allow the scalarizer pass to scalarize loads and store"));

Also the pass would add a boolean member to store the value of the option which would be initialized in the pass’s constructor like this:

ScalarizeLoadStore = cl::OptionRegistry::GetValue<bool>("ScalarizeLoadStore");

These two operations need to occur at separate times due to object lifespans. At the time when command lines are parsed passes have been initialized, but not constructed. That means making options live in passes doesn’t really work, but since we want the data to be part of the pass we need to initialize it during construction.

A large part of this phase will be finding appropriate places for all the command line options to be initialized, and identifying all the places where the option data will need to be threaded through the compiler. One of the goals here is to get rid of all global state in the compiler to (eventually) enable better multi-threading by clients like WebKit.

The second phase is to split the OptionRegistry into two pieces. The first piece is the parsing logic, and the second piece is the Option data store. The main goal of this phase is to make the OptionRegistry represent everything needed to parse command line options and to define a new second object, OptionStore, that is populated with values by parsing the command line. The OptionRegistry will be responsible for initializing “blank” option stores which can then be populated by either the command line parser, or API calls.

The OptionRegistry should remain a singleton so that the parsing logic for all options remains universally available. This is required to continue supporting plugin loadable options.

The OptionStore should be created when a command line is parsed, or by an API call in libraries, and can be passed through the pass manager and targets to populate option data. The OptionStore should have a lifetime independent of contexts, and pass managers because it can be re-used indiscriminately.

The core principle in this design is that the objects involved in parsing options only need to exist once, but you need a full list of all options in order to parse a command line. You should be able to have multiple copies of the actual stored option data. Having multiple copies of the data store is one step toward enabling two instances of LLVM in the same process to use optimization passes with different options.

I haven’t come up with a specific implementation proposal for this yet, but I do have some rough ideas. The basic flow that I’m thinking of is for command line parsing to create an object that maps option names to their values without any of the parsing data involved. This would allow for either parsing multiple command lines, or generally just constructing multiple option data stores. **Here is where things get foggy because I haven’t yet looked too deep.** Once you construct a data store it will get passed into the pass manager (and everywhere else that needs it), and it will be used to initialize all the option values.

There has been discussion about making the option store reside within the context, but this doesn’t feel right because the biggest consumer of option data is the passes, and you can use a single pass manager with multiple contexts.

Thanks,
-Chris
cl_opt.diff

Rafael Espíndola

unread,
Aug 18, 2014, 5:47:47 PM8/18/14
to Chris Bieneman, LLVM Dev

For the first step it might be better to keep the option value as a
global. That way we only switch to using something like


static bool ScalarizeLoadStore;
cl::OptionRegistry::CreateOption<bool>(&ScalarizeLoadStore,


"ScalarizeLoadStore",
"scalarize-load-store", cl::Hidden, cl::init(false),
cl::desc("Allow the scalarizer pass to scalarize loads and store"));

and everything else remains as is.

Some passes take options directly in the constructor. For example

Inliner::Inliner(char &ID, int Threshold, bool InsertLifetime)

Maybe we could just say that there are two different types of options.
The ones we want to expose to users and the ones which we use for
testing llvm itself. The options we want to expose should be just
constructor arguments. With that distinction we should be able to just
not use the options added by cl::OptionRegistry::CreateOption unless
cl::ParseCommandLineOptions is called. WebKit like clients would just
not call cl::ParseCommandLineOptions. Would that work?

Cheers,
Rafael

_______________________________________________
LLVM Developers mailing list
LLV...@cs.uiuc.edu http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Chris Bieneman

unread,
Aug 18, 2014, 6:00:49 PM8/18/14
to Rafael Espíndola, LLVM Dev
I’d prefer to do the removal of global storage all at once. This can be done one pass at a time, but doing all at once means I don’t need to revisit each pass as many times. Whereas if I do this in two passes (where the option becomes owned, but the storage remains global) I’ll need to revisit each pass again to get rid of the global storage.

Keep in mind one of the ultimate goals that this is building toward is allowing a single process to compile multiple programs at the same time with different options.
This is actually how some of our internal clients are already working. There are a few caveats with this approach:

(1) You can’t allow the pass manager to allocate your passes for you, because those passes only read from cl::opts
(2) Not all of our passes have constructors for overriding their cl::opts (the legacy Scalarizer is one)

I think it would in general be cleaner to provide a way for library clients to use cl::opts without being forced to parse a command line.

Thanks,
-Chris


Cheers,
Rafael

Rafael Espíndola

unread,
Aug 18, 2014, 6:13:56 PM8/18/14
to Chris Bieneman, LLVM Dev
> Some passes take options directly in the constructor. For example
>
> Inliner::Inliner(char &ID, int Threshold, bool InsertLifetime)
>
> Maybe we could just say that there are two different types of options.
> The ones we want to expose to users and the ones which we use for
> testing llvm itself. The options we want to expose should be just
> constructor arguments. With that distinction we should be able to just
> not use the options added by cl::OptionRegistry::CreateOption unless
> cl::ParseCommandLineOptions is called. WebKit like clients would just
> not call cl::ParseCommandLineOptions. Would that work?
>
>
> This is actually how some of our internal clients are already working. There
> are a few caveats with this approach:
>
> (1) You can’t allow the pass manager to allocate your passes for you,
> because those passes only read from cl::opts

You mean PassManagerBuilder, right?

> (2) Not all of our passes have constructors for overriding their cl::opts
> (the legacy Scalarizer is one)
>
> I think it would in general be cleaner to provide a way for library clients
> to use cl::opts without being forced to parse a command line.

I guess it really depends on how many options there are that we want
to expose via an API. I have the impression that there are few, which
would make changing the constructors and PassManagerBuilder better.

If there is a large number of options that we want to expose, then I
can see the value of having a llvm "configuration object" that is
passed around and is queried by the passes. If we do go down this
road, we should change passes like the inliner to use the
configuration object instead of constructor options. We should also
drop the "cl" from the names if it is not going to be handling command
lines :-)

Chris Bieneman

unread,
Aug 18, 2014, 6:25:50 PM8/18/14
to Rafael Espíndola, LLVM Dev

> On Aug 18, 2014, at 3:09 PM, Rafael Espíndola <rafael.e...@gmail.com> wrote:
>
>> Some passes take options directly in the constructor. For example
>>
>> Inliner::Inliner(char &ID, int Threshold, bool InsertLifetime)
>>
>> Maybe we could just say that there are two different types of options.
>> The ones we want to expose to users and the ones which we use for
>> testing llvm itself. The options we want to expose should be just
>> constructor arguments. With that distinction we should be able to just
>> not use the options added by cl::OptionRegistry::CreateOption unless
>> cl::ParseCommandLineOptions is called. WebKit like clients would just
>> not call cl::ParseCommandLineOptions. Would that work?
>>
>>
>> This is actually how some of our internal clients are already working. There
>> are a few caveats with this approach:
>>
>> (1) You can’t allow the pass manager to allocate your passes for you,
>> because those passes only read from cl::opts
>
> You mean PassManagerBuilder, right?

Yes.

>
>> (2) Not all of our passes have constructors for overriding their cl::opts
>> (the legacy Scalarizer is one)
>>
>> I think it would in general be cleaner to provide a way for library clients
>> to use cl::opts without being forced to parse a command line.
>
> I guess it really depends on how many options there are that we want
> to expose via an API. I have the impression that there are few, which
> would make changing the constructors and PassManagerBuilder better.
>
> If there is a large number of options that we want to expose, then I
> can see the value of having a llvm "configuration object" that is
> passed around and is queried by the passes. If we do go down this
> road, we should change passes like the inliner to use the
> configuration object instead of constructor options. We should also
> drop the "cl" from the names if it is not going to be handling command
> lines :-)

I’m curious if Tom Stellard or Filip Pizlo have any input on this as they are more directly involved on the client side.

I do agree that we should ultimately drop the cl namespace if we’re going in this direction.

-Chris

Filip Pizlo

unread,
Aug 18, 2014, 7:30:11 PM8/18/14
to Chris Bieneman, LLVM Dev

The fewer options we fiddle with, the better for WebKit. Hence we would be fine with a solution that exposes relatively few options.

The main option that we use now - turning on stack map liveness calculation - is something that feels like it shouldn't be an "option" at all but maybe an attribute instead.

Chris Bieneman

unread,
Aug 18, 2014, 7:32:42 PM8/18/14
to Filip Pizlo, LLVM Dev
How do you construct you PassManager?

-Chris

Keno Fischer

unread,
Aug 19, 2014, 2:19:05 AM8/19/14
to Chris Bieneman, LLVM Dev
I've recently experienced a similar pain with multiple
ExecutionEngines generating code (multiple LLVM-based JITs in the same
process). I don't have anything specific to add at this point, but I
wanted to express my whole hearted +1 on the project.

Filip Pizlo

unread,
Aug 19, 2014, 11:32:33 AM8/19/14
to Chris Bieneman, LLVM Dev
LLVMCreatePassManager() and then we add target data, add a target machine analysis pass, and then we construct our pipeline and call LLVMRunPassManager().

-Filip

Chris Bieneman

unread,
Aug 19, 2014, 1:49:44 PM8/19/14
to Filip Pizlo, LLVM Dev
I’d like to propose moving forward with the first phase of my proposal to make the cl::opt structures owned and eliminate global option storage. I’d also like to add to it that when updating passes I will ensure that each pass that has cl::opts also has a default constructor, an overridden constructor to populate each option, and the corresponding factory methods for the C API.

Does this sound reasonable for a first step?

-Chris

Rafael Espíndola

unread,
Aug 19, 2014, 2:12:52 PM8/19/14
to Chris Bieneman, LLVM Dev
On 19 August 2014 13:47, Chris Bieneman <be...@apple.com> wrote:
> I’d like to propose moving forward with the first phase of my proposal to
> make the cl::opt structures owned and eliminate global option storage.

For now, please eliminate only the static constructor and leave the
storage. Since it seems only a few options that need to be exposed by
non-command line APIs, we might be able to avoid the need for a
cl::OptionRegistry::GetValue.

> I’d
> also like to add to it that when updating passes I will ensure that each
> pass that has cl::opts also has a default constructor, an overridden
> constructor to populate each option, and the corresponding factory methods
> for the C API.

And *please* don't add anything to the C api unless someone has a real
need for that particular interface and there is a good discussion on
the review thread about it. The C api has more backwards compatibility
promises, which makes it incredibly painful :-(

Even the C++ api is not free, so I would also only modify a pass
constructor if someone wants to pass that option for something other
then llvm development or testing. For that command lines are a
perfectly reasonable solution.

> Does this sound reasonable for a first step?

Removing the static constructors does.

Filip Pizlo

unread,
Aug 19, 2014, 2:23:25 PM8/19/14
to Rafael Espíndola, LLVM Dev

> On Aug 19, 2014, at 11:09 AM, Rafael Espíndola <rafael.e...@gmail.com> wrote:
>
>> On 19 August 2014 13:47, Chris Bieneman <be...@apple.com> wrote:
>> I’d like to propose moving forward with the first phase of my proposal to
>> make the cl::opt structures owned and eliminate global option storage.
>
> For now, please eliminate only the static constructor and leave the
> storage. Since it seems only a few options that need to be exposed by
> non-command line APIs, we might be able to avoid the need for a
> cl::OptionRegistry::GetValue.
>
>> I’d
>> also like to add to it that when updating passes I will ensure that each
>> pass that has cl::opts also has a default constructor, an overridden
>> constructor to populate each option, and the corresponding factory methods
>> for the C API.
>
> And *please* don't add anything to the C api unless someone has a real
> need for that particular interface and there is a good discussion on
> the review thread about it. The C api has more backwards compatibility
> promises, which makes it incredibly painful :-(

This is a good point. I see two sensible options:

1) don't add anything to the C API unless someone specifically asks, as Rafael suggests.

2) make options passed to passes use some kind of loose coupling, like an array of strings or even better an options object where the user sets key/value pairs by some call (eg. LLVMSetOption(optionObject, keyString, valueString).

The upside of (2) is that it preserves current functionality and new options can be added easily. The downside is that we'd have to get very particular about whether an option needs to be supported indefinitely if it is ever exposed. Probably nobody wants that strong of a contract.

-Filip

Rafael Espíndola

unread,
Aug 19, 2014, 2:29:30 PM8/19/14
to Filip Pizlo, LLVM Dev
> This is a good point. I see two sensible options:
>
> 1) don't add anything to the C API unless someone specifically asks, as Rafael suggests.
>
> 2) make options passed to passes use some kind of loose coupling, like an array of strings or even better an options object where the user sets key/value pairs by some call (eg. LLVMSetOption(optionObject, keyString, valueString).
>
> The upside of (2) is that it preserves current functionality and new options can be added easily. The downside is that we'd have to get very particular about whether an option needs to be supported indefinitely if it is ever exposed. Probably nobody wants that strong of a contract.

This depends on what we mean by "ABI". Do me mean "it links"? If so,
yes, something like LLVMConfigSetBoolValue(Config,
"ScalarizeLoadStore", 1) is not part of the C abi. We could remove or
rename the ScalarizeLoadStore option.

But in my opinion that is a bit too narrow a definition. What we
really should care about is "WebKit still works with the next llvm".
The same is true for other users, and if one of those users depends on
an option, having that option be symbol, a constructor argument or or
a well know string seems like an implementation detail.

Chris Bieneman

unread,
Aug 19, 2014, 2:46:49 PM8/19/14
to Rafael Espíndola, LLVM Dev

> On Aug 19, 2014, at 11:09 AM, Rafael Espíndola <rafael.e...@gmail.com> wrote:
>
> On 19 August 2014 13:47, Chris Bieneman <be...@apple.com> wrote:
>> I’d like to propose moving forward with the first phase of my proposal to
>> make the cl::opt structures owned and eliminate global option storage.
>
> For now, please eliminate only the static constructor and leave the
> storage. Since it seems only a few options that need to be exposed by
> non-command line APIs, we might be able to avoid the need for a
> cl::OptionRegistry::GetValue.

I strongly feel this is the wrong decision. If you have a single process using two LLVM clients (say WebKit and Mesa), and they both are using an opt pass with different settings. If the storage is global this will not work.

>
>> I’d
>> also like to add to it that when updating passes I will ensure that each
>> pass that has cl::opts also has a default constructor, an overridden
>> constructor to populate each option, and the corresponding factory methods
>> for the C API.
>
> And *please* don't add anything to the C api unless someone has a real
> need for that particular interface and there is a good discussion on
> the review thread about it. The C api has more backwards compatibility
> promises, which makes it incredibly painful :-(
>
> Even the C++ api is not free, so I would also only modify a pass
> constructor if someone wants to pass that option for something other
> then llvm development or testing. For that command lines are a
> perfectly reasonable solution.

I can agree with all of this.

-Chris

Rafael Espíndola

unread,
Aug 19, 2014, 3:39:40 PM8/19/14
to Chris Bieneman, LLVM Dev
> I strongly feel this is the wrong decision. If you have a single process using two LLVM clients (say WebKit and Mesa), and they both are using an opt pass with different settings. If the storage is global this will not work.

That is only an issue if they can set the option at all.

To summarize, the point is that there are two *very* different types of options

* Nobs for which there is not a single right answer for all users.
There are very few of these currently and we expect it to remain like
that. These should not use cl::opt or static storage at all. They
should be an option passed to the constructor of the pass. It may
also involve exposing it to the C api and the PassManagerBuilder.

* Options that we use during development of llvm. They are useful for
testing, tracking a bug or enabling/disabling a feature that is still
under development. These can use a static storage since external
clients like webkit will never change them. We do have to avoid these
options requiring static constructors, since otherwise the client is
paying for something it will never use.

Chris Bieneman

unread,
Aug 19, 2014, 4:06:34 PM8/19/14
to Rafael Espíndola, LLVM Dev

> On Aug 19, 2014, at 12:32 PM, Rafael Espíndola <rafael.e...@gmail.com> wrote:
>
>> I strongly feel this is the wrong decision. If you have a single process using two LLVM clients (say WebKit and Mesa), and they both are using an opt pass with different settings. If the storage is global this will not work.
>
> That is only an issue if they can set the option at all.
>
> To summarize, the point is that there are two *very* different types of options
>
> * Nobs for which there is not a single right answer for all users.
> There are very few of these currently and we expect it to remain like
> that. These should not use cl::opt or static storage at all. They
> should be an option passed to the constructor of the pass. It may
> also involve exposing it to the C api and the PassManagerBuilder.

How would you suggest we expose cl::opts for modifying these options in tools like opt? A good example of this type of option would be the options in LoopUnrollPass.cpp.

>
> * Options that we use during development of llvm. They are useful for
> testing, tracking a bug or enabling/disabling a feature that is still
> under development. These can use a static storage since external
> clients like webkit will never change them. We do have to avoid these
> options requiring static constructors, since otherwise the client is
> paying for something it will never use.

What about when we're debugging the WebKit JIT? For development of libraries using LLVM it would be nice to be able to toggle these values too, which is why Filip’s suggestion of an API like LLVMConfigSetBoolValue(Config, "ScalarizeLoadStore", 1) would be nice.

-Chris

Rafael Espíndola

unread,
Aug 19, 2014, 4:35:20 PM8/19/14
to Chris Bieneman, LLVM Dev
>> * Nobs for which there is not a single right answer for all users.
>> There are very few of these currently and we expect it to remain like
>> that. These should not use cl::opt or static storage at all. They
>> should be an option passed to the constructor of the pass. It may
>> also involve exposing it to the C api and the PassManagerBuilder.
>
> How would you suggest we expose cl::opts for modifying these options in tools like opt? A good example of this type of option would be the options in LoopUnrollPass.cpp.

Opt uses the PassManagerBuilder. Opt itself could have a command line
options controlling its use of PassManagerBuilder. That is probably
fine since we expect very few of these.

>>
>> * Options that we use during development of llvm. They are useful for
>> testing, tracking a bug or enabling/disabling a feature that is still
>> under development. These can use a static storage since external
>> clients like webkit will never change them. We do have to avoid these
>> options requiring static constructors, since otherwise the client is
>> paying for something it will never use.
>
> What about when we're debugging the WebKit JIT? For development of libraries using LLVM it would be nice to be able to toggle these values too, which is why Filip’s suggestion of an API like LLVMConfigSetBoolValue(Config, "ScalarizeLoadStore", 1) would be nice.

Most llvm bugs reproduce with just opt or llc, but if that is not the
case, cl::ParseCommandLineOptions when debugging seems fine.

The advantages of this setup are

* Options that are exposed to the users are done so in a very natural
way (constructor arguments).
* Internal options are still really easy to create, but now don't have
static constructors.
* We don't need a LLVMConfig object that gets passed around.
* No stringly typed interface.

Chris Bieneman

unread,
Aug 19, 2014, 5:43:46 PM8/19/14
to Rafael Espíndola, LLVM Dev

> On Aug 19, 2014, at 1:32 PM, Rafael Espíndola <rafael.e...@gmail.com> wrote:
>
>>> * Nobs for which there is not a single right answer for all users.
>>> There are very few of these currently and we expect it to remain like
>>> that. These should not use cl::opt or static storage at all. They
>>> should be an option passed to the constructor of the pass. It may
>>> also involve exposing it to the C api and the PassManagerBuilder.
>>
>> How would you suggest we expose cl::opts for modifying these options in tools like opt? A good example of this type of option would be the options in LoopUnrollPass.cpp.
>
> Opt uses the PassManagerBuilder. Opt itself could have a command line
> options controlling its use of PassManagerBuilder. That is probably
> fine since we expect very few of these.
>
>>>
>>> * Options that we use during development of llvm. They are useful for
>>> testing, tracking a bug or enabling/disabling a feature that is still
>>> under development. These can use a static storage since external
>>> clients like webkit will never change them. We do have to avoid these
>>> options requiring static constructors, since otherwise the client is
>>> paying for something it will never use.
>>
>> What about when we're debugging the WebKit JIT? For development of libraries using LLVM it would be nice to be able to toggle these values too, which is why Filip’s suggestion of an API like LLVMConfigSetBoolValue(Config, "ScalarizeLoadStore", 1) would be nice.
>
> Most llvm bugs reproduce with just opt or llc, but if that is not the
> case, cl::ParseCommandLineOptions when debugging seems fine.

There are two reasons this doesn’t work:

(1) Cases where I might want to set a debug variable for the WebKit JIT but not for Mesa - if the option storage is global it will get overwritten for all users
(2) cl::ParseCommandLineOptions today is C++ API, WebKit restricts itself to the C API, so at the least we’d need to expose it as part of the C API

>
> The advantages of this setup are
>
> * Options that are exposed to the users are done so in a very natural
> way (constructor arguments).

I’m on board with this, but not to the exclusion of other use cases.

> * Internal options are still really easy to create, but now don't have
> static constructors.

We’re in agreement here

> * We don't need a LLVMConfig object that gets passed around.

For the sake of this discussion, let’s just scrap my phase two idea for a configuration object and treat that as a separate issue.

> * No stringly typed interface.

I think that there are advantages to a string-based interface. Sean Silva actually suggested that interface when I first sent out an email about our plans to rework command line options back on 8/6. Based on Sean’s feedback and a few discussions I had offline with other LLVM contributors I thought a stringly typed interface was the best approach to eliminating both the static constructors and the static storage which are explicit goals for our project.

Thanks,
-Chris

Rafael Espíndola

unread,
Aug 19, 2014, 6:13:04 PM8/19/14
to Chris Bieneman, LLVM Dev
> There are two reasons this doesn’t work:
>
> (1) Cases where I might want to set a debug variable for the WebKit JIT but not for Mesa - if the option storage is global it will get overwritten for all users

This really comes up? Really, we are not talking -O2/-O3 or inlining
thresholds. We are talking things like lcr-max-depth being needed for
a debugging session.

> (2) cl::ParseCommandLineOptions today is C++ API, WebKit restricts itself to the C API, so at the least we’d need to expose it as part of the C API

This seems a much smaller change than adding a LLVMConfig object.

>> * We don't need a LLVMConfig object that gets passed around.
>
> For the sake of this discussion, let’s just scrap my phase two idea for a configuration object and treat that as a separate issue.

But it makes a big difference on how the first phase is handled. If we
don't want the LLVMConfig object, the first phase should really just
remove the static constructors and not add a stringly typed interface.

> I think that there are advantages to a string-based interface. Sean Silva actually suggested that interface when I first sent out an email about our plans to rework command line options back on 8/6. Based on Sean’s feedback and a few discussions I had offline with other LLVM contributors I thought a stringly typed interface was the best approach to eliminating both the static constructors and the static storage which are explicit goals for our project.

Sorry I missed the original discussion. Sean, would you mind
summarizing the why of the stringly interface? Even if we do need a
LLVMConfig object (seems unlikely), I would still suggest using some
other key format. With strings we would suddenly be exposing every
command line option to the C API, which seems highly undesirable.

Chris Bieneman

unread,
Aug 19, 2014, 6:59:26 PM8/19/14
to Rafael Espíndola, LLVM Dev

> On Aug 19, 2014, at 3:10 PM, Rafael Espíndola <rafael.e...@gmail.com> wrote:
>
>> There are two reasons this doesn’t work:
>>
>> (1) Cases where I might want to set a debug variable for the WebKit JIT but not for Mesa - if the option storage is global it will get overwritten for all users
>
> This really comes up? Really, we are not talking -O2/-O3 or inlining
> thresholds. We are talking things like lcr-max-depth being needed for
> a debugging session.

I’ve toggled on SROA's "sroa-random-shuffle-slices” before for testing, I've played with InstCombine’s "enable-double-float-shrink”, so it does (although admittedly not terribly often).

>
>> (2) cl::ParseCommandLineOptions today is C++ API, WebKit restricts itself to the C API, so at the least we’d need to expose it as part of the C API
>
> This seems a much smaller change than adding a LLVMConfig object.
>
>>> * We don't need a LLVMConfig object that gets passed around.
>>
>> For the sake of this discussion, let’s just scrap my phase two idea for a configuration object and treat that as a separate issue.
>
> But it makes a big difference on how the first phase is handled. If we
> don't want the LLVMConfig object, the first phase should really just
> remove the static constructors and not add a stringly typed interface.

The stringly typed interface doesn’t just have to do with an LLVMConfig object, it also allows you to use non-static storage for options.

>
>> I think that there are advantages to a string-based interface. Sean Silva actually suggested that interface when I first sent out an email about our plans to rework command line options back on 8/6. Based on Sean’s feedback and a few discussions I had offline with other LLVM contributors I thought a stringly typed interface was the best approach to eliminating both the static constructors and the static storage which are explicit goals for our project.
>
> Sorry I missed the original discussion. Sean, would you mind
> summarizing the why of the stringly interface? Even if we do need a
> LLVMConfig object (seems unlikely), I would still suggest using some
> other key format. With strings we would suddenly be exposing every
> command line option to the C API, which seems highly undesirable.

The problem with alternate key formats from strings is it gets tricky when you start talking about supporting dynamically loaded passes and their options (which our current cl::opts do support).

Thanks,
-Chris

Chandler Carruth

unread,
Aug 19, 2014, 9:48:32 PM8/19/14
to Chris Bieneman, LLVM Dev
FWIW, I largely agree with Rafael's position here, at least in the near term.

The nice thing about it is that even if we don't stay there forever, it is a nice incremental improvement over the current state of the world, and we can actually be mindful going forward of whether the restriction it imposes (an inability to use "internal" knobs from within a library context that have multiple different library users in a single address space) proves to be a significant on-going burden.

On Tue, Aug 19, 2014 at 3:57 PM, Chris Bieneman <be...@apple.com> wrote:
> On Aug 19, 2014, at 3:10 PM, Rafael Espíndola <rafael.e...@gmail.com> wrote:
>
>> There are two reasons this doesn’t work:
>>
>> (1) Cases where I might want to set a debug variable for the WebKit JIT but not for Mesa - if the option storage is global it will get overwritten for all users
>
> This really comes up? Really, we are not talking -O2/-O3 or inlining
> thresholds. We are talking things like lcr-max-depth being needed for
> a debugging session.

I’ve toggled on SROA's "sroa-random-shuffle-slices” before for testing, I've played with InstCombine’s "enable-double-float-shrink”, so it does (although admittedly not terribly often).

I'm somewhat surprised that this comes up much in a context where you *can't* extract a test case and play with it using 'opt' or some other stand-alone context.

If these come up so rarely, would it be unreasonable to just flip the flag in the source code, and build a DSO to test with? For example, this is how I have done counter-based bisection and combine-based bisection of things (a similarly rare but necessary activity I suspect) and it seems to work well.

-Chandler

Sean Silva

unread,
Aug 20, 2014, 12:45:41 AM8/20/14
to Chris Bieneman, LLVM Dev
One interesting issue with moving away from the current system of static initializers for cl::opt is that we will no longer have the automatic registration of all the options so that -help will print everything available and generally we will not be able to issue an error for an "unknown command line option" (without calling into any other code).

The auto-registration is fundamentally tied with the globalness and the static initializers; pondering this has led me down an interesting path that has made me understand better my suggestion in the other thread. As I see it, there are two very different sorts of uses of llvm::cl in LLVM:

1. For regular command line processing. E.g. if a tool accepts an output file, then we need something that will parse the argument from the command line.

2. As a way to easily set up a conduit from A to B, where A is the command line and B is some place "deep" inside the LLVM library code that will do something in response to the command line.

(and, pending discussion, someday point A might include a proper programmatic interface (i.e. in a way other than hijacking the command line processing))

llvm::cl does a decent job for #1 and that is what it was designed for AFAICT; these uses of llvm::cl live outside of library code and everything is pretty happy, despite them being global and having static initializers.

The problem is that llvm::cl is not very well-suited to #2, yet it is used for #2, and that is the real problem. We need a solution to problem #2 which does not use llvm::cl. Thus, I don't think that the solution you propose here is the right direction.

The first step is to clearly differentiate between #1 and #2. I will say "command line options" for #1 and "configuration/tweak points" for #2. (maybe "library options" is better for #2; neither is perfect terminology)

The strawman I suggested in http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075503.html was a stab at #2. There is no way to dodge being stringly typed since command lines are stringly typed, so really it is just a question of how long a solution stays stringly typed.

My thought process for staying stringly typed "the whole time" (possibly with some caching) comes from these two desires:
- adding a c/t point should require adding just one call into the c/t machinery (this is both for convenience and for DRY/SPOT), and 
- this change should be localized to the code being configured/tweaked
This is the thought process:

Note that llvm::cl is stringly typed until it parses the options. llvm::cl gives the appearance of a typed interface because it uses static initialization as a backdoor to globally transport the knowledge of the expected type to the option parsing machinery (very early in the program lifetime). Without this backdoor, we need to stay stringly typed longer, at least until we reach the "localized" place where the single call into the c/t machinery is made; this single call is the only place that has the type information needed for the c/t value to become properly typed. But there is no way to know how long it will be until we reach that point (or even *if* we reach that point; consider passes that are not run on this invocation).

Hence my suggestion of just putting a stringly typed key-value store (or whatever) in an easily accessible place (like LLVMContext), and just translating any unrecognized command line options (ones that are not for #1) into that stringly typed storage.

I agree with Rafael that "constructor arguments to passes" are not c/t points. In the future, there might be some way to integrate the two (from the referenced post, you can probably tell that I kind of like the idea of doing so), but for now, I think the clear incremental step is to attack #2 and solve it without llvm::cl. I have suggested a way to do this that I think makes sense.

-- Sean Silva






Sean Silva

unread,
Aug 20, 2014, 12:54:26 AM8/20/14
to Chris Bieneman, LLVM Dev
To be clear: I agree with Rafael that we need to tread very carefully about how we expose this machinery in the C API, if we expose it at all. My suggestion is completely orthogonal to this though; all I'm talking about is how to avoid the static constructors and global state caused by the cl::opt's in library code, which as I understand it is the motivation for the OP.

-- Sean Silva

Sean Silva

unread,
Aug 20, 2014, 1:08:14 AM8/20/14
to Rafael Espíndola, LLVM Dev
I think that what/how/if we expose in the C API is orthogonal to my suggestions, which are more about removing the static initializers and global state. The stringly typed interface is motivated by the need to late-resolve which options there are and what their types are; this need itself arises from eliminating the static initializers which serve as a backdoor for globally propagating type and option information on startup.

For more details, see my lengthy musings that I just replied to the OP.

Personally, I am very wary of exposing a stringly typed interface in the C API.

-- Sean Silva
 

Cheers,
Rafael

Pete Cooper

unread,
Aug 20, 2014, 1:12:13 AM8/20/14
to Chandler Carruth, LLVM Dev
On Aug 19, 2014, at 6:45 PM, Chandler Carruth <chan...@google.com> wrote:

FWIW, I largely agree with Rafael's position here, at least in the near term.

The nice thing about it is that even if we don't stay there forever, it is a nice incremental improvement over the current state of the world, and we can actually be mindful going forward of whether the restriction it imposes (an inability to use "internal" knobs from within a library context that have multiple different library users in a single address space) proves to be a significant on-going burden.
I actually disagree with this for two reasons.

The first is that if there are going to be changes to the code anyway to remove static initializers, and we can move the storage to the pass at the same time, the we should make just one change and not two.

The second reason is that in most cases these knobs should not be globals. If I had a piece of data (not a CL::opt) in global scope, only used by one pass, then I'm sure people would question why it's a global at all and move it inside the class. We're treating cl::opt as special here when there's no reason for the opt or the storage to be global. 

We frown upon the use of globals, otherwise LLVM would be littered with them like many other C++ code bases. I don't think cl::opts should be special at all in this respect.

Thanks
Pete


On Tue, Aug 19, 2014 at 3:57 PM, Chris Bieneman <be...@apple.com> wrote:
> On Aug 19, 2014, at 3:10 PM, Rafael Espíndola <rafael.e...@gmail.com> wrote:
>
>> There are two reasons this doesn’t work:
>>
>> (1) Cases where I might want to set a debug variable for the WebKit JIT but not for Mesa - if the option storage is global it will get overwritten for all users
>
> This really comes up? Really, we are not talking -O2/-O3 or inlining
> thresholds. We are talking things like lcr-max-depth being needed for
> a debugging session.

I’ve toggled on SROA's "sroa-random-shuffle-slices” before for testing, I've played with InstCombine’s "enable-double-float-shrink”, so it does (although admittedly not terribly often).

I'm somewhat surprised that this comes up much in a context where you *can't* extract a test case and play with it using 'opt' or some other stand-alone context.

If these come up so rarely, would it be unreasonable to just flip the flag in the source code, and build a DSO to test with? For example, this is how I have done counter-based bisection and combine-based bisection of things (a similarly rare but necessary activity I suspect) and it seems to work well.

-Chandler
_______________________________________________

Chandler Carruth

unread,
Aug 20, 2014, 1:27:01 AM8/20/14
to Pete Cooper, LLVM Dev
On Tue, Aug 19, 2014 at 10:10 PM, Pete Cooper <peter_...@apple.com> wrote:
On Aug 19, 2014, at 6:45 PM, Chandler Carruth <chan...@google.com> wrote:

FWIW, I largely agree with Rafael's position here, at least in the near term.

The nice thing about it is that even if we don't stay there forever, it is a nice incremental improvement over the current state of the world, and we can actually be mindful going forward of whether the restriction it imposes (an inability to use "internal" knobs from within a library context that have multiple different library users in a single address space) proves to be a significant on-going burden.
I actually disagree with this for two reasons.

The first is that if there are going to be changes to the code anyway to remove static initializers, and we can move the storage to the pass at the same time, the we should make just one change and not two.

No one is suggesting otherwise that I have seen? At least, my interpretation (perhaps incorrect, I've not had time to read all of this thread in 100% detail) was that the removal of static initializers wouldn't require changing every cl::opt variable.
 

The second reason is that in most cases these knobs should not be globals. If I had a piece of data (not a CL::opt) in global scope, only used by one pass, then I'm sure people would question why it's a global at all and move it inside the class. We're treating cl::opt as special here when there's no reason for the opt or the storage to be global. 

We frown upon the use of globals, otherwise LLVM would be littered with them like many other C++ code bases. I don't think cl::opts should be special at all in this respect.

Sure, you're arguing against the core design of cl::opt. However, we have it, and it wasn't an accident or for lack of other patterns that we chose it.

For example, we don't require all constants to be per-pass, and instead have per-file constants. Rafael is suggesting that one use case for cl::opt global variables is, in essence, a constant that is somewhat easier for a developer of LLVM (*not* a user) to change during debugging and active development. I don't think the desire for convenience and only supporting the default value in production contexts are completely invalid.

Once you factor those in, we have a tradeoff. Historically, the tradeoff was made in the direction of convenience, relying on a very narrow interpretation of the use cases. It isn't clear to me that we should, today, make a different tradeoff. It certainly doesn't make sense why you would gate removing static initializers (a clear win) on forcing a change on a core design pattern within LLVM which not all of the developers are really supportive of (at this point).

Renato Golin

unread,
Aug 20, 2014, 1:31:43 AM8/20/14
to Sean Silva, LLVM Dev
On 20 August 2014 05:43, Sean Silva <chiso...@gmail.com> wrote:
> 2. As a way to easily set up a conduit from A to B, where A is the command
> line and B is some place "deep" inside the LLVM library code that will do
> something in response to the command line.

I never liked this as a solution to #2. Heck, I never liked that we do
have #2 in the first place.


> Hence my suggestion of just putting a stringly typed key-value store (or
> whatever) in an easily accessible place (like LLVMContext), and just
> translating any unrecognized command line options (ones that are not for #1)
> into that stringly typed storage.

I fully support this idea, and is in line with my strawman proposal
for FPU/CPU parsing on all tools shared (beyond the boundaries of
LLVM).

String parsing is common and, even being target specific or pass
specific, is still string parsing and should be identical to all users
of the *same* feature.

I also believe a more sane command line option scheme is in order.
Today we have a zillion of options that are completely disconnected,
documented by accident in the initializer, without any bigger context
whatsoever. This is in part to follow what GCC has always done, and
probably we'll still need to support GCC's and our own legacy for
decades, but all that can also live in this commoned up parser.

Specifically to #2, my idea was something like:
--vectorizer-opts="foo,bar,baz=10" --tbaa-opts="...", etc. That could
use a common parser all the way down to parsing "foo", which would be
left by the vectorizer's back-end to the parser to deal with and setup
the right flags in the right structure, used because "vectorizer" in
"vectorizer-opts" tell the factory to return a vectorizer parser's
back-end.

The FPU/CPU parsing (which has to parse command line options and
assembly directives, which happens to have the same syntax), would
have a similar structure.

Such flags in LLVMContext (or whatever) would have to be structured
like a tree and each pass should receive its own tree root, which most
of the time would have just a list of key/values, but some times have
a more nested structure.

cheers,
--renato

Pete Cooper

unread,
Aug 20, 2014, 2:01:54 AM8/20/14
to Sean Silva, LLVM Dev
I have no problem with this either.

As an alternative, and one which is likely to be fairly stable, we could just expose ParseCommandLineOptions to the C API.  That parsing will cause the options to change anyway, and there’s actually nothing to stop you from calling it multiple times to change different options at different times.

Pete

-- Sean Silva
 

Cheers,
Rafael

Pete Cooper

unread,
Aug 20, 2014, 2:14:24 AM8/20/14
to Sean Silva, LLVM Dev
On Aug 19, 2014, at 9:43 PM, Sean Silva <chiso...@gmail.com> wrote:

One interesting issue with moving away from the current system of static initializers for cl::opt is that we will no longer have the automatic registration of all the options so that -help will print everything available and generally we will not be able to issue an error for an "unknown command line option" (without calling into any other code).
Not automatic no, but in the proposal Chris puts the addOption call inside the pass initializer which is called before ParseCommandLineOptions.  This means you’ll still get options listed as you currently do, so long as you continue to calls the pass initializers before parse (something you have to do anyway to get the pass name visible to the command line)


The auto-registration is fundamentally tied with the globalness and the static initializers; pondering this has led me down an interesting path that has made me understand better my suggestion in the other thread. As I see it, there are two very different sorts of uses of llvm::cl in LLVM:

1. For regular command line processing. E.g. if a tool accepts an output file, then we need something that will parse the argument from the command line.

2. As a way to easily set up a conduit from A to B, where A is the command line and B is some place "deep" inside the LLVM library code that will do something in response to the command line.

(and, pending discussion, someday point A might include a proper programmatic interface (i.e. in a way other than hijacking the command line processing))
That would be nice.  I just suggested in another thread that we expose ParseCommandLineOptions to the C API to hack around this, but a nice clean interface would of course be better.


llvm::cl does a decent job for #1 and that is what it was designed for AFAICT; these uses of llvm::cl live outside of library code and everything is pretty happy, despite them being global and having static initializers.

The problem is that llvm::cl is not very well-suited to #2, yet it is used for #2, and that is the real problem. We need a solution to problem #2 which does not use llvm::cl. Thus, I don't think that the solution you propose here is the right direction.

The first step is to clearly differentiate between #1 and #2. I will say "command line options" for #1 and "configuration/tweak points" for #2. (maybe "library options" is better for #2; neither is perfect terminology)

The strawman I suggested in http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075503.html was a stab at #2. There is no way to dodge being stringly typed since command lines are stringly typed, so really it is just a question of how long a solution stays stringly typed.

My thought process for staying stringly typed "the whole time" (possibly with some caching) comes from these two desires:
- adding a c/t point should require adding just one call into the c/t machinery (this is both for convenience and for DRY/SPOT), and 
Right.  The current point is in the pass initializer.

There is another point in the pass constructor to read the option value.  This is the only point at which something will change from being a string to its actual type and value.

- this change should be localized to the code being configured/tweaked
This is the thought process:

Note that llvm::cl is stringly typed until it parses the options. llvm::cl gives the appearance of a typed interface because it uses static initialization as a backdoor to globally transport the knowledge of the expected type to the option parsing machinery (very early in the program lifetime). Without this backdoor, we need to stay stringly typed longer, at least until we reach the "localized" place where the single call into the c/t machinery is made; this single call is the only place that has the type information needed for the c/t value to become properly typed. But there is no way to know how long it will be until we reach that point (or even *if* we reach that point; consider passes that are not run on this invocation).
The current proposal exposes the type in addOption (as well as later when we get the option). So the type continues to be known to the command line parser.  Whether you want to actually type check in the command line is a point i’m open to discuss.  Personally i want a command line option to be type checked because it was registered, even if no-one actually gets the value of the option later.


Hence my suggestion of just putting a stringly typed key-value store (or whatever) in an easily accessible place (like LLVMContext), and just translating any unrecognized command line options (ones that are not for #1) into that stringly typed storage.
I’m against it being in the context because you may want to set up and reuse passes multiple times with the same options, and use that configuration to compile multiple LLVMContexts.  But I do agree that having a store with some lifetime is useful.

I think the current proposal is to have the store be a singleton, but there’s nothing to stop further work to have the storage for options be one per thread for example.  If you wanted to have one pass manager per thread with its own set of passes, configured (currently) via their own call to ParseCommandLineOptions then that would be possible with little work beyond the current proposal.


I agree with Rafael that "constructor arguments to passes" are not c/t points. In the future, there might be some way to integrate the two (from the referenced post, you can probably tell that I kind of like the idea of doing so), but for now, I think the clear incremental step is to attack #2 and solve it without llvm::cl. I have suggested a way to do this that I think makes sense.
If you change the current proposal so that it doesn’t read cl::opt, then I think this reads to me like what is being proposed now.  Really its creating a string->string map with addOption, and getting the values with getOption.  The passes don’t care (or know) whether the options are set via the command line or any other API.  I hope i’ve understood your proposal correctly here.  Please correct me otherwise.

Thanks,
Pete

Chris Bieneman

unread,
Aug 20, 2014, 2:22:37 AM8/20/14
to Chandler Carruth, LLVM Dev
On Aug 19, 2014, at 10:24 PM, Chandler Carruth <chan...@google.com> wrote:


On Tue, Aug 19, 2014 at 10:10 PM, Pete Cooper <peter_...@apple.com> wrote:
On Aug 19, 2014, at 6:45 PM, Chandler Carruth <chan...@google.com> wrote:

FWIW, I largely agree with Rafael's position here, at least in the near term.

The nice thing about it is that even if we don't stay there forever, it is a nice incremental improvement over the current state of the world, and we can actually be mindful going forward of whether the restriction it imposes (an inability to use "internal" knobs from within a library context that have multiple different library users in a single address space) proves to be a significant on-going burden.
I actually disagree with this for two reasons.

The first is that if there are going to be changes to the code anyway to remove static initializers, and we can move the storage to the pass at the same time, the we should make just one change and not two.

No one is suggesting otherwise that I have seen? At least, my interpretation (perhaps incorrect, I've not had time to read all of this thread in 100% detail) was that the removal of static initializers wouldn't require changing every cl::opt variable.

I think I may be misunderstanding you here. If I understand Rafael correctly he wants all the option storage to be done using the cl::opt external storage capabilities, so that the option values are stored globally.

My original proposal was to leave the storage in cl::opt, but to move the cl::opt to being owned, and have a keyed lookup. My plan was geared around being able to change one cl::opt at a time, but ultimately to get rid of the static initializers you do need to change every cl::opt variable so that they aren’t global.

As part of this work I did want to eliminate the global storage for all these options in favor of having the data stored in the passes. It seems that this last is the contentious part, which is what Pete is talking about WRT the use of globals.

I’m also not sure how Rafael’s proposal will work with eliminating static initializers for cl::opts with class or list typed data.

 

The second reason is that in most cases these knobs should not be globals. If I had a piece of data (not a CL::opt) in global scope, only used by one pass, then I'm sure people would question why it's a global at all and move it inside the class. We're treating cl::opt as special here when there's no reason for the opt or the storage to be global. 

We frown upon the use of globals, otherwise LLVM would be littered with them like many other C++ code bases. I don't think cl::opts should be special at all in this respect.

Sure, you're arguing against the core design of cl::opt. However, we have it, and it wasn't an accident or for lack of other patterns that we chose it.

For example, we don't require all constants to be per-pass, and instead have per-file constants. Rafael is suggesting that one use case for cl::opt global variables is, in essence, a constant that is somewhat easier for a developer of LLVM (*not* a user) to change during debugging and active development. I don't think the desire for convenience and only supporting the default value in production contexts are completely invalid.

I don’t think that my proposal adversely impacts the ability for a developer of LLVM to change the value of an option during debugging and development. If anything it makes this easier because it provides a way to do so without modifying source code (while not preventing you from doing it by modifying source code).


Once you factor those in, we have a tradeoff. Historically, the tradeoff was made in the direction of convenience, relying on a very narrow interpretation of the use cases. It isn't clear to me that we should, today, make a different tradeoff. It certainly doesn't make sense why you would gate removing static initializers (a clear win) on forcing a change on a core design pattern within LLVM which not all of the developers are really supportive of (at this point).

I guess my point here is that there doesn’t need to be a tradeoff. My proposal doesn’t adversely impact convenience, and supports a wider use case.

I do agree that we shouldn’t gate removing the static initializers on removing the globals, but I also don’t think this is really forcing a core design pattern change. If we can’t come to an agreement on the global data we can shelve the conversation.

-Chris

Pete Cooper

unread,
Aug 20, 2014, 2:25:26 AM8/20/14
to Chandler Carruth, LLVM Dev
On Aug 19, 2014, at 10:24 PM, Chandler Carruth <chan...@google.com> wrote:


On Tue, Aug 19, 2014 at 10:10 PM, Pete Cooper <peter_...@apple.com> wrote:
On Aug 19, 2014, at 6:45 PM, Chandler Carruth <chan...@google.com> wrote:

FWIW, I largely agree with Rafael's position here, at least in the near term.

The nice thing about it is that even if we don't stay there forever, it is a nice incremental improvement over the current state of the world, and we can actually be mindful going forward of whether the restriction it imposes (an inability to use "internal" knobs from within a library context that have multiple different library users in a single address space) proves to be a significant on-going burden.
I actually disagree with this for two reasons.

The first is that if there are going to be changes to the code anyway to remove static initializers, and we can move the storage to the pass at the same time, the we should make just one change and not two.

No one is suggesting otherwise that I have seen? At least, my interpretation (perhaps incorrect, I've not had time to read all of this thread in 100% detail) was that the removal of static initializers wouldn't require changing every cl::opt variable.
It won’t no.  The majority of static initializers are globals in passes, but cl::opt’s and other globals in tools for example are out of scope for this work right now (there’s no opt.cpp in a dylib so we don’t care about its globals for example).

 

The second reason is that in most cases these knobs should not be globals. If I had a piece of data (not a CL::opt) in global scope, only used by one pass, then I'm sure people would question why it's a global at all and move it inside the class. We're treating cl::opt as special here when there's no reason for the opt or the storage to be global. 

We frown upon the use of globals, otherwise LLVM would be littered with them like many other C++ code bases. I don't think cl::opts should be special at all in this respect.

Sure, you're arguing against the core design of cl::opt. However, we have it, and it wasn't an accident or for lack of other patterns that we chose it.
Oh yeah, its a fine solution for a tricky problem, but now that we’re having this discussion its clear that its use of static initializers is itself a problem.


For example, we don't require all constants to be per-pass, and instead have per-file constants. Rafael is suggesting that one use case for cl::opt global variables is, in essence, a constant that is somewhat easier for a developer of LLVM (*not* a user) to change during debugging and active development. I don't think the desire for convenience and only supporting the default value in production contexts are completely invalid.
I can see what you’re saying here, but i’m not convinced that changing the value of a constant in global scope is any easier than changing its value in the pass constructor.


Once you factor those in, we have a tradeoff. Historically, the tradeoff was made in the direction of convenience, relying on a very narrow interpretation of the use cases. It isn't clear to me that we should, today, make a different tradeoff. It certainly doesn't make sense why you would gate removing static initializers (a clear win) on forcing a change on a core design pattern within LLVM which not all of the developers are really supportive of (at this point).
I agree here.  There’s not a strict need to move the option storage for something like an unsigned in to a pass using it (there may be for a std::string for which we’d again have a static initializer).  However, I do think its the right thing to do in terms of coding guidelines.  If the code to get and set an option is already in the pass initializer/constructor respectively, then I don’t think the storage should have to live outside the pass just to minimize a patch.

Now saying this, I don’t think if the community agreed to this proposal as is, that it would mean blanket approval to change all cl::opts in all cases.  But I think where its an easy change, and obviously the right choice, that options as well as their storage should be allowed to be moved in to the pass.  If it makes sense to move only the option but not the storage then that should be done as a first step, and a discussion started on the right thing for the storage.

Thanks,
Pete

Rafael Espíndola

unread,
Aug 20, 2014, 11:54:14 AM8/20/14
to Pete Cooper, LLVM Dev
> FWIW, I largely agree with Rafael's position here, at least in the near
> term.
>
> The nice thing about it is that even if we don't stay there forever, it is a
> nice incremental improvement over the current state of the world, and we can
> actually be mindful going forward of whether the restriction it imposes (an
> inability to use "internal" knobs from within a library context that have
> multiple different library users in a single address space) proves to be a
> significant on-going burden.
>
> I actually disagree with this for two reasons.
>
> The first is that if there are going to be changes to the code anyway to
> remove static initializers, and we can move the storage to the pass at the
> same time, the we should make just one change and not two.
>
> The second reason is that in most cases these knobs should not be globals.
> If I had a piece of data (not a CL::opt) in global scope, only used by one
> pass, then I'm sure people would question why it's a global at all and move
> it inside the class. We're treating cl::opt as special here when there's no
> reason for the opt or the storage to be global.
>
> We frown upon the use of globals, otherwise LLVM would be littered with them
> like many other C++ code bases. I don't think cl::opts should be special at
> all in this respect.


Note that the call

cl::OptionRegistry::CreateOption<bool>(&ScalarizeLoadStore,
"ScalarizeLoadStore",
"scalarize-load-store", cl::Hidden, cl::init(false),
cl::desc("Allow the scalarizer pass to scalarize loads and store"));

ScalarizeLoadStore can actually be a member variable as long as caller
guarantees it is still around when the command line is parsed. I have
no objections to doing this move in the first pass if you want to.

What I would really like to avoid for now is the

cl::OptionRegistry::GetValue<bool>("ScalarizeLoadStore");

Cheers,
Rafael

Pete Cooper

unread,
Aug 20, 2014, 12:18:06 PM8/20/14
to Rafael Espíndola, LLVM Dev
Ah, I totally see what you mean now. Sorry if i had been confused before.

To be honest, I had exactly the same reservations myself, but then i looked in to how we initialize and create passes and there actually isn’t a nice way to do this right now.

The trouble is that INITIALIZE_PASS doesn’t actually construct a pass. It actually constructs a PassInfo which has a pointer to the default constructor of the pass. Later, if the pass manager needs to (say -scalarize was on the commandline), it will get the default constructor from PassInfo and construct the pass.

Unfortunately, this completely detaches the lifetime of the pass instance where we want to store the ScalarizeLoadStore bool, from the option code to hook in to the cl::opt stuff. I originally wanted to do something gross like pass __offset_of(Scalarizer::ScalarizeLoadStore) to the PassInfo and hide the initialization of the option in there. But that doesn’t work either, as there’s no guarantee you’ll even create a instance of Scalarizer, even though you could pass -scalarize-load-store to the command line.

So, we do have 2 solutions: a global variable, and a call to GetValue. As you can see, either way isn’t perfect, but if you can think of another solution i’m sure we can discuss it.

Thanks,
Pete

Pete Cooper

unread,
Aug 20, 2014, 12:34:23 PM8/20/14
to Rafael Espíndola, LLVM Dev
This is very raw, so excuse any mistakes in the code, but I think i came up with a third option.

What if we added a static method to the Scalarizer class. This method takes pointers to each option storage. If a pointer is null, the function is being called from INITIALIZE_PASS and so we create all the options. If a pointer is not null, we’re being called from the pass constructor and we set the value of that option. I think it would look something like this (which can of course be tidied up).

static void addOptions(bool *ScalarizeLoadStore = nullptr) {
Option::iterator ScalarizeLoadStoreOpt =
getOrAddOption<bool>("scalarize-load-store");
if (ScalarizeLoadStore) {
// Get the value of the option.
*ScalarizeLoadStore = ScalarizeLoadStoreOpt.getValue();
} else {
// Adding the option with this name. Set up its properties.
ScalarizeLoadStoreOpt.init(cl::Hidden, cl::init(false),


cl::desc("Allow the scalarizer pass to scalarize loads and store"));
}
}

Pete

Rafael Espíndola

unread,
Aug 20, 2014, 12:35:59 PM8/20/14
to Pete Cooper, LLVM Dev
> So, we do have 2 solutions: a global variable, and a call to GetValue. As you can see, either way isn’t perfect, but if you can think of another solution i’m sure we can discuss it.

Pass the passinfo to the pass constructor maybe?

I still don't understand what the problem with the global is. It has
the same value for all users. As Chandler pointed out, having

static cl::opt<bool> ScalarizeLoadStore

("scalarize-load-store", cl::Hidden, cl::init(false),


cl::desc("Allow the scalarizer pass to scalarize loads and store"));

right now is just our way of making

// Allow the scalarizer pass to scalarize loads and store
const static bool ScalarizeLoadStore = false;

more convenient to developers so they don't have to edit/compile to
test it with a different value. From a library user point of view they
should be exactly the same: a constant that is *always* false.

Another option (not my preference) would be to use globals just as keys:

typedef <something> LLVMOptionKey;
...
static LLVMOptionKey ScalarizeLoadStoreKey; //global
...
cl::OptionRegistry::createOption<bool>(&ScalarizeLoadStoreKey,


"ScalarizeLoadStore",
"scalarize-load-store", cl::Hidden, cl::init(false),
cl::desc("Allow the scalarizer pass to scalarize loads and store"));

....
bool ScalarizeLoadStore =
cl::OptionRegistry::getValue<bool>(&ScalarizeLoadStoreKey); // local

That way we avoid exposing ScalarizeLoadStore to library users since
the getValue takes a key they cannot guess instead of well known
string.

Rafael Espíndola

unread,
Aug 20, 2014, 1:00:28 PM8/20/14
to Pete Cooper, LLVM Dev
> This is very raw, so excuse any mistakes in the code, but I think i came up with a third option.
>
> What if we added a static method to the Scalarizer class. This method takes pointers to each option storage. If a pointer is null, the function is being called from INITIALIZE_PASS and so we create all the options. If a pointer is not null, we’re being called from the pass constructor and we set the value of that option. I think it would look something like this (which can of course be tidied up).
>
> static void addOptions(bool *ScalarizeLoadStore = nullptr) {
> Option::iterator ScalarizeLoadStoreOpt =
> getOrAddOption<bool>("scalarize-load-store");
> if (ScalarizeLoadStore) {
> // Get the value of the option.
> *ScalarizeLoadStore = ScalarizeLoadStoreOpt.getValue();
> } else {
> // Adding the option with this name. Set up its properties.
> ScalarizeLoadStoreOpt.init(cl::Hidden, cl::init(false),
> cl::desc("Allow the scalarizer pass to scalarize loads and store"));
> }
> }

As long as we are careful to make sure only the command line parsing
can set this, it should be fine. It has some nice properties missing
from the original proposal (including stage2)

* "scalarize-load-store" is written once. Not extra string keys.
* It seems easier to hide it.


* We don't need a LLVMConfig object that gets passed around.

* There is exactly one of exposing features to libraries: change the
constructor and maybe the PassManagerBuilder.

I still don't see the advantage over a static storage, but I am not
strongly oppose to it, as long as there is nothing like
cl::OptionRegistry::SetValue<bool>("ScalarizeLoadStore", true) that
libraries can use.

Pete Cooper

unread,
Aug 20, 2014, 1:15:05 PM8/20/14
to Rafael Espíndola, LLVM Dev

> On Aug 20, 2014, at 9:31 AM, Rafael Espíndola <rafael.e...@gmail.com> wrote:
>
>> So, we do have 2 solutions: a global variable, and a call to GetValue. As you can see, either way isn’t perfect, but if you can think of another solution i’m sure we can discuss it.
>
> Pass the passinfo to the pass constructor maybe?
>
> I still don't understand what the problem with the global is. It has
> the same value for all users. As Chandler pointed out, having
I just assumed that all globals are bad. Perhaps thats not true in all cases.

>
> static cl::opt<bool> ScalarizeLoadStore
> ("scalarize-load-store", cl::Hidden, cl::init(false),
> cl::desc("Allow the scalarizer pass to scalarize loads and store"));
>
> right now is just our way of making
>
> // Allow the scalarizer pass to scalarize loads and store
> const static bool ScalarizeLoadStore = false;
>
> more convenient to developers so they don't have to edit/compile to
> test it with a different value. From a library user point of view they
> should be exactly the same: a constant that is *always* false.
>
> Another option (not my preference) would be to use globals just as keys:
>
> typedef <something> LLVMOptionKey;
> ...
> static LLVMOptionKey ScalarizeLoadStoreKey; //global
You could do this if the key is the name of the option, but that detaches the name from the rest of the calls which may be quite far apart in the code. Otherwise so long as the key type doesn’t have a static constructor, this is fine with me.

Thanks,
Pete

Rafael Espíndola

unread,
Aug 20, 2014, 1:19:56 PM8/20/14
to Pete Cooper, LLVM Dev
>> Another option (not my preference) would be to use globals just as keys:
>>
>> typedef <something> LLVMOptionKey;
>> ...
>> static LLVMOptionKey ScalarizeLoadStoreKey; //global
> You could do this if the key is the name of the option, but that detaches the name from the rest of the calls which may be quite far apart in the code. Otherwise so long as the key type doesn’t have a static constructor, this is fine with me.

The Key global is only used for its unique address, so LLVMOptionKey
could be just "void *" for example.

Owen Anderson

unread,
Aug 20, 2014, 2:07:32 PM8/20/14
to Chandler Carruth, LLVM Dev

On Aug 19, 2014, at 10:24 PM, Chandler Carruth <chan...@google.com> wrote:

For example, we don't require all constants to be per-pass, and instead have per-file constants. Rafael is suggesting that one use case for cl::opt global variables is, in essence, a constant that is somewhat easier for a developer of LLVM (*not* a user) to change during debugging and active development. I don't think the desire for convenience and only supporting the default value in production contexts are completely invalid.

I think this statement ignores important debugging use cases, and over-simplifies the model of how LLVM-based frameworks are often developed.  As an example, say I develop an LLVM backend for a custom co-processor, and the compiler for it runs inside the software stack of the driver for my co-processor.  Per what you and Rafael are saying, the driver should not be able to programmatically set command line options to LLVM.  But this ignores the reality that, when developing my backend, I may still need to be able to twiddle debugging options when debugging a live driver stack.

Fundamentally, you argument is rooted in a world where LLVM debugging is always done on the llc or clang binaries, where the LLVM developer can easily modify the command line arguments themselves.  That’s not true of many use cases where LLVM as a shared library is relevant.

—Owen

Evan Cheng

unread,
Aug 20, 2014, 3:15:16 PM8/20/14
to Rafael Espíndola, LLVM Dev

> On Aug 20, 2014, at 9:31 AM, Rafael Espíndola <rafael.e...@gmail.com> wrote:
>
>> So, we do have 2 solutions: a global variable, and a call to GetValue. As you can see, either way isn’t perfect, but if you can think of another solution i’m sure we can discuss it.
>
> Pass the passinfo to the pass constructor maybe?
>
> I still don't understand what the problem with the global is. It has
> the same value for all users. As Chandler pointed out, having

Globals are bad for many reasons: namespace pollution, concurrency issues, lack of access control, etc. etc.. Some of them have been discussed in this thread. Perhaps it’s not a big concern for most of the LLVM users. But we have an unique environment where LLVM is shared by multiple clients, and where the concern around exploits are especially strong. So while eliminating globals is not strictly tied to the elimination of static initializers, it is still a strong goal towards providing a LLVM dylib.

Evan

Rafael Avila de Espindola

unread,
Aug 20, 2014, 3:18:07 PM8/20/14
to Owen Anderson, LLVM Dev
There nothing magical about llc or opt. If you need to pass debug options to your own tools, you can do that with command line options like llc or opt, but the important factors remain:

* this is a debugging session, not regular use.
* the option storage can remain static.
* we don't need and API for setting the value of the option, command line is sufficient.

Sent from my iPhone
_______________________________________________

Rafael Espíndola

unread,
Aug 20, 2014, 3:45:43 PM8/20/14
to Evan Cheng, LLVM Dev
>> Pass the passinfo to the pass constructor maybe?
>>
>> I still don't understand what the problem with the global is. It has
>> the same value for all users. As Chandler pointed out, having
>
> Globals are bad for many reasons: namespace pollution, concurrency issues, lack of access control, etc. etc.. Some of them have been discussed in this thread. Perhaps it’s not a big concern for most of the LLVM users. But we have an unique environment where LLVM is shared by multiple clients, and where the concern around exploits are especially strong. So while eliminating globals is not strictly tied to the elimination of static initializers, it is still a strong goal towards providing a LLVM dylib.

Fair enough. I would still make at separate stage since how to remove
them is the part that is still a bit contentious. Note that from a
security point of view a well known string probably makes things worse
than the global option storage since the address of the storage is at
least randomized.

It seems that both mine and Peter's proposal would avoid the global
storage and not introduce well known string for setting the options.

Sean Silva

unread,
Aug 20, 2014, 3:54:24 PM8/20/14
to Pete Cooper, LLVM Dev
On Tue, Aug 19, 2014 at 11:12 PM, Pete Cooper <peter_...@apple.com> wrote:

On Aug 19, 2014, at 9:43 PM, Sean Silva <chiso...@gmail.com> wrote:

One interesting issue with moving away from the current system of static initializers for cl::opt is that we will no longer have the automatic registration of all the options so that -help will print everything available and generally we will not be able to issue an error for an "unknown command line option" (without calling into any other code).
Not automatic no, but in the proposal Chris puts the addOption call inside the pass initializer which is called before ParseCommandLineOptions.  This means you’ll still get options listed as you currently do, so long as you continue to calls the pass initializers before parse (something you have to do anyway to get the pass name visible to the command line)


The auto-registration is fundamentally tied with the globalness and the static initializers; pondering this has led me down an interesting path that has made me understand better my suggestion in the other thread. As I see it, there are two very different sorts of uses of llvm::cl in LLVM:

1. For regular command line processing. E.g. if a tool accepts an output file, then we need something that will parse the argument from the command line.

2. As a way to easily set up a conduit from A to B, where A is the command line and B is some place "deep" inside the LLVM library code that will do something in response to the command line.

(and, pending discussion, someday point A might include a proper programmatic interface (i.e. in a way other than hijacking the command line processing))
That would be nice.  I just suggested in another thread that we expose ParseCommandLineOptions to the C API to hack around this, but a nice clean interface would of course be better.


llvm::cl does a decent job for #1 and that is what it was designed for AFAICT; these uses of llvm::cl live outside of library code and everything is pretty happy, despite them being global and having static initializers.

The problem is that llvm::cl is not very well-suited to #2, yet it is used for #2, and that is the real problem. We need a solution to problem #2 which does not use llvm::cl. Thus, I don't think that the solution you propose here is the right direction.

The first step is to clearly differentiate between #1 and #2. I will say "command line options" for #1 and "configuration/tweak points" for #2. (maybe "library options" is better for #2; neither is perfect terminology)

The strawman I suggested in http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075503.html was a stab at #2. There is no way to dodge being stringly typed since command lines are stringly typed, so really it is just a question of how long a solution stays stringly typed.

My thought process for staying stringly typed "the whole time" (possibly with some caching) comes from these two desires:
- adding a c/t point should require adding just one call into the c/t machinery (this is both for convenience and for DRY/SPOT), and 
Right.  The current point is in the pass initializer.

There is another point in the pass constructor to read the option value.  This is the only point at which something will change from being a string to its actual type and value.

Yes, that is two places. My design takes it as a requirement that there be one place, which drives it to be completely different.
 

- this change should be localized to the code being configured/tweaked
This is the thought process:

Note that llvm::cl is stringly typed until it parses the options. llvm::cl gives the appearance of a typed interface because it uses static initialization as a backdoor to globally transport the knowledge of the expected type to the option parsing machinery (very early in the program lifetime). Without this backdoor, we need to stay stringly typed longer, at least until we reach the "localized" place where the single call into the c/t machinery is made; this single call is the only place that has the type information needed for the c/t value to become properly typed. But there is no way to know how long it will be until we reach that point (or even *if* we reach that point; consider passes that are not run on this invocation).
The current proposal exposes the type in addOption (as well as later when we get the option). So the type continues to be known to the command line parser.  Whether you want to actually type check in the command line is a point i’m open to discuss.  

My solution stays stringly typed explicitly because it doesn't have that type information until later. If you have the type and option information earlier, then it is hard to imagine a reason not to resolve it immediately into a properly typed form, issue diagnostics, etc.
 
Personally i want a command line option to be type checked because it was registered, even if no-one actually gets the value of the option later.


Hence my suggestion of just putting a stringly typed key-value store (or whatever) in an easily accessible place (like LLVMContext), and just translating any unrecognized command line options (ones that are not for #1) into that stringly typed storage.
I’m against it being in the context because you may want to set up and reuse passes multiple times with the same options, and use that configuration to compile multiple LLVMContexts.  But I do agree that having a store with some lifetime is useful.

I think the current proposal is to have the store be a singleton, but there’s nothing to stop further work to have the storage for options be one per thread for example.  If you wanted to have one pass manager per thread with its own set of passes, configured (currently) via their own call to ParseCommandLineOptions then that would be possible with little work beyond the current proposal.


I agree with Rafael that "constructor arguments to passes" are not c/t points. In the future, there might be some way to integrate the two (from the referenced post, you can probably tell that I kind of like the idea of doing so), but for now, I think the clear incremental step is to attack #2 and solve it without llvm::cl. I have suggested a way to do this that I think makes sense.
If you change the current proposal so that it doesn’t read cl::opt, then I think this reads to me like what is being proposed now. Really its creating a string->string map with addOption, and getting the values with getOption.  The passes don’t care (or know) whether the options are set via the command line or any other API.  I hope i’ve understood your proposal correctly here.  Please correct me otherwise.

It is completely different. The Chris's proposal drops both requirements I set out which completely drive the design:
- adding a c/t point should require adding just one call into the c/t machinery (this is both for convenience and for DRY/SPOT)
  + Chris's OP requires 2 calls. thus it is less convenient, and violates Don't Repeat Yourself / Single Point Of Truth
- this change should be localized to the code being configured/tweaked
  + Some c/t points are in places that don't have a "localized" place that is called upon initialization, thus a non-localized change is required to insert the call to addOption

I'm open to discussing whether these are actually the true requirements we want to pursue for c/t points. However, if you assume these two as requirements, I believe that my solution makes a lot of sense.

It might be useful split #2 into two parts, which I call Chandler and Renato, since their posts suggested each use case:
- Chandler: The convenience of cl::opt is immense for developers. If we use cl::opt for things that the user of LLVM shouldn't be changing (but a developer might), then we can just not expose them to users. In production, these are not configurable, so they can be compiled down to hardcoded values and thus not have static initializers.
- Renato: Have a more formalized system for exposing certain kinds of options (e.g. pass options, target options). Splitting out the Chandler part eliminates the "convenience" requirement which drove a lot of my original thoughts. Thus we can have some designated well-defined ways to declare them up-front, and they can be less convenient to use.

However, given Owen's input, it might be that we really do not want to split #2 like this. In that case, I think my solution is a direction that makes sense:
- It eliminates the static initializers that arise from usage of cl::opt in library code (the overarching problem, which impacts LLVM's usability as a library)
- It opens the door to responsibly exposing access to the current cl::opt's in library code in a proper programmatic way (non-global, and even easily controlling this access by filtering keys, etc.) (Owen highlighted this as important; many users of LLVM as a library pointed out the desire to be given access to configuration/tweak points)
- It does not significantly regress on the convenience of adding a new "library option"/"configuration/tweak point" in the future for development (which Chandler has highlighted is important)
- It opens the door to a more formalized and well-defined system where needed (which Renato said is important)

I would also like to reiterate that uses of cl::opt in library code are fundamentally different from uses of cl::opt in tool code. The uses of cl::opt in library code are more about transporting a value "deep" into some part of the LLVM libraries to change their behavior, as a side-channel to the ordinary flow of information. We need to focus on that reality, and design a solution which addresses the needs to transport information deep into the library code.

-- Sean Silva

Owen Anderson

unread,
Aug 20, 2014, 4:10:45 PM8/20/14
to Rafael Avila de Espindola, LLVM Dev
On Aug 20, 2014, at 12:16 PM, Rafael Avila de Espindola <rafael.e...@gmail.com> wrote:

There nothing magical about llc or opt. If you need to pass debug options to your own tools, you can do that with command line options like llc or opt, but the important factors remain:

* this is a debugging session, not regular use.
* the option storage can remain static.
* we don't need and API for setting the value of the option, command line is sufficient.

#2 and #3 are not true.

•  As Chris has pointed out, it’s a perfectly normal circumstance in the systems we deal with to have multiple instances of LLVM resident for different clients.  Lots of applications bring in both WebKit and the OpenGL, including the browser itself via WebGL!  If I’m debugging or testing one component of the system, I definitely do not want debug options I have set to cause differences in the other components.

• A lot of these users are not launched at the command line in a typical fashion.  They’re either embedded into the address space of an existing application (which may have a totally different command line interface, or indeed none at all), or running in daemons whose lifetime is controlled by other parts of the system.  Either way, setting options at the command line in not always viable.

I agree that llc, opt and clang *shouldn’t* be special, but everything you’re saying here is firmly rooted in a world where they are special, or at least all use cases for debugging LLVM look like them.

—Owen

Chris Bieneman

unread,
Aug 20, 2014, 5:18:58 PM8/20/14
to Owen Anderson, LLVM Dev
I think Rafael’s suggestion of using the address of a global as the key to the option store gives us the ability to accomplish all our goals without introducing a string keyed option store.

I’ve updated my patches from the original proposal to reflect those changes.

Does this look reasonable?

Thanks,
-Chris
cl_opt.diff

Sean Silva

unread,
Aug 20, 2014, 6:14:21 PM8/20/14
to Chris Bieneman, LLVM Dev
The design seems to require no less than 4 not-too-close-to-each-other code changes to add an option (and LLVMOptionKey declaration, an LLVMOptionKey initialization, a call to GetValue, and a call to CreateOption). I think that the convenience for developers is an important aspect and this solution regresses significantly on that front.

Especially, I think that we need to evaluate the viability of an explicit CreateOption call. Do all uses of cl::opt in the library code have a relatively convenient place to put that call? If not, then we won't be able to migrate them, and we will be stuck with a partial solution. If we are willing to accept a partial solution, then the design discussion needs to change; at the very least we need to consider the scope of the partial solution.

-- Sean Silva
 

Thanks,
-Chris

Chris Bieneman

unread,
Aug 20, 2014, 6:58:15 PM8/20/14
to Sean Silva, LLVM Dev
On Aug 20, 2014, at 3:12 PM, Sean Silva <chiso...@gmail.com> wrote:




On Wed, Aug 20, 2014 at 2:16 PM, Chris Bieneman <be...@apple.com> wrote:
I think Rafael’s suggestion of using the address of a global as the key to the option store gives us the ability to accomplish all our goals without introducing a string keyed option store.

I’ve updated my patches from the original proposal to reflect those changes.

Does this look reasonable?

The design seems to require no less than 4 not-too-close-to-each-other code changes to add an option (and LLVMOptionKey declaration, an LLVMOptionKey initialization, a call to GetValue, and a call to CreateOption). I think that the convenience for developers is an important aspect and this solution regresses significantly on that front.

Especially, I think that we need to evaluate the viability of an explicit CreateOption call. Do all uses of cl::opt in the library code have a relatively convenient place to put that call? If not, then we won't be able to migrate them, and we will be stuck with a partial solution. If we are willing to accept a partial solution, then the design discussion needs to change; at the very least we need to consider the scope of the partial solution.

Most cl::opts are in the passes, so they do have a convenient place to put the CreateOption call. There are others in places that are less convenient - like ISel lowering. However there are always initialization calls that CreateOption calls can be put into (in targets, libraries, etc).

Fundamentally you and I are looking at this problem differently. My proposal has the following requirements:
1) Preserve existing command line option function exactly as it is today
2) Eliminate all static initializers associated with command line options and their storage
3) Design a solution that can be phased in with existing code so we don’t need to change all cl::opts in the same commit

These requirements unfortunately made your requirements unfeasible.

One important point I would make, is that while my changes may make it slightly less convenient to create new command line options, it does not impact the ability of a developer to manipulate the values of the options. The flags behave as they do today, and there is still one line of code somewhere that sets the value the compiler actually uses. In my example this is in the constructor for the pass.

-Chris

Sean Silva

unread,
Aug 20, 2014, 7:20:09 PM8/20/14
to Chris Bieneman, LLVM Dev
On Wed, Aug 20, 2014 at 3:53 PM, Chris Bieneman <be...@apple.com> wrote:

On Aug 20, 2014, at 3:12 PM, Sean Silva <chiso...@gmail.com> wrote:




On Wed, Aug 20, 2014 at 2:16 PM, Chris Bieneman <be...@apple.com> wrote:
I think Rafael’s suggestion of using the address of a global as the key to the option store gives us the ability to accomplish all our goals without introducing a string keyed option store.

I’ve updated my patches from the original proposal to reflect those changes.

Does this look reasonable?

The design seems to require no less than 4 not-too-close-to-each-other code changes to add an option (and LLVMOptionKey declaration, an LLVMOptionKey initialization, a call to GetValue, and a call to CreateOption). I think that the convenience for developers is an important aspect and this solution regresses significantly on that front.

Especially, I think that we need to evaluate the viability of an explicit CreateOption call. Do all uses of cl::opt in the library code have a relatively convenient place to put that call? If not, then we won't be able to migrate them, and we will be stuck with a partial solution. If we are willing to accept a partial solution, then the design discussion needs to change; at the very least we need to consider the scope of the partial solution.

Most cl::opts are in the passes, so they do have a convenient place to put the CreateOption call. There are others in places that are less convenient - like ISel lowering. However there are always initialization calls that CreateOption calls can be put into (in targets, libraries, etc).

Fundamentally you and I are looking at this problem differently. My proposal has the following requirements:
1) Preserve existing command line option function exactly as it is today
2) Eliminate all static initializers associated with command line options and their storage
3) Design a solution that can be phased in with existing code so we don’t need to change all cl::opts in the same commit

I think your approach makes sense in view of those requirements.

-- Sean Silva

Chris Bieneman

unread,
Aug 21, 2014, 12:20:13 AM8/21/14
to Rafael Espíndola, LLVM Dev
I think Rafael’s suggestion of using the address of a global as the key to the option store gives us the ability to accomplish all our goals without introducing a string keyed option store.

I’ve updated my patches from the original proposal to reflect those changes.

Does this look reasonable?

Thanks,
-Chris
cl_opt.diff

Rafael Espíndola

unread,
Aug 21, 2014, 1:57:48 AM8/21/14
to Chris Bieneman, LLVM Dev
I don't have any objections to this approach. Just some comments:

+ LLVMOptionKey* Key; // The name of the option used by the OptionRegistry

The comment looks out of date.

+ std::map<LLVMOptionKey*,Option*> Options;

Why std::map? DenseMap maybe?

+ void AddOption(LLVMOptionKey* key, Option *O);
+ void RemoveOption(Option *O);

addOption, removeOption. Same for other methods.

+LLVMOptionKey Scalarizer::ScalarizeLoadStoreKey = 0;

the value is not used, so I would probably omit the "= 0;".

Chandler Carruth

unread,
Aug 21, 2014, 2:21:41 AM8/21/14
to Chris Bieneman, LLVM Dev
I'll try to provide some feedback on this patch assuming that we want to try to completely move away from global variables in a single step (something that I do generally support, although I don't understand why a more incremental approach is unacceptable -- it seems like a bit more work over all but a somewhat smoother transition for the community).

I feel like the APIs you're ending up with are bending around backwards to achieve things that it isn't even clear are valuable:

1) They seem to be trying to working very hard to behave similarly to the existing cl::opt interfaces. If we're so radically changing the core design of cl::opt, different interfaces might be in order.

2) Perhaps this was explained up-thread (I'm going to try to read most of the older posts, but I'm still catching up on this thread) but if we're going this direction I would really expect the options to be per-LLVMContext... I feel like these are just "lazy globals with private storage" which don't seem all that conceptually better than globals. They've just engineered around the technical problems caused.

3) I really dislike the whole INITIALIZE_PASS_{BEGIN,END} pattern. I'm not sure if there is any realistic alternative, but I'm expecting to essentially have to rework every single one of these for the new pass manager. =[


I'm going to try to read Sean's proposal, but I'm suspect it is essentially a strongly typed option registry inside the LLVMContext? With some way to collect the options first and then to parse flags? That kind of approach seems more likely to really solve this issue once and for all and be an excellent, and scalable long-term solution... but as I said, I'm still catching up on the thread. I'll try to post more coherent thoughts when I've had time to read the other posts fully and skim through the other proposals.



Chris Bieneman

unread,
Aug 21, 2014, 1:23:42 PM8/21/14
to Chandler Carruth, LLVM Dev
Thanks Chandler. I would very much appreciate having your feedback on this.

On Aug 20, 2014, at 11:18 PM, Chandler Carruth <chan...@google.com> wrote:

I'll try to provide some feedback on this patch assuming that we want to try to completely move away from global variables in a single step (something that I do generally support, although I don't understand why a more incremental approach is unacceptable -- it seems like a bit more work over all but a somewhat smoother transition for the community).

I feel like the APIs you're ending up with are bending around backwards to achieve things that it isn't even clear are valuable:

1) They seem to be trying to working very hard to behave similarly to the existing cl::opt interfaces. If we're so radically changing the core design of cl::opt, different interfaces might be in order.

I wouldn’t say the APIs are working hard to behave similarly to the existing cl::opt interfaces as much as I designed this patch to provide incremental improvements to cl::opt, and changing the API was not on the table for the first change.

The primary focus of this change is to provide a new way of defining existing cl::opt objects such that they are not globally scoped. This change should not preclude us from coming up with a new better API for options, but that should be a separate discussion from using the existing API while avoiding some of the less desirable side-effects.


2) Perhaps this was explained up-thread (I'm going to try to read most of the older posts, but I'm still catching up on this thread) but if we're going this direction I would really expect the options to be per-LLVMContext... I feel like these are just "lazy globals with private storage" which don't seem all that conceptually better than globals. They've just engineered around the technical problems caused.

Earlier in the thread we’ve danced around this. My original proposal was to eventually create an option store object that would be passed into the context, but would be independent of it. The reason for not putting it in the context is because largest consumers of command line options are passes, and we already have use cases where we re-use pass managers with different contexts. Due to some early feedback on the thread I thought it would be best to shelve that conversation for a later date, but I’d love to know what feedback you have in that area.


3) I really dislike the whole INITIALIZE_PASS_{BEGIN,END} pattern. I'm not sure if there is any realistic alternative, but I'm expecting to essentially have to rework every single one of these for the new pass manager. =[

I agree the INITIALIZE_PASS macros are nasty, but fundamentally I think you either need to be okay with global initializers (which we’re not), or you need to have some form of an explicit initialization.



I'm going to try to read Sean's proposal, but I'm suspect it is essentially a strongly typed option registry inside the LLVMContext? With some way to collect the options first and then to parse flags? That kind of approach seems more likely to really solve this issue once and for all and be an excellent, and scalable long-term solution... but as I said, I'm still catching up on the thread. I'll try to post more coherent thoughts when I've had time to read the other posts fully and skim through the other proposals.

I don’t disagree that there are merits to Sean’s proposal, but there are also issues. I’d be more than happy to debate this in more detail, but I would actually ask that we set it aside for now. I’d really prefer if we didn’t gate removing static initializers and global option storage, which I think we can both agree are good things, on a larger change to a core design pattern within LLVM.

Thanks,
-Chris

Chandler Carruth

unread,
Aug 21, 2014, 1:36:56 PM8/21/14
to Chris Bieneman, LLVM Dev

On Thu, Aug 21, 2014 at 10:16 AM, Chris Bieneman <be...@apple.com> wrote:
I’d really prefer if we didn’t gate removing static initializers and global option storage, which I think we can both agree are good things, on a larger change to a core design pattern within LLVM.

FWIW, while I think it may be possible to remove static initializers without changing core design patterns within LLVM, I don't think that's the case for removing global storage. Quite fundamentally, once you're not using global storage I think you want a *different* interface to these things. =/

Jonathan Roelofs

unread,
Aug 21, 2014, 1:50:15 PM8/21/14
to Owen Anderson, Rafael Avila de Espindola, LLVM Dev


On 8/20/14, 2:07 PM, Owen Anderson wrote:
>
>> On Aug 20, 2014, at 12:16 PM, Rafael Avila de Espindola
>> <rafael.e...@gmail.com <mailto:rafael.e...@gmail.com>> wrote:
>>
>> There nothing magical about llc or opt. If you need to pass debug options to
>> your own tools, you can do that with command line options like llc or opt, but
>> the important factors remain:
>>
>> * this is a debugging session, not regular use.
>> * the option storage can remain static.
>> * we don't need and API for setting the value of the option, command line is
>> sufficient.
>
> #2 and #3 are not true.
>
> • As Chris has pointed out, it’s a perfectly normal circumstance in the systems
> we deal with to have multiple instances of LLVM resident for different clients.
> Lots of applications bring in both WebKit and the OpenGL, including the
> browser itself via WebGL! If I’m debugging or testing one component of the
> system, I definitely do not want debug options I have set to cause differences
> in the other components.
+ 1
>
> • A lot of these users are not launched at the command line in a typical
> fashion. They’re either embedded into the address space of an existing
> application (which may have a totally different command line interface, or
> indeed none at all), or running in daemons whose lifetime is controlled by other
> parts of the system. Either way, setting options at the command line in not
> always viable.
+ 1
>
> I agree that llc, opt and clang *shouldn’t* be special, but everything you’re
> saying here is firmly rooted in a world where they are special, or at least all
> use cases for debugging LLVM look like them.
>
> —Owen
>
>
>> Sent from my iPhone
>>
>> On Aug 20, 2014, at 13:42, Owen Anderson <resi...@mac.com
>> <mailto:resi...@mac.com>> wrote:
>>
>>>
>>>> On Aug 19, 2014, at 10:24 PM, Chandler Carruth <chan...@google.com
>>>> <mailto:chan...@google.com>> wrote:
>>>>
>>>> For example, we don't require all constants to be per-pass, and instead have
>>>> per-file constants. Rafael is suggesting that one use case for cl::opt
>>>> global variables is, in essence, a constant that is somewhat easier for a
>>>> developer of LLVM (*not* a user) to change during debugging and active
>>>> development. I don't think the desire for convenience and only supporting
>>>> the default value in production contexts are completely invalid.
>>>
>>> I think this statement ignores important debugging use cases, and
>>> over-simplifies the model of how LLVM-based frameworks are often developed.
>>> As an example, say I develop an LLVM backend for a custom co-processor, and
>>> the compiler for it runs inside the software stack of the driver for my
>>> co-processor. Per what you and Rafael are saying, the driver should not be
>>> able to programmatically set command line options to LLVM. But this ignores
>>> the reality that, when developing my backend, I may still need to be able to
>>> twiddle debugging options when debugging a live driver stack.
>>>
>>> Fundamentally, you argument is rooted in a world where LLVM debugging is
>>> always done on the llc or clang binaries, where the LLVM developer can easily
>>> modify the command line arguments themselves. That’s not true of many use
>>> cases where LLVM as a shared library is relevant.
>>>
>>> —Owen
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLV...@cs.uiuc.edu <mailto:LLV...@cs.uiuc.edu> http://llvm.cs.uiuc.edu
>>> <http://llvm.cs.uiuc.edu/>
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Chris Bieneman

unread,
Aug 21, 2014, 1:56:38 PM8/21/14
to Chandler Carruth, LLVM Dev
The problem that led me to tackling both together was that there are certain command line options that we use today where the storage itself will incur static initializers if it is global. Examples of these are command line options where the stores are lists, strings, and classes.

The fact that our options support these types makes the problem of eliminating static initializers bound to the problem of eliminating global storage. This problem also gets more complicated when you take into account that the logical place to move the storage for many of these options (the passes themselves) don’t actually exist at the time that you are parsing the options.

If you have suggestions on how to tackle these problems I’d be grateful for the feedback.

-Chris

Chandler Carruth

unread,
Aug 26, 2014, 1:45:31 PM8/26/14
to Chris Bieneman, LLVM Dev
A bunch of IRC discussions etc have taken place, and I wanted to surface some of the points from them here. Also, I wanted to give a more fully thought-through set of my own perspectives.


First, I'd like to re-state the goals as I think those have gotten muddied.

1) Remove most of LLVM's static initializers (that is, dynamic initialization code for static storage duration objects) by moving away from global objects with non-trivial constructors of 'llvm::cl::opt' types in the LLVM *libraries* (they remain fine in the tools of course).
 
2) Support separate LLVM users in a single address space having different values for the debugging knobs currently managed through the 'llvm::cl::opt' command line flags mechanism.

3) (related to #2) Support some library API for setting the debugging knobs currently managed through the 'llvm::cl::opt' command line flags mechanism.


Now, I want to change my position on one point because after talking to Chris, Jim, and others I understand the problem significantly better. I am now convinced that it is impossible to fix even #1 without making a change for every single option in LLVM. Why? Because we have to *register* all of the flags, and this registration inherently cannot be done from a global's constructor without introducing the undesirable static initializers[1].

I also think that in order to address #2 we would need to make some change to essentially every option in LLVM as we would have to connect it to non-global storage.

Given that, I think it does make sense to try to address #1 and #2 at the same time -- trying to limit the change to just one or the other would very likely result in a significantly different API for creating, registering, and using these options, and I would rather not make all of the LLVM developers re-learn how to do this 2 times if we can instead do it 1 time.


However, doing #2 only really makes sense if we also do #3 to let people *use* this new freedom. =] Owen convinced me that there really is a good reason to support #3 as part of the LLVM libraries, however I wanted to raise my primary concern about this on the mailing list because it hasn't really been discussed.

These "debugging knobs" or library level options which *aren't* exposed in the library's APIs are very risky to expose in a side-channel API: they are often completely unsupported or unsupportable options that were never intended for "production" usage. Once we expose these options via some API to library users, we run a very serious risk of getting locked into supporting behaviors or debugging misbehavior that were never expected or planned for...

However, the debugging use case is real and important. So I want to propose that we guard many of the paths to setting these options with NDEBUG so that in non-asserts builds, library users can't override these options. This would mean the C++ API for setting this in LLVMContext (and the C API if we ever expose it there) would only support setting these in asserts builds. We could even sink the checking into the registry. The name should probably have "debug" in it somewhere. It would then have some InternalUnsupportedScaryBad API which is not actually in the header files but is used in LLVM's tools to maintain their current behavior without documenting a public API. It's not bulletproof, but it seemed that it would be sufficient for the debugging use cases and is a strong statement of "unsupported" IMO.


The only other observation I would like to make is that if we're going to do this, we really should take the time to try to sort out a really *good* API. Hopefully we'll be slightly less wrong this time, and maybe the next transition can be more incremental.


Ok, on to the more low-level parts of this email, here some specific thoughts for how to design the API for further discussion:

- I really like the idea of using a 'void*' derived from some static tag as a way to identify uniquely each option. I also think we can use some minimal "reflection" techniques to significantly simplify this.

- The need for a global registry is... distasteful, but I agree inevitable.

- I'd like to avoid a separate "option store". My current suggestion is to use the LLVMContext, and to add a context to the layers which don't have an LLVMContext and where these kinds of options make sense.

- I would actually use a simple 'void*' -> string mapping in the "store" of the context. (Maybe we can actually use StringMap?) I would still provide the type to the registry and use that to validate options when parsing. When getting the value out of the "store" in the context, we can parse it into the data-type and assert fail on parse errors. I don't feel strongly about this, but without "std::any" it seems simpler than rolling our own with a void* or a union.

- I would provide a simple API on the LLVMContext to get an "option" string for a given key. This API could be exposed by any context, and we can write the option value extraction library to accept a generic T which supports such a query method. That way there is *very* loose coupling between these and it is easy to support any context object at any layer.

- The registry should essentially provide an interface to parse an argv string list into such a map from 'void*' -> string. The LLVMContext (or any other context) can then be populated by the result of this parse. Each library can parse whatever argv it wants.

- I don't think we should keep piling on the interface design of the existing cl::opt stuff. I would make a clean break to a new header / library, and just build some code to map these into the cl::opt parsing code from the registry (we already do stuff like this for pass names). This will also help the two systems co-exist but stay separate during the (inevitably long) transition period.

- As a specific example of what I would change from the cl::opt stuff: I think we should make the required arguments just be required and positional, and the optional arguments use a fluent-style interface (yea, I know, but I think it works well here) rather than the weird variadic-ish thing...


Once there is a patch that seems like a reasonable start at this, I'd actually like to get the new library added and just one pass (the one you're using seems great) converted. I'd then like to try a few experiments to see if we can improve some of the nitty-gritty details of the API before we roll it out widely.

Seem reasonable? I'm actually happy to help with some of the API experiments and the rollout here.
-Chandler


[1]: I'd actually like to understand why on some platforms the explicit global initialization routines are so much better than global constructors... But that's probably a topic for on off-list or IRC conversation. I don't want to derail this thread with that.

Rafael Espíndola

unread,
Aug 26, 2014, 2:54:52 PM8/26/14
to Chandler Carruth, Joerg Sonnenberger, LLVM Dev
> These "debugging knobs" or library level options which *aren't* exposed in
> the library's APIs are very risky to expose in a side-channel API: they are
> often completely unsupported or unsupportable options that were never
> intended for "production" usage. Once we expose these options via some API
> to library users, we run a very serious risk of getting locked into
> supporting behaviors or debugging misbehavior that were never expected or
> planned for...

Agreed. This is my only real concern in here.

> However, the debugging use case is real and important. So I want to propose
> that we guard many of the paths to setting these options with NDEBUG so that
> in non-asserts builds, library users can't override these options. This
> would mean the C++ API for setting this in LLVMContext (and the C API if we
> ever expose it there) would only support setting these in asserts builds. We
> could even sink the checking into the registry. The name should probably
> have "debug" in it somewhere. It would then have some
> InternalUnsupportedScaryBad API which is not actually in the header files
> but is used in LLVM's tools to maintain their current behavior without
> documenting a public API. It's not bulletproof, but it seemed that it would
> be sufficient for the debugging use cases and is a strong statement of
> "unsupported" IMO.

Just to be clear, the idea is that in a release build the only way of
setting options is via the command line? That they can still be passed
to opt or llc, but they would not be considered part of the API.

> - I really like the idea of using a 'void*' derived from some static tag as
> a way to identify uniquely each option. I also think we can use some minimal
> "reflection" techniques to significantly simplify this.

I like it too. The reason I do is that it is one more layer in a
defense in depth against making our internal options part of the API.
The key is not just a well known value (like a string).

> - I'd like to avoid a separate "option store". My current suggestion is to
> use the LLVMContext, and to add a context to the layers which don't have an
> LLVMContext and where these kinds of options make sense.

I agree. There are few places in LLVM without a LLVMContext. For
example, there is one cl::opt in lib/MC and Joerg already has a patch
to turn it into a proper API.

Cheers,
Rafael

Chandler Carruth

unread,
Aug 26, 2014, 3:52:54 PM8/26/14
to Rafael Espíndola, Joerg Sonnenberger, LLVM Dev

On Tue, Aug 26, 2014 at 11:52 AM, Rafael Espíndola <rafael.e...@gmail.com> wrote:
Just to be clear, the idea is that in a release build the only way of
setting options is via the command line? That they can still be passed
to opt or llc, but they would not be considered part of the API.

Yep. The path to being part of the API is, well, to put it in the API -- constructors, methods, whatever.

Pete Cooper

unread,
Aug 26, 2014, 8:11:33 PM8/26/14
to Rafael Espíndola, Joerg Sonnenberger, LLVM Dev
I disagree with this point. The majority of the fields of the context are IR. An MC only tool should not be required to link the LLVM IR just because they want to use a command line option.

Chris’ suggestion is to make an OptionStore class and pass it throughout the APIs where we want to handle options. At first the store should be a global singleton. Users of options don’t need to know about the LLVMContext or any other unrelated data structure. If the options are stored as a field of the context then that can easily be done too.

If we later decide that the OptionStore should live in the LLVMContext or any other location then thats fine, but moving it to that point would be a minimal change as no APIs have to be rewritten. Also, if we decide that we want the LLVMContext to have an option store, but MC level tools to be allowed to define their own, then that would also be possible.

Thanks,
Pete

Sean Silva

unread,
Aug 26, 2014, 8:32:43 PM8/26/14
to Pete Cooper, LLVM Dev, Joerg Sonnenberger
The way I understand it, Chandler's proposal is to let any object act as an object store if it satisfies the interface requirements (tested through SFINAE or whatever). I'm guessing that whatever method the LLVMContext uses to store the options will probably be reusable.

Just using a concrete ObjectStore class might be simpler than a more loosely-coupled but more complex metaprogramming based approach (which as I understand it is his suggestion).

-- Sean Silva

Chandler Carruth

unread,
Aug 26, 2014, 8:35:17 PM8/26/14
to Pete Cooper, LLVM Dev, Joerg Sonnenberger
On Tue, Aug 26, 2014 at 5:09 PM, Pete Cooper <peter_...@apple.com> wrote:
>> - I'd like to avoid a separate "option store". My current suggestion is to
>> use the LLVMContext, and to add a context to the layers which don't have an
>> LLVMContext and where these kinds of options make sense.
>
> I agree. There are few places in LLVM without a LLVMContext. For
> example, there is one cl::opt in lib/MC and Joerg already has a patch
> to turn it into a proper API.
I disagree with this point.  The majority of the fields of the context are IR.  An MC only tool should not be required to link the LLVM IR just because they want to use a command line option.

Sorry, this is my fault, I phrased something poorly.

When I said "add a context to the layers ..." I didn't mean to add an LLVMContext to them. I agree, that's crazy. =] I meant add some context object *for* that layer. Maybe an MCContext. This would *only* traffic in the shared context and state needed by that layer, not any other layer. If the layer needs these debug options, they can expose a generic interface from that context and the option management code will happily use it.

The reason I mention this is that I don't want us to develop N different objects which are essentially "side cars" to carry additional state around a layer of abstraction. We've been using LLVMContext to do that successfully at the IR layer. At other layers where we need to do the same, I'd like to add a single context *type* to represent that layer, and use objects of that type to pass around both options and any other common structures needed at that layer.

-Chandler

Pete Cooper

unread,
Aug 26, 2014, 8:36:25 PM8/26/14
to Chandler Carruth, LLVM Dev, Joerg Sonnenberger
Sounds good.  Thanks for clarifying that.

Thanks,
Pete

-Chandler

Chandler Carruth

unread,
Aug 26, 2014, 8:37:37 PM8/26/14
to Sean Silva, Joerg Sonnenberger, LLVM Dev

On Tue, Aug 26, 2014 at 5:29 PM, Sean Silva <chiso...@gmail.com> wrote:
more complex metaprogramming based approach (which as I understand it is his suggestion).

FWIW, I think the complexity here is going to be *tiny*.

Robinson, Paul

unread,
Aug 26, 2014, 9:21:31 PM8/26/14
to Chandler Carruth, Chris Bieneman, LLVM Dev

I count 15 occurrences of "-backend-option" in clang/lib/Driver/Tools.cpp and most (all? didn't look at them real hard) of those aren't debugging options.  Presumably those can all be solved by adding more plumbing, but it would be *really* nice to make that plumbing easier to do.

--paulr

Chris Bieneman

unread,
Aug 27, 2014, 5:39:16 PM8/27/14
to Robinson, Paul, LLVM Dev
Chandler brought up a lot of really great points. One of his big points was that if we’re going to touch every pass, we may as well fix the cl::opt API too.

Toward that end I’ve put together some patches with a new fluent-style API as a starting point of a conversation.

Disclaimer: To make these patches work with the existing cl::opts there is some nasty hackery going on in CommandLine.h — For the sake of this conversation I’d like to focus our attention on the API for creating and accessing options in Scalarizer.cpp

My intent with these changes is to provide a groundwork for meeting all of Chandler’s main points. These patches have the following high-level changes from my previous patches:

1) I’ve switched to some crazy template-foo for generating IDs to identify options
2) I’ve moved all the new API components out of the cl namespace
3) There is no option storage outside the OptionRegistry, but I have defined a get API and setup template methods for reading from a store and suggested API for LLVMContext to implement
4) I’ve defined a new API for defining options in the form of the opt_builder object. This is intended to be a transitional API only, but it allows us to migrate off the cl::opt template-foo

As with my previous patches, these patches are designed to work with existing cl::opts.

Thanks,
-Chris
cl_opt-revised.diff

Chandler Carruth

unread,
Aug 29, 2014, 4:52:27 AM8/29/14
to Chris Bieneman, LLVM Dev
On Wed, Aug 27, 2014 at 2:36 PM, Chris Bieneman <be...@apple.com> wrote:
Chandler brought up a lot of really great points. One of his big points was that if we’re going to touch every pass, we may as well fix the cl::opt API too.

Toward that end I’ve put together some patches with a new fluent-style API as a starting point of a conversation.

Disclaimer: To make these patches work with the existing cl::opts there is some nasty hackery going on in CommandLine.h — For the sake of this conversation I’d like to focus our attention on the API for creating and accessing options in Scalarizer.cpp

Yep, I'll actually only use that part of the patch to comment on, I agree the rest are largely details.
 

My intent with these changes is to provide a groundwork for meeting all of Chandler’s main points. These patches have the following high-level changes from my previous patches:

1) I’ve switched to some crazy template-foo for generating IDs to identify options
2) I’ve moved all the new API components out of the cl namespace
3) There is no option storage outside the OptionRegistry, but I have defined a get API and setup template methods for reading from a store and suggested API for LLVMContext to implement
4) I’ve defined a new API for defining options in the form of the opt_builder object. This is intended to be a transitional API only, but it allows us to migrate off the cl::opt template-foo

As with my previous patches, these patches are designed to work with existing cl::opts.

One very high-level comment about this interaction...

I think it would be nice to end up with these debugging options fully separated from the 'cl::opt' stuff from the perspective of code registering and querying an option. What I'm thinking is that we should be able to give simple guidance within LLVM: library code doesn't use CommandLine.h, it uses DebugOptions.h (or whatever its called).

However, as you say, we obviously need to integrate cleanly with the cl::opt world that exists. =] I would probably structure it such that these debugging options didn't depend on any of the details of the cl::opt infrastructure, and instead the cl::opt stuff queried these debugging options to collect their names and parse them when parsing command line options. I suspect long term the library interface for setting and toggling these may well be unrelated to the cl::opt interface; it would at least be nice to leave that option open, which I why I would suggest layering it in that direction. Does that make sense? Perhaps there are implementation reasons that preclude it, I've not checked as I agree that the current code there isn't important, but I wanted to mention the layering thing just so it was on your radar.

Now, on to the code! =]

diff --git a/lib/Transforms/Scalar/Scalarizer.cpp b/lib/Transforms/Scalar/Scalarizer.cpp
index 813041a..d830a48 100644
--- a/lib/Transforms/Scalar/Scalarizer.cpp
+++ b/lib/Transforms/Scalar/Scalarizer.cpp
@@ -127,9 +127,22 @@ class Scalarizer : public FunctionPass,
 public:
   static char ID;
 
+  template<typename OptStore>
+  void init(const OptStore &OS) {
+    initializeScalarizerPass(*PassRegistry::getPassRegistry());
+    ScalarizeLoadStore = OS.template
+      get<bool, Scalarizer, &Scalarizer::ScalarizeLoadStore>();
+  }
+
   Scalarizer() :
     FunctionPass(ID) {
-    initializeScalarizerPass(*PassRegistry::getPassRegistry());
+    init(OptionRegistry::instance());
+  }
+
+  template<typename OptStore>
+  Scalarizer(const OptStore &OS) :
+    FunctionPass(ID) {
+    init(OS);

What do you think about my suggestion to use doInitialization(Module &) and getting the options from LLVMContext?

I understand that some layers of LLVM will need to use something other than LLVMContext, but as mentioned in IRC, I'm interested in just having one object that gathers common state for a layer and you can introduce new types/objects for layers that shouldn't be querying LLVMContext.


@@ -150,6 +163,18 @@ public:
   bool visitLoadInst(LoadInst &);
   bool visitStoreInst(StoreInst &);
 
+  static void RegisterOptions() {
+    // This is disabled by default because having separate loads and stores makes
+    // it more likely that the -combiner-alias-analysis limits will be reached.
+    OptionRegistry::CreateOption<bool,
+                                Scalarizer,
+                                &Scalarizer::ScalarizeLoadStore>()
+      .setInitialValue(false)
+      .setArgStr("scalarize-load-store")
+      .setHiddenFlag(cl::Hidden)
+      .setDescription("Allow the scalarizer pass to scalarize loads and store");

As this is a new interface, I would follow the new naming conventions here.

Also, I don't think it being a static method is really useful. What about just making this a free function "registerOption"?


I wouldn't set the initial value here -- I think it is more clear and more flexible to pass that into the "get" API.


The argument string isn't optional, right? The initial idea I was thinking of for a fluent-style API was to have the required parameters as normal ones, and the fluent API for optional. I had also imagined (but I'm not really attached to it, curious what you think here) using it as an option-struct builder for an optional parameter... but I've no idea what to call it. Here is a rough example of what I'm thinking just so that we can both see it:


  static void registerOptions() {
    registerOption<bool, Scalarizer, &Scalarizer::ScalarizeLoadStore>(
        "scalarize-load-store",
        OptionOptions
            .hidden()
            .description("Allow the scalarizer pass to scalarize loads and stores"));
  }


However, now that I write all this, I think I was just really wrong about the fluent API. It's not that the fluent API isn't a good way to implement something like what cl::opt is providing, its just that seeing this example makes me think it's doing the wrong thing: these shouldn't be optional flags at all.

1) The initial value *should* be optional, but that already can be optional if its in the "get" API, and we can report errors nicely when the option is missing.

2) The name of the option is pretty clearly something that should be required I think.

3) Making the description optional seems like a mistake in retrospect. Essentially all of them have the description, and *especially* the ones that this API is designed to replace: highly specialized options for configuring the innards of some part of the library.

4) I think the "hidden" distinction is no longer needed. All of the options using this new API should be "hidden" options -- they're just debugging tuning knobs. It's the options that continue to use cl::opt from inside of the tools themselves that would want to be non-hidden I think?

At that point, I think this becomes something pleasingly simple:

  static void registerOptions() {
    registerOption<bool, Scalarizer, &Scalarizer::ScalarizeLoadStore>(
        "scalarize-load-store",
        "Allow the scalarizer pass to scalarize loads and stores");
  }

Thoughts?

Chris Bieneman

unread,
Aug 29, 2014, 11:55:37 AM8/29/14
to Chandler Carruth, LLVM Dev
On Aug 29, 2014, at 1:50 AM, Chandler Carruth <chan...@google.com> wrote:


On Wed, Aug 27, 2014 at 2:36 PM, Chris Bieneman <be...@apple.com> wrote:
Chandler brought up a lot of really great points. One of his big points was that if we’re going to touch every pass, we may as well fix the cl::opt API too.

Toward that end I’ve put together some patches with a new fluent-style API as a starting point of a conversation.

Disclaimer: To make these patches work with the existing cl::opts there is some nasty hackery going on in CommandLine.h — For the sake of this conversation I’d like to focus our attention on the API for creating and accessing options in Scalarizer.cpp

Yep, I'll actually only use that part of the patch to comment on, I agree the rest are largely details.
 

My intent with these changes is to provide a groundwork for meeting all of Chandler’s main points. These patches have the following high-level changes from my previous patches:

1) I’ve switched to some crazy template-foo for generating IDs to identify options
2) I’ve moved all the new API components out of the cl namespace
3) There is no option storage outside the OptionRegistry, but I have defined a get API and setup template methods for reading from a store and suggested API for LLVMContext to implement
4) I’ve defined a new API for defining options in the form of the opt_builder object. This is intended to be a transitional API only, but it allows us to migrate off the cl::opt template-foo

As with my previous patches, these patches are designed to work with existing cl::opts.

One very high-level comment about this interaction...

I think it would be nice to end up with these debugging options fully separated from the 'cl::opt' stuff from the perspective of code registering and querying an option. What I'm thinking is that we should be able to give simple guidance within LLVM: library code doesn't use CommandLine.h, it uses DebugOptions.h (or whatever its called).

However, as you say, we obviously need to integrate cleanly with the cl::opt world that exists. =] I would probably structure it such that these debugging options didn't depend on any of the details of the cl::opt infrastructure, and instead the cl::opt stuff queried these debugging options to collect their names and parse them when parsing command line options. I suspect long term the library interface for setting and toggling these may well be unrelated to the cl::opt interface; it would at least be nice to leave that option open, which I why I would suggest layering it in that direction. Does that make sense? Perhaps there are implementation reasons that preclude it, I've not checked as I agree that the current code there isn't important, but I wanted to mention the layering thing just so it was on your radar.

That makes sense. I’ll think about how to best re-structure the code and propose some patches.
We can go down that route. I would prefer to phase that change. If we make doInitialization a template (like above with init), we can have it initially called with the OptionRegistry, and replace it with the context later and not need to revisit the pass code. I do think that whatever object we pass into doInitialization, it should conform to the interface we decide on for retrieving options.


I understand that some layers of LLVM will need to use something other than LLVMContext, but as mentioned in IRC, I'm interested in just having one object that gathers common state for a layer and you can introduce new types/objects for layers that shouldn't be querying LLVMContext.


@@ -150,6 +163,18 @@ public:
   bool visitLoadInst(LoadInst &);
   bool visitStoreInst(StoreInst &);
 
+  static void RegisterOptions() {
+    // This is disabled by default because having separate loads and stores makes
+    // it more likely that the -combiner-alias-analysis limits will be reached.
+    OptionRegistry::CreateOption<bool,
+                                Scalarizer,
+                                &Scalarizer::ScalarizeLoadStore>()
+      .setInitialValue(false)
+      .setArgStr("scalarize-load-store")
+      .setHiddenFlag(cl::Hidden)
+      .setDescription("Allow the scalarizer pass to scalarize loads and store");

As this is a new interface, I would follow the new naming conventions here.

Also, I don't think it being a static method is really useful. What about just making this a free function "registerOption”?

If Scalarizer::ScalarizeLoadStore is a private member (and it kinda should be), then you need this to be a static method on the class otherwise the code won’t compile. That’s why I did it this way instead of as a standalone function.



I wouldn't set the initial value here -- I think it is more clear and more flexible to pass that into the "get" API.

I preserved the initial value where it is based on a conversation with Jim. I’m not particularly opinionated either way. I’ll ask Jim to chime in if he feels strongly about this.



The argument string isn't optional, right? The initial idea I was thinking of for a fluent-style API was to have the required parameters as normal ones, and the fluent API for optional. I had also imagined (but I'm not really attached to it, curious what you think here) using it as an option-struct builder for an optional parameter... but I've no idea what to call it. Here is a rough example of what I'm thinking just so that we can both see it:


  static void registerOptions() {
    registerOption<bool, Scalarizer, &Scalarizer::ScalarizeLoadStore>(
        "scalarize-load-store",
        OptionOptions
            .hidden()
            .description("Allow the scalarizer pass to scalarize loads and stores"));
  }


However, now that I write all this, I think I was just really wrong about the fluent API. It's not that the fluent API isn't a good way to implement something like what cl::opt is providing, its just that seeing this example makes me think it's doing the wrong thing: these shouldn't be optional flags at all.

1) The initial value *should* be optional, but that already can be optional if its in the "get" API, and we can report errors nicely when the option is missing.

2) The name of the option is pretty clearly something that should be required I think.

3) Making the description optional seems like a mistake in retrospect. Essentially all of them have the description, and *especially* the ones that this API is designed to replace: highly specialized options for configuring the innards of some part of the library.

4) I think the "hidden" distinction is no longer needed. All of the options using this new API should be "hidden" options -- they're just debugging tuning knobs. It's the options that continue to use cl::opt from inside of the tools themselves that would want to be non-hidden I think?

At that point, I think this becomes something pleasingly simple:

  static void registerOptions() {
    registerOption<bool, Scalarizer, &Scalarizer::ScalarizeLoadStore>(
        "scalarize-load-store",
        "Allow the scalarizer pass to scalarize loads and stores");
  }

Thoughts?

Oddly enough the argument string IS optional, but you must have either an argument string or cl::Positional (but not both). It is probably reasonable to define two registerOption() calls one that takes an argument string and description, and one that only takes a description and implies cl::Positional.

I also think we still need a fluent-style API for toggling some options that aren’t always default values (like hidden). In the case of hidden, maybe the better way to handle it is to make registerOption default to creating hidden options and have a “.visible()” API for toggling it?

It seems to me like you’re suggesting an end solution where cl::opt still exists for tool-specific options, and the register/get API exists for all of the other options littered around the compiler. While I’m not going to object strongly to this idea because it does serve my purposes (getting rid of static initializers in the libraries), I don’t particularly care for this. I think fundamentally if we are aiming to provide a better API for defining options, I don’t think there is any reason not to use that API everywhere and to abandon the existing one completely. If you disagree, please let me know.

-Chris

Chris Bieneman

unread,
Aug 29, 2014, 2:14:32 PM8/29/14
to Chandler Carruth, LLVM Dev
Now that I’ve had my coffee and played with code a bit…

I think your idea of using doInitialization and fetching the context off the module makes sense.

Here are some diffs to review. I still haven’t moved the default off the registration, but I have adjusted the API a bit.

Again, please focus on Scalarizer.cpp as there is a growing amount of nastiness outside the pass to make this all work.

Thanks,
-Chris
cl_opt-revised.diff

Jim Grosbach

unread,
Aug 29, 2014, 2:33:59 PM8/29/14
to Chris Bieneman, LLVM Dev
It depends on the specifics, but in general having the default value for the options be specified (potentially differently) on every call to the getter method feels very wrong to me. Perhaps I am misunderstanding the proposal?




The argument string isn't optional, right? The initial idea I was thinking of for a fluent-style API was to have the required parameters as normal ones, and the fluent API for optional. I had also imagined (but I'm not really attached to it, curious what you think here) using it as an option-struct builder for an optional parameter... but I've no idea what to call it. Here is a rough example of what I'm thinking just so that we can both see it:


  static void registerOptions() {
    registerOption<bool, Scalarizer, &Scalarizer::ScalarizeLoadStore>(
        "scalarize-load-store",
        OptionOptions
            .hidden()
            .description("Allow the scalarizer pass to scalarize loads and stores"));
  }


However, now that I write all this, I think I was just really wrong about the fluent API. It's not that the fluent API isn't a good way to implement something like what cl::opt is providing, its just that seeing this example makes me think it's doing the wrong thing: these shouldn't be optional flags at all.

1) The initial value *should* be optional, but that already can be optional if its in the "get" API, and we can report errors nicely when the option is missing.

2) The name of the option is pretty clearly something that should be required I think.

3) Making the description optional seems like a mistake in retrospect. Essentially all of them have the description, and *especially* the ones that this API is designed to replace: highly specialized options for configuring the innards of some part of the library.

4) I think the "hidden" distinction is no longer needed. All of the options using this new API should be "hidden" options -- they're just debugging tuning knobs. It's the options that continue to use cl::opt from inside of the tools themselves that would want to be non-hidden I think?

At that point, I think this becomes something pleasingly simple:

  static void registerOptions() {
    registerOption<bool, Scalarizer, &Scalarizer::ScalarizeLoadStore>(
        "scalarize-load-store",
        "Allow the scalarizer pass to scalarize loads and stores");
  }

Thoughts?

Oddly enough the argument string IS optional, but you must have either an argument string or cl::Positional (but not both). It is probably reasonable to define two registerOption() calls one that takes an argument string and description, and one that only takes a description and implies cl::Positional.

I also think we still need a fluent-style API for toggling some options that aren’t always default values (like hidden). In the case of hidden, maybe the better way to handle it is to make registerOption default to creating hidden options and have a “.visible()” API for toggling it?

It seems to me like you’re suggesting an end solution where cl::opt still exists for tool-specific options, and the register/get API exists for all of the other options littered around the compiler. While I’m not going to object strongly to this idea because it does serve my purposes (getting rid of static initializers in the libraries), I don’t particularly care for this. I think fundamentally if we are aiming to provide a better API for defining options, I don’t think there is any reason not to use that API everywhere and to abandon the existing one completely. If you disagree, please let me know.

-Chris

Chandler Carruth

unread,
Aug 29, 2014, 3:49:50 PM8/29/14
to Jim Grosbach, Chris Bieneman, LLVM Dev

On Fri, Aug 29, 2014 at 11:29 AM, Jim Grosbach <gros...@apple.com> wrote:
It depends on the specifics, but in general having the default value for the options be specified (potentially differently) on every call to the getter method feels very wrong to me. Perhaps I am misunderstanding the proposal?

No, I think you're understanding the proposal.

I agree that the default changing is... ew.

However, the more I think about it, the less sense ever *not* having a default makes for these options. So maybe it would be fine to just always require an initial value when registering?

Jim Grosbach

unread,
Aug 29, 2014, 3:55:19 PM8/29/14
to Chandler Carruth, Chris Bieneman, LLVM Dev
Yeah, I think that’s the right way to go here.

-Jim

Chandler Carruth

unread,
Aug 29, 2014, 4:12:05 PM8/29/14
to Chris Bieneman, LLVM Dev
So, trying to quickly reply to what I think may be a high-level point of confusion that we should sift through early:

On Fri, Aug 29, 2014 at 8:53 AM, Chris Bieneman <cbie...@apple.com> wrote:
Oddly enough the argument string IS optional, but you must have either an argument string or cl::Positional (but not both). It is probably reasonable to define two registerOption() calls one that takes an argument string and description, and one that only takes a description and implies cl::Positional.

I think all of this comes down to the point you make below: 

It seems to me like you’re suggesting an end solution where cl::opt still exists for tool-specific options, and the register/get API exists for all of the other options littered around the compiler. While I’m not going to object strongly to this idea because it does serve my purposes (getting rid of static initializers in the libraries), I don’t particularly care for this. I think fundamentally if we are aiming to provide a better API for defining options, I don’t think there is any reason not to use that API everywhere and to abandon the existing one completely. If you disagree, please let me know.

Ah, ok, here we have uncovered what I suspect is the underpinning set of different assumptions / directions that are creating some of the confusion. Now that you've put your finger on it (thanks!) I'll try actually address this. Sorry if this is a bit of a re-statement, but hopefully it'll at least let you respond directly with where my thinking has gone off the rails.

So I am definitely trying to design this as a system totally divorced from the tool-specific options, or actually a command line system of any form. That's why stuff like positional arguments and the cl::opt tie-ins aren't showing up in my thoughts and suggestions.  What you say makes perfect sense if we're building a replacement for *all* of cl::opt's functionality. While I would be interested in trying to do that (and we already have one replacement in tree used by Clang and LLD to parse command line options), I'm suggesting that what the LLVM *libraries* need is something completely separable from a command line interface, but which is easy to build a command line interface around.

The way I'm thinking of this is as a generic collection of optional key->value settings. The libraries don't have any business mucking with positional parameters, non-hidden options, or the other concerns of a command line system, and the code it uses (and the APIs it uses) to register and use these options is a *better* interface when it is narrower and doesn't have to support these features.

Now, I think we should still be able to expose these via the 'cl::opt' layer in the tools that are using that layer. But we should also be able to expose it via an API to say webkit or whomever needs it there. And we should be able to wrap it in a nicer command line interface in Clang.


Is that a convincing argument for you to restrict the scope of this interface? Maybe there are other things that you'd like to do here that motivate going the other direction?

Chris Bieneman

unread,
Aug 29, 2014, 4:41:23 PM8/29/14
to Chandler Carruth, LLVM Dev
This is where I thought you might be going, and I’m glad we’re now talking on the same page.

I think the philosophically I agree with you on most points, but I do have a few comments and disagreements.

On Aug 29, 2014, at 1:09 PM, Chandler Carruth <chan...@google.com> wrote:

So, trying to quickly reply to what I think may be a high-level point of confusion that we should sift through early:

On Fri, Aug 29, 2014 at 8:53 AM, Chris Bieneman <cbie...@apple.com> wrote:
Oddly enough the argument string IS optional, but you must have either an argument string or cl::Positional (but not both). It is probably reasonable to define two registerOption() calls one that takes an argument string and description, and one that only takes a description and implies cl::Positional.

I think all of this comes down to the point you make below: 

It seems to me like you’re suggesting an end solution where cl::opt still exists for tool-specific options, and the register/get API exists for all of the other options littered around the compiler. While I’m not going to object strongly to this idea because it does serve my purposes (getting rid of static initializers in the libraries), I don’t particularly care for this. I think fundamentally if we are aiming to provide a better API for defining options, I don’t think there is any reason not to use that API everywhere and to abandon the existing one completely. If you disagree, please let me know.

Ah, ok, here we have uncovered what I suspect is the underpinning set of different assumptions / directions that are creating some of the confusion. Now that you've put your finger on it (thanks!) I'll try actually address this. Sorry if this is a bit of a re-statement, but hopefully it'll at least let you respond directly with where my thinking has gone off the rails.

So I am definitely trying to design this as a system totally divorced from the tool-specific options, or actually a command line system of any form. That's why stuff like positional arguments and the cl::opt tie-ins aren't showing up in my thoughts and suggestions.  What you say makes perfect sense if we're building a replacement for *all* of cl::opt's functionality. While I would be interested in trying to do that (and we already have one replacement in tree used by Clang and LLD to parse command line options), I'm suggesting that what the LLVM *libraries* need is something completely separable from a command line interface, but which is easy to build a command line interface around.

I think this is absolutely correct in the general sense. When you are talking about libraries as clients the notion of a command line just doesn’t makes sense in all use cases, so it stands to reason you need some completely new abstraction. But...


The way I'm thinking of this is as a generic collection of optional key->value settings. The libraries don't have any business mucking with positional parameters, non-hidden options, or the other concerns of a command line system, and the code it uses (and the APIs it uses) to register and use these options is a *better* interface when it is narrower and doesn't have to support these features.

While it is true that we probably don’t need to allow the libraries to mess around with positional parameters, non-hidden options are (I think) a different story. If you look at include/llvm/CodeGen/CommandFlags.h, there are a number of flags defined here that are not positional, not hidden, but only relevant to the CodeGen library. It is probably reasonable that these flags be transitioned to the new API, but they also should be exposed on the command line.

Aside from those options there are probably very few cases where it actually makes sense for library options to not be hidden, although I must add the caveat that I’m sure there are plenty of library options that are not hidden today and making this change will make them hidden. I’m ok with this, but it does change one of my original goals because it will actually modify the behavior of the command line.


Now, I think we should still be able to expose these via the 'cl::opt' layer in the tools that are using that layer. But we should also be able to expose it via an API to say webkit or whomever needs it there. And we should be able to wrap it in a nicer command line interface in Clang.

I think this is a big step in the right direction for clients of LLVM libraries that aren’t command line driven, so this would be a big win.



Is that a convincing argument for you to restrict the scope of this interface? Maybe there are other things that you'd like to do here that motivate going the other direction?

How do you feel about also providing a fluent-style API for toggling the (hopefully) uncommon options?

-Chris

Chandler Carruth

unread,
Aug 29, 2014, 5:22:29 PM8/29/14
to Chris Bieneman, LLVM Dev
On Fri, Aug 29, 2014 at 1:38 PM, Chris Bieneman <cbie...@apple.com> wrote:
While it is true that we probably don’t need to allow the libraries to mess around with positional parameters, non-hidden options are (I think) a different story. If you look at include/llvm/CodeGen/CommandFlags.h, there are a number of flags defined here that are not positional, not hidden, but only relevant to the CodeGen library. It is probably reasonable that these flags be transitioned to the new API, but they also should be exposed on the command line.

So, the 'hidden' flags are *all* exposed on the commandline, they just aren't in the *help* output.

My thinking (which perhaps is wrong) is that the library options of this form probably shouldn't be in the help output. The help output should probably talk about the tool commandline flags which drive actual API knobs in constructors of things.
 

Aside from those options there are probably very few cases where it actually makes sense for library options to not be hidden, although I must add the caveat that I’m sure there are plenty of library options that are not hidden today and making this change will make them hidden. I’m ok with this, but it does change one of my original goals because it will actually modify the behavior of the command line.

This should only change the behavior of '-help' which I think is fine.
Is that a convincing argument for you to restrict the scope of this interface? Maybe there are other things that you'd like to do here that motivate going the other direction?

How do you feel about also providing a fluent-style API for toggling the (hopefully) uncommon options?

I think that if we have optional things that don't make sense as one or two unsurprising optional arguments, the fluent API design is the right one.

But I'd like to understand what those optional properties are. Currently, all of the ones we've discussed seem fine to either be required, or be completely omitted.

Chris Bieneman

unread,
Aug 29, 2014, 7:43:48 PM8/29/14
to Chandler Carruth, LLVM Dev
This all sounds fine to me.

How do you feel about the idea of committing changes with some cleanup from my last patches and converting a few  cl::opts to the new API to see how the new API feels? I can provide patches either later today or this weekend.

-Chris

Chandler Carruth

unread,
Sep 16, 2014, 8:32:10 PM9/16/14
to Chris Bieneman, LLVM Dev
Picking this back up, I think that your patch with some cleanups is probably the right starting point. I think the only really substantive issue I currently have is that I'd rather have the OptionRegistry in the LLVMContext rather than in a singleton method. I understand we'll also have to add similar registries to other context objects (MCContext at least), but I think it's useful to start off with just one context initially.

How do you want to go about the review / cleanups on the patch? Maybe a new thread? I have a couple of significant things and probably a bunch of really minor cleanup stuff.

While you may already have much of this fixed, just listing the things I've seen here so I don't forget:

- I would use a DenseMap or a sorted vector for the lookups. While not super hot, std::map is pretty terrible.
- Naming of parameters likely needs updating to match conventions...
- I think you can kill off the hidden flag from the new API and just always mark these as hidden with cl::opt. Any flags which *aren't* marked as hidden probably shouldn't migrate to this API and instead should become constructor parameters controlled at API boundaries (and surfacing with non-hidden cl::opt objects in the tools rather than the libraries).
- I think you can kill off the builder based on the above. Most of the builder is already redundant with the register API.
- Go ahead and clang-format the CommandLine.{h,cpp} code you're going to need to touch so that the diff is sane.

I think everything else I see will just fall out of the cleanup.

Sound like a plan?


Chris Bieneman

unread,
Sep 17, 2014, 1:41:00 PM9/17/14
to Chandler Carruth, LLVM Dev
On Sep 16, 2014, at 5:28 PM, Chandler Carruth <chan...@google.com> wrote:

Picking this back up, I think that your patch with some cleanups is probably the right starting point. I think the only really substantive issue I currently have is that I'd rather have the OptionRegistry in the LLVMContext rather than in a singleton method. I understand we'll also have to add similar registries to other context objects (MCContext at least), but I think it's useful to start off with just one context initially.

In my last patches I had the OptionRegistry really just being a template type that conformed to the get interface. I was having it be a singleton so that we have a way to gradually transition. To make it the context means we'll need to change all the pass constructors and the pass macros up front. My thoughts on how to phase this in are that we first use the template/singleton that I had in my last patches. Then we can update the context to conform to the API, and migrate the macros and pass construction to pass the context in. After that we can remove the default constructors from the passes and replace the template argument with LLVMContext.

Does that sound like a good roadmap for a transition?


How do you want to go about the review / cleanups on the patch? Maybe a new thread? I have a couple of significant things and probably a bunch of really minor cleanup stuff.

Sounds good to me. How about I clean up my patches based on your recommendations below and start a Phabricator review?

-Chris

Chandler Carruth

unread,
Sep 17, 2014, 5:12:21 PM9/17/14
to Chris Bieneman, LLVM Dev
On Wed, Sep 17, 2014 at 10:35 AM, Chris Bieneman <be...@apple.com> wrote:
On Sep 16, 2014, at 5:28 PM, Chandler Carruth <chan...@google.com> wrote:

Picking this back up, I think that your patch with some cleanups is probably the right starting point. I think the only really substantive issue I currently have is that I'd rather have the OptionRegistry in the LLVMContext rather than in a singleton method. I understand we'll also have to add similar registries to other context objects (MCContext at least), but I think it's useful to start off with just one context initially.

In my last patches I had the OptionRegistry really just being a template type that conformed to the get interface. I was having it be a singleton so that we have a way to gradually transition. To make it the context means we'll need to change all the pass constructors and the pass macros up front. My thoughts on how to phase this in are that we first use the template/singleton that I had in my last patches. Then we can update the context to conform to the API, and migrate the macros and pass construction to pass the context in. After that we can remove the default constructors from the passes and replace the template argument with LLVMContext.

Does that sound like a good roadmap for a transition?

Hmm, OK. I disagree with some specifics (I don't think we want to do this in the constructor) but it should be easy to sort that out as it comes up.
 


How do you want to go about the review / cleanups on the patch? Maybe a new thread? I have a couple of significant things and probably a bunch of really minor cleanup stuff.

Sounds good to me. How about I clean up my patches based on your recommendations below and start a Phabricator review?

Sure.
Reply all
Reply to author
Forward
0 new messages