[llvm-dev] Suggestion / Help regarding new calling convention

55 views
Skip to first unread message

vivek pandya via llvm-dev

unread,
Jun 20, 2016, 10:39:41 AM6/20/16
to llvm-dev
Dear Community,

To improve current interprocedural register allocation (IPRA) , we have planned to set callee saved registers to none for local functions, currently I am doing it in following way:

if (F->hasLocalLinkage()  && !F->hasAddressTaken()) {
   DEBUG(dbgs() << "Function has LocalLinkage \n");
F->setCallingConv(CallingConv::GHC); 
 }

but we think threre should be clean and properway to do this perhaps like:

if (F->hasLocalLinkage() && !F->hasAddressTaken()) {
    DEBUG(dbgs() << "Function has LocalLinkage \n");
    F->setCallingConv(CallingConv::NO_Callee_Saved);
  }

So I would like to know any better suggestions and if it is better to add a new CC for this purpose then what aspects should be considered while defining a new CC. Actually in this case the new CC does not really required to define how parameters should be passed or any special rule for return value etc , it just required to set callee saved registers to be none. So what are the minimal things required to define such a CC?

Other alternative that I have thought was to add new attribute for function and use it like following in TargetFrameLowering::determineCalleeSaves()

 // In Naked functions we aren't going to save any registers.
  if (MF.getFunction()->hasFnAttribute(Attribute::Naked))
    return;

Any suggestions / thoughts are welcomed !

Sincerely,
Vivek

John Criswell via llvm-dev

unread,
Jun 20, 2016, 11:12:52 AM6/20/16
to vivek pandya, llvm-dev
On 6/20/16 9:39 AM, vivek pandya via llvm-dev wrote:
Dear Community,

To improve current interprocedural register allocation (IPRA) , we have planned to set callee saved registers to none for local functions, currently I am doing it in following way:

if (F->hasLocalLinkage()  && !F->hasAddressTaken()) {

As an aside, you might want to analyze how many functions have both local linkage and are not address taken.  I recall that many functions returned false for hasAddressTaken() because some direct calls casted the function to a different function type before calling it.  Such functions are still not address taken, but the simple hasAddressTaken() method can't determine it.

If you see that happening, you can simply scan through a function's def-use chains and see if any "indirect calls" are really direct calls that cast the function pointer.  I believe SAFECode has some code somewhere that does this if you need it.



   DEBUG(dbgs() << "Function has LocalLinkage \n");
F->setCallingConv(CallingConv::GHC); 
 }

but we think threre should be clean and properway to do this perhaps like:

if (F->hasLocalLinkage() && !F->hasAddressTaken()) {
    DEBUG(dbgs() << "Function has LocalLinkage \n");
    F->setCallingConv(CallingConv::NO_Callee_Saved);
  }

So I would like to know any better suggestions and if it is better to add a new CC for this purpose then what aspects should be considered while defining a new CC. Actually in this case the new CC does not really required to define how parameters should be passed or any special rule for return value etc , it just required to set callee saved registers to be none. So what are the minimal things required to define such a CC?

Other alternative that I have thought was to add new attribute for function and use it like following in TargetFrameLowering::determineCalleeSaves()

 // In Naked functions we aren't going to save any registers.
  if (MF.getFunction()->hasFnAttribute(Attribute::Naked))
    return;

Any suggestions / thoughts are welcomed !

My humble opinion is that you should avoid hacks as they will likely break as LLVM changes.  If the GHC calling convention or the naked function attribute guarantee that you will always get the behavior that you want on all architectures, then go ahead and use them; just make sure to add a clear and conspicuous comment explaining why are you using them as it is not obvious.

If the GHC calling convention or the naked attribute does not guarantee to give you what you need, I'd add an attribute or a calling convention.

That said, I'm not familiar enough with the code generator or these attributes/calling conventions to tell you what you should do.  You'll need input from others more familiar with them.

My two cents,

John Criswell


Sincerely,
Vivek


_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


-- 
John Criswell
Assistant Professor
Department of Computer Science, University of Rochester
http://www.cs.rochester.edu/u/criswell

Reid Kleckner via llvm-dev

unread,
Jun 20, 2016, 11:25:17 AM6/20/16
to vivek pandya, llvm-dev
GlobalOpt already does this with fastcc, use it as a guide if you want to do this:

You can't just change the convention of F, you need to update the convention used at F's call sites.

Unfortunately, while LLVM has PreserveMost and PreservesAll calling conventions, it does not appear to have an equivalent PreservesNone/Least.

vivek pandya via llvm-dev

unread,
Jun 20, 2016, 12:09:04 PM6/20/16
to Reid Kleckner, llvm-dev


On Mon, Jun 20, 2016 at 8:55 PM, Reid Kleckner <r...@google.com> wrote:
Hello Reid,
Yes actually idea to identify local function is taken from GlobalOpt : )

GlobalOpt already does this with fastcc, use it as a guide if you want to do this:

You can't just change the convention of F, you need to update the convention used at F's call sites.
 Thanks for pointing this out , and looking at RemoveNestAttribute() in GlobalOpt I think even if I add new Function attribute then also it needs to be updated at every call sites.
Unfortunately, while LLVM has PreserveMost and PreservesAll calling conventions, it does not appear to have an equivalent PreservesNone/Least.
Actually we are in favor of adding new CC like No_CSR but what I want to discuss is how much minimal changes are required and also as I mentioned for the purpose of IPRA we don't really have any new rules for parameter passing or return value etc So will it be good to have new CC for this purpose and if new CC is added is it ok to keep parameter passing and other rules same as common CC?

I hope my question is clear.

Sincerely,
Vivek

vivek pandya via llvm-dev

unread,
Jun 20, 2016, 12:18:14 PM6/20/16
to John Criswell, llvm-dev
On Mon, Jun 20, 2016 at 8:42 PM, John Criswell <jtcr...@gmail.com> wrote:
On 6/20/16 9:39 AM, vivek pandya via llvm-dev wrote:
Dear Community,

To improve current interprocedural register allocation (IPRA) , we have planned to set callee saved registers to none for local functions, currently I am doing it in following way:

if (F->hasLocalLinkage()  && !F->hasAddressTaken()) {

As an aside, you might want to analyze how many functions have both local linkage and are not address taken.  I recall that many functions returned false for hasAddressTaken() because some direct calls casted the function to a different function type before calling it.  Such functions are still not address taken, but the simple hasAddressTaken() method can't determine it.

If you see that happening, you can simply scan through a function's def-use chains and see if any "indirect calls" are really direct calls that cast the function pointer.  I believe SAFECode has some code somewhere that does this if you need it.

Dear Professor John,

Thanks for pointing out this , but I wonder that how many such cases may be there on average in a module? Is it too frequently seen?

   DEBUG(dbgs() << "Function has LocalLinkage \n");
F->setCallingConv(CallingConv::GHC); 
 }

but we think threre should be clean and properway to do this perhaps like:

if (F->hasLocalLinkage() && !F->hasAddressTaken()) {
    DEBUG(dbgs() << "Function has LocalLinkage \n");
    F->setCallingConv(CallingConv::NO_Callee_Saved);
  }

So I would like to know any better suggestions and if it is better to add a new CC for this purpose then what aspects should be considered while defining a new CC. Actually in this case the new CC does not really required to define how parameters should be passed or any special rule for return value etc , it just required to set callee saved registers to be none. So what are the minimal things required to define such a CC?

Other alternative that I have thought was to add new attribute for function and use it like following in TargetFrameLowering::determineCalleeSaves()

 // In Naked functions we aren't going to save any registers.
  if (MF.getFunction()->hasFnAttribute(Attribute::Naked))
    return;

Any suggestions / thoughts are welcomed !

My humble opinion is that you should avoid hacks as they will likely break as LLVM changes.  If the GHC calling convention or the naked function attribute guarantee that you will always get the behavior that you want on all architectures, then go ahead and use them; just make sure to add a clear and conspicuous comment explaining why are you using them as it is not obvious.

If the GHC calling convention or the naked attribute does not guarantee to give you what you need, I'd add an attribute or a calling convention.
Actually I am not really in favor of using this kind of hack because using GHC CC or naked attribute have there other rules (in addition to no callee saved registers) and that are not really wanted. So it is desired to add new CC or attribute to achieve the required effect.

Sincerely,
Vivek

John Criswell via llvm-dev

unread,
Jun 20, 2016, 12:28:20 PM6/20/16
to vivek pandya, llvm-dev
On 6/20/16 11:18 AM, vivek pandya wrote:


On Mon, Jun 20, 2016 at 8:42 PM, John Criswell <jtcr...@gmail.com> wrote:
On 6/20/16 9:39 AM, vivek pandya via llvm-dev wrote:
Dear Community,

To improve current interprocedural register allocation (IPRA) , we have planned to set callee saved registers to none for local functions, currently I am doing it in following way:

if (F->hasLocalLinkage()  && !F->hasAddressTaken()) {

As an aside, you might want to analyze how many functions have both local linkage and are not address taken.  I recall that many functions returned false for hasAddressTaken() because some direct calls casted the function to a different function type before calling it.  Such functions are still not address taken, but the simple hasAddressTaken() method can't determine it.

If you see that happening, you can simply scan through a function's def-use chains and see if any "indirect calls" are really direct calls that cast the function pointer.  I believe SAFECode has some code somewhere that does this if you need it.

Dear Professor John,

Thanks for pointing out this , but I wonder that how many such cases may be there on average in a module? Is it too frequently seen?

As of 2-3 years ago, I saw this all the time.  It is easy enough to write an LLVM pass that iterates through all instructions and counts the number of call instructions that case their constant function pointer operand, so you can write that pass and see for yourself whether it is still an issue.

Regards,

John Criswell

Matthias Braun via llvm-dev

unread,
Jun 20, 2016, 3:02:03 PM6/20/16
to vivek pandya, llvm-dev
I just discussed this with vivek on IRC (and I think we agreed on this):

Let me first state the motivation clearly to ease later discussions:
As far as the motivation for this change goes: Changing the calling convention allows us to choose whether a register is saved by the callee or the caller. Usually it is best to have a mix of both as too many caller saved registers leads to unnecessary save/restores when the called function turns out to only touch a fraction of the registers (as is typically for smaller leaf-functions of the call graph). While too many callee saved registers may lead to unnecessary saves/restores of registers even though the calling function didn't have a live value in the register anyway. With IPRA the first problem is mitigated since we propagate the actually clobbered set of registers up the callgraph instead of relying on conventions, so it is best to aim for more caller saved registers (though we should check for code size increases and store/restore code being potentially less good than the tuned sequences generated during FrameLowering).

To the disucssion at hand:
- Introducing a new calling convention at the IR level is the wrong approach: The calling convention is mostly a contract when calling and being called across translation unit boundaries. The details about how this contract is fulfilled are part of CodeGen IMO but do not need to be visible at the IR level.
- The only thing we want to influence here is which registers are saved by the callee. Changing TargetFrameLowering::determineCalleeSaves() is a good place to achieve this without affecting unrelated things like parameter and return value handling which would be part of the calling convention.
- We could experiment with dynamically changing the number of caller saved registers in the future. I could imagine heuristics like functions called from many places using some callee saved registers in order to avoid code size increases because of extra spills/restores at all the call sites. We can hardly create new calling conventions for these combinations of marking registers as callee/caller saved.

- Matthias

Mehdi Amini via llvm-dev

unread,
Jun 21, 2016, 12:29:46 AM6/21/16
to John Criswell, llvm-dev, vivek pandya
On Jun 20, 2016, at 11:12 AM, John Criswell via llvm-dev <llvm...@lists.llvm.org> wrote:

On 6/20/16 9:39 AM, vivek pandya via llvm-dev wrote:
Dear Community,

To improve current interprocedural register allocation (IPRA) , we have planned to set callee saved registers to none for local functions, currently I am doing it in following way:

if (F->hasLocalLinkage()  && !F->hasAddressTaken()) {

As an aside, you might want to analyze how many functions have both local linkage and are not address taken.  I recall that many functions returned false for hasAddressTaken() because some direct calls casted the function to a different function type before calling it.  Such functions are still not address taken, but the simple hasAddressTaken() method can't determine it.

Looks like hasAddressTaken could be updated to handle these simple case maybe?


If you see that happening, you can simply scan through a function's def-use chains and see if any "indirect calls" are really direct calls that cast the function pointer.  I believe SAFECode has some code somewhere that does this if you need it.


   DEBUG(dbgs() << "Function has LocalLinkage \n");
F->setCallingConv(CallingConv::GHC); 
 }

but we think threre should be clean and properway to do this perhaps like:

if (F->hasLocalLinkage() && !F->hasAddressTaken()) {
    DEBUG(dbgs() << "Function has LocalLinkage \n");
    F->setCallingConv(CallingConv::NO_Callee_Saved);
  }

So I would like to know any better suggestions and if it is better to add a new CC for this purpose then what aspects should be considered while defining a new CC. Actually in this case the new CC does not really required to define how parameters should be passed or any special rule for return value etc , it just required to set callee saved registers to be none. So what are the minimal things required to define such a CC?

Other alternative that I have thought was to add new attribute for function and use it like following in TargetFrameLowering::determineCalleeSaves()

 // In Naked functions we aren't going to save any registers.
  if (MF.getFunction()->hasFnAttribute(Attribute::Naked))
    return;

Any suggestions / thoughts are welcomed !

My humble opinion is that you should avoid hacks as they will likely break as LLVM changes.  

In-tree code with appropriate tests is unlikely to break. 
But we should avoid hacks anyway because they make LLVM harder to change in general.


— 
Mehdi

John Criswell via llvm-dev

unread,
Jun 21, 2016, 11:28:10 AM6/21/16
to Mehdi Amini, llvm-dev, vivek pandya
On 6/20/16 11:29 PM, Mehdi Amini wrote:

On Jun 20, 2016, at 11:12 AM, John Criswell via llvm-dev <llvm...@lists.llvm.org> wrote:

On 6/20/16 9:39 AM, vivek pandya via llvm-dev wrote:
Dear Community,

To improve current interprocedural register allocation (IPRA) , we have planned to set callee saved registers to none for local functions, currently I am doing it in following way:

if (F->hasLocalLinkage()  && !F->hasAddressTaken()) {

As an aside, you might want to analyze how many functions have both local linkage and are not address taken.  I recall that many functions returned false for hasAddressTaken() because some direct calls casted the function to a different function type before calling it.  Such functions are still not address taken, but the simple hasAddressTaken() method can't determine it.

Looks like hasAddressTaken could be updated to handle these simple case maybe?

That might make sense if it has not been fixed already.  Another approach (if in-tree LLVM passes are frequently checking for indirect calls) would be to write a simple analysis pass that lazily computes the information on demand.  That way, if multiple passes are checking the same function repeatedly, it gets cached in the analysis pass instead of being recomputed (so long as the analysis pass is not invalidated by a transform).

Regards,

John Criswell

vivek pandya via llvm-dev

unread,
Jun 21, 2016, 12:27:34 PM6/21/16
to John Criswell, llvm-dev
On Tue, Jun 21, 2016 at 8:58 PM, John Criswell <jtcr...@gmail.com> wrote:
On 6/20/16 11:29 PM, Mehdi Amini wrote:

On Jun 20, 2016, at 11:12 AM, John Criswell via llvm-dev <llvm...@lists.llvm.org> wrote:

On 6/20/16 9:39 AM, vivek pandya via llvm-dev wrote:
Dear Community,

To improve current interprocedural register allocation (IPRA) , we have planned to set callee saved registers to none for local functions, currently I am doing it in following way:

if (F->hasLocalLinkage()  && !F->hasAddressTaken()) {

As an aside, you might want to analyze how many functions have both local linkage and are not address taken.  I recall that many functions returned false for hasAddressTaken() because some direct calls casted the function to a different function type before calling it.  Such functions are still not address taken, but the simple hasAddressTaken() method can't determine it.

Looks like hasAddressTaken could be updated to handle these simple case maybe?

That might make sense if it has not been fixed already.  Another approach (if in-tree LLVM passes are frequently checking for indirect calls) would be to write a simple analysis pass that lazily computes the information on demand.  That way, if multiple passes are checking the same function repeatedly, it gets cached in the analysis pass instead of being recomputed (so long as the analysis pass is not invalidated by a transform).

Addition of new pass will require other passes to be modified. So it will be good have strong reason for adding new pass. Other wise I am in favor to modify hasAddressTaken().

-Vivek

vivek pandya via llvm-dev

unread,
Jun 21, 2016, 2:07:11 PM6/21/16
to John Criswell, llvm-dev
On Tue, Jun 21, 2016 at 9:57 PM, vivek pandya <vivekv...@gmail.com> wrote:


On Tue, Jun 21, 2016 at 8:58 PM, John Criswell <jtcr...@gmail.com> wrote:
On 6/20/16 11:29 PM, Mehdi Amini wrote:

On Jun 20, 2016, at 11:12 AM, John Criswell via llvm-dev <llvm...@lists.llvm.org> wrote:

On 6/20/16 9:39 AM, vivek pandya via llvm-dev wrote:
Dear Community,

To improve current interprocedural register allocation (IPRA) , we have planned to set callee saved registers to none for local functions, currently I am doing it in following way:

if (F->hasLocalLinkage()  && !F->hasAddressTaken()) {

As an aside, you might want to analyze how many functions have both local linkage and are not address taken.  I recall that many functions returned false for
I think here you mean "returned true" but it should not return true for such case. Or I am making any mistake in understanding this? 

- Vivek

John Criswell via llvm-dev

unread,
Jun 21, 2016, 4:56:33 PM6/21/16
to vivek pandya, llvm-dev
On 6/21/16 11:27 AM, vivek pandya wrote:


On Tue, Jun 21, 2016 at 8:58 PM, John Criswell <jtcr...@gmail.com> wrote:
On 6/20/16 11:29 PM, Mehdi Amini wrote:

On Jun 20, 2016, at 11:12 AM, John Criswell via llvm-dev <llvm...@lists.llvm.org> wrote:

On 6/20/16 9:39 AM, vivek pandya via llvm-dev wrote:
Dear Community,

To improve current interprocedural register allocation (IPRA) , we have planned to set callee saved registers to none for local functions, currently I am doing it in following way:

if (F->hasLocalLinkage()  && !F->hasAddressTaken()) {

As an aside, you might want to analyze how many functions have both local linkage and are not address taken.  I recall that many functions returned false for hasAddressTaken() because some direct calls casted the function to a different function type before calling it.  Such functions are still not address taken, but the simple hasAddressTaken() method can't determine it.

Looks like hasAddressTaken could be updated to handle these simple case maybe?

That might make sense if it has not been fixed already.  Another approach (if in-tree LLVM passes are frequently checking for indirect calls) would be to write a simple analysis pass that lazily computes the information on demand.  That way, if multiple passes are checking the same function repeatedly, it gets cached in the analysis pass instead of being recomputed (so long as the analysis pass is not invalidated by a transform).

Addition of new pass will require other passes to be modified. So it will be good have strong reason for adding new pass. Other wise I am in favor to modify hasAddressTaken().

Addition of a new pass means that existing passes would not get the benefit of the new pass until they are updated to use it.  You would not need to update existing passes though it would probably make sense to do so if using a separate analysis pass is better.

Also keep in mind that the two approaches are not mutually exclusive.  Updating hasAddressTaken() to take casting into account would probably not increase its run-time much.  The point of using a pass is to cache the results of hasAddressTaken(); the pass makes sense if the hasAddressTaken() queries are being performed repeatedly on the same functions by multiple passes.

Regards,

John Criswell

John Criswell via llvm-dev

unread,
Jun 21, 2016, 5:00:52 PM6/21/16
to vivek pandya, llvm-dev
On 6/21/16 1:06 PM, vivek pandya wrote:


On Tue, Jun 21, 2016 at 9:57 PM, vivek pandya <vivekv...@gmail.com> wrote:


On Tue, Jun 21, 2016 at 8:58 PM, John Criswell <jtcr...@gmail.com> wrote:
On 6/20/16 11:29 PM, Mehdi Amini wrote:

On Jun 20, 2016, at 11:12 AM, John Criswell via llvm-dev <llvm...@lists.llvm.org> wrote:

On 6/20/16 9:39 AM, vivek pandya via llvm-dev wrote:
Dear Community,

To improve current interprocedural register allocation (IPRA) , we have planned to set callee saved registers to none for local functions, currently I am doing it in following way:

if (F->hasLocalLinkage()  && !F->hasAddressTaken()) {

As an aside, you might want to analyze how many functions have both local linkage and are not address taken.  I recall that many functions returned false for
I think here you mean "returned true" but it should not return true for such case. Or I am making any mistake in understanding this?

You're correct; I said it backwards: hasAddressTaken() returns true when it could return false because the function pointer is casted to another function pointer type in one or more direct calls.

To be clear, my experience is a bit dated; I'm recalling details from LLVM 3.2.  You should therefore test it for yourself to see if the issue still exists and if it is pessimizing your results.  I mentioned it because the problem existed for a long time during SAFECode's development, and so I got into the habit of never using hasAddressTaken().

Regards,

John Criswell

vivek pandya via llvm-dev

unread,
Jun 24, 2016, 2:35:21 PM6/24/16
to Matthias Braun, llvm-dev
On Tue, Jun 21, 2016 at 12:31 AM, Matthias Braun <ma...@braunis.de> wrote:
I just discussed this with vivek on IRC (and I think we agreed on this):

Let me first state the motivation clearly to ease later discussions:
As far as the motivation for this change goes: Changing the calling convention allows us to choose whether a register is saved by the callee or the caller. Usually it is best to have a mix of both as too many caller saved registers leads to unnecessary save/restores when the called function turns out to only touch a fraction of the registers (as is typically for smaller leaf-functions of the call graph). While too many callee saved registers may lead to unnecessary saves/restores of registers even though the calling function didn't have a live value in the register anyway. With IPRA the first problem is mitigated since we propagate the actually clobbered set of registers up the callgraph instead of relying on conventions, so it is best to aim for more caller saved registers (though we should check for code size increases and store/restore code being potentially less good than the tuned sequences generated during FrameLowering).

To the disucssion at hand:
- Introducing a new calling convention at the IR level is the wrong approach: The calling convention is mostly a contract when calling and being called across translation unit boundaries. The details about how this contract is fulfilled are part of CodeGen IMO but do not need to be visible at the IR level.
- The only thing we want to influence here is which registers are saved by the callee. Changing TargetFrameLowering::determineCalleeSaves() is a good place to achieve this without affecting unrelated things like parameter and return value handling which would be part of the calling convention.
Hello Matthias,

As per our discussion, the above trick will make sure that there is no callee saved registers and also we have thought that RegUsageInfoCalculator.cpp is having regmask that will make caller to save restore registers if both callee and caller is using any common register but this would require following change in RegUsageInfoCalculator.cpp :

if (!F->hasLocalLinkage() || F->hasAddressTaken()) {
      const uint32_t *CallPreservedMask =
        TRI->getCallPreservedMask(MF, MF.getFunction()->getCallingConv());
      // Set callee saved register as preserved.
      for (unsigned i = 0; i < RegMaskSize; ++i)
        RegMask[i] = RegMask[i] | CallPreservedMask[i];
  }

because RegUsageInfoCalculator.cpp marks register as preserved if MF's CC preserves it. But While optimizing for callee saved register we need to skip above code so that register save/restore code is adder around call site.

Apart from that my hunch is that IPO inlining of static function also creates problem for this ( I have this feeling because some test case from test-suite fails when using -O > 0 ). I am still working on this. 

Please share your thoughts on this.

Sincerely,
- Vivek

Matthias Braun via llvm-dev

unread,
Jun 24, 2016, 6:36:57 PM6/24/16
to vivek pandya, llvm-dev
On Jun 24, 2016, at 11:35 AM, vivek pandya <vivekv...@gmail.com> wrote:



On Tue, Jun 21, 2016 at 12:31 AM, Matthias Braun <ma...@braunis.de> wrote:
I just discussed this with vivek on IRC (and I think we agreed on this):

Let me first state the motivation clearly to ease later discussions:
As far as the motivation for this change goes: Changing the calling convention allows us to choose whether a register is saved by the callee or the caller. Usually it is best to have a mix of both as too many caller saved registers leads to unnecessary save/restores when the called function turns out to only touch a fraction of the registers (as is typically for smaller leaf-functions of the call graph). While too many callee saved registers may lead to unnecessary saves/restores of registers even though the calling function didn't have a live value in the register anyway. With IPRA the first problem is mitigated since we propagate the actually clobbered set of registers up the callgraph instead of relying on conventions, so it is best to aim for more caller saved registers (though we should check for code size increases and store/restore code being potentially less good than the tuned sequences generated during FrameLowering).

To the disucssion at hand:
- Introducing a new calling convention at the IR level is the wrong approach: The calling convention is mostly a contract when calling and being called across translation unit boundaries. The details about how this contract is fulfilled are part of CodeGen IMO but do not need to be visible at the IR level.
- The only thing we want to influence here is which registers are saved by the callee. Changing TargetFrameLowering::determineCalleeSaves() is a good place to achieve this without affecting unrelated things like parameter and return value handling which would be part of the calling convention.
Hello Matthias,

As per our discussion, the above trick will make sure that there is no callee saved registers and also we have thought that RegUsageInfoCalculator.cpp is having regmask that will make caller to save restore registers if both callee and caller is using any common register but this would require following change in RegUsageInfoCalculator.cpp :

if (!F->hasLocalLinkage() || F->hasAddressTaken()) {
      const uint32_t *CallPreservedMask =
        TRI->getCallPreservedMask(MF, MF.getFunction()->getCallingConv());
      // Set callee saved register as preserved.
      for (unsigned i = 0; i < RegMaskSize; ++i)
        RegMask[i] = RegMask[i] | CallPreservedMask[i];
  }

because RegUsageInfoCalculator.cpp marks register as preserved if MF's CC preserves it. But While optimizing for callee saved register we need to skip above code so that register save/restore code is adder around call site.
Indeed some adjustment there should improve your results. I would however recommend to use something like this to determine which registers got saved:

const MachineFrameInfo &MFI = *MF.getFrameInfo();
assert(MFI.isCalleeSavedInfoValid());
for (const CalleeSavedInfo &Info : MFI.getCalleeSavedInfo()) {
   // Mark Info.getReg() and all its subregisters as preserved here!
}

- Matthias

vivek pandya via llvm-dev

unread,
Jun 24, 2016, 11:19:09 PM6/24/16
to Matthias Braun, llvm-dev
On Sat, Jun 25, 2016 at 4:06 AM, Matthias Braun <ma...@braunis.de> wrote:

On Jun 24, 2016, at 11:35 AM, vivek pandya <vivekv...@gmail.com> wrote:



On Tue, Jun 21, 2016 at 12:31 AM, Matthias Braun <ma...@braunis.de> wrote:
I just discussed this with vivek on IRC (and I think we agreed on this):

Let me first state the motivation clearly to ease later discussions:
As far as the motivation for this change goes: Changing the calling convention allows us to choose whether a register is saved by the callee or the caller. Usually it is best to have a mix of both as too many caller saved registers leads to unnecessary save/restores when the called function turns out to only touch a fraction of the registers (as is typically for smaller leaf-functions of the call graph). While too many callee saved registers may lead to unnecessary saves/restores of registers even though the calling function didn't have a live value in the register anyway. With IPRA the first problem is mitigated since we propagate the actually clobbered set of registers up the callgraph instead of relying on conventions, so it is best to aim for more caller saved registers (though we should check for code size increases and store/restore code being potentially less good than the tuned sequences generated during FrameLowering).

To the disucssion at hand:
- Introducing a new calling convention at the IR level is the wrong approach: The calling convention is mostly a contract when calling and being called across translation unit boundaries. The details about how this contract is fulfilled are part of CodeGen IMO but do not need to be visible at the IR level.
- The only thing we want to influence here is which registers are saved by the callee. Changing TargetFrameLowering::determineCalleeSaves() is a good place to achieve this without affecting unrelated things like parameter and return value handling which would be part of the calling convention.
Hello Matthias,

As per our discussion, the above trick will make sure that there is no callee saved registers and also we have thought that RegUsageInfoCalculator.cpp is having regmask that will make caller to save restore registers if both callee and caller is using any common register but this would require following change in RegUsageInfoCalculator.cpp :

if (!F->hasLocalLinkage() || F->hasAddressTaken()) {
      const uint32_t *CallPreservedMask =
        TRI->getCallPreservedMask(MF, MF.getFunction()->getCallingConv());
      // Set callee saved register as preserved.
      for (unsigned i = 0; i < RegMaskSize; ++i)
        RegMask[i] = RegMask[i] | CallPreservedMask[i];
  }

because RegUsageInfoCalculator.cpp marks register as preserved if MF's CC preserves it. But While optimizing for callee saved register we need to skip above code so that register save/restore code is adder around call site.
Indeed some adjustment there should improve your results.
I think before improvement I need to get this functionally correct while  local functions are optimized not to have callee saved registers, caller is not saving appropriate registers as TargetFrameLowering::determineCalleeSaves() just effects callee saved register. RegMask needs to be updated to reflect change too.
-Vivek
Reply all
Reply to author
Forward
0 new messages