This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Fix stack slot for argument types (Bug 49500)
ClosedPublic

Authored by frasercrmck on Mar 22 2021, 9:21 AM.

Details

Summary

This is an complementary/alternative fix for D99068. It takes a slightly
different approach by explicitly summing up all of the required split
part type sizes and ensuring we allocate enough space for them. It also
takes the maximum alignment of each part.

Compared with D99068 there are fewer changes to the stack objects in
existing tests. However, @luismarques has shown in that patch that there
are opportunities to reduce our stack usage in the future.

Diff Detail

Event Timeline

frasercrmck created this revision.Mar 22 2021, 9:21 AM
frasercrmck requested review of this revision.Mar 22 2021, 9:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 22 2021, 9:21 AM

There are certainly fewer changes to existing tests, compared with

D99068. But if @luismarques is right in that there is excessive stack
alignment then this patch needs further work.

My analysis was a bit rushed. Looking at just the stack-slot-size.ll tests, the allocas are fine. The stack has some spare space (due to the correct alignment) and the alloca offset was chosen differently than I had expected when I originally looked at it (36 vs 32), but both options are valid.
The other tests I haven't yet looked into in detail to verify if the changes are correct. On the one hand, having this patch with fewer changes is appealing, because it's much easier to review. On the other hand, I see that my patch has reduced the stack size for several test cases, so if those changes are correct they are an improvement, and it would be nice to reap that reward.
While looking into this issue now, I explored some other patch variants and they had even more test case changes, possibly (correct) improvements, so maybe it's worth it looking into this more carefully? Or maybe, given the priorities, we should just commit this patch for now (looks OK to me)... Decisions, decisions. @asb any thoughts?

My analysis was a bit rushed. Looking at just the stack-slot-size.ll tests, the allocas are fine. The stack has some spare space (due to the correct alignment) and the alloca offset was chosen differently than I had expected when I originally looked at it (36 vs 32), but both options are valid.
The other tests I haven't yet looked into in detail to verify if the changes are correct. On the one hand, having this patch with fewer changes is appealing, because it's much easier to review. On the other hand, I see that my patch has reduced the stack size for several test cases, so if those changes are correct they are an improvement, and it would be nice to reap that reward.
While looking into this issue now, I explored some other patch variants and they had even more test case changes, possibly (correct) improvements, so maybe it's worth it looking into this more carefully? Or maybe, given the priorities, we should just commit this patch for now (looks OK to me)... Decisions, decisions. @asb any thoughts?

Sorry for the slow reply, I've been away from LLVM for a few days, and have kind of been waiting in case @asb had some input. In general, I don't mind which patch goes in but, conceptually, splitting it more into "non-functional" and "optimize stack usage" changes probably sounds like better design. I do think it's worth investigating whether there's work to do on reducing the stack size.

asb added a comment.Mar 31 2021, 6:40 AM

I've been a bit busy with some none-LLVM things the past few days too, so sorry for the delay in commenting. I'll try and loop back shortly.

asb added a comment.Apr 1 2021, 7:10 AM

@frasercrmck you mentioned this patch doesn't quite seem right to you - is there a particular part you're concerned about. I've spent some time going over both this and D99068. The logic of this patch makes more intuitive sense to me, but that may be as it's a long time since I have a better memory of the logic it's modifying. Though Luis' approach does indeed seem to reduce stack size in some cases, and of course directly mirrors the matching SystemZ fix.

In D99087#2663805, @asb wrote:

@frasercrmck you mentioned this patch doesn't quite seem right to you - is there a particular part you're concerned about. I've spent some time going over both this and D99068. The logic of this patch makes more intuitive sense to me, but that may be as it's a long time since I have a better memory of the logic it's modifying. Though Luis' approach does indeed seem to reduce stack size in some cases, and of course directly mirrors the matching SystemZ fix.

Er no, it was more a three-fold hunch, given that I had only dipped my toe into this bit of the ABI: that I couldn't find a hook or setting that does it automatically, that other targets hadn't been required to do the same, and that other targets hadn't been able to fix it "nicely" (without two loops).

Thanks for going over it, though. It's a relief that both you and @luismarques seem to agree it does the job.

This revision is now accepted and ready to land.Apr 1 2021, 10:10 AM

BTW, it shouldn't affect correctness but the original CreateStackTemporary would use (as part of that method's implementation) getPrefTypeAlign instead of getEVTAlign. In principle that would allow better alignment choices, although at the moment it probably would never make any difference.

  • rebase on top of pre-committed test
frasercrmck retitled this revision from [RISCV][WIP] Fix stack slot for argument types (Bug 49500) to [RISCV] Fix stack slot for argument types (Bug 49500).Apr 5 2021, 4:06 AM
frasercrmck edited the summary of this revision. (Show Details)
frasercrmck edited the summary of this revision. (Show Details)

BTW, it shouldn't affect correctness but the original CreateStackTemporary would use (as part of that method's implementation) getPrefTypeAlign instead of getEVTAlign. In principle that would allow better alignment choices, although at the moment it probably would never make any difference.

Hmm good catch. Shame there's no DAG wrapper for that method; targets don't typically use it. I could look into it before merging, if you like?

Hmm good catch. Shame there's no DAG wrapper for that method; targets don't typically use it. I could look into it before merging, if you like?

Using the preferred type align does produce changes in 7 of our tests. Maybe we should leave it to a follow-up patch? We'd be best verifying that each is correct.

Hmm good catch. Shame there's no DAG wrapper for that method; targets don't typically use it. I could look into it before merging, if you like?

Using the preferred type align does produce changes in 7 of our tests. Maybe we should leave it to a follow-up patch? We'd be best verifying that each is correct.

My concern with leaving it up to a follow-up patch is that 1) for "normal" types, we're already using getPrefTypeAlign (under the covers), so in a sense this patch would be a slight regression; 2) it's easy to never follow up on these kinds of things.
Do you reckon it would be hard to verify each? (I imagine that the stack allocation is either the same or increases for all of those impacted tests, right?)

  • update for preferred EVT alignment

My concern with leaving it up to a follow-up patch is that 1) for "normal" types, we're already using getPrefTypeAlign (under the covers), so in a sense this patch would be a slight regression; 2) it's easy to never follow up on these kinds of things.
Do you reckon it would be hard to verify each? (I imagine that the stack allocation is either the same or increases for all of those impacted tests, right?)

Yes, fair enough. I've updated the diff to use the preferred type alignment. We can always revert for whatever reason. I'll try and take a look through the diffs later.

Yes, fair enough. I've updated the diff to use the preferred type alignment. We can always revert for whatever reason. I'll try and take a look through the diffs later.

Thanks. Also, it seems I was wrong, you do sometimes get smaller stack sizes despite getPrefTypeAlign >= getEVTAlign. Weird.

Yes, fair enough. I've updated the diff to use the preferred type alignment. We can always revert for whatever reason. I'll try and take a look through the diffs later.

Thanks. Also, it seems I was wrong, you do sometimes get smaller stack sizes despite getPrefTypeAlign >= getEVTAlign. Weird.

Yeah it's interesting. Sorry, I haven't had time to investigate yet. I'll do that today. Have you taken a look, by any chance?

Thanks. Also, it seems I was wrong, you do sometimes get smaller stack sizes despite getPrefTypeAlign >= getEVTAlign. Weird.

Yeah it's interesting. Sorry, I haven't had time to investigate yet. I'll do that today. Have you taken a look, by any chance?

Sorry, I didn't.

Yeah it's interesting. Sorry, I haven't had time to investigate yet. I'll do that today. Have you taken a look, by any chance?

Sorry, I didn't.

No worries. I just noticed my latest "pref" diff mistakenly changed the initial StackAlign from max(align(ArgValueVT), align(Outs[i].ArgVT)) to max(prefalign(ArgValueVT), prefalign(ArgValueVT)). I now think that this is where the diffs came from. The align of the Outs[i].ArgVT type is often likely larger than the split type (like i128 vs. XLEN), so by removing it we reduce the stack size.

Right now I suspect I should revert this back to the preferred alignment of Outs[i].ArgVT so we're at least keeping the same alignment (or safely increasing it to the split type). Any thoughts?

I don't know if we strictly need the alignment of the original type or not.

  • rebase
  • fix preferred alignment, keeping alignment of original ArgVT
  • undo most of the test changes

No worries. I just noticed my latest "pref" diff mistakenly changed the initial StackAlign from max(align(ArgValueVT), align(Outs[i].ArgVT)) to max(prefalign(ArgValueVT), prefalign(ArgValueVT)). I now think that this is where the diffs came from. The align of the Outs[i].ArgVT type is often likely larger than the split type (like i128 vs. XLEN), so by removing it we reduce the stack size.

Right now I suspect I should revert this back to the preferred alignment of Outs[i].ArgVT so we're at least keeping the same alignment (or safely increasing it to the split type). Any thoughts?

Ahh, that explains it. Yes, I think we need that change (that you already made).

I don't know if we strictly need the alignment of the original type or not.

I think we do, per the psABI alignment requirements.
I'll go over the patch more thoroughly later, but this now seems more like what I would expect and could probably be committed.

I think we do, per the psABI alignment requirements.

Yeah that makes sense.

I'll go over the patch more thoroughly later, but this now seems more like what I would expect and could probably be committed.

Have you had a chance to look over it?

Have you had a chance to look over it?

Sorry, not yet. I'll go over it again later today, just to confirm everything, but IIRC the patch was now in mergeable shape.

luismarques accepted this revision.Apr 28 2021, 4:28 PM

LGTM. Thank you!

This revision was automatically updated to reflect the committed changes.