This is an archive of the discontinued LLVM Phabricator instance.

[AttributorAttributes] Remove hardcoded parameters
AbandonedPublic

Authored by Bryce-MW on Jan 10 2022, 1:45 PM.

Details

Summary

Following up with some of the recent refactoring of MemoryBuiltins, the size and allignment parameters used in heap-to-stack can be removed. This allows heap-to-stack to work with any allocation functions (that allocate undefined or zeroed memory). The alignment is important now because in the past, aligned operator new may not create a properly aligned allocation when being moved to the stack.

There are however a few remaining issues that I would apprecitate someone looking at.

  • Is it possible to get away without adding getAllocSizeArgs to MemoryBuiltins. Based on some email conversations that I have had, getObjectSize would be prefered but in this case, we want to get the size arguments even if actual size is not a known constant. It's also used to get one of the types for the memset on line 5999. I don't fully understand what that type is doing so there may be a better way to do that.
  • If the alignment is specified but is not a known constant, I use the maximum alignment as the alignment. Is this acceptable or should we use the minimum alignment or perhaps there is some way or change that could allow the alignment to be dynamic.

Diff Detail

Event Timeline

Bryce-MW created this revision.Jan 10 2022, 1:45 PM
Bryce-MW requested review of this revision.Jan 10 2022, 1:45 PM
Herald added a reviewer: sstefan1. · View Herald Transcript
Herald added a reviewer: baziotis. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
Bryce-MW updated this revision to Diff 398734.Jan 10 2022, 1:50 PM
  • getAllocSizeArgs doesn't work with strdup

Is it possible to get away without adding getAllocSizeArgs to MemoryBuiltins. Based on some email conversations that I have had, getObjectSize would be prefered but in this case, we want to get the size arguments even if actual size is not a known constant. It's also used to get one of the types for the memset on line 5999. I don't fully understand what that type is doing so there may be a better way to do that.

I think h2s for non-constant size allocations should be possible (through a user switch). Thus, we should not require a constant object size. We might even use our own constant value info later to get an upper bound for non-constant allocations.
The memset type things is an artifact though. The idea is that the type used for the size parameter must be good enough for the memset size parameter, though, we probably can use some other type as well.

If the alignment is specified but is not a known constant, I use the maximum alignment as the alignment. Is this acceptable or should we use the minimum alignment or perhaps there is some way or change that could allow the alignment to be dynamic.

Setting the alignment to maximum doesn't make sense (and won't work). Minimum, or any other value that is too low, won't work either. So the alignment actually has to be a constant, or at least bounded by a (small) constant.

llvm/lib/Transforms/IPO/AttributorAttributes.cpp
5972

I'm not sure I understand why we take the maximum of something and the MaximumAlignment?

At least for now, I'll leave in the getAllocSizeArgs function since I don't see another way to get the non constant size. If we can't handle non-constant alignment, what should we do with calls that have it? I suspect that's a situation that would be very rare so I would be fine just cancelling the conversion but I'm not sure how to do that inside of manifest. Maybe I can just change line 5967 to continue; but I want to check if I have to set something to invalid or something else. It seems like the old version just asserts that it's constant so I suppose I could do that as well but it doesn't feel right to me.

llvm/lib/Transforms/IPO/AttributorAttributes.cpp
5972

Now that I look at it again, I do agree that it doesn't make any sense. Especially since Alignment starts at 1

Based on the build results. It looks like there are some other issues. I'll try to debug them later today or tomorrow.

Bryce-MW updated this revision to Diff 398765.Jan 10 2022, 3:50 PM
  • Check for undef value correctly
Bryce-MW updated this revision to Diff 398787.Jan 10 2022, 5:13 PM
  • Possibly fix unknown alignment issue
Bryce-MW updated this revision to Diff 398789.Jan 10 2022, 5:20 PM
Bryce-MW marked an inline comment as done.
  • Possibly fix unknown alignment issue
Bryce-MW updated this revision to Diff 398790.Jan 10 2022, 5:21 PM
  • Possibly fix unknown alignment issue
Bryce-MW updated this revision to Diff 399140.Jan 11 2022, 5:27 PM
  • Rebase
  • Try different getSize

The only thing I could think of that could cause that test to fail would be the change to getSize since it was changed to use getObjectSize which I don't fully understand the workings of. I've been told that getObjectSize is prefered but if this works and it doesn't then I'm not sure what I can do about it.

Bryce-MW updated this revision to Diff 399141.Jan 11 2022, 5:29 PM
  • Forgot the constant alignment check

I am unsure why tools/llvm-libtool-darwin/L-and-l.test failed on Windows. It didn't fail before and seems completely unrelated. I've done a rebase in case that helps.

I would prefer not to use getAllocSizeArgs inside of getSize over using getObjectSize but the former passes the tests and the latter doesn't. Perhaps it is a bug in getObjectSize but I don't understand that function well enough.

Bryce-MW updated this revision to Diff 399200.Jan 11 2022, 9:41 PM
  • Allocation must be removable to put it on the stack. Currently, all allocations are but this may change for some languages.
Bryce-MW updated this revision to Diff 399207.Jan 11 2022, 10:20 PM
  • Rebase to fix failing test on Windows

There are some parts of this which look obviously correct, and a couple that I don't understand and want to leave to @jdoerfert. Bryce, do you mind if I split and land a few pieces of this so you can rebase leaving only the interesting parts?

If so, please confirm again that this is intended for contribution and you agree to the license terms. (You should probably just email Chris for commit rights...)

llvm/lib/Analysis/MemoryBuiltins.cpp
315

Please replace this with assert(isAllocationFn()). There's no reason we can't return a result for e.g. op new. The current client won't use it, but we should be generic.

reames added inline comments.Jan 12 2022, 8:13 AM
llvm/lib/Transforms/IPO/AttributorAttributes.cpp
5840

Can be null in general, might be prevented by the kind precondition, might not. Haven't looked.

5843

I think all you need in client code is a boolean IsZeroInit.

5989

I might be missing something, but I think you can simplify/generalize this code by querying the init value as an i8 before deleting the call using AI.CB, use that value in the memset, and then just check that the value being memset is not undef/poison. (The last check is in fact optional - it's a minor compile time win.)

I think this removes the need for the whole alloc state enum, and upfront detection.

Bryce-MW updated this revision to Diff 399428.Jan 12 2022, 12:56 PM
Bryce-MW marked 2 inline comments as done.
  • Check initial value when needed
llvm/lib/Analysis/MemoryBuiltins.cpp
315

I believe MallocOrCallocLike already includes op new.

MallocLike         = 1<<1 | OpNewLike

My intention was to include everything but StrDupLike since it has size parameters that work differently.

llvm/lib/Transforms/IPO/AttributorAttributes.cpp
5840

I had meant for null to be a possibility. That would be the only case where the else would happen.

5843

Good point

5989

The main reason for the early detection is if getInitialValueOfAllocation returned null, then we wouldn't be able to remove the call. But maybe it makes sense to do it twice. Once to see if we know the initial value, and once to see what to memset. That's probably a better design (in case we ever have an allocation function that allocates all FF for some reason or something like that). I'll change that

I'm not totally sure why it it failing. Something about the memset parameters not being correct though I'm not entirely sure why. I'll look into it. I also forgot to confirm that this is intended for contribution and I agree to the license terms. I'll ask for commit rights after this. I think having a track record of a few commits is probably good.

Bryce-MW updated this revision to Diff 399501.Jan 12 2022, 3:57 PM
  • Attempt to fix type issue
Bryce-MW updated this revision to Diff 399513.Jan 12 2022, 4:58 PM
  • Rebase since test issue seems unrelated

Landed a couple changes inspired by this review and discussion here. None were directly code copied from the review, but all inspired thereby.

commit d1f4c6a6112a9868f178d3a4c666f04ac5dc7415 (HEAD -> main)
Author: Philip Reames <listmail@philipreames.com>
Date: Wed Jan 12 16:43:02 2022 -0800

[Attributor] Generalize calloc handling in heap-to-stack for any init value [NFC]

commit 8e76720cf2c0fb230777e98ad5a41175efc4d404
Author: Philip Reames <listmail@philipreames.com>
Date: Wed Jan 12 16:16:36 2022 -0800

[Attributor] Reuse object size evaluation code [NFC]

commit db57065b368ab4a078b749ccf14b0f384ab4571f
Author: Philip Reames <listmail@philipreames.com>
Date: Wed Jan 12 16:04:18 2022 -0800

[Attributor] Use getAllocAlignment where possible [NFC]

Remaining pieces I see:

  • Figuring out how to generalize getSize. This is the main change which requires the size arg table. I keep thinking there should be a way to reuse existing code for this, but if so, I am not currently seeing it.
  • Rewriting the entry condition guards (e.g. removable with known initial value) and removing the enum.

I suspect these should be separate patches.

One more cleanup

commit 9979299705817c90ef7e4928ba1fb10243cecb67
Author: Philip Reames <listmail@philipreames.com>
Date: Wed Jan 12 17:32:05 2022 -0800

[Attributor] Simplify how we handle required alignment during heap-to-stack [NFC]

This should simplify the getSize rewrite.

Bryce-MW updated this revision to Diff 399524.Jan 12 2022, 6:21 PM

I've rebased onto your commits. I don't really understand how the size system you put in works so I'll just have to trust you. Hopefully we can remove getAllocSizeArgs from getSize. Using getObjectSize was causing errors in tests before but it would be good to remove this extra function. Thanks for those commits!

Went to split out and land your getSize changes, and noticed two things:

  1. The existing code is wrong. There's a missing trunc/zext to pointer index type which can result in wrong results in edge cases. (Not a bug in the patch, a bug in the code you're changing.)
  2. I think I found a better approach than your getAllocSizeParams. We can expose essentially the body of the visitCallBase used by getObjectSize with a callback for replacing an operand with a constant. I'm going to write that up tomorrow, and see if it works out the way I'm currently picturing.

Thanks! I wasn't a big fan of getAllocSizeParams anyway. I'll wait to see how your new version works.

Bryce-MW abandoned this revision.Jan 13 2022, 3:30 PM

I'm abandoning this in favour of D117241