Page MenuHomePhabricator
Feed Advanced Search

Today

nlopes added a comment to D97924: [LangRef] clarify the semantics of nocapture.

@aqjune ping: there are a few sentences that need tweaking.

Mon, Apr 12, 11:25 AM · Restricted Project
nlopes accepted D69428: [GlobalOpt] Remove valgrind specific hacks (revert r160529).

Please go ahead. The relevant stakeholders didn't reply, so let's assume they are not interested in this functionality anymore.
Anyway, these days people can use @llvm.used if needed.

Mon, Apr 12, 11:13 AM

Thu, Apr 8

nlopes added a comment to D99642: For non-null pointer checks, do not descend through out-of-bounds GEPs.

LGTM, thanks!

@nlopes do you have any more thoughts on the difference between LLVM & Alive2 on this topic?

Thu, Apr 8, 2:01 AM · Restricted Project

Sat, Apr 3

nlopes added a comment to rGef1f90ba6761: [SLP]Added a test for min/max reductions with the key store inside, NFC..

It should be fixed by https://reviews.llvm.org/rG5fcb07a07020 already.

Sat, Apr 3, 11:50 AM
nlopes added a comment to rGef1f90ba6761: [SLP]Added a test for min/max reductions with the key store inside, NFC..

Is this bug being tracked somewhere? Alive2 complains about this:

@arr = global 128 bytes, align 16
@var = global 4 bytes, align 8
Sat, Apr 3, 3:19 AM

Wed, Mar 31

nlopes added a comment to D99642: For non-null pointer checks, do not descend through out-of-bounds GEPs.

alive2 again doesn't agree that non-inbounds GEP is allowed to produce null pointer: https://alive2.llvm.org/ce/z/9wfL5x

Interesting, but also surprising, especially because the LangRef explicitly calls out GEPs without inbounds to wrap silently?

Yes.
I think it's another example of alive2 trying to invent a memory model that isn't fit for the real world.

Let me try to explain our reasoning:

  • GEP doesn't change the object of a pointer, just the offset within that object. Having this rule enables a lot of optimizations (e.g. you know "for free" that gep %p, %x and gep %q, %y can't alias if %p/%q are the result of e.g. distinct alloca/malloc, even if you know nothing about %x/%y).

I think that's fine for AA, which generally looks at pointers that are dereferenced. It seems a bit unfortunate though that we determine that the pointer is based on object but still may not be de-referenceable if the object is though.

Wed, Mar 31, 3:36 AM · Restricted Project
nlopes added a comment to D99642: For non-null pointer checks, do not descend through out-of-bounds GEPs.

alive2 again doesn't agree that non-inbounds GEP is allowed to produce null pointer: https://alive2.llvm.org/ce/z/9wfL5x

Interesting, but also surprising, especially because the LangRef explicitly calls out GEPs without inbounds to wrap silently?

Yes.
I think it's another example of alive2 trying to invent a memory model that isn't fit for the real world.

Wed, Mar 31, 3:12 AM · Restricted Project

Tue, Mar 30

nlopes updated subscribers of D69428: [GlobalOpt] Remove valgrind specific hacks (revert r160529).

@nlopes

Not sure you are still interested in this patch. If so, I would suggest you get in touch with some Google folks and check with them if their codebase is ready for this patch. They were the only reason for this workaround.

Yeah, it would be nice to get rid of this valgrind legacy (btw, there is already D70006 which eliminates such kind of unneeded globals for thin LTO). However I don't whom to contact.

Tue, Mar 30, 3:07 AM
nlopes committed rGad613b149733: [docs] remove references to checking out svn repos (authored by nlopes).
[docs] remove references to checking out svn repos
Tue, Mar 30, 2:01 AM
nlopes added a comment to D69428: [GlobalOpt] Remove valgrind specific hacks (revert r160529).

Not sure you are still interested in this patch. If so, I would suggest you get in touch with some Google folks and check with them if their codebase is ready for this patch. They were the only reason for this workaround.

Tue, Mar 30, 1:24 AM

Thu, Mar 25

nlopes added a comment to D99121: [IR][InstCombine] IntToPtr Produces Typeless Pointer To Byte.

Given that it gives a decent speedup for some important workload, and provided it doesn't regress others, I think this should go in then.
It's easy to revert this once opaque pointers arrive.

Thu, Mar 25, 4:39 AM · Restricted Project, Restricted Project

Wed, Mar 24

nlopes resigned from D99135: [deref] Implement initial set of inference rules for deref-at-point.
Wed, Mar 24, 9:24 AM · Restricted Project
nlopes added inline comments to D99135: [deref] Implement initial set of inference rules for deref-at-point.
Wed, Mar 24, 4:38 AM · Restricted Project
nlopes added a comment to D99121: [IR][InstCombine] IntToPtr Produces Typeless Pointer To Byte.

The pointee type in LLVM doesn't really matter. It's even supposed to disappear one day after the migration is completed.
E.g., i8* and i64* are exactly the same thing: they are pointers to data.

Yep. That will be indeed a great to see.

So, I don't understand the motivation for this patch. It doesn't solve the root cause of the problem (which one btw?).

It is indeed temporary until Opaque pointers are here.
The problem has been stated last time in D99051 by @ruiling:
https://godbolt.org/z/x7E1EjWvv, i.e. given the same integer,
there can be any number of pointers inttoptr'd from it,
and passes won't be able to tell that they are identical.

@dblaikie @t.p.northover can anyone comment on the Opaque Pointers progress? Is there a checklist somewhere?

no checklist, unfortunately - myself, @t.p.northover, @jyknight, and @arsenm have all done bits and pieces of work on it lately.

I think we've got most of the big IR changes (adding explicit types where they'll be needed when they're no longer carried on the type of pointer parameters) - @arsenm's D98146 is another piece in that area, hopefully near the last I think.

After all that's in place, the next step I think would be to introduce the typeless pointer, support it as an operand to these various operations - and then try producing it as a result of instructions too. But I'm probably missing a bunch of important steps we'll find are necessary...

Wed, Mar 24, 3:49 AM · Restricted Project, Restricted Project

Tue, Mar 23

nlopes added inline comments to D99135: [deref] Implement initial set of inference rules for deref-at-point.
Tue, Mar 23, 12:40 PM · Restricted Project
nlopes added inline comments to D99135: [deref] Implement initial set of inference rules for deref-at-point.
Tue, Mar 23, 11:26 AM · Restricted Project
nlopes added inline comments to D99135: [deref] Implement initial set of inference rules for deref-at-point.
Tue, Mar 23, 9:58 AM · Restricted Project
nlopes added inline comments to D99138: [deref] Use readonly to infer global dereferenceability in a callee.
Tue, Mar 23, 4:33 AM · Restricted Project
nlopes requested changes to D99135: [deref] Implement initial set of inference rules for deref-at-point.
Tue, Mar 23, 4:19 AM · Restricted Project
nlopes added a comment to D99121: [IR][InstCombine] IntToPtr Produces Typeless Pointer To Byte.

The pointee type in LLVM doesn't really matter. It's even supposed to disappear one day after the migration is completed.
E.g., i8* and i64* are exactly the same thing: they are pointers to data.
So, I don't understand the motivation for this patch. It doesn't solve the root cause of the problem (which one btw?).

Tue, Mar 23, 3:59 AM · Restricted Project, Restricted Project

Sat, Mar 20

nlopes added a comment to D98179: [lit] Sort test start times based on prior test timing data.

Why are timeouts important? Our use case is running Alive2 with the test suite. Alive2 is heavy machinery and runs into timeouts. Running the tests in roughly the same order every time is important to avoid timeout tests flipping to failed or vice-versa. Plus slow tests = tests that consume a lot of memory (in our scenario), so we can't bundle slow tests together.
Adding a --ignore-timing-data would be great, yes! But I still feel that sorting the list of failed tests is important for user experience. I diff these all the time.

That still sounds incredibly brittle. If there is any variety in test machine performance, then any and all attempts at sorting should be futile because the underlying hardware will perturb different timeouts. Is this not your experience? How do you reconcile hardware performance and configuration details (like SMT) with timeout settings?

Of course it's brittle :) Changing from a time-based setting to a ticks-based system is ongoing work, such that resource exhaustion becomes deterministic.
Nevertheless, on a same machine, we don't see many test flips. It's quite stable most of the times (just one test flip once in a while).

This seems really beyond the scope and purpose of sorting the tests.

If you don't mind and given that the workaround is trivial (delete the timing data), I'd like to hold off on adding --ignore-timing-data. If enough people complain then we can add that option. Is that okay with you?

Sat, Mar 20, 10:46 AM · Restricted Project, Restricted Project, Restricted Project, Restricted Project
nlopes added a comment to D98179: [lit] Sort test start times based on prior test timing data.

Why are timeouts important? Our use case is running Alive2 with the test suite. Alive2 is heavy machinery and runs into timeouts. Running the tests in roughly the same order every time is important to avoid timeout tests flipping to failed or vice-versa. Plus slow tests = tests that consume a lot of memory (in our scenario), so we can't bundle slow tests together.
Adding a --ignore-timing-data would be great, yes! But I still feel that sorting the list of failed tests is important for user experience. I diff these all the time.

That still sounds incredibly brittle. If there is any variety in test machine performance, then any and all attempts at sorting should be futile because the underlying hardware will perturb different timeouts. Is this not your experience? How do you reconcile hardware performance and configuration details (like SMT) with timeout settings?

Sat, Mar 20, 7:37 AM · Restricted Project, Restricted Project, Restricted Project, Restricted Project
nlopes added a comment to rG5cbe2279f723: [lit] Sort testing summary output.

Thank you!

Sat, Mar 20, 7:24 AM
nlopes added a comment to D98179: [lit] Sort test start times based on prior test timing data.

I'm talking about sorting just the summary of failed tests, not the whole output. We need the whole -vv output, but that can be out of order.

Sat, Mar 20, 4:25 AM · Restricted Project, Restricted Project, Restricted Project, Restricted Project

Fri, Mar 19

nlopes added a comment to D98179: [lit] Sort test start times based on prior test timing data.

Can we revert to the previous behavior please? The current behavior is not user friendly. Thanks!

To clarify: you care about the order in the final summary, not the actual execution order, right? (the goal of this patch is the latter, if it changes the former this is just a side-effect I believe)

Fri, Mar 19, 4:23 PM · Restricted Project, Restricted Project, Restricted Project, Restricted Project
nlopes added a comment to D98179: [lit] Sort test start times based on prior test timing data.

This patch makes the order of the list of failing tests non-deterministic. This is extremely annoying because you can't do a simple diff between test dumps anymore.
Before the list of failed tests used to be sorted.

Fri, Mar 19, 3:37 PM · Restricted Project, Restricted Project, Restricted Project, Restricted Project
nlopes accepted D98908: Update basic deref API to account for possiblity of free [NFC].

It is LGTM, yes.
Thanks for the details on the plan. Sounds great, thanks!

Fri, Mar 19, 10:06 AM · Restricted Project
nlopes added a comment to D98908: Update basic deref API to account for possiblity of free [NFC].

Sounds good to me, as long as you commit to removing that cmd switch within a reasonable time frame. Otherwise we accumulate technical debt.

Fri, Mar 19, 7:03 AM · Restricted Project

Thu, Mar 18

nlopes accepted D94964: [LangRef] Describe memory layout for vectors types.

Thanks for expanding on the padding. Sounds good to me.
LGTM.

Thu, Mar 18, 3:15 AM · Restricted Project

Wed, Mar 17

nlopes added a comment to D94964: [LangRef] Describe memory layout for vectors types.

Overall LGTM. Thanks for documenting this! It was painful to reverse-engineer this when implementing it in Alive2..

Wed, Mar 17, 12:28 PM · Restricted Project

Mar 9 2021

nlopes added a comment to D97924: [LangRef] clarify the semantics of nocapture.

Consider this example:

f(nocapture p) {
  p2 = load glb
  ret p2
}

According to your definition, this functions triggers UB if p2 == p, because it increases the number of observations of p. It's counter-intuitive to me that a function that doesn't touch its nocapture argument captures that argument.

I agree, this is slightly counter intuitive. I think it's also fundamentally necessary.

Mar 9 2021, 3:08 AM · Restricted Project

Mar 8 2021

nlopes added a comment to D97924: [LangRef] clarify the semantics of nocapture.
// G is only used in this function and always written first, no leakage!
static int *G;
void noescape1(int *p) {
    int q;
    G = cond() ? p : &q;
    *G = 3;
}
Mar 8 2021, 6:16 AM · Restricted Project

Mar 5 2021

nlopes added a comment to D97924: [LangRef] clarify the semantics of nocapture.

If we have an object which has not yet been captured passed to a nocapture argument of a function, we know that the object remains uncaptured after the call. Additionally, we also know that the callee hasn't increased the number of locations in which references to the object can be observed after the call. Thus, if the caller can precisely enumerate said set before the call, that said remains precise after the call completes.

Mar 5 2021, 10:23 AM · Restricted Project
nlopes added a comment to D94002: [LangRef] Make lifetime intrinsic's semantics consistent with StackColoring's comment.

The exact semantics of lifetime.start depends on the pattern matching patterns in the stack coloring algorithm. So this intrinsic cannot be abused. It must be used for the uses it was created for only.

That's fair, but then shouldn't the docs say that? Usually one would expect the docs to say everything there is to be said; so in this case a dedicated warning might be in order to document the caveats you just mentioned.

Mar 5 2021, 4:40 AM · Restricted Project
nlopes added inline comments to D94002: [LangRef] Make lifetime intrinsic's semantics consistent with StackColoring's comment.
Mar 5 2021, 4:16 AM · Restricted Project

Mar 2 2021

nlopes added a comment to D81678: Introduce noundef attribute at call sites for stricter poison analysis.

@rsmith already gave his blessing, so please go ahead.

Mar 2 2021, 11:19 AM · Restricted Project, Restricted Project
nlopes accepted D94002: [LangRef] Make lifetime intrinsic's semantics consistent with StackColoring's comment.

We've had a several months-long discussion on this topic. I think we've reached quorum to move forward.
The patch looks great, thanks for your work. Please go ahead and commit it!

Mar 2 2021, 10:15 AM · Restricted Project
nlopes resigned from D42879: InstCombine: 1./x >= 0. -> x >= 0..
Mar 2 2021, 9:55 AM · Restricted Project
nlopes resigned from D36878: Inst Combine GEP Flatten.
Mar 2 2021, 9:53 AM

Mar 1 2021

nlopes committed rW8377e21fa6bd: [GSoC] add 2021 project on LLVM IR issues (authored by nlopes).
[GSoC] add 2021 project on LLVM IR issues
Mar 1 2021, 11:00 AM

Feb 26 2021

nlopes added a comment to D88287: [NARY-REASSOCIATE] Support reassociation of min/max.

I think the solution is to use >= instead of > when we do min/max reassociation. In other words, originally we had 'any' > MAX_INT which is known to be false. If we want semantically equal but reassociated expression we should invert the comparison logic. In other words we should check MAX_INT >= 'any' which is known to be true and MAX_INT will be selected.

Feb 26 2021, 3:21 AM · Restricted Project
nlopes added a comment to D88287: [NARY-REASSOCIATE] Support reassociation of min/max.

I think this is an issue of verification itself. In the first case max(0, undef)=>any and max(any, max_int)=>max_int. In the second case max(max_int, undef)=>x03002006. I believe the behavior of the verifier is inconsistent in these two cases and max(max_int, undef) should be evaluated to max_int as well. We can do the following trivial transformations to prove that: max(max_int, undef) is trivially equal to max(max_int, max(undef, undef)) and max(undef, undef) should be evaluated to 'any' since max(0, undef) is evaluated to 'any' in the first case. Thus we get max(max_int, any) which is evaluated to 'max_int' in the first case. So max(max_int, undef) should be evaluated to 'max_int' but not 'x03002006'.

Makes sense?

Feb 26 2021, 2:38 AM · Restricted Project

Feb 25 2021

nlopes added a comment to D88287: [NARY-REASSOCIATE] Support reassociation of min/max.

This patch regressed the following tests:

  • LLVM :: Transforms/NaryReassociate/nary-smax.ll
  • LLVM :: Transforms/NaryReassociate/nary-smin.ll
  • LLVM :: Transforms/NaryReassociate/nary-umax.ll
  • LLVM :: Transforms/NaryReassociate/nary-umin.ll
Feb 25 2021, 8:39 AM · Restricted Project

Jan 11 2021

nlopes added a comment to D94014: [InstCombine] reduce icmp(ashr X, C1), C2 to sign-bit test.

Anyone see problems with this Alive2 implementation using count-leading-*?
https://alive2.llvm.org/ce/z/SWxadd

I also manually entered all of the i4 regression tests with fixed constants in Alive1 (rise4fun), and they appear to be correct as shown in the test diffs.

Jan 11 2021, 9:05 AM · Restricted Project
nlopes added a comment to D94014: [InstCombine] reduce icmp(ashr X, C1), C2 to sign-bit test.

alive1 does not actually have a countLeadingOnes() precondition

Weird - maybe @nlopes can tell us how this example parses at all then: https://rise4fun.com/Alive/juX1

(at least as per https://github.com/nunoplopes/alive/blob/master/constants.py),

Jan 11 2021, 8:00 AM · Restricted Project

Jan 10 2021

nlopes added a comment to D93820: [InstSimplify] Don't fold gep p, -p to null.

Based on what @RalfJung mentioned on zulip, the question of whether the transform is legal for inbounds comes down to the particular choice of inbounds semantics. I was using the semantics specified in LangRef, which make the optimization illegal, while @nlopes used the semantics from https://people.mpi-sws.org/~jung/twinsem/twinsem.pdf (or something similar), which makes it legal. The relevant difference to the LangRef semantics (if we stick to the gep-inbounds-logical case) would be:

- The base pointer has an in bounds address of an allocated object [...]
+ The base pointer has an in bounds address of the allocated object it is based on [...]

In any case, regardless of whether this is legal for the inbounds case, I think everyone agrees it's not legal for the non-inbounds case (and not legal for the non-null case regardless of inbounds). Is that enough to move forward here, or do you want me to thread inbounds information through SimplifyGEPInst and retain this optimization for the inbounds case?

Jan 10 2021, 7:52 AM · Restricted Project

Jan 7 2021

nlopes added a comment to D89697: * [x86] Implement smarter instruction lowering for FP_TO_UINT from vXf32 to vXi32 for SSE2 and AVX2 by using the exact semantic of the CVTTPS2SI instruction..

No regression appeared in our internal testcases.
It seems the transform is correct, have you verified it with alive-tv?

I was curious to see if I could model it:
https://alive2.llvm.org/ce/z/RXcYY9

Converting #x4f800000 (4294967296) to uint32_t is poison, not 0 though. Am I reading the Alive output correctly? (cc @lebedev.ri @aqjune @nlopes @nikic )

Jan 7 2021, 11:18 AM · Restricted Project
nlopes added a comment to D89697: * [x86] Implement smarter instruction lowering for FP_TO_UINT from vXf32 to vXi32 for SSE2 and AVX2 by using the exact semantic of the CVTTPS2SI instruction..

No regression appeared in our internal testcases.
It seems the transform is correct, have you verified it with alive-tv?

I was curious to see if I could model it:
https://alive2.llvm.org/ce/z/RXcYY9

Converting #x4f800000 (4294967296) to uint32_t is poison, not 0 though. Am I reading the Alive output correctly? (cc @lebedev.ri @aqjune @nlopes @nikic )

Jan 7 2021, 10:43 AM · Restricted Project

Jan 2 2021

nlopes added a comment to D87188: [InstCombine] Canonicalize SPF to abs intrinc.

Heads up: Breaks a test for us: https://bugs.chromium.org/p/chromium/issues/detail?id=1161542

(No reduced repro yet, might be UB, just fyi at this point.)

Thanks for headsup. For now i'll deal with the problem @nlopes pointed out above in a bit..

Just to follow up, this ended up being UB on our end (fix: https://android-review.googlesource.com/c/platform/external/perfetto/+/1535483)

Jan 2 2021, 8:51 AM · Restricted Project, Restricted Project

Dec 31 2020

nlopes added a comment to rGa2513cb8655e: remove pessimizing moves (reported by gcc 10).

I reverted this commit because it was causing problems with clang on several of the buildbots with the general flavor of the following error log:

http://lab.llvm.org:8011/#/builders/21/builds/5863/steps/5/logs/stdio

llvm-project/llvm/include/llvm/ExecutionEngine/Orc/Shared/RPCUtils.h:1519:14: error: call to deleted constructor of 'llvm::Error'
      return Err;
             ^~~

In where the changes were made and:

llvm-project/llvm/include/llvm/ExecutionEngine/Orc/Shared/RPCUtils.h:1232:29: error: no matching member function for call to 'callB'
              Impl.template callB<OrcRPCNegotiate>(Func::getPrototype())) {
Dec 31 2020, 2:42 PM
nlopes committed rGa2513cb8655e: remove pessimizing moves (reported by gcc 10) (authored by nlopes).
remove pessimizing moves (reported by gcc 10)
Dec 31 2020, 12:36 PM
nlopes committed rGf760d57052d8: LangRef: fix significand bits of fp128 (authored by nlopes).
LangRef: fix significand bits of fp128
Dec 31 2020, 3:14 AM

Dec 26 2020

nlopes added a comment to D93820: [InstSimplify] Don't fold gep p, -p to null.

I only looked at the tests and they were correct before, see here: https://alive2.llvm.org/ce/z/UzW3pv
The tests are weird because they have 'gep inbounds'. The reason they are correct (and weird) is that the only way p - (int)p/sizeof(*p) is inbounds is p being null. Anything else will overflow.

This doesn't look right to me, at least not given current LangRef wording. Lets say we have gep inbounds p, -p, where p = ptr(base_addr = 1, offset = -1). This means that the address value of p is 0, but it has provenance of the object at base_addr = 1. As such, the inbounds is not violated (both p and the gep results are inbounds of the zero address), but we still change provenance.

There's an extra catch: gep inbounds requires both the input and output pointers to be in bounds. This part is explicit in LangRef, at least.
Some examples:

p = malloc()
q = gep inbounds p, -1  // poison
r = gep p, -1           // ok
s = gep inbounds r, 1   // poison: r is not inbounds
t = gep r, 1            // ok, offset = 0
u = gep inbounds t, 1   // ok, offset = 1 (assuming malloc size > 0)

Right, but inbounds and provenance are, as far as I can tell, orthogonal concepts. Alive claims that this code has UB due to use of gep inbounds: https://alive2.llvm.org/ce/z/zTctIR At the same time, the gep inbounds itself is not poison: https://alive2.llvm.org/ce/z/wxGGyu That makes it looks like Alive also constrains provenance based on gep inbounds, not just the value of the pointer.

Dec 26 2020, 11:21 AM · Restricted Project
nlopes added a comment to D93820: [InstSimplify] Don't fold gep p, -p to null.

I only looked at the tests and they were correct before, see here: https://alive2.llvm.org/ce/z/UzW3pv
The tests are weird because they have 'gep inbounds'. The reason they are correct (and weird) is that the only way p - (int)p/sizeof(*p) is inbounds is p being null. Anything else will overflow.

This doesn't look right to me, at least not given current LangRef wording. Lets say we have gep inbounds p, -p, where p = ptr(base_addr = 1, offset = -1). This means that the address value of p is 0, but it has provenance of the object at base_addr = 1. As such, the inbounds is not violated (both p and the gep results are inbounds of the zero address), but we still change provenance.

Dec 26 2020, 8:15 AM · Restricted Project
nlopes added a comment to D93818: [LangRef] Update shufflevector's semantics to return poison if the mask is undef.

LGTM

Dec 26 2020, 4:44 AM · Restricted Project
nlopes added a comment to D93820: [InstSimplify] Don't fold gep p, -p to null.

I only looked at the tests and they were correct before, see here: https://alive2.llvm.org/ce/z/UzW3pv
The tests are weird because they have 'gep inbounds'. The reason they are correct (and weird) is that the only way p - (int)p/sizeof(*p) is inbounds is p being null. Anything else will overflow.

Dec 26 2020, 4:10 AM · Restricted Project

Dec 24 2020

nlopes added inline comments to D93793: [IR] Let IRBuilder's CreateVectorSplat/CreateShuffleVector use poison as placeholder.
Dec 24 2020, 2:57 AM · Restricted Project, Restricted Project

Dec 22 2020

nlopes added a comment to D87188: [InstCombine] Canonicalize SPF to abs intrinc.

This patch regressed the following test of Transforms/InstCombine/abs-1.ll:
(need to drop NSW in this case).

define i8 @nabs_canonical_3(i8 %x) {
%0:
  %cmp = icmp slt i8 %x, 0
  %neg = sub nsw i8 0, %x
  %abs = select i1 %cmp, i8 %x, i8 %neg
  ret i8 %abs
}
=>
define i8 @nabs_canonical_3(i8 %x) {
%0:
  %1 = abs i8 %x, 1
  %abs = sub nsw i8 0, %1
  ret i8 %abs
}
Transformation doesn't verify!
ERROR: Target is more poisonous than source
Dec 22 2020, 6:26 AM · Restricted Project, Restricted Project

Dec 21 2020

nlopes added a comment to D93065: [InstCombine] Disable optimizations of select instructions that causes propagation of poison values.

In practice, probably not a lot. But it may have implications for loop optimization, like:

for (i=0; some_bool && i < limit; ++i) {
...
}

If you remove the poison from the i+1 < limit bit it may make the work of SCEV harder (or impossible; didn't think the example through carefully).

Can I just make sure my understanding is correct -- so when we check the SCEV of some_bool && i < limit; we do recursion backwards on this select instruction (after this patch) or on an AND instruction (before this patch). If we choose the freeze approach, we'll do recursion on the AND instruction and eventually hit a freeze instruction which SCEV does not know how to handle, hence SCEV will just return CouldNotCompute?

Dec 21 2020, 4:07 AM · Restricted Project, Restricted Project

Dec 18 2020

nlopes added a comment to D93376: [LangRef] Clarify the semantics of lifetime intrinsics.

As Ralf mentioned, the ship has sailed. Alloca and lifetime intrinsics were implemented like this several years ago. They were a quick hack to save stack space after inlining. That's it, and their design reflects the goals at the time.
We simply want to document what is implemented. @jdoerfert you seem to want to change the implementation and/or the design, which is a separate discussion. I suggest we first document how LLVM works and then if you want to make changes you start a *separate* discussion on the things you want to change, why, and what's the upgrade path, etc. We can't change the semantics of either alloca or lifetime intrinsics without an automatic upgrade path as otherwise we would break all frontends out there.

Dec 18 2020, 9:01 AM · Restricted Project
nlopes added a comment to D93065: [InstCombine] Disable optimizations of select instructions that causes propagation of poison values.

Using freeze loses information (if some of the inputs was poison). Plus It requires an extra op.
If we canonicalize around select there's no loss of information and it's just 1 instruction.

The disadvantage is that then we have 2 ways or doing boolean ANDs/ORs. Though most analyses can be patched easily, as most LLVM analyses' results are of the form "x has property foo unless it's poison". So for those analyses using and/or or select is the same (as the only difference between these is propagation of poison).
Other analyses/optimization can learn about select as needed.

Thank you for raising up the good point! I understand that we lose information by preventing poison values from propagation using freeze. But I'm unclear what would be the side effect or problem with that? I'd appreciate it if you could clarify a bit, thanks!

Dec 18 2020, 7:13 AM · Restricted Project, Restricted Project

Dec 17 2020

nlopes added a comment to D93376: [LangRef] Clarify the semantics of lifetime intrinsics.

What is the reason to restrict it to allocas? Just that we don't emit it right now? I don't see how that makes it conceptually better.

Dec 17 2020, 12:08 PM · Restricted Project
nlopes added a comment to D93065: [InstCombine] Disable optimizations of select instructions that causes propagation of poison values.

Using freeze loses information (if some of the inputs was poison). Plus It requires an extra op.
If we canonicalize around select there's no loss of information and it's just 1 instruction.

Dec 17 2020, 11:48 AM · Restricted Project, Restricted Project
nlopes added a comment to D78938: Make LLVM build in C++20 mode.

@BRevzin @nlopes This is causing MSVC build failure please can you take a look?

E:\llvm\llvm-project\llvm\include\llvm/DebugInfo/DWARF/DWARFDie.h(405): note: see declaration of 'std::reverse_iterator<llvm::DWARFDie::iterator>'
E:\llvm\llvm-project\llvm\lib\DWARFLinker\DWARFLinker.cpp(383): note: see reference to function template instantiation 'bool std::operator !=<llvm::DWARFDie::iterator,llvm::DWARFDie::iterator>(const std::reverse_iterator<llvm::DWARFDie::iterator> &,const std::reverse_iterator<llvm::DWARFDie::iterator> &)' being compiled
C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.28.29333\include\xutility(2086): error C2039: '_Get_current': is not a member of 'std::reverse_iterator<llvm::DWARFDie::iterator>'
E:\llvm\llvm-project\llvm\include\llvm/DebugInfo/DWARF/DWARFDie.h(405): note: see declaration of 'std::reverse_iterator<llvm::DWARFDie::iterator>'
Dec 17 2020, 6:12 AM · Restricted Project, Restricted Project
nlopes committed rG92310454bf0f: Make LLVM build in C++20 mode (authored by BRevzin).
Make LLVM build in C++20 mode
Dec 17 2020, 2:45 AM
nlopes closed D78938: Make LLVM build in C++20 mode.
Dec 17 2020, 2:45 AM · Restricted Project, Restricted Project

Dec 16 2020

nlopes added a comment to D93376: [LangRef] Clarify the semantics of lifetime intrinsics.

A contradiction in the proposed semantics vs the comments in the code is that we state that subsequent lifetime.start don't change the address of the alloca. There's only one address per alloca, even if it may have multiple disjoint liveness ranges.
It would be good to get confirmation from some CodeGen folks that the implemented algorithm respects this condition and that there are no plans to make the algorithm more aggressive in a way that may break this assumption.
Having multiple start/end pairs for the same alloca is not common, so I think imposing this condition is fine. It gives us free movement of gep/ptr2int, which is a good tradeoff.

Dec 16 2020, 7:00 AM · Restricted Project
nlopes added a reviewer for D93376: [LangRef] Clarify the semantics of lifetime intrinsics: MatzeB.
Dec 16 2020, 6:56 AM · Restricted Project

Dec 14 2020

nlopes added a comment to D93229: [VectorCombine] allow peeking through GEPs when creating a vector load.
/builddirs/llvm-project/build-Clang11-unknown$ /builddirs/llvm-project/build-Clang11-unknown/bin/opt -load /repositories/alive2/build-Clang-release/tv/tv.so -tv -vector-combine -mtriple=x86_64-- -mattr=avx2 -tv -o /dev/null --tv-smt-to=60000 /tmp/D93229.ll 

----------------------------------------
define <8 x i16> @t(* dereferenceable(128) align(128) %base) {
%0:
  %ptr = gep inbounds * dereferenceable(128) align(128) %base, 1 x i64 1
  %p = bitcast * %ptr to *
  %gep = gep inbounds * %p, 16 x i64 0, 2 x i64 1
  %s = load i16, * %gep, align 1
  %r = insertelement <8 x i16> undef, i16 %s, i64 0
  ret <8 x i16> %r
}
=>
define <8 x i16> @t(* dereferenceable(128) align(128) %base) {
%0:
  %ptr = gep inbounds * dereferenceable(128) align(128) %base, 1 x i64 1
  %p = bitcast * %ptr to *
  %gep = gep inbounds * %p, 16 x i64 0, 2 x i64 1
  %1 = bitcast * %gep to *
  %r = load <8 x i16>, * %1, align 1
  ret <8 x i16> %r
}
Transformation doesn't verify!
Dec 14 2020, 11:36 AM · Restricted Project

Dec 13 2020

nlopes added a comment to D78938: Make LLVM build in C++20 mode.

Thanks @lebedev.ri for the pointer!
I started working on exactly the same thing as I was trying to link a C++20 project with LLVM.
@BRevzin is there anything missing in this patch? Do you have commit access or do you need help to land this?

Dec 13 2020, 10:16 AM · Restricted Project, Restricted Project
nlopes added a comment to D90529: Allow nonnull/align attribute to accept poison.

For the partial undef memory access example that Juneyoung gave.. Well, maybe we need to make it UB to dereference a non-deterministic value. Doesn't seem like it's a very useful thing to do, and this non-determinism comes from some previous undefined behavior, so it seems fine to just make dereference of partial undef UB. Simplifies things.

There was a discussion for this: https://groups.google.com/g/llvm-dev/c/2Qk4fOHUoAE/m/OxZa3bIhAgAJ
This partially undef thing is a bit painful.. :/

Dec 13 2020, 4:31 AM · Restricted Project

Dec 11 2020

nlopes added a comment to D90529: Allow nonnull/align attribute to accept poison.

I like where this is going. Most of LLVM's alias analysis produce information that only holds if the value is not poison. Since these attributes are derived from said analysis, then it makes sense then they have the same "X is poison or foo(X) holds" semantics.
I agree that certain attributes are different, like dereferenceable. It is useless if the value might be poison as well. Though we may go with the same semantics and then require the noundef attribute to make it useful. Seems like a good way to go as well.

Dec 11 2020, 2:07 PM · Restricted Project

Dec 10 2020

nlopes committed rGd2a7b83c5c7b: AA: make AliasAnalysis.h compatible with C++20 (NFC) (authored by nlopes).
AA: make AliasAnalysis.h compatible with C++20 (NFC)
Dec 10 2020, 7:32 AM

Dec 8 2020

nlopes committed rG3c01af9aeebe: DenseMap: fix build with clang in C++20 mode (authored by nlopes).
DenseMap: fix build with clang in C++20 mode
Dec 8 2020, 10:40 AM

Nov 12 2020

nlopes added a comment to D90708: [LangRef] Clarify GEP inbounds wrapping semantics.

LGTM!
Thanks a lot for working on this!

Nov 12 2020, 10:06 AM · Restricted Project

Nov 11 2020

nlopes added inline comments to D90708: [LangRef] Clarify GEP inbounds wrapping semantics.
Nov 11 2020, 12:20 PM · Restricted Project
nlopes added inline comments to D90708: [LangRef] Clarify GEP inbounds wrapping semantics.
Nov 11 2020, 11:41 AM · Restricted Project

Nov 9 2020

nlopes added a comment to D91055: [clang-tidy] Introduce misc No Integer To Pointer Cast check.

Nice!
BTW, another popular idiom is to store data in the last few bits of the pointer (e.g., LLVM's own PointerIntPair). I guess that one can also be implement by casting the ptr to char* and doing operations over that.

Nov 9 2020, 2:05 AM · Restricted Project, Restricted Project, Restricted Project

Nov 5 2020

nlopes added a comment to D90382: [InstCombine] foldSelectRotate - generalize to foldSelectFunnelShift .

@nlopes I think we should adjust the funnel shift definition to say that it blocks poison on one operand if the shift amount is zero. Basically the poison semantics should be "as if" the funnel shift were expanded, which does include an explicit select for the zero shift amount case.

Nov 5 2020, 5:32 AM · Restricted Project
nlopes added a comment to D90382: [InstCombine] foldSelectRotate - generalize to foldSelectFunnelShift .

Alive2 says this test is incorrect (because select blocks poison and funnel shift doesn't):

define i8 @fshr_select(i8 %x, i8 %y, i8 %shamt) {
%0:
  %cmp = icmp eq i8 %shamt, 0
  %sub = sub i8 8, %shamt
  %shr = lshr i8 %y, %shamt
  %shl = shl i8 %x, %sub
  %or = or i8 %shl, %shr
  %r = select i1 %cmp, i8 %y, i8 %or
  ret i8 %r
}
=>
define i8 @fshr_select(i8 %x, i8 %y, i8 %shamt) {
%0:
  %r = fshr i8 %x, i8 %y, i8 %shamt
  ret i8 %r
}
Transformation doesn't verify!
ERROR: Target is more poisonous than source
Nov 5 2020, 4:19 AM · Restricted Project

Nov 4 2020

nlopes added a reviewer for D90708: [LangRef] Clarify GEP inbounds wrapping semantics: rsmith.
Nov 4 2020, 1:04 AM · Restricted Project
nlopes added a comment to D90708: [LangRef] Clarify GEP inbounds wrapping semantics.

LGTM modulo the two comments. Thanks for writing this down!

Nov 4 2020, 1:03 AM · Restricted Project

Nov 3 2020

nlopes added a comment to D90637: [ValueTracking] Inbounds does not imply nsw.

Hm, I think we need to clarify this in LangRef. We definitely assume this interpretation (unsigned base and signed offset) in some places (e.g. https://github.com/llvm/llvm-project/blob/c938b4a1ed43f3075155e16a7c2792ca8c122258/llvm/lib/Analysis/ScalarEvolution.cpp#L5061-L5072 and I'm pretty sure I've seen it elsewhere as well), but LangRef is really not clear on this point. It's also not completely obvious where the assumption that the pointer address space is unsigned comes from. E.g. on x86-64 the canonical address space is signed (but I don't know about other architectures). We need to clarify whether having an allocated object at [0xffffffff, 0x00000001] is legal (signed address space), [0x7fffffff, 0x80000001] is legal (unsigned address space) or both.

Nov 3 2020, 5:31 AM · Restricted Project

Nov 2 2020

nlopes added a comment to D90637: [ValueTracking] Inbounds does not imply nsw.

FWIW, here's a related bug (fixed already): https://bugs.llvm.org/show_bug.cgi?id=42699

Nov 2 2020, 2:28 PM · Restricted Project

Oct 11 2020

nlopes added a comment to D88783: [InstCombine] matchFunnelShift - fold or(shl(a,x),lshr(b,sub(bw,x))) -> fshl(a,b,x) iff x < bw.

Alive2 complains about one of the test cases:

define i64 @fshr_sub_mask(i64 %x, i64 %y, i64 %a) {
  %mask = and i64 %a, 63
  %shr = lshr i64 %x, %mask
  %sub = sub nsw nuw i64 64, %mask
  %shl = shl i64 %y, %sub
  %r = or i64 %shl, %shr
  ret i64 %r
}
=>
define i64 @fshr_sub_mask(i64 %x, i64 %y, i64 %a) {
  %r = fshr i64 %x, i64 %y, i64 %a
  ret i64 %r
}
Transformation doesn't verify!
ERROR: Value mismatch
Oct 11 2020, 3:30 PM · Restricted Project
nlopes accepted D88979: [InstCombine] combineLoadToOperationType(): don't fold int<->ptr cast into load.

LGTM

@nlopes does this look good to you?

Looking through other uses of isNoopCast(), I don't think it makes sense to push this change into it, as many other usages do need it to work with ptrtoint/inttoptr (some of them using it specifically for them). The comment above the function indicates that "no-op" is to be understood as "generates no code" here. Possibly it could do with a rename.

I think i don't agree with you there.
I agree with @nlopes, the end goal will be to basically disallow fusing of inttoptr/ptrtoint into loads,
disallow dropping inttoptr-of-ptrtoint/ptrtoint-of-inttoptr, etc.
And all that eventually boils down to updating CastInst::isNoopCast()/CastInst::isEliminableCastPair().

Oct 11 2020, 10:03 AM · Restricted Project, Restricted Project
nlopes added inline comments to D88979: [InstCombine] combineLoadToOperationType(): don't fold int<->ptr cast into load.
Oct 11 2020, 8:16 AM · Restricted Project, Restricted Project
nlopes updated subscribers of D88995: Support vectors in CastInst::isBitOrNoopPointerCastable.
Oct 11 2020, 7:46 AM · Restricted Project
nlopes added inline comments to D88995: Support vectors in CastInst::isBitOrNoopPointerCastable.
Oct 11 2020, 7:46 AM · Restricted Project

Oct 6 2020

nlopes added a comment to D88788: [SROA] rewritePartition()/findCommonType(): if uses have conflicting type, try getTypePartition() before falling back to largest integral use type (PR47592).

I guess if we're relying on the allocated type of the alloca anyway, preferring it over an integer type isn't terrible.

Really, though, we should avoid relying on the allocated type where possible. Here, we could check if any of the load/store operations use a pointer type, and choose a pointer type in that case.

Agreed. But until LLVM removes pointer sub-types it's convenient to get the alloca type right to avoid bitcast on every access anyway.
When pointer sub-types goes away, I guess all this code in SROA to find the right type for alloca would go away, but as you say it would have to be replaced with code to get the right load/store type instead. (FWIW Alive2's alloca only takes the number of bytes to allocate as argument)
So I see this patch as a step in the right direction.

So i don't do this blindly to find out that is'a bad idea, can we agree on the baseline here?
How should this be done properly? Instead of relying on the allocation type, would the D88842's approach be applicable here?

Oct 6 2020, 3:52 AM · Restricted Project
nlopes added a comment to D88806: [SCEV] Model ptrtoint(SCEVUnknown) cast not as unknown, but as zext/trunc/self of SCEVUnknown.

ptrtoint and inttoptr are different beasts.
Supporting ptrtoint is much simpler. It's inttoptr that makes me uncomfortable. I don't known enough about SCEV to know how it handles these unknown nodes.
To me, the patch makes sense for ptrtoint (as you say, it's just a zext/sext of some unknown value; worst case it's poison if we want to be strict about OOB). Though I can't comment on the inttoptr case (some SCEV expert needs to chime in).

Oct 6 2020, 3:24 AM · Restricted Project
nlopes added a comment to D88788: [SROA] rewritePartition()/findCommonType(): if uses have conflicting type, try getTypePartition() before falling back to largest integral use type (PR47592).

I guess if we're relying on the allocated type of the alloca anyway, preferring it over an integer type isn't terrible.

Really, though, we should avoid relying on the allocated type where possible. Here, we could check if any of the load/store operations use a pointer type, and choose a pointer type in that case.

Oct 6 2020, 3:17 AM · Restricted Project
nlopes added a comment to D88860: [LangRef] Describe why the pointer aliasing rules are currently unsound..

My meta-comment about this patch is that I'm not sure LangRef is the right place for this content. I see LangRef as the stuff that is set in stone, not necessarily for ongoing discussions.
However, since LangRef doesn't get these bits right, it might be ok to have a warning section about stuff that is disputed/under discussion so that readers know that part is not set in stone.

Oct 6 2020, 3:05 AM · Restricted Project

Oct 5 2020

nlopes added a comment to D88842: [InstCombine] inttoptr(load) -> load.

I don't like the direction of this patch because it will remove inttoptr instructions that were present in the original program. In the same way that optimizations shouldn't introduce inttoptr, they shouldn't fuse them with loads either.
If the original program had the inttoptr cast and you fold it with a load instruction, then you are asking the load to do the cast. This sort of type punning is not ok. I strongly disagree with Chandler that LLVM's memory is not typed. It needs to distinguish between integers and pointers, otherwise I don't know how to make LLVM correct without a significant perf penalty.
If LLVM's memory was untyped, we would have to assume that every pointer load could be doing an implicit inttoptr cast, which would havoc most optimizations. The fact that some optimizations, like sroa, assume that the memory is untyped and most of the others don't is a source of miscompilations.

Oct 5 2020, 12:24 PM · Restricted Project

Oct 3 2020

nlopes accepted D88789: [InstCombine] Revert rL226781 "Teach InstCombine to canonicalize loads which are only ever stored to always use a legal integer type if one is available." (PR47592).

Love it, thanks!
This gets rid of a lot of type punning issues through load/store of integers. Not introducing inttoptr during optimization is a very healthy goal.

Oct 3 2020, 2:58 PM · Restricted Project, Restricted Project

Sep 19 2020

nlopes added inline comments to D87965: [InstCombine] replace phi values from unreachable blocks with 'undef'.
Sep 19 2020, 8:43 AM · Restricted Project

Sep 10 2020

nlopes added a comment to D87149: [InstCombine] erase instructions leading up to unreachable.

Ok, let me make it more concrete.
it seems we have 3 possible semantics:

  1. volatile accesses never trap, but rather trigger UB when the address is not dereferenceable
  2. they trap if the address is not dereferenceable
  3. they may trap regardless (i.e., they can never be removed). Alternatively we can state that the load/store address traces are externally observable and can't change
Sep 10 2020, 4:35 AM · Restricted Project
nlopes added a comment to D87149: [InstCombine] erase instructions leading up to unreachable.

I would say let's write an RFC and see if there are other opinions. Also, @nlopes what does alive2 think of such a proposal?

Sep 10 2020, 4:27 AM · Restricted Project

Sep 6 2020

nlopes added a comment to D86815: [LangRef] Adjust guarantee for llvm.memcpy to also allow equal arguments..

FYI: I've just run Alive2 (already patched for this new semantics) on the test suite and no regressions reported.

Sep 6 2020, 11:52 AM · Restricted Project

Sep 1 2020

nlopes added a comment to D86815: [LangRef] Adjust guarantee for llvm.memcpy to also allow equal arguments..

Why would we change this? What's the point of having separate memcpy and memmove intrinsics?

I'm reading this as clang mis-uses llvm.memcpy when it probably should be using llvm.memmove

There is a longstanding assumption made by ~every compiler that memcpy(p, p, n) is safe. That's what we should be encoding here. We should not be removing all overlap restrictions.

Sep 1 2020, 3:26 AM · Restricted Project