Page MenuHomePhabricator

nlopes (Nuno Lopes)
User

Projects

User does not belong to any projects.

User Details

User Since
Oct 3 2013, 11:31 AM (495 w, 1 d)

Recent Activity

Sat, Mar 18

nlopes added a comment to D143287: [Clang][X86] Change X86 cast intrinsics to use __builtin_nondeterministic_value.

Implementing the 128 to 512 casts by filling the rest of the vector with the same definition of a nondeterministic_value is not correct because :

a = freeze poison
v = <a, a>

is not the same as

v = freeze poison

The only solution I'm seeing ,using the shufflevector, is doing the conversion in two steps:

  • build a 256 vector with the upper half being undefined( freeze poison)
  • build a 512 vector where the lower half is the previous 256 vector and the upper half being undefined

I think this would require two shuffles which is unfortunate.

This would ensure no miscompilations due to multiple uses of the same freeze undef/poison but would probably require some backend work to ensure the pattern is recognized to emit efficient assembly.

Sat, Mar 18, 2:58 AM · Restricted Project, Restricted Project

Mon, Mar 13

nlopes accepted D145920: [IR] Allow !range on vector of integer instructions.

LGTM

Mon, Mar 13, 4:25 AM · Restricted Project, Restricted Project

Fri, Mar 10

nlopes added a comment to D144903: [X86] Drop single use check for freeze(undef) in LowerAVXCONCAT_VECTORS.

I think the point of the hasOneUse check is to avoid a possible miscompile; if a FREEZE has more than one use, all users need to see the same value. So not sure dropping the check is correct in general.

Fri, Mar 10, 11:14 AM · Restricted Project, Restricted Project, Restricted Project

Thu, Mar 9

nlopes added a comment to D145632: [test] Remove occurences of br undef in CodeGen/PowerPC tests.

tests are failing. please run all the tests before submitting patches.

I checked these files with ninja check-llvm and all tests passed, but some fail here. Is there other checks I can perform before submitting a patch?

Thu, Mar 9, 4:39 AM · Restricted Project, Restricted Project
nlopes requested changes to D145630: [test] Remove occurences of br undef in CodeGen/Hexagon tests.

tests failing

Thu, Mar 9, 12:06 AM · Restricted Project, Restricted Project
nlopes requested changes to D145622: [test] Remove occurences of br undef in CodeGen/AMDGPU tests.

tests failing

Thu, Mar 9, 12:05 AM · Restricted Project, Restricted Project
nlopes requested changes to D145627: [test] Remove occurences of br undef in CodeGen/ARM and CodeGen/AVR tests.

tests failing

Thu, Mar 9, 12:05 AM · Restricted Project, Restricted Project
nlopes requested changes to D145632: [test] Remove occurences of br undef in CodeGen/PowerPC tests.

tests are failing. please run all the tests before submitting patches.

Thu, Mar 9, 12:05 AM · Restricted Project, Restricted Project

Sun, Mar 5

nlopes added inline comments to D145331: [test] Remove occurences of br undef in test/Analysis tests.
Sun, Mar 5, 1:56 PM · Restricted Project, Restricted Project
nlopes added inline comments to D145331: [test] Remove occurences of br undef in test/Analysis tests.
Sun, Mar 5, 1:54 PM · Restricted Project, Restricted Project

Mar 2 2023

nlopes added inline comments to D145163: Add support for vectorization of interleaved memory accesses for scalable VF.
Mar 2 2023, 8:09 AM · Restricted Project, Restricted Project

Feb 22 2023

nlopes added a comment to D144590: [Clang][OpenMP] Fix shared memory allocation on AMDGPU.

Please use poison values as place holders instead of undef values as we're trying to get rid of Undef. Thank you!

Feb 22 2023, 1:56 PM · Restricted Project, Restricted Project

Feb 15 2023

nlopes accepted D144106: [InstSimplify] Fix poison safety in insertvalue fold.

This is great, thank you!

Feb 15 2023, 7:55 AM · Restricted Project, Restricted Project

Feb 12 2023

nlopes committed rGfebe740b2c8f: [test] Remove occurences of br undef in Transform/Util tests [NFC] (authored by kritgpt).
[test] Remove occurences of br undef in Transform/Util tests [NFC]
Feb 12 2023, 2:14 AM · Restricted Project, Restricted Project
nlopes closed D143770: [test] Remove occurences of br undef in Transform/Util tests.
Feb 12 2023, 2:14 AM · Restricted Project, Restricted Project

Feb 10 2023

nlopes accepted D143770: [test] Remove occurences of br undef in Transform/Util tests.
Feb 10 2023, 1:27 PM · Restricted Project, Restricted Project

Feb 5 2023

nlopes committed rW93e900b74f16: add gsoc23 project (authored by nlopes).
add gsoc23 project
Feb 5 2023, 10:17 AM · Restricted Project

Jan 20 2023

nlopes added a comment to D141386: [LangRef] Make !range, !nonnull and !align return poison instead of IUB.

Three failures:

  • Transforms/InstCombine/loadstore-metadata.ll
  • Transforms/Mem2Reg/preserve-nonnull-load-metadata.ll
  • Transforms/SROA/preserve-nonnull.ll

The last two should be fixed by https://github.com/llvm/llvm-project/commit/e6241cbdcbf3cc9beb49460578466e18936ef220.

Jan 20 2023, 7:55 AM · Restricted Project, Restricted Project

Jan 19 2023

nlopes accepted D142115: [ValueTracking] Take poison-generating metadata into account (PR59888).
Jan 19 2023, 10:32 AM · Restricted Project, Restricted Project
nlopes added a comment to D142115: [ValueTracking] Take poison-generating metadata into account (PR59888).

LGTM, except for the align metadata.

Jan 19 2023, 8:35 AM · Restricted Project, Restricted Project

Jan 16 2023

nlopes added a comment to D138958: [clang] Better UX for Clang’s unwind-affecting attributes.

The issue is the function call, not the stores.
Stores are valid as long as they are to local memory (stack or allocated by the function).

Jan 16 2023, 4:02 AM · Restricted Project, Restricted Project, Restricted Project, Restricted Project

Jan 12 2023

nlopes added a comment to D141386: [LangRef] Make !range, !nonnull and !align return poison instead of IUB.

Three failures:

  • Transforms/InstCombine/loadstore-metadata.ll
  • Transforms/Mem2Reg/preserve-nonnull-load-metadata.ll
  • Transforms/SROA/preserve-nonnull.ll
Jan 12 2023, 1:54 PM · Restricted Project, Restricted Project
nlopes added a comment to D141386: [LangRef] Make !range, !nonnull and !align return poison instead of IUB.

Sounds good, thanks!
I implemented this in Alive2 and I got zero failures in LLVM's test suite. Either it doesn't have many !ranges or we are good.

Shouldn't this test have failed? https://github.com/llvm/llvm-project/blob/main/llvm/test/Transforms/Mem2Reg/preserve-nonnull-load-metadata.ll Or did you only adjust !range but not !nonnull?

Jan 12 2023, 9:19 AM · Restricted Project, Restricted Project
nlopes added a comment to D141386: [LangRef] Make !range, !nonnull and !align return poison instead of IUB.

Sounds good, thanks!
I implemented this in Alive2 and I got zero failures in LLVM's test suite. Either it doesn't have many !ranges or we are good.

Shouldn't this test have failed? https://github.com/llvm/llvm-project/blob/main/llvm/test/Transforms/Mem2Reg/preserve-nonnull-load-metadata.ll Or did you only adjust !range but not !nonnull?

Jan 12 2023, 6:41 AM · Restricted Project, Restricted Project

Jan 11 2023

nlopes accepted D141386: [LangRef] Make !range, !nonnull and !align return poison instead of IUB.

Sounds good, thanks!
I implemented this in Alive2 and I got zero failures in LLVM's test suite. Either it doesn't have many !ranges or we are good.

Jan 11 2023, 10:11 AM · Restricted Project, Restricted Project
nlopes accepted D141494: [Clang] Emit noundef metadata next to range metadata.

LGTM

Jan 11 2023, 8:42 AM · Restricted Project, Restricted Project

Jan 9 2023

nlopes added a comment to D141251: Replace un-demanded values with undef in BDCE..

I must say that this patch is a bit annoying :)
We are trying to remove undef, and you're adding yet another use.
It's true we can't use poison here. The alternative in the no-undef world is 'freeze poison'. Or just leave the zero there.
Is this change important? If not, I would prefer to not do it, as we'll have to revert it when removing undef (hopefully later this year).
Thanks!

Jan 9 2023, 3:07 AM · Restricted Project, Restricted Project

Jan 4 2023

nlopes accepted D140991: [DebugInfo] Replace UndefValue with PoisonValue in DIArgList::handleChangedOperand.

LGTM, thanks!

Jan 4 2023, 8:48 AM · Restricted Project, Restricted Project, debug-info

Jan 3 2023

nlopes accepted D140905: [DebugInfo] Replace UndefValue with PoisonValue in setKillLocation.

LGTM!

Jan 3 2023, 2:15 PM · Restricted Project, Restricted Project, debug-info

Dec 23 2022

nlopes added a comment to D134410: [clang][CodeGen] Add noundef metadata to load instructions (preliminary 1 or 5).

The other problem I see here is that we still need to do something about the memcpy -> load/store fold. As soon as we have poison from uninit values (either directly or via !uninit_is_poison) this will start causing miscompiles very quickly. The only way to do this right now is again with an <N x i8> vector load/store, which still optimizes terribly. This needs either load+store of integer to preserve poison, or again some form of byte type.

Something I've only recently realized is that we also run into this problem when inserting spurious load/store pairs, e.g. as done by LICM scalar promotion. If we're promoting say an i32 value to scalar that previously used a conditional store, then promotion will now introduce an unconditional load and store, which will spread poison from individual bytes. So that means that technically scalar promotion (and SimplifyCFG store speculation) also need to do some special dance to preserve poison. And without preservation of poison across load/store/phi in plain ints, this is going to be a bad optimization outcome either way: We'd have to use either a vector of i8 or a byte type for the inserted phi nodes and only cast to integer and back when manipulating the value, which would probably defeat the optimization. At that point probably best to freeze the first load (which again needs a freezing load).

Dec 23 2022, 10:32 AM · Restricted Project, Restricted Project

Dec 22 2022

nlopes added inline comments to D140546: [AMDGPU] Remove permlane discard vdst_in optimization from isel.
Dec 22 2022, 6:59 AM · Restricted Project, Restricted Project
nlopes accepted D140538: [Clang][CodeGen] Use poison instead of undef for dummy values in CGExpr [NFC].

LGTM, thank you!

Dec 22 2022, 2:51 AM · Restricted Project, Restricted Project, Restricted Project

Dec 20 2022

nlopes added inline comments to D140404: Patch for dbg variable instrinsics to point towards cloned values in JumpThreading.
Dec 20 2022, 10:48 AM · Restricted Project, Restricted Project

Dec 16 2022

nlopes accepted D131628: [LangRef] Add description for nocallback attribute.

@nlopes With D137360 up, do you still have concerns about this specification?

Dec 16 2022, 3:22 AM · Restricted Project, Restricted Project

Dec 14 2022

nlopes added a comment to rG4446f71ce392: [InstCombine] try to fold a pair of insertelements into one insertelement.

This because once you bitcast the vector to have fewer elements, poison propagates to neighbors. I think we can't do this part of the optimization.

Interesting - so we can't ever bitcast an arbitrary vector to fewer (larger) elements. But the tests where we are inserting into a full undef/poison base vector are still ok? (because that guarantees there's no potential to spill poison over to an element that wasn't already poison)

Dec 14 2022, 1:42 PM · Restricted Project, Restricted Project
nlopes added a comment to D138766: [InstCombine] If loading from small alloca, load whole alloca and perform variable extraction.

Thank you for looking into it!

FWIW, I've discovered today that GVN does a similar optimization (but without the freeze..).
See here (scroll to the bottom): https://web.ist.utl.pt/nuno.lopes/alive2/index.php?hash=aed14c64378404c9&test=Transforms%2FPhaseOrdering%2FX86%2Fvec-load-combine.ll

That seems to be with constant indexes, though?

True.
So it could use a simple extractelement rather than bit masking.

Define "could"? Define "simple"?
I've looked at alternative lowerings (shufflevector or chain of extractelement's),
and they all result in worse codegen. We can not use a single `extractelement,
because the byte offset may not be a multiple of the element size.
The shift is the optimal lowering here, any alternative chosen lowering
would need to be canonicalized into it, and and which point why bother?

Dec 14 2022, 1:38 PM · Restricted Project, Restricted Project
nlopes added a comment to D138766: [InstCombine] If loading from small alloca, load whole alloca and perform variable extraction.

FWIW, I've discovered today that GVN does a similar optimization (but without the freeze..).
See here (scroll to the bottom): https://web.ist.utl.pt/nuno.lopes/alive2/index.php?hash=aed14c64378404c9&test=Transforms%2FPhaseOrdering%2FX86%2Fvec-load-combine.ll

That seems to be with constant indexes, though?

Dec 14 2022, 1:06 PM · Restricted Project, Restricted Project
nlopes raised a concern with rG4446f71ce392: [InstCombine] try to fold a pair of insertelements into one insertelement.

Alive complains about this test case:

define <8 x i16> @insert_10_v8i16(i32 %x, <8 x i16> %v) {
%0:
  %hi32 = lshr i32 %x, 16
  %hi16 = trunc i32 %hi32 to i16
  %lo16 = trunc i32 %x to i16
  %ins0 = insertelement <8 x i16> %v, i16 %lo16, i64 1
  %ins1 = insertelement <8 x i16> %ins0, i16 %hi16, i64 0
  ret <8 x i16> %ins1
}
=>
define <8 x i16> @insert_10_v8i16(i32 %x, <8 x i16> %v) {
%0:
  %1 = bitcast <8 x i16> %v to <4 x i32>
  %2 = insertelement <4 x i32> %1, i32 %x, i64 0
  %ins1 = bitcast <4 x i32> %2 to <8 x i16>
  ret <8 x i16> %ins1
}
Transformation doesn't verify! (unsound)
ERROR: Target is more poisonous than source
Dec 14 2022, 11:55 AM · Restricted Project, Restricted Project
nlopes added a comment to D138766: [InstCombine] If loading from small alloca, load whole alloca and perform variable extraction.

FWIW, I've discovered today that GVN does a similar optimization (but without the freeze..).
See here (scroll to the bottom): https://web.ist.utl.pt/nuno.lopes/alive2/index.php?hash=aed14c64378404c9&test=Transforms%2FPhaseOrdering%2FX86%2Fvec-load-combine.ll

Dec 14 2022, 11:40 AM · Restricted Project, Restricted Project

Dec 11 2022

nlopes committed rG45a892d0124a: Use poison instead of undef where its used as a placeholder [NFC] (authored by ManuelJBrito).
Use poison instead of undef where its used as a placeholder [NFC]
Dec 11 2022, 9:18 AM · Restricted Project, Restricted Project
nlopes closed D139789: Use poison instead of undef where its used as a placeholder [NFC].
Dec 11 2022, 9:18 AM · Restricted Project, Restricted Project
nlopes accepted D139789: Use poison instead of undef where its used as a placeholder [NFC].

LGTM

Dec 11 2022, 9:16 AM · Restricted Project, Restricted Project
nlopes added a comment to D139785: [InstCombine] preserve signbit semantics of NAN with fold to fabs.

The actual issue is that X > 0.0 is false for negative NaNs

Dec 11 2022, 9:10 AM · Restricted Project, Restricted Project
nlopes committed rG2482dbff461e: [Clang] Use poison instead of undef where its used as placeholder [NFC] (authored by ManuelJBrito).
[Clang] Use poison instead of undef where its used as placeholder [NFC]
Dec 11 2022, 8:18 AM · Restricted Project, Restricted Project
nlopes closed D139745: [Clang]Use poison instead of undef where its used as placeholder[NFC].
Dec 11 2022, 8:18 AM · Restricted Project, Restricted Project, Restricted Project
nlopes accepted D139745: [Clang]Use poison instead of undef where its used as placeholder[NFC].

LGTM

Dec 11 2022, 8:14 AM · Restricted Project, Restricted Project, Restricted Project

Dec 6 2022

nlopes committed rG481170cb5511: [Clang][CodeGen] Use poison instead of undef for extra argument in… (authored by ManuelJBrito).
[Clang][CodeGen] Use poison instead of undef for extra argument in…
Dec 6 2022, 4:41 AM · Restricted Project, Restricted Project
nlopes closed D138755: [Clang][CodeGen] Use poison instead of undef for extra argument in __builtin_amdgcn_mov_dpp [NFC].
Dec 6 2022, 4:41 AM · Restricted Project, Restricted Project, Restricted Project

Dec 5 2022

nlopes accepted D139349: LangRef: Clarify semantics of lround/llround and lrint/llrint.

LGTM

Dec 5 2022, 11:49 AM · Restricted Project, Restricted Project
nlopes added a comment to D139349: LangRef: Clarify semantics of lround/llround and lrint/llrint.

Thanks.
unspecified is not a term used in LLVM IR semantics. I would write, for example, "a non-deterministic value (equivalent to freeze poison)".

Dec 5 2022, 11:03 AM · Restricted Project, Restricted Project
nlopes added inline comments to D139314: ValueTracking: Teach canCreateUndefOrPoison about FP ops.
Dec 5 2022, 9:25 AM · Restricted Project, Restricted Project
nlopes added a comment to D139312: ValueTracking: Teach CannotBeOrderedLessThanZero about copysign.

Seems reasonable. Can we get alive2 proofs for floating point things like this yet @nlopes?

Dec 5 2022, 6:07 AM · Restricted Project, Restricted Project
nlopes accepted D139313: ValueTracking: Teach canCreateUndefOrPoison about more intrinsics.

LGTM

Dec 5 2022, 6:02 AM · Restricted Project, Restricted Project
nlopes committed rG553bdf4fde5a: [NFC][clang] Strengthen checks in matrix-type-operators.c (authored by jmciver).
[NFC][clang] Strengthen checks in matrix-type-operators.c
Dec 5 2022, 2:14 AM · Restricted Project, Restricted Project
nlopes committed rG64428c0ddf55: [NFC][clang] Strengthen checks in matrix-type-operators.cpp (authored by jmciver).
[NFC][clang] Strengthen checks in matrix-type-operators.cpp
Dec 5 2022, 2:12 AM · Restricted Project, Restricted Project

Dec 4 2022

nlopes committed rGee13633c46cd: [NFC][clang] Strengthen checks in avx512fp16-builtins.c (authored by jmciver).
[NFC][clang] Strengthen checks in avx512fp16-builtins.c
Dec 4 2022, 6:58 AM · Restricted Project, Restricted Project
nlopes committed rG238948843773: [NFC][clang] Strengthen checks in avx512f-builtins.c (authored by jmciver).
[NFC][clang] Strengthen checks in avx512f-builtins.c
Dec 4 2022, 6:56 AM · Restricted Project, Restricted Project

Dec 1 2022

nlopes added inline comments to D139074: Vectorization Of Conditional Statements Using BOSCC.
Dec 1 2022, 12:31 AM · Restricted Project, Restricted Project

Nov 30 2022

nlopes added inline comments to D137811: InstCombine: Perform basic isnan combines on llvm.is.fpclass.
Nov 30 2022, 7:36 AM · Restricted Project, Restricted Project

Nov 24 2022

nlopes committed rGf408635b26b4: [CodeGen] Use poison instead of undef as placeholder in AtomicExpandPass [NFC] (authored by ManuelJBrito).
[CodeGen] Use poison instead of undef as placeholder in AtomicExpandPass [NFC]
Nov 24 2022, 12:43 AM · Restricted Project, Restricted Project
nlopes closed D138483: [CodeGen] Use poison instead of undef as placeholder in AtomicExpandPass [NFC].
Nov 24 2022, 12:42 AM · Restricted Project, Restricted Project

Nov 22 2022

nlopes added a reverting change for rGf50423c1a442: [CodeGen] Use poison instead of undef as placeholder in AtomicExpandPass [NFC]: rGb50e1bd60520: Revert "[CodeGen] Use poison instead of undef as placeholder in….
Nov 22 2022, 4:42 AM · Restricted Project, Restricted Project
nlopes committed rGb50e1bd60520: Revert "[CodeGen] Use poison instead of undef as placeholder in… (authored by nlopes).
Revert "[CodeGen] Use poison instead of undef as placeholder in…
Nov 22 2022, 4:42 AM · Restricted Project, Restricted Project
nlopes added a reverting change for D138483: [CodeGen] Use poison instead of undef as placeholder in AtomicExpandPass [NFC]: rGb50e1bd60520: Revert "[CodeGen] Use poison instead of undef as placeholder in….
Nov 22 2022, 4:41 AM · Restricted Project, Restricted Project
nlopes added a comment to D138483: [CodeGen] Use poison instead of undef as placeholder in AtomicExpandPass [NFC].

Manuel: there are quite a few test cases failing in the build bots: https://lab.llvm.org/buildbot/#/builders/16/builds/38352

Nov 22 2022, 3:48 AM · Restricted Project, Restricted Project
nlopes committed rGf50423c1a442: [CodeGen] Use poison instead of undef as placeholder in AtomicExpandPass [NFC] (authored by ManuelJBrito).
[CodeGen] Use poison instead of undef as placeholder in AtomicExpandPass [NFC]
Nov 22 2022, 3:40 AM · Restricted Project, Restricted Project
nlopes closed D138483: [CodeGen] Use poison instead of undef as placeholder in AtomicExpandPass [NFC].
Nov 22 2022, 3:40 AM · Restricted Project, Restricted Project
nlopes accepted D138483: [CodeGen] Use poison instead of undef as placeholder in AtomicExpandPass [NFC].
Nov 22 2022, 3:35 AM · Restricted Project, Restricted Project

Nov 21 2022

nlopes committed rG1e55d5b1f252: Use poison instead of undef as placeholder for vector construction [NFC] (authored by ManuelJBrito).
Use poison instead of undef as placeholder for vector construction [NFC]
Nov 21 2022, 10:43 AM · Restricted Project, Restricted Project
nlopes closed D138450: Use poison instead of undef as placeholder for vector construction [NFC].
Nov 21 2022, 10:43 AM · Restricted Project, Restricted Project
nlopes accepted D138450: Use poison instead of undef as placeholder for vector construction [NFC].

LGTM

Nov 21 2022, 10:37 AM · Restricted Project, Restricted Project

Nov 17 2022

nlopes added inline comments to D138203: [AArch64][InstCombine] Simplify repeated complex patterns in dupqlane.
Nov 17 2022, 5:42 AM · Restricted Project, Restricted Project
nlopes added inline comments to D119013: [ArgPromotion][AMDGPU] New MSSA-based function argument promotion pass with input/output argument support.
Nov 17 2022, 12:21 AM · Restricted Project, Restricted Project

Nov 16 2022

nlopes added inline comments to D138141: [amdgpu] Reimplement LDS lowering.
Nov 16 2022, 1:33 PM · Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project
nlopes added a comment to D136320: [Assignment Tracking Analysis][1/*] Add analysis pass core.

Please avoid using UndefValue::get as much as possible as we are trying to get rid of undef. Please use PoisonValue whenever possible.
Thank you!

Hi @nlopes. This patch is part of the series in which we agreed this could be addressed after landing the stack: https://reviews.llvm.org/D133293#inline-1286892. That is the next item on my TODO list and I plan to make the change for all debug info modes (rather than just this new one). Is that still okay with you?

Nov 16 2022, 4:10 AM · Restricted Project, Restricted Project, debug-info

Nov 2 2022

nlopes added a comment to D131628: [LangRef] Add description for nocallback attribute.

@nlopes, can we create a separate issue for finding the right semantics, and move forward by making this change describe as accurately as possible the current state of affairs?
Given that the original patch has been in LLVM for a long time, spanning many versions, it seems like we'd want to rectify the undocumentedness of the attribute ASAP, then work towards changing its semantics afterwards.

Nov 2 2022, 4:12 AM · Restricted Project, Restricted Project

Nov 1 2022

nlopes added a comment to D131628: [LangRef] Add description for nocallback attribute.

So, dropping nocallback during linking seems the way to go (and adjust the docs accordingly). Once you link, you can do inter-proc analysis and recover that info if needed anyway.

This would make Clang the only implementation of __attribute__((leaf)) to do so.
It's also not generally possible to recover the nocallback information about functions external to the LTO unit, which is the only thing __attribute__((leaf)) is actually good for in full LTO.

Nov 1 2022, 10:50 AM · Restricted Project, Restricted Project
nlopes added a comment to D134410: [clang][CodeGen] Add noundef metadata to load instructions (preliminary 1 or 5).

We wanted this patch to make us switch uninitialized loads to poison at will, since they become UB. In practice, this helps us fixing bugs in SROA and etc without perf degradation.

Can you elaborate on this? I don't see how this is necessary for switching uninitialized loads to poison.

It's not mandatory, it's a simple way of achieving it as !noundef already exists.

We cannot change the default behavior of load as it would break BC.

FWIW, I don't agree with this. It's fine to change load semantics, as long as bitcode autoupgrade is handled correctly. I'd go even further and say that at least long term, any solution that does not have a plain load instruction return poison for uninitialized memory will be a maintenance mess.

I think the core problem that prevents us from moving in this direction is that we have no way to represent a frozen load at all (as freeze(load) will propagate poison before freezing). If I understand correctly, you're trying to solve this by making *all* loads frozen loads, and use load !noundef to allow dropping the freeze. I think this would be a very bad outcome, especially as we're going to lose that !noundef on most load speculations, and I don't think we want to be introducing freezes when speculating loads (e.g. during LICM).

I expect that the path of least resistance is going to be to support bitwise poison for load/store/phi/select and promote it to full-value poison on any other operation, allowing a freezing load to be expressed as freeze(load).

Please let me know if I completely misunderstood the scheme you have in mind...

Nov 1 2022, 2:54 AM · Restricted Project, Restricted Project
nlopes added a comment to D131628: [LangRef] Add description for nocallback attribute.

How should we proceed on this? AFAICT, we did not get a clear answer about the semantics of leaf attribute for LTO from LTO semantics for __attribute__((leaf)).
Would that be ok to drop nocalback attribute during linking? There is an LLVM bug now (Missing LangRef documentation for nocallback), and leaf attribute support will be reverted if we cannot come up with the description.

Nov 1 2022, 2:40 AM · Restricted Project, Restricted Project

Oct 30 2022

nlopes added a comment to D134410: [clang][CodeGen] Add noundef metadata to load instructions (preliminary 1 or 5).

We wanted this patch to make us switch uninitialized loads to poison at will, since they become UB. In practice, this helps us fixing bugs in SROA and etc without perf degradation.

Can you elaborate on this? I don't see how this is necessary for switching uninitialized loads to poison.

Oct 30 2022, 3:14 PM · Restricted Project, Restricted Project
nlopes added a comment to D63401: SROA: Simplify addrspacecasted allocas with volatile accesses.

It seems this patch contradicts the proposed LangRef patch. The LangRef patch says that you can't assume that a volatile operation yields the same result in different spaces.
And I think that's a fair condition. Otherwise we need to document that constraint for hardware vendors and change LangRef. Pick one of the patches basically :)

This isn't contradictory. This is leaving the address space untouched, and doing the other simplifications SROA does (like replace byte arrays with int). A non-volatile access in this case would have been rewritten to point to the original alloca address space

Oct 30 2022, 11:21 AM · Restricted Project, Restricted Project
nlopes added a comment to D134410: [clang][CodeGen] Add noundef metadata to load instructions (preliminary 1 or 5).

I think adding this under a default-disabled flag is fine for evaluation purposes, but I have doubts that we will ever be able to enable this by default. There is a lot of code out there assuming that copying uninitialized data around is fine.

Oct 30 2022, 11:06 AM · Restricted Project, Restricted Project
nlopes added a comment to D63401: SROA: Simplify addrspacecasted allocas with volatile accesses.

It seems this patch contradicts the proposed LangRef patch. The LangRef patch says that you can't assume that a volatile operation yields the same result in different spaces.
And I think that's a fair condition. Otherwise we need to document that constraint for hardware vendors and change LangRef. Pick one of the patches basically :)

Oct 30 2022, 9:29 AM · Restricted Project, Restricted Project
nlopes accepted D63525: LangRef: Attempt to formulate some rules for addrspacecast.

I like this latest writing. But please wait for another reviewer to make sure I didn't miss smth.

Oct 30 2022, 5:20 AM · Restricted Project, Restricted Project

Oct 29 2022

nlopes added inline comments to D136998: Try to implement lambdas with inalloca parameters by inlining the call operator function..
Oct 29 2022, 1:29 AM · Restricted Project, Restricted Project

Oct 28 2022

nlopes updated subscribers of D136882: ConstantFold: Fold out compare of addrspacecasted null with global.

I'm not completely convinced that this is correct. Note that icmp on pointers is a pure bitwise comparison, it is not affected by provenance. I think it's pretty clear that ptr null and ptr addrspace(3) @lds are distinct underlying objects, with distinct provenance, and that's all that matters to alias analysis. But it's less obvious to me that casting ptr null to ptr addrspace(3) may not produce a bit pattern that just so happens to match ptr addrspace(3) @lds.

@nlopes This silently fails to compile in the online version: https://alive2.llvm.org/ce/z/XC2RGM Locally it prints:

ERROR: Unsupported instruction:   %__constexpr_0 = icmp eq ptr addrspace(3) addrspacecast (ptr null to ptr addrspace(3)), @lds
ERROR: Unsupported instruction:   ret i1 icmp eq (ptr addrspace(3) addrspacecast (ptr null to ptr addrspace(3)), ptr addrspace(3) @lds)
ERROR: Could not translate 'src' to Alive IR

Would be great if these errors would show up online as well. Maybe the problem is that the error gets printed to stdout rather than stderr?

Oct 28 2022, 2:19 AM · Restricted Project, Restricted Project

Oct 26 2022

nlopes added inline comments to D63525: LangRef: Attempt to formulate some rules for addrspacecast.
Oct 26 2022, 2:29 PM · Restricted Project, Restricted Project

Oct 24 2022

nlopes added inline comments to D136583: use riscv vsetvl to tailfolding Co-authored-by: lidawei lidawei.1226@bytedance.com.
Oct 24 2022, 1:27 AM · Restricted Project, Restricted Project

Oct 21 2022

nlopes added a comment to D136284: SROA should freeze undefs for loads with no prior stores.

At least in C++, working with uninitialized memory is pretty much always immediate undefined behavior, see https://eel.is/c++draft/basic.indet for the relevant wording. The only exception are "copy-like" operations on unsigned character types, which comparisons do not fall under.

I believe the C specification is less clear cut about this, but Clang and LLVM assume basically the same to also hold for C code.

What version of the C++ standard is this? Every version that I have seen has basics as section 3 and I cannot find this section, nor anything similar. Section 6 is Statements.

That discussion is orthogonal to this patch.
This patch is not desired because it's not needed per the current LLVM IR semantics. If you want to change something, you need to start by proposing a change to the LLVM IR semantics. You'll need to justify why it's needed, why it's correct, the perf impact, how to make it backwards compatible, why it's better than the proposals over the table right now.

Anyway, a patch like this solves no problem. LLVM allows loads to be duplicated. Your patch does nothing to prevent that and to ensure all loads see the same value. The issue is way more complicated than what this patch implies.

I'm not trying to flog a dead horse (I've already abandoned this) but I am trying to understand this statement. I do not dispute that there may be other situations similar to this but, assuming that we did want to ensure that the loads had the same value, why is freezing them at this point not the correct thing to do? Whether they are poison or undef, freezing them would ensure that they compare equal. Yes, I understand it may have performance impacts, there may be better ways, etc. But, ignoring all that, isn't this exactly what freeze is designed for?

Oct 21 2022, 8:58 AM · Restricted Project, Restricted Project, Restricted Project
nlopes added a comment to D136284: SROA should freeze undefs for loads with no prior stores.

At least in C++, working with uninitialized memory is pretty much always immediate undefined behavior, see https://eel.is/c++draft/basic.indet for the relevant wording. The only exception are "copy-like" operations on unsigned character types, which comparisons do not fall under.

I believe the C specification is less clear cut about this, but Clang and LLVM assume basically the same to also hold for C code.

What version of the C++ standard is this? Every version that I have seen has basics as section 3 and I cannot find this section, nor anything similar. Section 6 is Statements.

Oct 21 2022, 7:33 AM · Restricted Project, Restricted Project, Restricted Project

Oct 20 2022

nlopes added a comment to D136320: [Assignment Tracking Analysis][1/*] Add analysis pass core.

Please avoid using UndefValue::get as much as possible as we are trying to get rid of undef. Please use PoisonValue whenever possible.
Thank you!

Oct 20 2022, 1:21 AM · Restricted Project, Restricted Project, debug-info

Oct 19 2022

nlopes requested changes to D136284: SROA should freeze undefs for loads with no prior stores.

However, multiple loads of the same memory must be equal

Oct 19 2022, 1:20 PM · Restricted Project, Restricted Project, Restricted Project
nlopes added inline comments to D135876: [InstCombine] Remove redundant splats in InstCombineVectorOps.
Oct 19 2022, 8:59 AM · Restricted Project, Restricted Project

Oct 18 2022

nlopes committed rG57ba07c06f8b: [clang] fix typo in unit test (authored by susmonteiro).
[clang] fix typo in unit test
Oct 18 2022, 8:32 AM · Restricted Project, Restricted Project
nlopes added a comment to D136097: InstSimplify: Fold fdiv nnan nsz x, 0 -> inf.

It looks like by 754 sec 7.3 a (-x)/(+0) == -Inf, but the nsz is documented as saying the sign of the result is "insignificant", so I think we're good here.

LGTM

"nsz" means the sign of a zero result is insignificant, not any result. -42.0 / 0.0 should be -Inf, not +Inf?

It's documented as ignoring the sign of a zero argument or a zero result.
Thus, when nsw is involved, we should consider the result of -42.0 / 0.0 and -42.0 / -0.0. This case the 0.0 is a constant, but that doesn't matter.

Alive2 is correct here AFAICT.

Ah, thanks for clearing that up. The "nsz" wording has always been a less solid than I'd like, but I'm not sure how to change it.
I suspect this transform would cause some fallout with users because it's a stretch to go from -0.0 isn't different than 0.0, therefore, -Inf isn't different than +Inf. There was a suggestion in a previous discussion that we really need entirely new FP types to use with fast-math-flags, so it's clear that -0.0, Inf, or NaN values simply do not exist when the corresponding compile flag is used, but I don't think anyone has looked seriously at implementing that. There's more discussion about the general problem here:
https://github.com/llvm/llvm-project/issues/51601

Oct 18 2022, 1:28 AM · Restricted Project, Restricted Project

Oct 17 2022

nlopes added a comment to D136097: InstSimplify: Fold fdiv nnan nsz x, 0 -> inf.

It looks like by 754 sec 7.3 a (-x)/(+0) == -Inf, but the nsz is documented as saying the sign of the result is "insignificant", so I think we're good here.

LGTM

"nsz" means the sign of a zero result is insignificant, not any result. -42.0 / 0.0 should be -Inf, not +Inf?

Oct 17 2022, 1:15 PM · Restricted Project, Restricted Project
nlopes added a comment to D136055: [ValueTracking] Make !range metadata imply noundef for load & call results.

This is correct as specified, but I think there were plans to change !range and !nonnull to return poison, and require an additional !noundef to make them immediate UB, which is the way it works for attributes. I'm not sure what the state of that is though.

Oct 17 2022, 3:36 AM · Restricted Project, Restricted Project

Oct 11 2022

nlopes added a comment to D135451: [TTI] New PPC target hook enableUncondDivisionSpeculation.

As I understand it integer divide by zero is considered undefined behaviour in C while the same may not be true in other languages. Furthermore presence of side effects may be dependent on other configurations including target hardware (eg default treatment on Power vs x86).

The LLVM IR rule is more driven by the behavior of the instructions on various targets. If a target only has a trapping divide, we'd need to wrap it in control flow to implement a non-trapping divide. And particularly for signed divide, that check isn't cheap. We tend to prefer poison where it makes sense (for example, out-of-bounds shifts).

Frontends can always use control flow to get whatever user-visible behavior they want. (For example, the Rust divide operator panics if you divide by zero.)

To allow more flexibility we could either leave the IR neutral and let the optimizer decide based on config info (eg. TTI)

We try to avoid making core IR semantics depend on TTI. Not that we can completely ignore target differences when writing IR optimizations, but we want to keep IR understandable without reference to target-specific semantics.

I mean, it would be self-consistent to write in LangRef something like "whether division by zero is undefined behavior, or defined to produce a poison value, depends on the current target/current subtarget/bitwith of the operation/current moon cycle". But I don't want to go there. If the rules are the same across all targets, it's easier to understand, and easier to implement tools like Alive2 to validate transforms.

I'm sorry but I don't quite follow the logic that tries to justify the current design. On the one hand the IR rules are driven by target requirements, and at the same time you avoid making IR semantics dependent on TTI (which is the proper way to query about target requirements). They seem like recipes for the exact problem we are dealing with here, ie there are target-specific assumptions baked into the IR that are not configurable.

Oct 11 2022, 8:27 AM · Restricted Project, Restricted Project

Oct 8 2022

nlopes added a comment to D135451: [TTI] New PPC target hook enableUncondDivisionSpeculation.

I would also prefer to not go this route.
Alternatives include using predicated builtins, stick to pre-headers, or introduce a new intrinsic that yields poison on division by zero and/or INT_MAX/-1.

Oct 8 2022, 1:37 AM · Restricted Project, Restricted Project

Oct 7 2022

nlopes added inline comments to D135091: Fix assert in generated `direct` property getter/setters due to removal of `_cmd` parameter..
Oct 7 2022, 2:13 PM · Restricted Project, Restricted Project