This is an archive of the discontinued LLVM Phabricator instance.

SROA should freeze undefs for loads with no prior stores
AbandonedPublic

Authored by jamieschmeiser on Oct 19 2022, 12:21 PM.

Details

Summary

SROA will replace loads of an alloca with undef when there are no
prior stores. However, multiple loads of the same memory must be
equal. Insert freeze instructions so that loads of the same alloca
with no prior store will compare correctly.

See new lit test /Transforms/SROA/same-promoted-undefs.ll for sample
IR that is fixed by this change.

Also fix up existing lit tests. @tstellar, please examine the AMD test changes.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptOct 19 2022, 12:21 PM
jamieschmeiser requested review of this revision.Oct 19 2022, 12:21 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptOct 19 2022, 12:21 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
nikic requested changes to this revision.Oct 19 2022, 12:42 PM
nikic added a reviewer: nlopes.

The current behavior here is intentional -- in fact, LLVM will move towards returning poison for loads from uninitialized memory in the future (though precisely how this will happen is still uncertain, see https://discourse.llvm.org/t/rfc-making-bit-field-codegen-poison-compatible/63250 for some recent discussion on the topic).

This revision now requires changes to proceed.Oct 19 2022, 12:42 PM
nlopes requested changes to this revision.Oct 19 2022, 1:20 PM

However, multiple loads of the same memory must be equal

This premise is incorrect w.r.t. with current semantics, which say that loads return undef for uninit memory.

As Nikita mentioned we are working towards changing this semantics. The work is ongoing, but we pivoted to make a couple of improvements in other areas to avoid perf regressions. For example, see the clang !noundef patches.

However, multiple loads of the same memory must be equal

This premise is incorrect w.r.t. with current semantics, which say that loads return undef for uninit memory.

As Nikita mentioned we are working towards changing this semantics. The work is ongoing, but we pivoted to make a couple of improvements in other areas to avoid perf regressions. For example, see the clang !noundef patches.

This is a C/C++ language semantics statement. Yes, I realize that LLVM is not language specific, but this is what is generated for this, and also, what freeze appears specifically designed to handle.

The current behavior here is intentional -- in fact, LLVM will move towards returning poison for loads from uninitialized memory in the future (though precisely how this will happen is still uncertain, see https://discourse.llvm.org/t/rfc-making-bit-field-codegen-poison-compatible/63250 for some recent discussion on the topic).

The current behaviour implies that the content of uninitialized memory is volatile, which is not correct. One would still need the freeze to ensure that 2 loads of the same uninitialized memory are the same, whether this is undef or poison. So this would not affect that future change and would still be required. Besides, are you sure that poison is appropriate here? Loading uninitialized memory is not erroneous; it is undefined. Comparing the same uninitialized memory, however, is defined, hence the freeze.

nikic added a comment.Oct 21 2022, 6:55 AM

At least in C++, working with uninitialized memory is pretty much always immediate undefined behavior, see https://eel.is/c++draft/basic.indet for the relevant wording. The only exception are "copy-like" operations on unsigned character types, which comparisons do not fall under.

I believe the C specification is less clear cut about this, but Clang and LLVM assume basically the same to also hold for C code.

At least in C++, working with uninitialized memory is pretty much always immediate undefined behavior, see https://eel.is/c++draft/basic.indet for the relevant wording. The only exception are "copy-like" operations on unsigned character types, which comparisons do not fall under.

I believe the C specification is less clear cut about this, but Clang and LLVM assume basically the same to also hold for C code.

What version of the C++ standard is this? Every version that I have seen has basics as section 3 and I cannot find this section, nor anything similar. Section 6 is Statements.

At least in C++, working with uninitialized memory is pretty much always immediate undefined behavior, see https://eel.is/c++draft/basic.indet for the relevant wording. The only exception are "copy-like" operations on unsigned character types, which comparisons do not fall under.

I believe the C specification is less clear cut about this, but Clang and LLVM assume basically the same to also hold for C code.

What version of the C++ standard is this? Every version that I have seen has basics as section 3 and I cannot find this section, nor anything similar. Section 6 is Statements.

That discussion is orthogonal to this patch.
This patch is not desired because it's not needed per the current LLVM IR semantics. If you want to change something, you need to start by proposing a change to the LLVM IR semantics. You'll need to justify why it's needed, why it's correct, the perf impact, how to make it backwards compatible, why it's better than the proposals over the table right now.

Anyway, a patch like this solves no problem. LLVM allows loads to be duplicated. Your patch does nothing to prevent that and to ensure all loads see the same value. The issue is way more complicated than what this patch implies.

jamieschmeiser abandoned this revision.Oct 21 2022, 8:10 AM

I checked with a member of the C++ standards committee and he verified that comparing an uninitialized value against itself is, indeed, undefined behaviour, in the general case. I am abandoning this revision.

At least in C++, working with uninitialized memory is pretty much always immediate undefined behavior, see https://eel.is/c++draft/basic.indet for the relevant wording. The only exception are "copy-like" operations on unsigned character types, which comparisons do not fall under.

I believe the C specification is less clear cut about this, but Clang and LLVM assume basically the same to also hold for C code.

What version of the C++ standard is this? Every version that I have seen has basics as section 3 and I cannot find this section, nor anything similar. Section 6 is Statements.

That discussion is orthogonal to this patch.
This patch is not desired because it's not needed per the current LLVM IR semantics. If you want to change something, you need to start by proposing a change to the LLVM IR semantics. You'll need to justify why it's needed, why it's correct, the perf impact, how to make it backwards compatible, why it's better than the proposals over the table right now.

Anyway, a patch like this solves no problem. LLVM allows loads to be duplicated. Your patch does nothing to prevent that and to ensure all loads see the same value. The issue is way more complicated than what this patch implies.

I'm not trying to flog a dead horse (I've already abandoned this) but I am trying to understand this statement. I do not dispute that there may be other situations similar to this but, assuming that we did want to ensure that the loads had the same value, why is freezing them at this point not the correct thing to do? Whether they are poison or undef, freezing them would ensure that they compare equal. Yes, I understand it may have performance impacts, there may be better ways, etc. But, ignoring all that, isn't this exactly what freeze is designed for?

At least in C++, working with uninitialized memory is pretty much always immediate undefined behavior, see https://eel.is/c++draft/basic.indet for the relevant wording. The only exception are "copy-like" operations on unsigned character types, which comparisons do not fall under.

I believe the C specification is less clear cut about this, but Clang and LLVM assume basically the same to also hold for C code.

What version of the C++ standard is this? Every version that I have seen has basics as section 3 and I cannot find this section, nor anything similar. Section 6 is Statements.

That discussion is orthogonal to this patch.
This patch is not desired because it's not needed per the current LLVM IR semantics. If you want to change something, you need to start by proposing a change to the LLVM IR semantics. You'll need to justify why it's needed, why it's correct, the perf impact, how to make it backwards compatible, why it's better than the proposals over the table right now.

Anyway, a patch like this solves no problem. LLVM allows loads to be duplicated. Your patch does nothing to prevent that and to ensure all loads see the same value. The issue is way more complicated than what this patch implies.

I'm not trying to flog a dead horse (I've already abandoned this) but I am trying to understand this statement. I do not dispute that there may be other situations similar to this but, assuming that we did want to ensure that the loads had the same value, why is freezing them at this point not the correct thing to do? Whether they are poison or undef, freezing them would ensure that they compare equal. Yes, I understand it may have performance impacts, there may be better ways, etc. But, ignoring all that, isn't this exactly what freeze is designed for?

It's not enough to ensure the semantics you want. What about optimizations that happen before and after SROA? This patch only deals with a subset of the cases (the ones that are detected by SROA's algorithm). Again, load duplication happens, and this patch doesn't deal with it. So it's inconsistent.
To implement the proposed semantics, you would need to change quite a few optimizations, not just SROA.

Yes, I agree it is incomplete (aside from being incorrect here :-) I've just been asking to ensure that my understanding of freeze is correct. Thanks.