This is an archive of the discontinued LLVM Phabricator instance.

[RFC WIP] Fix DSE for asm outputs (aka PR44913)
Needs RevisionPublic

Authored by glider on Feb 19 2020, 10:40 AM.

Details

Reviewers
fhahn
efriedma
Summary

We need to collect the outputs of inline assembly statements, which in
the same time do not serve as inputs (i.e. those declared as "=m", not
"+m"). Doing so requires changing the DSE code so that every instruction
can potentially have multiple outputs.
Right now it seems sufficient to just change eliminateDeadStores() to
handle that case, but probably other places also need to be changed.
It also makes sense to factor out the "for (auto Loc: Locs)" loop from
that function to make it more readable.

Event Timeline

glider created this revision.Feb 19 2020, 10:40 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 19 2020, 10:40 AM

Hi Florian, Eli et al.,

I tried attacking https://bugs.llvm.org/show_bug.cgi?id=44913, and the provided patch seems to fix the problem.
Can you please take a look and tell if the general idea makes sense?
If it does, I can polish it by replacing all uses of getLocForWrite() with getLocsForWrite() (we may also need a similar function for reads) and move the code around a bit.

fhahn added inline comments.Feb 19 2020, 2:01 PM
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
291

I didn't have time to take a close look yet, but I think the reasoning here should be based on semantics defined in the LangRef. It might also be necessary to improve the LangRef.

I guess reasoning about inline asm directly while we don't have call attributes that allow equivalent reasoning makes sense. Sort of serves as a demonstration for why the attributes are useful. But you'll need to be *very* careful that we're actually specifying and implementing it properly. LangRef needs to clearly state the rules, and we need a bunch of tests to compare various combinations against gcc.

I assume you're aware this is missing a lot of logic that's necessary for correctness.

llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
320

Iterating over a SmallSet of pointers is non-deterministic.

glider marked 2 inline comments as done.Mar 3 2020, 7:40 AM
glider added inline comments.
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
291

You are right, and my understanding that "=r" must always precede "=*m" didn't match LangRef.
For example, some inline assembly instructions in the Linux kernel, e.g.

{ i8, i32 } (i32*, i32, i32*, i32)* asm sideeffect ".pushsection .smp_locks,\22a\22\0A.balign 4\0A.long 671f - .\0A.popsection\0A671:\0A\09lock; cmpxchgl $3, $1\0A\09/* output condition code z*/\0A", "={@ccz},=*m,={ax},r,*m,2,~{memory},~{dirflag},~{fpsr},~{flags}"

have the following constraints: `"={@ccz},=*m,={ax},r,*m,2,~{memory},~{dirflag},~{fpsr},~{flags}", where a memory output stands between two register outputs.

Instead, we need to iterate over the constraints and pick all indirect outputs that don't have a matching input.

320

Correct, will fix.

fhahn requested changes to this revision.Nov 9 2020, 12:48 PM

Not sure if you are still interested in pushing this patch, but I think it would need to be re-based to work on the MemorySSA-backed DSE implementation, which is the default now. Also, as Eli mentioned there are missing correctness checks.

llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
291

Instead, we need to iterate over the constraints and pick all indirect outputs that don't have a matching input.

I am not sure I follow.Could you elaborate? As Eli mentioned, there probably are some additional legality checks missing. It would be good to start by adding some tests. Also, we also need to account for the assembly using already assigned registers to read/write memory?

This revision now requires changes to proceed.Nov 9 2020, 12:48 PM