We need to collect the outputs of inline assembly statements, which in
the same time do not serve as inputs (i.e. those declared as "=m", not
"+m"). Doing so requires changing the DSE code so that every instruction
can potentially have multiple outputs.
Right now it seems sufficient to just change eliminateDeadStores() to
handle that case, but probably other places also need to be changed.
It also makes sense to factor out the "for (auto Loc: Locs)" loop from
that function to make it more readable.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
- Build Status
Buildable 46829 Build 49477: arc lint + arc unit
Event Timeline
Hi Florian, Eli et al.,
I tried attacking https://bugs.llvm.org/show_bug.cgi?id=44913, and the provided patch seems to fix the problem.
Can you please take a look and tell if the general idea makes sense?
If it does, I can polish it by replacing all uses of getLocForWrite() with getLocsForWrite() (we may also need a similar function for reads) and move the code around a bit.
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp | ||
---|---|---|
291 | I didn't have time to take a close look yet, but I think the reasoning here should be based on semantics defined in the LangRef. It might also be necessary to improve the LangRef. |
I guess reasoning about inline asm directly while we don't have call attributes that allow equivalent reasoning makes sense. Sort of serves as a demonstration for why the attributes are useful. But you'll need to be *very* careful that we're actually specifying and implementing it properly. LangRef needs to clearly state the rules, and we need a bunch of tests to compare various combinations against gcc.
I assume you're aware this is missing a lot of logic that's necessary for correctness.
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp | ||
---|---|---|
320 | Iterating over a SmallSet of pointers is non-deterministic. |
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp | ||
---|---|---|
291 | You are right, and my understanding that "=r" must always precede "=*m" didn't match LangRef. { i8, i32 } (i32*, i32, i32*, i32)* asm sideeffect ".pushsection .smp_locks,\22a\22\0A.balign 4\0A.long 671f - .\0A.popsection\0A671:\0A\09lock; cmpxchgl $3, $1\0A\09/* output condition code z*/\0A", "={@ccz},=*m,={ax},r,*m,2,~{memory},~{dirflag},~{fpsr},~{flags}" have the following constraints: `"={@ccz},=*m,={ax},r,*m,2,~{memory},~{dirflag},~{fpsr},~{flags}", where a memory output stands between two register outputs. Instead, we need to iterate over the constraints and pick all indirect outputs that don't have a matching input. | |
320 | Correct, will fix. |
Not sure if you are still interested in pushing this patch, but I think it would need to be re-based to work on the MemorySSA-backed DSE implementation, which is the default now. Also, as Eli mentioned there are missing correctness checks.
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp | ||
---|---|---|
291 |
I am not sure I follow.Could you elaborate? As Eli mentioned, there probably are some additional legality checks missing. It would be good to start by adding some tests. Also, we also need to account for the assembly using already assigned registers to read/write memory? |
I didn't have time to take a close look yet, but I think the reasoning here should be based on semantics defined in the LangRef. It might also be necessary to improve the LangRef.