This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Unfold masked merge with constant mask
ClosedPublic

Authored by lebedev.ri on Apr 20 2018, 2:13 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

  • Rebased.
  • This differential is the one to review, it has no unmerged deps/prerequisites.

Not related to this patch exactly but since you work in this area... maybe you can check this too?
https://godbolt.org/g/wksfqb

Not related to this patch exactly but since you work in this area... maybe you can check this too?
https://godbolt.org/g/wksfqb

That particular example should be handled by dead code elimination pass i guess,
because it is literally optimized down to return 0;,
since it only reads global uint32_t b; once, and ultimately does nothing with that.

But, i have zero clues how it should be done.

xbolva00 added a comment.EditedApr 28 2018, 2:02 PM

Not related to this patch exactly but since you work in this area... maybe you can check this too?
https://godbolt.org/g/wksfqb

That particular example should be handled by dead code elimination pass i guess,
because it is literally optimized down to return 0;,
since it only reads global uint32_t b; once, and ultimately does nothing with that.

But, i have zero clues how it should be done.

It is interesting, since even if we dont use "b", just leave it as "uint32_t a = 5;", the code is not eliminated since optimizer may think it could be infinite loop (?) (cc @chandlerc , @spatel )

Not related to this patch exactly but since you work in this area... maybe you can check this too?
https://godbolt.org/g/wksfqb

That particular example should be handled by dead code elimination pass i guess,
because it is literally optimized down to return 0;,
since it only reads global uint32_t b; once, and ultimately does nothing with that.

But, i have zero clues how it should be done.

It is interesting, since even if we dont use "b", just leave it as "uint32_t a = 5;", the code is not eliminated since optimizer may think it could be infinite loop (?) (cc @chandlerc , @spatel )

You probably want to move that to a bugzilla issue, as you have noted yourself, this is completely unrelated from the patch a hand.

spatel accepted this revision.Apr 30 2018, 7:49 AM

LGTM - see inline comments.

lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2417–2422

We should state our reasoning here - because I know I won't remember for long. :)

So something like:
We can canonicalize by swapping the final xor operand to eliminate the 'not' of the mask.
If M is a constant, we transform to 'and' / 'or' ops because that shortens the dependency chain and improves analysis.

2444

iC -> NotC

This revision is now accepted and ready to land.Apr 30 2018, 7:49 AM
lebedev.ri marked 2 inline comments as done.

@spatel thank you for the review!

lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
2430

I'm wondering if by simply capturing M here we miss some opportunities where
this first matched M does not satisfy any of these following checks (inverted/constant).

But i'm also not sure how to squeeze that into this one match(), in particular
i don't see how to know which of the matches is the real match, and which one is
stale pointer from the previous partial match.

This revision was automatically updated to reflect the committed changes.