This is an archive of the discontinued LLVM Phabricator instance.

[Reassociate] move check to ignore boolean expressions before canonicalizing binary operands
AbandonedPublic

Authored by mhillenbrand on Jan 5 2021, 8:00 AM.

Details

Reviewers
spatel
lebedev.ri
Summary

The Reassociate pass intends to leave alone boolean (i1) expressions
to preserve the order of short-circuited comparisons, which SimplifyCFG
has folded into AND/OR expressions. The check to achieve that behavior
currently is ineffective since it only happens after
canonicalizeOperands() may already have flipped the order of operands
(and, consequentially, the order of short-circuited comparisons after
CodeGen). Move that check up to happen before the call to
canonicalizeOperands().

Bug: https://bugs.llvm.org/show_bug.cgi?id=48529

Diff Detail

Event Timeline

mhillenbrand created this revision.Jan 5 2021, 8:00 AM
mhillenbrand requested review of this revision.Jan 5 2021, 8:00 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 5 2021, 8:00 AM
fhahn added a subscriber: fhahn.Jan 5 2021, 8:12 AM

Can you add a test for the change?

lebedev.ri requested changes to this revision.Jan 5 2021, 8:34 AM

I do not understand the example in https://bugs.llvm.org/show_bug.cgi?id=48529
By "intent", are you saying that the canonicalization of commutative ops
is causing the code to be miscompiled, leading to different observable side-effects?

This revision now requires changes to proceed.Jan 5 2021, 8:34 AM

@lebedev.ri No, I do not see differences in observable side-effects. Though, runtime differs significantly depending on the order of evaluation. Thus, "intent" would be out-of-band information that the programmer has about the likelihood of each of the terms being true, and placing the most-likely term first, while that cannot be derived from the source by the compiler.

In the example

if (ab > de || ac > df || bc > cd)
      return 0;

the programmer may know that (ab > de) is more commonly true than (ac >df) and thus place the former check first. With short-circuited evaluation strictly in the order of the source code, the resulting code would execute fewer comparisons and run faster, and vice versa.

@fhahn I will add a test (seeking feedback whether this actually makes sense with this initial submission).

Add regression test

spatel added inline comments.Feb 2 2021, 5:57 AM
llvm/lib/Transforms/Scalar/Reassociate.cpp
2190–2195

For reference, this was added in 2010:
27dfb1e

I'm sympathetic to the motivation, but that was a hack: one goal of IR is to overcome side-channel source-level optimization like this.

For example, we added several IR transforms to reduce compiled-code differences in source-level diffs like "if (bool1 && bool2)" vs. "if (bool1 & bool2)" (logical vs. bitwise ops).

If the expression order matters, then we want the programmer to indicate that explicitly with something like "__builtin_expect" or profile metadata.

So this patch is really taking us away from the end goal (assuming we can get there). It might help solve the motivating problem in the short-term, but there's no guarantee that the fix will last.

mhillenbrand abandoned this revision.Feb 11 2021, 1:21 AM

Sanjay, thanks for providing context. In that light, I'm dropping this patch.