This patch enables folding following expressions under -ffast-math flag: exp(log(x)) -> x, exp2(log2(x)) -> x, log(exp(x)) -> x, log2(exp2(x)) -> x
Details
Diff Detail
Event Timeline
I don't have access to my checkout here, but I'm fairly sure we do something similar in SimplifyLibCalls.
I understand many of the pattern proposed will be duplicated, which is, not ideal.
Also, I'm pretty sure @scanon once taught me (when I implemented a similar transformation in simplifylibcalls) that these can underflow/overflow pretty dramatically, so, shouldn't these only be enabled under -ffast-math ?
These transforms are only enabled under -ffast-math, II->isFast() checks here for it. Tests explicitly show correctness. SimlifyLibCalls doesn't have these optimizations, but I can move transforms to that file if it's necessary.
Since this hasn't been committed yet, let me ask: why are we doing this in instcombine rather than instsimplify? When no new instructions are created, we generally want that transform in InstSimplify because that makes passes like GVN more powerful. InstSimplify already has "SimplifyIntrinsic", so I think this could fit neatly in there.
What is the expected result for this kind of test (please add at least one test like this)?
define double @exp_log_partly_fast(double %a) { %1 = call double @llvm.log.f64(double %a) ; strict %2 = call fast double @llvm.exp.f64(double %1) ret double %2 }
LGTM - see inline comments for some potential improvements.
lib/Analysis/InstructionSimplify.cpp | ||
---|---|---|
4520 ↗ | (On Diff #128440) | I don't see much value in naming this local (could just use *ArgBegin in the line below here). So you could shrink this down to: Value *X; if (Q.CxtI->isFast() && match(*ArgBegin, m_Intrinsic<Intrinsic::log>(m_Value(X)))) return X; return nullptr; |
4521 ↗ | (On Diff #128440) | As I've said in other reviews, I think explicitly setting this to nullptr is just noise. |
test/Transforms/InstSimplify/exp-intrinsic.ll | ||
4–5 ↗ | (On Diff #128440) | It's a matter of taste, but since these are highly similar transforms, I'd put all of the tests together in one test file to match the fact that they're all handled together in one block of code. |
27–45 ↗ | (On Diff #128440) | Looking back at the discussion in D5584, I think this is correct. Only the later op needs to be relaxed to do the transform. |