Page MenuHomePhabricator

Support {S,U}REMEqFold before legalization
Needs ReviewPublic

Authored by nagisa on Oct 3 2020, 11:33 AM.

Details

Summary

This allows these optimisations to apply to e.g. urem i16 directly
before urem is promoted to i32 on architectures where i16 operations
are not intrinsically legal (such as on Aarch64). The legalization then
later can happen more directly and generated code gets a chance to avoid
wasting time on computing results in types wider than necessary, in the end.

Seems like mostly an improvement in terms of results at least as far as x86_64 and aarch64 are concerned, with a few regressions here and there. It also helps in preventing regressions in changes like D87976: Support the division-by-constant strength reduction for more integer types.

Diff Detail

Unit TestsFailed

TimeTest
1,090 mslinux > libarcher.parallel::parallel-nosuppression.c
Script: -- : 'RUN: at line 15'; /mnt/disks/ssd0/agent/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests -I /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/runtime/src -L /mnt/disks/ssd0/agent/llvm-project/build/lib -Wl,-rpath,/mnt/disks/ssd0/agent/llvm-project/build/lib /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/parallel/parallel-nosuppression.c -o /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/parallel/Output/parallel-nosuppression.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0' /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/parallel/Output/parallel-nosuppression.c.tmp 2>&1 | tee /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/parallel/Output/parallel-nosuppression.c.tmp.log | /mnt/disks/ssd0/agent/llvm-project/build/./bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/parallel/parallel-nosuppression.c

Event Timeline

nagisa created this revision.Oct 3 2020, 11:33 AM
nagisa requested review of this revision.Oct 3 2020, 11:33 AM
nagisa edited the summary of this revision. (Show Details)Oct 3 2020, 11:37 AM
nagisa updated this revision to Diff 295999.Oct 3 2020, 12:49 PM

Dont attempt to force legalization of non-trivial ops

(Mostly VSELECT)

nagisa added inline comments.Oct 3 2020, 1:09 PM
llvm/test/CodeGen/AArch64/urem-seteq-vec-tautological.ll
77 ↗(On Diff #295999)

This is a regression.

Originally urem-seteq fold did not fire because isOperationLegalOrCustom(ISD::XOR, SETCCVT) was returning false and thus we never actually executed the lowering this is supposed to be testing in the first place.

Instead DAG would move on to lower UREM via BuildDIV (i.e. mul-shift strength reduction) – which, turns out, produces significantly better code in this particular instance.

I suspect that most of the regressions seen here will be due to a similar cause as this one.

nagisa updated this revision to Diff 296002.Oct 3 2020, 1:11 PM

Don't force lowering of weird xors for aarch64

nagisa edited the summary of this revision. (Show Details)Oct 3 2020, 1:19 PM
nagisa added a comment.Oct 3 2020, 6:03 PM

The build error appears to be unrelated to the changes here – other differentials fail in much the same way.

Is there anything I could do to make this proceed?

nagisa edited the summary of this revision. (Show Details)Oct 25 2020, 8:36 AM