This is an archive of the discontinued LLVM Phabricator instance.

Use TuningFastScalarFSQRT for default X86 tuning
AbandonedPublic

Authored by andrew.w.kaylor on Jul 5 2022, 10:54 AM.

Details

Summary

When the "tune-cpu" attribute is not set or is set to "x86-64", we currently use an approximation sequence (when permitted) rather than the sqrtss instruction. Since this instruction is available with the default x86-64 ISA and is more accurate, it is better to assume fast sqrt by default, as we do with "tune-cpu"="generic".

The clang front end sets "tune-cpu"="generic" if no tuning or target processor is specifically requested, but other front ends that set "target-cpu"="x86-64" will get the "x86-64" tuning, which is different from "generic".

I've also started a discussion of this here: https://discourse.llvm.org/t/fast-scalar-fsqrt-tuning-in-x86/63605

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptJul 5 2022, 10:54 AM
andrew.w.kaylor requested review of this revision.Jul 5 2022, 10:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 5 2022, 10:54 AM
RKSimon added inline comments.Jul 5 2022, 12:12 PM
llvm/test/CodeGen/X86/sqrt-fastmath-tunecpu-attr.ll
2

add explicit triple for cases when its run on other arch

Abandon this after D129647?

andrew.w.kaylor abandoned this revision.Jul 22 2022, 11:34 AM

Superseded by D129647