This is an archive of the discontinued LLVM Phabricator instance.

[X86][Reduce] Preserve fast math flags when change it. NFCI
ClosedPublic

Authored by pengfei on Dec 21 2022, 1:09 AM.

Details

Summary

@arsenm raised a good question that we should use a flag guard.
But I found it is not a problem as long as user uses intrinsics only: https://godbolt.org/z/WoYsqqjh3
Anyway, it is still nice to have.

Diff Detail

Event Timeline

pengfei created this revision.Dec 21 2022, 1:09 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 21 2022, 1:09 AM
pengfei requested review of this revision.Dec 21 2022, 1:09 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 21 2022, 1:09 AM
barannikov88 added inline comments.
clang/lib/CodeGen/CGBuiltin.cpp
14746

Could be just:

return Builder.CreateCall(F, {Ops[0], Ops[1]})
    ->getFastMathFlags()
    .setAllowReassoc();

so that you don't have to save and restore Builder's flags.

barannikov88 added inline comments.Dec 21 2022, 1:25 AM
clang/lib/CodeGen/CGBuiltin.cpp
14746

No, it couldn't be, just tried :(

foad added a subscriber: foad.Dec 21 2022, 2:30 AM
foad added inline comments.
clang/lib/CodeGen/CGBuiltin.cpp
14744

We have FastMathFlagGuard for automatically saving and restoring fast math flags.

Needs tests. I couldn’t find any for the base builtins either

pengfei updated this revision to Diff 484521.Dec 21 2022, 3:35 AM
pengfei marked an inline comment as done.

Use FastMathFlagGuard instead, thanks @foad!

Needs tests. I couldn’t find any for the base builtins either

I don't think so. We have tested their intrinsic wrappers, e.g., https://github.com/llvm/llvm-project/blob/main/clang/test/CodeGen/X86/avx512fp16-builtins.c#L4451

We don't need to test those target specific builtins. They are neither guaranteed to be cross compiler compatible nor release by release.

Besides, we have thousands of intrinsics in X86. Adding their builtin tests have no any benefits and increase the time in lit tests.

clang/lib/CodeGen/CGBuiltin.cpp
14746

Thanks for trying it :)

arsenm requested changes to this revision.Dec 21 2022, 5:19 AM

Use FastMathFlagGuard instead, thanks @foad!

Needs tests. I couldn’t find any for the base builtins either

I don't think so. We have tested their intrinsic wrappers, e.g., https://github.com/llvm/llvm-project/blob/main/clang/test/CodeGen/X86/avx512fp16-builtins.c#L4451

These are not testing use of these builtins correctly

We don't need to test those target specific builtins. They are neither guaranteed to be cross compiler compatible nor release by release.

If it exists it must be tested.

Besides, we have thousands of intrinsics in X86. Adding their builtin tests have no any benefits and increase the time in lit tests.

Every piece of code generation needs to be tested.

This revision now requires changes to proceed.Dec 21 2022, 5:19 AM

Use FastMathFlagGuard instead, thanks @foad!

Needs tests. I couldn’t find any for the base builtins either

I don't think so. We have tested their intrinsic wrappers, e.g., https://github.com/llvm/llvm-project/blob/main/clang/test/CodeGen/X86/avx512fp16-builtins.c#L4451

These are not testing use of these builtins correctly

This test amounts to a builtin header test for immintrin.h. This should move to clang/test/Headers and replaced with a builtin test directly checking the builtin handling

If it exists it must be tested.
Every piece of code generation needs to be tested.

Let me show you the number:

$ grep -rho '__builtin_ia32\w\+' clang/test/CodeGen | sort|uniq |wc -l
337
$ grep -rho '_mm512_\w\+' clang/test/CodeGen | sort|uniq |wc -l
2304

Note most of __builtin_ia32 tests are negative ones and _mm512_ is less than 1/3 of the total x86 intrinsics. Two orders of magnitude!

I took a look at the tests. The positive tests come from clang/test/CodeGen/builtins-x86.c. Other targets don't have a big test either
wc -l clang/test/CodeGen/builtins-*

These are not testing use of these builtins correctly

As I have explained, users are not suggested to use these builtins given we have provided the more stable, well documented corresponding intrinsics. The only case user has to use it is the intrinsic is missing. In that case, we do need test case for it.

In a word, as titled, it is NFC from the perspective of intrinsics. So I think we don't need test cases for them.

This test amounts to a builtin header test for immintrin.h. This should move to clang/test/Headers and replaced with a builtin test directly checking the builtin handling

Not get your point. But currently no builtin tests under clang/test/Headers/.

lebedev.ri requested changes to this revision.Dec 21 2022, 7:02 AM
lebedev.ri added a subscriber: lebedev.ri.

If it exists it must be tested.
Every piece of code generation needs to be tested.

Let me show you the number:

$ grep -rho '__builtin_ia32\w\+' clang/test/CodeGen | sort|uniq |wc -l
337
$ grep -rho '_mm512_\w\+' clang/test/CodeGen | sort|uniq |wc -l
2304

Note most of __builtin_ia32 tests are negative ones and _mm512_ is less than 1/3 of the total x86 intrinsics. Two orders of magnitude!

I took a look at the tests. The positive tests come from clang/test/CodeGen/builtins-x86.c. Other targets don't have a big test either
wc -l clang/test/CodeGen/builtins-*

These are not testing use of these builtins correctly

As I have explained, users are not suggested to use these builtins given we have provided the more stable, well documented corresponding intrinsics. The only case user has to use it is the intrinsic is missing. In that case, we do need test case for it.

In a word, as titled, it is NFC from the perspective of intrinsics. So I think we don't need test cases for them.

This test amounts to a builtin header test for immintrin.h. This should move to clang/test/Headers and replaced with a builtin test directly checking the builtin handling

Not get your point. But currently no builtin tests under clang/test/Headers/.

I agree with @arsenm. At least for clang irgen, we should have good test coverage.

As I have explained, users are not suggested to use these builtins given we have provided the more stable, well documented corresponding intrinsics. The only case user has to use it is the intrinsic is missing. In that case, we do need test case for it.

Tests don't exist for users, they exist for compiler developers. We shouldn't have frail infrastructure that depends on people not using things that are available. I'm not asking to test every builtin in this change, but this does need a test that shows a case where fast math flags were incorrectly applied. The broader issue of missing builtin coverage is a separate problem.

In a word, as titled, it is NFC from the perspective of intrinsics. So I think we don't need test cases for them.

This is not NFC. This changes codegen.

This test amounts to a builtin header test for immintrin.h. This should move to clang/test/Headers and replaced with a builtin test directly checking the builtin handling

Not get your point. But currently no builtin tests under clang/test/Headers/.

These "builtin tests" are checking wrappers implemented in a builtin header, not builtins. These are two levels of functionality that should be independently tested

pengfei updated this revision to Diff 484826.Dec 22 2022, 7:03 AM

Add test case to check FastMathFlagGuard works.

Tests don't exist for users, they exist for compiler developers...
I agree with @arsenm. At least for clang irgen, we should have good test coverage.

You are right. Added a test case.

This test amounts to a builtin header test for immintrin.h. This should move to clang/test/Headers and replaced with a builtin test directly checking the builtin handling

I got your point now, thanks! However, I don't agree with you. Although most intrinsics are simple wrappers of builtins, we have many intrinsics are combinations of more builtins. It's common in mask intrinsics. We also have intrinsics that permute their arguments order when calling to builtins.
IIUC, -fsyntax-only only checks for types, builtins existence. It even doesn't check if intrinsics mapped to the right builtins. Not to mention above cases. So it doesn't sound feasible to me.

Add test case to check FastMathFlagGuard works.

Not to mention above cases. So it doesn't sound feasible to me.

Testing is always feasible. You could even just generate all the combinations

clang/lib/CodeGen/CGBuiltin.cpp
13120

Should only do this in the cases that actually set the flags

clang/test/CodeGen/builtins-x86-reduce.c
9

Should test the builtins from both sets

10–12

use update_cc_test_checks, check-not is super fragile

pengfei updated this revision to Diff 484838.Dec 22 2022, 7:42 AM
pengfei marked 2 inline comments as done.

Address review comments. Thanks!

clang/test/CodeGen/builtins-x86-reduce.c
9

Do you mean this?

arsenm added inline comments.Dec 22 2022, 9:17 AM
clang/test/CodeGen/builtins-x86-reduce.c
9

Almost. You added the guard to 4 switch cases, so I would expect at minimum one builtin from each of the cases. These are both the same builtin. You also should have an fmul, fmin and fmax case

pengfei updated this revision to Diff 485016.Dec 22 2022, 7:12 PM

Add fmul, fmin and fmax cases.

pengfei marked an inline comment as done.Dec 22 2022, 7:12 PM
arsenm accepted this revision.Dec 23 2022, 3:15 PM
This revision was not accepted when it landed; it landed in state Needs Review.Dec 23 2022, 8:17 PM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.