- User Since
- Dec 12 2018, 5:57 PM (109 w, 4 d)
Fri, Jan 15
This doesn't add metadata to llvm intrinsics that are not constrained.
Oh, right. I misunderstood what's doing in these patches and thought we can add metadata to any intrinsics by CGFPOptionsRAII now. :-)
If a relevant #pragma is used then without this change the metadata ...
Yeah, I understand it now. Thank you! But why we still have the wrong maytrap with this patch?
It's true that the middle-end optimizers know nothing about the constrained intrinsics. Today. I plan on changing that in the near future.
If use of the constrained intrinsics will cause breakage of the target-specific clang builtins then that's important to know and to fix.
I had a look at these changes and didn't find anything will cause breakage for now. I'm still not sure if we need to teach target-specific clang builtins to respect strict FP or not. But it has nothing to do with this patch.
Thu, Jan 14
What is the reason for treating this differently in LLVM?
tile config register is the user of each AMX instruction
Wed, Jan 13
Hi Kevin, what's the intention of adding constrained FP metadata for target dependent builtins? I believe the middle-end passes always ignore these builtins. What's more, will it imply user these builtins have different behaviors under different FP model? But it's not true for most platform builtins, since they are just representative of given instructions.
Tue, Jan 12
Mon, Jan 11
LGTM, but I'd like to see if @lebedev.ri has any objections.
Sun, Jan 10
to amx intrinsics the the
Fri, Jan 8
Thu, Jan 7
Is inline assembly the only case emms instruction will be needed? But inline assembly doesn't enable mmx attribute automatically, right? E.g. https://godbolt.org/z/43ases
Analyzing asm block and appending the mmx attribute if we see mmx instructions might be needed. But if we do the analysis, just adding an emms instruction at the end of the block seems better.
Tue, Jan 5
I saw the document mentions ESP/RSP should be preserve for regcall. But I cannot think out a situation that ESP/RSP need to preserve either.
Mon, Jan 4
Wed, Dec 30
Let me know if I am not answering/understanding the questions.
Tue, Dec 29
Mon, Dec 28
Hi Sanjay, pairwise reduction seems not adopted by any targets. But do we need to consider the FMF here? I found LangRef says the intrinsic is sequential for fadd and fmul if reassoc flag is not set.
Hi Simon, I found we have the same problem for fadd/fmul. See https://godbolt.org/z/3YKaGx
X86 Intrinsics imply reassoc flag, but llvm.vector.reduce.* doesn't.
Thu, Dec 24
LGTM. Thanks for the refactors. Maybe better to wait for a few days to see if others have objections.
Wed, Dec 23
In my test case, it is transformed after Combine redundant instructions.
Tue, Dec 22
Mon, Dec 21
Dec 19 2020
Dec 17 2020
Dec 16 2020
Dec 15 2020
Dec 14 2020
update_test_prefix.py assumes the conflicting output. You may need to change the expection of it as well.
What's the difference with the existing code? It looks to me that you just brought the warning out of loop, right?
Yes, it is using maxpd/minpd here. Sorry for missing the nan cases.
In fact, I doubt about the behavior when using fast math with target intrinsics. In my opinion, target intrinsics are always associated with given instructions (reduce* are exceptions). So the behavior of intrinsics, e.g. respect nans, signaling, rounding, exception etc. are concordance with their associated instructions. That said the fast math flags won't change the behavior of intrinsics.
Assume above, I'm happy to set these expansions to fast math. Either keeping the existing implementaion or expansion LGTM.
Dec 11 2020
LGTM. Thanks for bringing this refactor.
I also verified that ICC and GCC both do reduce math in an binary tree way, though sometimes ICC has a different LSB from GCC and Clang.
Dec 10 2020
looks like quite a few (3K) of tests in CodeGen (14K or so) use it. So I wanted to understand why removing a prefix used nowhere in the test lead to an incorrect rewrite of some assertions.
There are more then 6K tests. update_llc_test_checks.py is just one of the update scripts.
Add initial author to have a review.
I think you meant --allow-unused-prefixes=true.
Yes, that's what I meant. Thanks for correct me.
Dec 9 2020
LGTM. I think we can land this patch as a beginning. Cheers~
I left about 12 tests when I fixed the prefixes issues using the script, because the changes on prefixes will resulted in conflict when re-generating them. I suspected if it is a bug in update_llc_test_checks.py, but I didn't find time to look through. Maybe we can add --allow-unused-prefixes=true for these failed tests as a workaround?
The changes on the tests look good to me. But I'm not sure the changes on the code. I'd like to see opinions from other reviewers.
Dec 8 2020
Dec 7 2020
Dec 6 2020
Dec 3 2020
Dec 2 2020
Glad to see you are using the script. I'm not very sure its behavior. Please pay attention to its changes especially marked CHANGE
It's better to set parent since the patch is based on D91927.
Nov 30 2020
Nov 29 2020
Nov 27 2020
Nov 26 2020
Fix a missed change.
Adress Craig's comments.
Revert unrelated change.
LGTM. You'd better wait few days to see if others object. Thanks.
Nov 25 2020
Can you add more detail for the background on summary?