- User Since
- Oct 13 2017, 7:01 AM (283 w, 3 d)
Apr 5 2022
Apr 1 2022
Just to check that I understand this correctly.
Is the intention behind the function prefix to make it configurable in the end, so that one can also check for Legalized selection DAG and others?
Feb 25 2022
Yes, looks great, thanks
Feb 23 2022
Sorry for taking so long to review this.
Feb 14 2022
Jan 12 2022
This looks like it was fixed by D114652, but that got reverted?
Nov 22 2021
The change of PrivateLabelPrefix was pushed in D114273.
May 13 2021
May 12 2021
I left a nit inline. Apart from that, LGTM.
Mar 22 2021
It would be nice if we can put the common code with AArch64 into some generic place instead of duplicating it.
Other than that, looks good.
Feb 24 2021
If these registers keep changing, we probably want to switch between architectures at some point. But it’s easier to look at (slightly off) names than at hex numbers, so LGTM.
Jan 29 2021
We have slight problems when it comes to differences between hardware versions.
COMPUTE_USER_ACCUM_1 on gfx9 has the same number as COMPUTE_SHADER_CHKSUM on gfx10.
Also, some have slightly different names. E.g. IA_MULTI_VGT_PARAM for gfx9 is called IA_MULTI_VGT_PARAM_PIPED for gfx10.
Jan 27 2021
Jan 26 2021
Jan 12 2021
Also stop passing them for amdgpu_gfx, since the DAG path seems to skip these. I'm unclear on what amdgpu_gfx's expectations are.
Dec 23 2020
Looks good to me, I left some nits inline.
Someone who is more familiar with GlobalISel should review this.
Nov 3 2020
Nov 2 2020
Oct 30 2020
Add test with loop
Oct 28 2020
Add pre-commited tests.
Oct 27 2020
Fix lld test failure
Ah, the dynamically sized alloca provoked a message that it is unsupported.
An alloca in a branch works fine.
Oct 26 2020
Fix code and add more tests.
Looks good to me.
I tested it with the amdvlk vulkan driver (needs a pal-specific patch) and a short Vulkan CTS test ran through fine (except for pal-related failures).
Also set MEM_ORDERED and WGP_MODE for supported PGMRSrc1 registers.
Oct 23 2020
Oct 21 2020
I don’t see a way to add a test case. It fails an assertion on Windows when compiling with msvc.
Oct 20 2020
Disallow calls with C calling convention from shaders
Update from internal review comments.
Oct 19 2020
Oct 16 2020
Oct 14 2020
Oct 12 2020
Looks good, I have one comment.
Oct 8 2020
Fix return value handling, also need to copy COPYs following the call.
Oct 7 2020
Oct 6 2020
Oct 2 2020
Sep 30 2020
So someone has a preference :)
friendly ping for review
Sep 29 2020
Remove debug dumps
Sep 25 2020
Sep 24 2020
Sep 23 2020
Thanks for the heads-up, I reverted it for now.
Sep 16 2020
Add fixme that wait on function entry/return should depend on calling convention.
Improve is-widened check in CustomWidenLowerNode to determine if a value was widened or not.
Having this one use a different multiclass is weird looking. Why can't it directly use the same multiclass as the other cases?
I would expect this to look something like
class MTBUF_LoadIntrinsicPta<SDPatternOperator node, ValueType memvt, ValueType vt = memvt>
and then only override the vt in the weird v3 cases
Move is gs-done check to own function.
Sep 15 2020
Improve comment as suggested
Use DebugLoc from call for waitcnt and return early.
Sep 1 2020
Wow, good catch. Looks good to me.
Aug 21 2020
Looks good, thanks!
Right, changed unsigned to uint8_t for offsets in ImageDimIntrinsicInfo.
Aug 20 2020
Aug 18 2020
Aug 14 2020
Address review comments: Move patterns to SIInstrInfo.td and use MemoryVT.
Preserve fast-math flags and add test that ensures a16 combining is not done on gfx8.
Aug 13 2020
Fix review comments
Jul 23 2020
Thanks for the notification @davezarzycki, an auto-bisecting bot is cool!
Jul 21 2020
I’m also trying to get it working properly (currently for SDag). I think I got the legalization/widening part working but I’m still trying to figure out how to select the right instruction patterns.
Rebased and fix triple for Thumb2 tests as suggested.
Jul 17 2020
Here you go.
Rebased and added some docs.
Jul 13 2020
Jul 10 2020
Rebased (no conflicts this time).
Jul 6 2020
Rebased and removed a few includes as suggested.
Make the TargetTransformInfo a private member of InstCombiner because it should not be used in general inst combines.
Move CreateOverflowTuple out of InstCombiner and make CreateNonTerminatorUnreachable static.
Jun 30 2020
Rebased and call target-specific combining only for target-specific intrinsics as suggested.
Add Function::isTargetIntrinsic() for this purpose.