- User Since
- Jan 27 2020, 1:17 AM (126 w, 2 d)
Mon, Jun 20
I found one more place where type information from dxil needs to be extracted and that’s metadata.
The patch now adds a second callback, which allows changing metadata while it’s read in.
In the test, that’s used to replace a pointer value metadata with a tuple of the original value and metadata that stores its type information.
I wasn’t able to store type info about metadata without replacing it because the values get indistinguishable after being read (e.g. an i8* and an i32* both end up as a ptr).
Since there have been no other reactions and I think this is a reasonable patch, I’ll accept it.
Please wait a day before submitting it in case others have comments.
Fri, Jun 17
I think exposing something like this is reasonable.
Tue, Jun 14
May 24 2022
May 23 2022
Looks good to me, thanks for moving out the one negative test.
May 18 2022
May 17 2022
Both patches look good to me!
May 13 2022
The highlighting looks nice with this.
I didn’t see anything blatantly wrong or missing, so thumbs up from me.
May 12 2022
Looks good to me
Apr 28 2022
Apr 27 2022
I wanted to add this combine before, but I don’t think there is a way to add d16 to an instruction without potentially breaking the code.
The reason is, when an image_sample has the d16 flag enabled, it will use f32→f16 truncation or i32→i16 truncation, depending on the texture format in the descriptor.
Apr 22 2022
I’m not sure why the writelane registers are added as live-in to every block. Is the same happening for WWM registers and VGPRs used for SGPR spills?
Apr 19 2022
Mar 29 2022
Mar 16 2022
Looks good, thanks, just left some small comments.
Mar 14 2022
Can you add lit tests for the fixes you made please?
Mar 3 2022
Mar 2 2022
Looks good to me
Mar 1 2022
Feb 28 2022
Feb 22 2022
Feb 21 2022
friendly ping for review
Feb 18 2022
friendly ping for review
Feb 17 2022
The pre-merge builds report some test failure in MC/AMDGPU and MC/Disassembler/AMDGPU. I think the assembler fix in VOPCInstructions.td and these test changes could be a separate patch.
Feb 14 2022
Feb 11 2022
Feb 10 2022
Improve WQM comment
Feb 8 2022
Thanks! Fixed your comments
Two more typos
Feb 7 2022
Looks good to me, with all the comments and fixes the generated tests look a load better than in the first version.
Feb 4 2022
Feb 2 2022
Forgot to accept as amdgpu last time.
Feb 1 2022
Jan 28 2022
Jan 27 2022
Jan 26 2022
@phosek, this patch fixes a regression that was introduced with D116521.
Could we fix this regression first with a simple patch that does not risk to be reverted again and do further refactorings afterwards?
Our downstream gcc build is broken and we’d like to re-enable it rather sooner than later.
Jan 25 2022
Thanks, the patch got smaller with applyBuildFn.
Jan 24 2022
Thanks for the review, I fixed the comments.
wouldn't it be nice if there was some way to autogenerate the Offset to NoOffset Optimization Mapping table?
Jan 21 2022
I think I found the problem and I’m a little surprised it currently works.
In the llpc pipeline, a module pass (PipelineStateClearer) writes pal metadata into the amdgpu.pal.metadata.msgpack metadata.
This metadata is read by the AMDGPUAsmPrinter, extended as functions are emitted and finally written out in AMDGPUTarget[…]Streamer::finish.
Jan 20 2022
Under the assumption that opencl or hip on pal do not use this, it looks fine to me.
Both patches seem to fix the bug
Jan 19 2022
This change seems to sink v_cmp instructions, which creates different results if the exec mask changed and that makes several Vulkan tests fail.
I put a reproducer here: https://gist.github.com/Flakebi/fd1d91a806b60ec330e9f61e19fe62ac
Compile with llc -mtriple=amdgcn--amdpal -mcpu=gfx1010 -verify-machineinstrs -start-before=machine-sink -stop-after=machine-sink PipelineVsFs_0xDD57C231E25DA514.mir -o PipelineVsFs_0xDD57C231E25DA514-after.mir
and the %104:sreg_64 = V_CMP_NE_U32_e64 %89, %101, implicit $exec instruction will be sunk from bb.5 into bb.6. For reference, the pipeline is from the dEQP-VK.subgroups.arithmetic.framebuffer.subgroupexclusiveadd_float_vertex CTS test.