- User Since
- Feb 19 2019, 1:58 AM (83 w, 4 d)
Fri, Sep 18
Thu, Sep 17
- Added run line with -mattr=+unaligned-access-mode to test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir
- Removed dword-access-mode from tests.
This helps vulkan gfx9 windows tests.
Wed, Sep 16
- Reimplemented as a AMDGPU GICombineRule.
Wed, Sep 9
Tue, Sep 8
- Added FeatureLdsMisalignedBug to GFX 10.1.1
Fri, Sep 4
- Updated description of FeatureLdsMisalignedBug to match what is covered by tests.
Thu, Sep 3
Note that tests with flat instructions are not copied to GlobalISel/lds-misaligned-bug.ll. While we can put similar check in allowsMisalignedMemoryAccessesImpl for flat address space as well it will cause SDag to produce less optimal code. For some reason it will break down a load 16, align 8 into four flat_load_dword instead of two flat_load_dwordx2 instructions (but not similar stores). This patch should fix problems mentioned in D84403 while I look into this.
Wed, Sep 2
Tue, Sep 1
Aug 21 2020
Aug 20 2020
- Updated llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-local-128.mir
- Removed FeatureDoesNot* ones.
Issue was in (load 16, align 4, addrspace 3) which should not be legal for gfx7 but because of -global-isel-abort=0 if it crashes it would just give the same MIR as input.
I've changed those to align 8. D81638 will update it again to pick DS_READ_B128 or DS_READ2_B64.
Aug 18 2020
Sorry, unfortunate timing. I removed your "accept revision".
- Addressed comments
- Addressed comments.
Aug 10 2020
Jul 24 2020
Moved tests from here to parent revision (mistakenly put them in wrong patch):
Moved tests from child revision to here (mistakenly put them in wrong patch):
- Addressed comments.
Jul 23 2020
Also added another child revision that selects these instructions for SDag the same way as for GlobalISel https://reviews.llvm.org/D84403 so the tests will make more sense.
- Updated to reflect the changes in parent revision.
This is the way these requirements and options make sense to me and match the docs.
As for a feature that turns unalingned access for both buffer and ds instructions, I guess it could be something that just acts like -mattr=+unaligned-buffer-access,+unaligned-ds-access but I'm not sure if it's really necessary.
Jul 14 2020
Jul 13 2020
Jul 10 2020
- Also renamed and updated SDag tests.
Jul 9 2020
- Updated tests.
- Added tests with waterfall loops.
- Changed them to -stop-after=instruction-select like others for GlobalISel.
Jul 7 2020
- Addressed comments
- Also renamed and updated tests for SDag. Let me know if you would rather have this as a separate patch.
Jul 6 2020
Jul 2 2020
Reduced duplicated code for SelectDS64Bit4ByteAligned and SelectDS128Bit8ByteAligned.
Jun 29 2020
Code that was changing alignment requirements from SITargetLowering::allowsMisalignedMemoryAccessesImpl in now in D82788.
Jun 22 2020
Sorry, I was away for a few days.
Jun 16 2020
Looking at ISA .pdf docs for SI (gfx6) and onward I have not found any requirements for alignments on local loads and stores. There are mentions of dword alignment for reads and writes of dword and larger for buffer instructions but nothing more specific for LDS or GDS. SDag likes to break down ds_read/write_b128 in certain cases but does not know about b96. It seems to me that the code was not updated since SI.
Now b96 and b128 will be picked for align 4 and larger (align 2 and 1 are broken down same way as before). Furthermore, there are several Vulkan conformance tests that have align 4 loads and stores (96 and 128) that will now pass.
Jun 12 2020
Yes, it basically avoids problems of not being able to select 3x32 for local address space. SDag was breaking these down to a ds_read_b64 and ds_read_b32 so I did the same thing for GlobalISel.
Jun 11 2020
Jun 10 2020
What about DS_READ? Following are also broken:
Feb 11 2020
- Rename ldrq_w to ldr_w; Rename strq_w to str_w.
Feb 7 2020
Not yet, a proposal was made to both GCC and LLVM and as far as I can tell no work was done on GCC yet. If we accept these names I'll let them know so we end up with matching names.
Jan 30 2020
We could do that for loads. For example on Mips32r5 (where we need most instructions) for intrinsic ldr_d instead of:
Jan 29 2020
A few notes/questions:
Dec 12 2019
Dec 11 2019
Sorry @atanasyan, you already reviewed this, but for Mips64 tests would fail with -verify-machineinstrs option. Apparently both 'SLT' and 'SLT64' use GPR32 for result. It's been corrected now and there should be no issues for EXPENSIVE_CHECKS builds. Can you take a quick look at new changes? Thanks.
Dec 5 2019
Dec 4 2019
There is a trick we can do to avoid taking an additional register. We can reuse either OldVal or Incr for intermediate results. I know that return value needs to be same as OldVal but I don't know if changing Incr is allowed.