- User Since
- Apr 4 2014, 4:14 AM (338 w, 9 h)
Wed, Sep 23
Tue, Sep 22
OK, thanks Alex.
Mon, Sep 21
Added legalization. Code is expectedly worse though.
LGTM. Looks like Matt has no concerns about Global ISel as well.
This is obviously LGTM from the AMDGPU BE point of view, we did it ourselves.
Can this appear later in the codegen? It also does not cover global isel, so part in the operand folding probably needs to remain in addition to patterns.
Fri, Sep 18
Doesn't loop block self dominate?
I think you need to drop denorm checks and move the check outside of the address space check.
Thu, Sep 17
Actually it can be done with a singe v_perm_b32.
Replaced v_pack with two instructions.
Wed, Sep 16
Tue, Sep 15
Mon, Sep 14
LGTM, but please also wait for Matt's review.
Wed, Sep 9
Tue, Sep 8
Fri, Sep 4
I still do not believe the problem is specific to EXEC = 0 case. In fact the problem could occur with any EXEC value, it is sufficient to have it different from what is expected. That is not valid to insert a split into any block before exec is restored, not just if previous value was zero.
Thu, Sep 3
I do not think it shall be at a block level. You can allow splitting after all exec modifications are done. For example:
Wed, Sep 2
LGTM, thanks. I never really understood why are we doing this copy.
Tue, Sep 1
LGTM with a nit: please run opt -instnamer on the test before submission.
LGTM even if we need to rework it as Matt suggests.
LGTM, but please run PSDB before submission. This stuff is quite nontrivial.
Mon, Aug 31
Unlimited is really too aggressive. It can slow down compilation dramatically in some cases.
Also it would be nice to see a relevant test.
Fri, Aug 28
Thu, Aug 27