This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Do not use undef on indirect source
ClosedPublic

Authored by rampitec on Jul 29 2020, 5:24 PM.

Details

Summary

We are using undef on the indirect move source subreg and then
using implicit super-reg. This creates a problem in RA when
Greedy decides to split the register. It reassigns the implicit
super-reg but does not bother to change undef source because
it is really does not matter. The fix is to stop lying to RA and
drop undef flag.

This has also hit a problem in SIFoldOperands as it can fold
immediate into an indirect move since there is no undef flag
anymore. That results in multiple test failures, so added the
check for this case.

Diff Detail

Event Timeline

rampitec created this revision.Jul 29 2020, 5:24 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 29 2020, 5:24 PM
rampitec requested review of this revision.Jul 29 2020, 5:24 PM

I think I ran into these problems before. Last time I think I concluded we need to use pseudoinstructions and expand them after register allocation. I think the GPR index mode should definitely be using a pseudo instead of checking for the m0 implicit use.

Also this needs to fix GlobalISel

llvm/lib/Target/AMDGPU/SIISelLowering.cpp
3661

I think I ran into these problems before. Last time I think I concluded we need to use pseudoinstructions and expand them after register allocation rather than here

I think I ran into these problems before. Last time I think I concluded we need to use pseudoinstructions and expand them after register allocation. I think the GPR index mode should definitely be using a pseudo instead of checking for the m0 implicit use.

That's probably orthogonal. We can switch to pseudos, but we shall not fold immediate into an indirect move anyway, just in case it would be somehow produced. If you do not mind I would like to land this before any big changes as there is a blocker associated with the problem. Meanwhile I have started PSDB to see if it has not broke anything.

I think I ran into these problems before. Last time I think I concluded we need to use pseudoinstructions and expand them after register allocation. I think the GPR index mode should definitely be using a pseudo instead of checking for the m0 implicit use.

That's probably orthogonal. We can switch to pseudos, but we shall not fold immediate into an indirect move anyway, just in case it would be somehow produced. If you do not mind I would like to land this before any big changes as there is a blocker associated with the problem. Meanwhile I have started PSDB to see if it has not broke anything.

This isn't the first time this technique has broken.

This still needs to change GlobalISel to match

arsenm added inline comments.Jul 29 2020, 5:47 PM
llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
473

This also needs a comment for why

I think I ran into these problems before. Last time I think I concluded we need to use pseudoinstructions and expand them after register allocation. I think the GPR index mode should definitely be using a pseudo instead of checking for the m0 implicit use.

That's probably orthogonal. We can switch to pseudos, but we shall not fold immediate into an indirect move anyway, just in case it would be somehow produced. If you do not mind I would like to land this before any big changes as there is a blocker associated with the problem. Meanwhile I have started PSDB to see if it has not broke anything.

This isn't the first time this technique has broken.

This still needs to change GlobalISel to match

Sure, I will add global isel part. Meanwhile do you remember why that undef was even set there?

I think I ran into these problems before. Last time I think I concluded we need to use pseudoinstructions and expand them after register allocation. I think the GPR index mode should definitely be using a pseudo instead of checking for the m0 implicit use.

That's probably orthogonal. We can switch to pseudos, but we shall not fold immediate into an indirect move anyway, just in case it would be somehow produced. If you do not mind I would like to land this before any big changes as there is a blocker associated with the problem. Meanwhile I have started PSDB to see if it has not broke anything.

This isn't the first time this technique has broken.

This still needs to change GlobalISel to match

Sure, I will add global isel part. Meanwhile do you remember why that undef was even set there?

It doesn't actually read the content of that register. I think there was an associated liveness verifier error

I think I ran into these problems before. Last time I think I concluded we need to use pseudoinstructions and expand them after register allocation. I think the GPR index mode should definitely be using a pseudo instead of checking for the m0 implicit use.

That's probably orthogonal. We can switch to pseudos, but we shall not fold immediate into an indirect move anyway, just in case it would be somehow produced. If you do not mind I would like to land this before any big changes as there is a blocker associated with the problem. Meanwhile I have started PSDB to see if it has not broke anything.

This isn't the first time this technique has broken.

This still needs to change GlobalISel to match

Sure, I will add global isel part. Meanwhile do you remember why that undef was even set there?

It doesn't actually read the content of that register. I think there was an associated liveness verifier error

Hm... My understanding that it may read that register. Not necessarily will but may. Anyway, I hope PSDB will show problems if there are any.

It did pass the PSDB.

rampitec updated this revision to Diff 281949.Jul 30 2020, 8:57 AM
rampitec marked an inline comment as done.

Added global-isel part.
Added comment.

arsenm accepted this revision.Jul 30 2020, 10:44 AM
This revision is now accepted and ready to land.Jul 30 2020, 10:44 AM
This revision was automatically updated to reflect the committed changes.