This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU/GlobalISel: Insert readfirstlane on SGPR returns
ClosedPublic

Authored by arsenm on Feb 14 2020, 6:25 PM.

Details

Summary

In case the source value ends up in a VGPR, insert a readfirstlane to
avoid producing an illegal copy later. If it turns out to be
unnecessary, it can be folded out.

Diff Detail

Event Timeline

arsenm created this revision.Feb 14 2020, 6:25 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 14 2020, 6:25 PM
arsenm updated this revision to Diff 244802.Feb 14 2020, 6:33 PM

Remove workarounds in test

nhaehnle requested changes to this revision.Feb 16 2020, 3:41 AM

Please remove the changes in the bswap test. It is critically important that we do not insert unnecessary v_readfirstlane insructions. Mesa performance leans on this quite heavily.

This revision now requires changes to proceed.Feb 16 2020, 3:41 AM

Come to think of it, is there an explicit test that we *don't* insert v_readfirstlane when none is needed at all?

Please remove the changes in the bswap test. It is critically important that we do not insert unnecessary v_readfirstlane insructions. Mesa performance leans on this quite heavily.

This is just noise for the purpose of the bswap test. We already fold out unneeded. readfirstlanes in 3 different places which already have tests. I’m also planning on adding another in the regbank aware combiner

foad added a comment.Feb 27 2020, 2:04 AM

Looks OK to me if Nicolai is happy with it now.

This revision is now accepted and ready to land.Mar 10 2020, 2:45 AM