Page MenuHomePhabricator

[NVPTX] allow register copy between float and int
ClosedPublic

Authored by jingyue on Jul 29 2015, 11:13 PM.

Details

Summary

Fixes PR24303. With Bruno's WIP (D11197) on PeepholeOptimizer, across-class
register copying (e.g. i32 to f32) becomes possible. Enhance
NVPTXInstrInfo::copyPhysReg to handle these cases.

Diff Detail

Event Timeline

jingyue updated this revision to Diff 30991.Jul 29 2015, 11:13 PM
jingyue retitled this revision from to [NVPTX] allow register copy between float and int.
jingyue updated this object.
jingyue added a reviewer: jholewinski.
jingyue added subscribers: bruno, llvm-commits.
jholewinski accepted this revision.Aug 1 2015, 3:46 AM
jholewinski edited edge metadata.

This looks reasonable for now. Longer term, I'd like to experiment with doing away with the typed registers completely and just use .bXX for all registers. Typing is nice for readability, but code gen can be cleaner without it. They all compile down to the same set of registers at the sass level, and I'd like to match the hardware as much as possible (given that we can't just generate sass).

This revision is now accepted and ready to land.Aug 1 2015, 3:46 AM
eliben added a subscriber: eliben.Aug 1 2015, 8:19 AM

This looks reasonable for now. Longer term, I'd like to experiment with doing away with the typed registers completely and just use .bXX for all registers. Typing is nice for readability, but code gen can be cleaner without it. They all compile down to the same set of registers at the sass level, and I'd like to match the hardware as much as possible (given that we can't just generate sass).

It's a shame we can't generate SASS ;-) Working around limitations in ptxas makes LLVM's job somewhat miserable here.

This looks reasonable for now. Longer term, I'd like to experiment with doing away with the typed registers completely and just use .bXX for all registers. Typing is nice for readability, but code gen can be cleaner without it. They all compile down to the same set of registers at the sass level, and I'd like to match the hardware as much as possible (given that we can't just generate sass).

It's a shame we can't generate SASS ;-) Working around limitations in ptxas makes LLVM's job somewhat miserable here.

Oh, I agree 110%. PTX serves a purpose, but it's not very useful in a full compilation chain like what LLVM can provide. But that decision is way above my pay grade.

jingyue closed this revision.Aug 1 2015, 11:02 AM