Fixes PR24303. With Bruno's WIP (D11197) on PeepholeOptimizer, across-class
register copying (e.g. i32 to f32) becomes possible. Enhance
NVPTXInstrInfo::copyPhysReg to handle these cases.
Details
Details
Diff Detail
Diff Detail
Event Timeline
Comment Actions
This looks reasonable for now. Longer term, I'd like to experiment with doing away with the typed registers completely and just use .bXX for all registers. Typing is nice for readability, but code gen can be cleaner without it. They all compile down to the same set of registers at the sass level, and I'd like to match the hardware as much as possible (given that we can't just generate sass).
Comment Actions
It's a shame we can't generate SASS ;-) Working around limitations in ptxas makes LLVM's job somewhat miserable here.
Comment Actions
Oh, I agree 110%. PTX serves a purpose, but it's not very useful in a full compilation chain like what LLVM can provide. But that decision is way above my pay grade.