I used the implementation for floor instead of round. It also turns
out the OpenCL builtin library wasn't using the round builtin, but
implemented the expanded form.
Details
Diff Detail
Event Timeline
Looks OK technically. Why is it useful to lower round and floor in terms of trunc? Is trunc somehow more primitive, or more commonly legal? Does the AMDGPU backend use any of this?
We need the round lowering. I think the floor lowering is now dead in the DAG path (but R600 might need it for f64 if that were ever implemented)
I looked through the implementation of lowerIntrinsicRound, it looks good for me.
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-intrinsic-round.mir | ||
---|---|---|
1 | If the previous version of this file tested intrinsic floor, maybe it makes sense to keep that version (with necessary changes) under a new name, to keep tests for floor? |
llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp | ||
---|---|---|
4691 | Not related to your patch, but couldn't this be if (result > src) ? |
llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp | ||
---|---|---|
4691 | I don't see why not. I"m guessing this has been blindly copied around since AMDIL times, which didn't have all the compare types | |
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-intrinsic-round.mir | ||
1 | There's already a floor test, we didn't actually need the floor expansion |
Not related to your patch, but couldn't this be if (result > src) ?