If we're doing a compare that only tests the sign bit and only the sign bit is demanded, we can just bypass the node. This removes one of the blend dependencies in our v2i64->v2f32 uint_to_fp codegen on pre-sse4.2 targets.
Details
Details
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Comment Actions
Are you OK with the increase in register pressure? I think this is mainly due to SSE41 BLENDVPS being hard coded to using xmm0 for the selection mask.