This attempts to expand the handling for G_FMAXNUM/G_FMINNUM for vector types, which is hopefully fairly straightforward now that fptrunc and fpext are working.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/test/CodeGen/AArch64/fminmax.ll | ||
---|---|---|
271 | It's just uglyness from buildvectors with undef elements. You can see it moving out-of and into the same register: mov s2, v0.s[1] .. mov v0.s[1], v2.s[0] And it shouldn't be setting v0.s[3], as those value are just undef. All that can be cleaned up with other combines, they just haven't been implemented yet (and they resolve around what happens to propagated undef elements, so have some dragons). | |
402 | The calling convention here is very odd. They seem to match between the two, at least. | |
1409 | The SDAG for fp16 without fullfp16 has always scalarized many instructions, and I don't believe anyone has looked into fixing it, as it's fairly low priority. It's worth getting it right with GISel if we can though. |
May I ask you to add :
... --- name: 3xs32_legal_with_full_fp16 alignment: 4 body: | bb.0: liveins: $h0, $h1 %a:_(<3 x s32>) = IMPLICIT_DEF %b:_(<3 x s32>) = IMPLICIT_DEF %minnum:_(<3 x s32>) = G_FMINNUM %a, %b %c:_(<3 x s32>) = COPY %minnum(<3 x s32>) RET_ReallyLR implicit %c
to llvm/test/CodeGen/AArch64/GlobalISel/legalize-fminnum.mir?
LGTM (with other comments addressed), thanks.
llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp | ||
---|---|---|
929–934 | General musing out loud, not for this patch: we should find a way to factor out these common vector rules cleanly. |
I've added a test, but it's a little different. I am not a huge fan of tests with IMPLICIT_DEF/undef as they could just be simplified away in the future, and the <3 x s32> needs to go somewhere legal in order to be valid (I think). It adds a test for the legalization that is currently seen when lowering.
General musing out loud, not for this patch: we should find a way to factor out these common vector rules cleanly.