This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] Combine concat_vectors of scalars into build_vector.
ClosedPublic

Authored by ab on Apr 9 2015, 6:29 PM.

Details

Summary

Replaces D8884 and D8885, which focused on the wrong issue.

Combine something like:

(v8i8 concat_vectors (v2i8 bitcast (i16)) x4)

into:

(v8i8 (bitcast (v4i16 BUILD_VECTOR (i16) x4)))

Only when the concatenated vector type isn't legal. Also, I'm not sure if the bitcast to an integer scalar (rather than something smarter, say picking FP if all scalars are FP) is an issue?

Diff Detail

Repository
rL LLVM

Event Timeline

ab updated this revision to Diff 23553.Apr 9 2015, 6:29 PM
ab retitled this revision from to [CodeGen] Combine concat_vectors of scalars into build_vector..
ab updated this object.
ab edited the test plan for this revision. (Show Details)
ab added reviewers: RKSimon, chandlerc.
ab added a subscriber: Unknown Object (MLST).
RKSimon edited edge metadata.Apr 10 2015, 8:19 AM

Looks promising, thank you. I agree reducing unnecessary bitcasts (as you said scalar float -> integer ..... vector integer -> float) would make it easier for later optimizations.

ab updated this revision to Diff 23638.Apr 10 2015, 7:29 PM
ab edited edge metadata.
  • Bitcast to FP or integer scalars depending on the most common type in the inputs.
ab added a comment.Apr 10 2015, 7:31 PM

The tests only cover 1 or 2 scalar inputs. More than 2 is more interesting, and is actually a good testcase for D8885, so I'll revive it and add those there.

-Ahmed

chandlerc accepted this revision.Apr 10 2015, 8:10 PM
chandlerc edited edge metadata.

Bunch of nit-picky comments, but feel free to submit this once addressed. None of this seems terribly fundamental or important, and the transform seems quite good.

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
11510 ↗(On Diff #23638)

Since this is essentially a new function, please use 'DL' to be consistent with the coding conventions.

11516–11524 ↗(On Diff #23638)

Are integer vectors the common case here? I would somewhat imagine they are as they seem the most likely to be widened or narrowed, but I've no idea what tests you're looking at. The rest assumes the default should be integer, but invert it appropriately if that's not the case.

I would construct a fresh UNDEF node here with the integer type so that if we don't need to cast to floating point, we can just skip ahead.

11527–11528 ↗(On Diff #23638)

I'm not sure this is the right tactic. If we have a mixture of floating point and integer inputs, I think we should assume the entire vector is intended to be floating point. We don't form floating point vectors randomly anywhere, but the integers could come from generic loads or some such that had no typed operation. Does that make sense to you? It also simplifies the logic -- you can track whether we see any non-floating-point operands and any floating point operands, and if we see both, bitcast all the inputs.

11532–11534 ↗(On Diff #23638)

Why std::for_each rather than a range based for loop?

I would also test the type of the op before bitcasting it to avoid the (sadly non-trivial) machinery involved in constantly folding no-op bitcasts.

This revision is now accepted and ready to land.Apr 10 2015, 8:10 PM
ab added inline comments.Apr 13 2015, 11:34 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
11527–11528 ↗(On Diff #23638)

I sat down and actually thought about this: I agree, if any of the inputs is floating point, all should be, but the real-world code I'm looking at doesn't involve floating point at all, so I might have the wrong reasons?

We can either look at:

  • the vectors: floating point vectors is a strong hint the result type should really be floating point as well, and I think this is what you're saying above.
  • the scalars: I see only one case where a floating point scalar value is "unintentionally" bitcast to a vector, and that's a target with say illegal i64 but legal f64 loads. In that case, it's also a hint that we should prefer FP, but that's more because the integer scalar type is illegal than anything else, really. When the IR does an "intentional" bitcast, the AArch64 "_mixed_" testcase shows another reason this is a good idea: moving the scalars to the vector/FP bank once is better (modulo micro-architectures) than doing all the insertions directly from the GPRs.

My conclusion is that if there's any floating point type anywhere (input or output), everything should be floating point. Thoughts?

I'll commit something like that later, thanks for the review!

11532–11534 ↗(On Diff #23638)

Oof, not sure what happened there, my brain was clearly off when I wrote the patch.

This revision was automatically updated to reflect the committed changes.