Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
llvm/test/Transforms/InstCombine/vscale_sext_and_zext.ll | ||
---|---|---|
43 | Nit: Missing space. |
I think you might need to be a bit more careful to justify these transforms.
LangRef says "If the result value does not fit in the result type, then the result is a poison value." But does that mean the sign bit is guaranteed to be zero?
We could use the max-vscale attribute to determine if the runtime value fits the type. If this cannot be determined, the optimization would be skipped, otherwise we change the type of llvm.vscale to the extended type.
Would that be a suitable approach?
I assume the max-vscale attribute will usually be present? In that case, checking it seems best.
(We might actually want to consider relaxing the LangRef rule; if we can easily determine whether the result fits into a given bitwidth, there isn't really any reason to make llvm.vscale return poison.)
There's instances where max-vscale won't be known at compile time, but this route should allow for the optimizations wherever possible.
I believe in terms of the LangRef, it's still possible for llvm.vscale to return poision, and it could still happen in cases where max-vscale is unknown until runtime. So personally I don't think any adjustment is required.
Just to make sure that I understand the suggestion, do you mean changing the wording in the LangRef that the result is undefined instead of poison? If so, does that mean we could replace llvm.vscale.i8() with undef if max(vscale_range) doesn't fit the i8?
My suggestion is that, for example, @llvm.vscale.i8() returns "vscale & 0xFF". Not really sure if it's worthwhile, but it seems like an unnecessary source of poison.
It would allow folding "trunc vscale". And it might be a little simpler to understand. Either way, not a big deal. :)
Added check for vscale_range attribute before optimisation
If the attribute isn't present, or if the maximum value exceeds the bitwidth of the original instrinsic, the optimization is skipped
Updated .ll test to test the extra logic
llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp | ||
---|---|---|
1371 | This shift can overflow. I'd suggest using Log2_32(MaxVScale) instead. |
This shift can overflow.
I'd suggest using Log2_32(MaxVScale) instead.