When extracting the first lane of a predicate created using the
llvm.get.active.lane.mask intrinsic, it should give the same codegen as
when the predicate is created using the llvm.aarch64.sve.whilelo
intrinsic, since get.active.lane.mask is lowered to whilelo. This patch
ensures the codegen is the same by recognizing
llvm.get.active.lane.mask as a flag-setting operation in this case.
Details
Details
Diff Detail
Diff Detail
Event Timeline
Comment Actions
LGTM! Thanks for the codegen improvement @RosieSumpter. :)
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
14654 | nit: Could you add a short comment here explaining that get_active_lane_mask is lowered to a whilelo instruction? |
nit: Could you add a short comment here explaining that get_active_lane_mask is lowered to a whilelo instruction?