By default fixed-width i1 vectors are promoted, but when SVE is available,
some expression trees can be rewritten to use <vscale x M x i1> types,
such that all operations are performed on predicate registers, thus
avoiding unnecessary sign-extends and truncates. It does this by bubbling
up the 'sign-extend + extract' operations all the way up to nodes that
can be performed on SVE predicate registers.
The example chosen in this patch is to optimise an OR reduction
of a <N x i1> type, which can be implemented directly with a PTEST
instruction.
This patch is a rework of D117574.
This feels weird; we're adding nodes to the worklist without having actually done any transforms. Maybe this should be somewhere else?