When it comes to the scalar cost of any predicated block, the loop vectorizer by default regards this predication as a sign that it is looking at an if-conversion and divides the scalar cost of the block by 2, assuming it would only be executed half the time. This however makes no sense if the predication has been introduced to tail predicate the loop.
Can you add the test separately and then just have the diff of the cost-model change in this patch?
Are the lines actually auto-generated? Looks like they are just a subset of the debug output?
nit: are dse_local, no capture, read none local_unnamed_addr actually needed?
are all those attributes needed? can you limit them to the minimum required?
Yep, that's the plan. It will need a number of other patches though. I think one for adding a cost to predicated blocks is important, along with the one for costing masked loads more correctly under MVE.
Thanks. We need to be careful with this as it has the potential to cause some regressions, especially as we here (in MVE land) make heavy use of predication nowadays. I am working through some of the details.