The accumulator value always seems to be ignored when creating
non-strict VECREDUCE_FADD/FMUL ISD nodes. This patch fixes that.
Imho we should definitely make this change, and consider also making it for undef -- it seems like a bad idea to special-case it, because it may negatively affect optimization.
If there are concerns about backwards compatibility for experimental intrinsics, we should rename and autoupgrade them. (Upgrade fadd to fadd.acc, either with original argument or 0.0 argument, depending on whether it is fast).
@nikic thanks for pointing me to that discussion! I clearly misread the LangRef for this change :)
I agree it makes more sense to change these intrinsics and to make its accumulator argument always relevant (regardless of what flags are set) and AutoUpgrading older IR. Given that I'm currently working on this, I'd be happy to move this forward with patches and a proposal/discussion on the mailing list to change the experimental reduction intrinsics. @aemerson you expressed an intention to work on it later this year, do you have any objection to me moving forward with this now?
If you're going down the auto-upgrade route then I suggest proposing that we promote these from experimental to first class intrinsics. That way you can auto-upgrade form one intrinsic to another without any risk of breaking older code (i.e. you can't just start using an accumulator arg that before could be unused and therefore undef).
I'm not sure I agree - we shouldn't drop the experimental state unless we're certain the intrinsic is not going to need to be further tweaked in the future. I still think not including an accumulator argument at all would be for the best.
There needs to be some way to determine when it's safe to upgrade the IR, either through a name change or some other mechanism. The old implementations which pass undef/whatever as an accumulator can't start being compiled to code making use of it.
If we don't include an accumulator, then we have to split the intrinsics into strictly ordered and non-strictly ordered. But that's what the fast math flag is for on the call, so the flag becomes useless.