This patch introduces a "ftz" function attribute and uses that to enable
vectorization for ARM Neon when -ffast-math is not specified. It would be nicer
to encode FTZ as part of FastMathFlags but we've run out of space there.
If this approach looks workable, I'll change the NVPTX backend to also use this
(backend independent) ftz attribute instead of the custom "nvptx-f32ftz"
attribute. I'll also add an entry to the langref.