This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Codegen FP16 vmaxnm/vminnm scalar instructions
ClosedPublic

Authored by SjoerdMeijer on Mar 20 2018, 3:29 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

SjoerdMeijer created this revision.Mar 20 2018, 3:29 AM

Hi Sjoerd,

Looking at the vminmaxnm-safe.ll test file, there are many other condition code tests and patterns. Are these also relevant to FP16 as well?

cheers,
sam

Hi Sam, thanks for checking this and pointing this out. These instructions should be selected when 'fast math' is enabled. Tests in vminmaxnm-safe.ll are checking that this won't happen when fast math is not enabled, which I forgot to add. So I've added new file fp16-vminmaxnm-safe.ll. I've also moved the actual codegen tests to new file fp16-vminmaxnm.ll, because some cases were missing and there are a lot of them.
Cheers.

samparker added inline comments.Apr 12 2018, 1:51 AM
test/CodeGen/ARM/fp16-vminmaxnm-safe.ll
154 ↗(On Diff #142123)

Could you school an FP noob and explain why we don't expect two min instructions?

SjoerdMeijer updated this revision to Diff 142178.EditedApr 12 2018, 7:55 AM

Could you school an FP noob and explain why we don't expect two min instructions?

The VMINNM and VMAXNM variants handles NaNs in consistence with the IEEE754-2008 spec, the variants without NM apparently not.

Long story short, the initial selection dag of the "safe" tests contain nodes:

t14: f16 = fminnum t10, ConstantFP:f16<APFloat(18944)>
  t18: f16 = fminnan t14, ConstantFP:f16<APFloat(20544)>

whereas the "fast math" tests contain:

t14: f16 = fminnum t10, ConstantFP:f16<APFloat(18944)>
  t24: f16 = fminnum t14, ConstantFP:f16<APFloat(20544)>

as the "safe" test contains 1 fminnum node and 1 fminnan, we expect only 1 vminnm.

I forgot to mention, that due to your last comment, I had a closer look at the generated code and spotted a missed optimisation. Therefore I've also added a new pattern, as it was not always generating a vmin/vmax pattern. I've changed the "safe" tests to check the actual expected pattern; before the tests were not really great because they were not testing the full pattern.

samparker added a comment.EditedApr 13 2018, 12:12 AM

Hi Sjoerd,

Would you mind generating the tests with the llc_update script please? In their current state, they're not showing that codegen is correct. And thanks for the schooling!

cheers,

Thanks, I've updated the tests with more checks.

samparker accepted this revision.Apr 13 2018, 8:31 AM

LGTM, cheers!

This revision is now accepted and ready to land.Apr 13 2018, 8:31 AM
This revision was automatically updated to reflect the committed changes.