This is an archive of the discontinued LLVM Phabricator instance.

[X86] Add initialization of MXCSR in llvm-exegesis
ClosedPublic

Authored by pengfei on Nov 30 2019, 9:50 PM.

Details

Summary

This patch is used to initialize the new added register MXCSR.

Event Timeline

pengfei created this revision.Nov 30 2019, 9:50 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 30 2019, 9:50 PM

I'm not sure I understand why this needs to be initialized. Why don't we need to do it for FPCW?

llvm/tools/llvm-exegesis/lib/X86/Target.cpp
506

Isn't the default value of MXCSR 0x1f80 not 0x1f8?

pengfei updated this revision to Diff 231613.Dec 1 2019, 1:56 AM
pengfei marked an inline comment as done.

Fix typo.

llvm/tools/llvm-exegesis/lib/X86/Target.cpp
506

Yes! I missed an 0. Thanks!

I'm not sure I understand why this needs to be initialized. Why don't we need to do it for FPCW?

I'm sorry for not giving comments. Yes, we do need for FPCW if this is a reasonable approach. But I'm not sure if we need to explicatly initialize registers like MXCSR and FPCW or not either.
Can we just return a MCInstBuilder(X86::NOOP) for them to make the check pass (In Assembler.cpp L44, it judges the IsSnippetSetupComplete by checking if the return value is empty).

courbet accepted this revision.Dec 2 2019, 12:07 AM

I'm not sure I understand why this needs to be initialized. Why don't we need to do it for FPCW?

We make sure that every register that is used by an instruction in the snippet is initialized. This is to avoid having fluctuations in measurements due to performance depending on values in registers. I think it's great if SSE/AVX instructions start explicitly state their deps on MXCSR, because the behaviour does indeed depend on the value of these flags.

Regarding whether masking instruction is the right thing to do: Yes for now, because this will ensure that these instructions will at least pass, but on the other hand that might not be perfectly representative of the real performance characteristics. AFAICT @gchatelet still wants to work on value constraints & exploration.

So this looks good to me if Craig has no other remarks.

This revision is now accepted and ready to land.Dec 2 2019, 12:07 AM
This revision was automatically updated to reflect the committed changes.

We make sure that every register that is used by an instruction in the snippet is initialized. This is to avoid having fluctuations in measurements due to performance depending on values in registers. I think it's great if SSE/AVX instructions start explicitly state their deps on MXCSR, because the behaviour does indeed depend on the value of these flags.

Thanks for the explanation. Please note that currently we only state the rounding modes and the IEEE masks bits of MXCSR. This means instructions like VRSQRT14PS, which only have dependence on FTZ and DAZ, wouldn't be modeled.

Please note that currently we only state the rounding modes and the IEEE masks bits of MXCSR. This means instructions like VRSQRT14PS, which only have dependence on FTZ and DAZ, wouldn't be modeled.

FTZ and DAZ are modeled now by commit rGc8995de06994. Thanks.

Regarding whether masking instruction is the right thing to do: Yes for now, because this will ensure that these instructions will at least pass, but on the other hand that might not be perfectly representative of the real performance characteristics. AFAICT @gchatelet still wants to work on value constraints & exploration.

Yes definitely. LGTM as well.