This is an archive of the discontinued LLVM Phabricator instance.

[llvm-exegesis] Proposal to add Exegesis Configuration to td files.
AbandonedPublic

Authored by gchatelet on Jul 31 2018, 6:54 AM.

Details

Summary

This patch is not to be submitted as is. I'd like to get some input first, but starting with some code will help.
Some instruction's latency depend on the value of its operands (e.g. DIV with NaN or subnormal takes longer to execute than normal numbers).
For llvm-exegesis to explore the latency space of the instructions it needs some special knowledge of the instruction.
I envision that the best place to store this knowledge is the td file.

Let's take the following div example, by choosing special starting values we can make sure that even with 100000 iterations divss will always stay in the normal range.

movss xmm0, 0x3F800000 1.0f
movss xmm1, 0x3F800001
1.0f + 1ulp
.rept 100000
divss xmm0, xmm1
.endr

So an exegesis configuration for divss would be (1.0f, 1.0f + ulp),
Another could be (1.0f, 0.0f) to test infinity generation,
Another could be (NaN, Inf) to test the impact of NaNs...

The configuration should allow to specify the bit precise values and to pack them in case of vector instructions.

This configuration should work for all floating point types (including X87) and for as many arithmetic operations as possible (div, mul, sqrt, ...).

It can be part of the Instruction Records or separated as it is in this Patch.

Addendum

I started a document to gather the different strategies (feel free to comment).
From this it becomes clear that we only need to annotate an Instruction with its semantic, llvm-exegesis would then run strategies depending on the semantic and the type of the arguments (floating point precision, packed/scalar).

I'll update the code with a simpler proposal.

Diff Detail

Event Timeline

gchatelet created this revision.Jul 31 2018, 6:54 AM
gchatelet updated this revision to Diff 158244.Jul 31 2018, 6:59 AM
  • remove unrelated files
gchatelet updated this revision to Diff 158245.Jul 31 2018, 7:00 AM
  • remove unrelated files
gchatelet updated this revision to Diff 158249.Jul 31 2018, 7:06 AM
  • Added documentation
gchatelet retitled this revision from Proposal to add Exegesis Configuration to td files. to [llvm-exegesis] Proposal to add Exegesis Configuration to td files..Aug 1 2018, 3:02 AM
gchatelet edited the summary of this revision. (Show Details)
gchatelet updated this revision to Diff 158512.Aug 1 2018, 5:45 AM
  • Simpler version with only the semantic of the instruction.

I'm not sure this version will scale, we need to spell out all the instructions we want to support...
The nice thing is that it's pretty orthogonal to the rest of the td files.

Any suggestions?

I'm not sure this version will scale, we need to spell out all the instructions we want to support...
The nice thing is that it's pretty orthogonal to the rest of the td files.

Any suggestions?

I'm worried about the scalability of this as well - how much can you achieve just be using the domain of the instruction?

I'm worried about the scalability of this as well - how much can you achieve just be using the domain of the instruction?

By domain you mean register classes of the operands?
If so then not much, DIVPDrr for instance has the following in operands

dag InOperandList = (ins VR128:$src1, VR128:$src2);

VR128 does not tell how many values are packed nor if they are half, float, double or integer values.
Sure you can infer it from the name of the instruction but it's not going to scale to other architectures.

For llvm-exegesis to do a good job we need to add some semantic to the instruction description.

I meant the X86 execution domain info:

X86InstrInfo::getExecutionDomain(const MachineInstr &MI) const {

uint16_t domain = (MI.getDesc().TSFlags >> X86II::SSEDomainShift) & 3;

I meant the X86 execution domain info:

X86InstrInfo::getExecutionDomain(const MachineInstr &MI) const {

uint16_t domain = (MI.getDesc().TSFlags >> X86II::SSEDomainShift) & 3;

Ha I see, well it's very specific to X86 and would not work for other architectures.
Also it still doesn't tell much about the semantic of the instruction which is needed to come up with good testing strategies.

Ultimately I'd like llvm-exegesis to explore instructions and report the ones with varied execution time. Two use cases for this:

  • Timing attacks identification,
  • Arithmetic performance: floating point operations on subnormal / Inf / NaN may be slower than with normal float. For instance, llvm-exegesis currently reports unrealistic latencies for division (VDIVPDrr is 172 cycles on Skylake)
gchatelet abandoned this revision.Oct 9 2018, 1:49 AM