Page MenuHomePhabricator

Add llvm-exegesis tool.
ClosedPublic

Authored by courbet on Mar 15 2018, 7:47 AM.

Details

Summary

[llvm-exegesis][RFC] Automatic Measurement of Instruction Latency/Uops

This is the code corresponding to the RFC "llvm-exegesis Automatic Measurement of Instruction Latency/Uops".

The RFC is available on the LLVM mailing lists as well as the following document
for easier reading:
https://docs.google.com/document/d/1QidaJMJUyQdRrFKD66vE1_N55whe0coQ3h1GpFzz27M/edit?usp=sharing

Diff Detail

Repository
rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Eugene.Zelenko added inline comments.Mar 16 2018, 6:16 AM
tools/llvm-exegesis/Uops.cpp
209 ↗(On Diff #138678)

It's better to specify type explicitly, since it's not obvious. Same in other places.

210 ↗(On Diff #138678)

Brackets could be omitted.

214 ↗(On Diff #138678)

Brackets could be omitted.

223 ↗(On Diff #138678)

Brackets could be omitted.

Eugene.Zelenko added inline comments.Mar 16 2018, 6:18 AM
tools/llvm-exegesis/BenchmarkResult.h
16 ↗(On Diff #138678)

Final underscore is not needed. Same in other places.

courbet updated this revision to Diff 138698.Mar 16 2018, 7:05 AM
courbet marked 7 inline comments as done.
  • Split llvm-exegesis into lib and main.
  • Add unit tests
  • Adress Eugene's comments.
gbedwell added inline comments.
docs/CommandGuide/llvm-exegesis.rst
1 ↗(On Diff #138698)

I think you'll need to add a reference to this in CommandGuide/index.rst to prevent llvm-sphinx-docs from complaining (see rL326999)

courbet updated this revision to Diff 138721.Mar 16 2018, 8:59 AM
courbet marked an inline comment as done.
  • do not fail to compile if libpfm is not available.
  • Add doc in index.
courbet updated this revision to Diff 138883.Mar 19 2018, 1:21 AM
  • clang-format
courbet updated this revision to Diff 138889.Mar 19 2018, 3:20 AM
  • A few more style fixes.

Some initial style comments - braces, assertion messages, StringRef and for-loop cleanup in particular need addressing throughout.

tools/llvm-exegesis/lib/BenchmarkRunner.cpp
19 ↗(On Diff #138889)

Move to header?

21 ↗(On Diff #138889)

Move to header?

tools/llvm-exegesis/lib/InMemoryAssembler.cpp
54 ↗(On Diff #138889)
if (!PI->getNormalCtor()) {
  llvm::errs() << " cannot create pass: " << PI->getPassName() << "\n";
  return true;
}
llvm::Pass *P = PI->getNormalCtor()();
tools/llvm-exegesis/lib/InstructionSnippetGenerator.cpp
29 ↗(On Diff #138889)

(style) Remove braces where you can

55 ↗(On Diff #138889)
Var.IsDef = IsDef;
Var.IsUse = !IsDef;
72 ↗(On Diff #138889)

use llvm::any_of ?

86 ↗(On Diff #138889)

(style)

for (size_t I = 0, E = InstrInfo.getNumOperands(); I <E; ++I) {
tools/llvm-exegesis/lib/Latency.cpp
46 ↗(On Diff #138889)

(style) Add assertion message as well as condition

tools/llvm-exegesis/lib/LlvmState.cpp
46 ↗(On Diff #138889)

Put all getters in the headers?

tools/llvm-exegesis/lib/OperandGraph.h
59 ↗(On Diff #138889)

Should these be pass by value or ref?

tools/llvm-exegesis/lib/PerfHelper.cpp
79 ↗(On Diff #138889)

Return StringRef?

tools/llvm-exegesis/lib/PerfHelper.h
45 ↗(On Diff #138889)

Return StringRef?

55 ↗(On Diff #138889)

Return StringRef?

unittests/tools/llvm-exegesis/BenchmarkResultTest.cpp
37 ↗(On Diff #138889)

StringRef?

courbet updated this revision to Diff 139088.Mar 20 2018, 2:05 AM
courbet marked 9 inline comments as done.
  • Address Simon's comments.
courbet added inline comments.Mar 20 2018, 2:12 AM
tools/llvm-exegesis/lib/BenchmarkRunner.cpp
19 ↗(On Diff #138889)

I'm using the dtor as vtable anchor.

21 ↗(On Diff #138889)

ditto.

tools/llvm-exegesis/lib/InstructionSnippetGenerator.cpp
55 ↗(On Diff #138889)

We're actually conditionally updating the Var values here.

72 ↗(On Diff #138889)

We're returning Var here, so I'm not sure I can.

tools/llvm-exegesis/lib/OperandGraph.h
59 ↗(On Diff #138889)

Node is very small, we decided to pass by value.

Matt added a subscriber: Matt.Mar 20 2018, 3:46 AM
docs/ReleaseNotes.rst
47 ↗(On Diff #139088)

Please insert empty line above and use :doc: to refer to documentation. See D44636 as example.

courbet updated this revision to Diff 139253.Mar 21 2018, 12:20 AM
courbet marked an inline comment as done.
  • Better llvm-exegesis doc.
docs/ReleaseNotes.rst
47 ↗(On Diff #139088)

Thanks for the pointer.

spatel added a subscriber: spatel.Mar 22 2018, 9:45 AM

Any further comments ?

It'd be interesting to get some opinions from non-X86 teams, whether the current architecture can be made to work for ARM/PPC/MIPS etc.?

tools/llvm-exegesis/lib/InMemoryAssembler.cpp
54 ↗(On Diff #138889)

Why not this?

if (!PI->getNormalCtor()) {
  llvm::errs() << " cannot create pass: " << PI->getPassName() << "\n";
  return true;
}
llvm::Pass *P = PI->getNormalCtor()();
95 ↗(On Diff #140045)

All these braces (for/if) can go.

138 ↗(On Diff #140045)

Remove braces from single line if()

tools/llvm-exegesis/lib/InstructionSnippetGenerator.cpp
151 ↗(On Diff #140045)

Do you need the const on the ArrayRef? I don't think you can change anything - that's what MutableArrayRef is for.

159 ↗(On Diff #140045)
for (size_t I = 0, E = Vars.size(); I < E; ++I) {
230 ↗(On Diff #140045)
for (size_t I = 0, E = Vars.size(); I < E; ++I) {
tools/llvm-exegesis/lib/LlvmState.h
32 ↗(On Diff #140045)

Return StringRef instead for these?

tools/llvm-exegesis/lib/Uops.cpp
101 ↗(On Diff #140045)
for (size_t I = 0, E = Vars.size(); I < e; ++I) {
courbet updated this revision to Diff 140430.Mar 30 2018, 8:00 AM
courbet marked 4 inline comments as done.
  • Address Simon's comments.
tools/llvm-exegesis/lib/InstructionSnippetGenerator.cpp
151 ↗(On Diff #140045)

Thanks for pointing that out. Our internal ArrayRef (Span) can change its extent if not its elements. Fixed here and in 14 other places.

159 ↗(On Diff #140045)

The compiler should be able to inline these, right ?

230 ↗(On Diff #140045)

ditto

tools/llvm-exegesis/lib/Uops.cpp
101 ↗(On Diff #140045)

ditto

RKSimon added inline comments.Mar 30 2018, 8:11 AM
tools/llvm-exegesis/lib/InstructionSnippetGenerator.cpp
159 ↗(On Diff #140045)
courbet updated this revision to Diff 140433.Mar 30 2018, 8:25 AM
courbet marked an inline comment as done.
  • Always cache for loop end iterators.

Hi Clement,

Thanks for the tool. Could you please rebase the patch?

I finally managed to test the tool on my linux machine. However, I was unable to get some numbers on AMD cpus.

The major problem I see is that the tool expects perf event "UNHALTED_CORE_CYCLES" to be always available. However, that event is Intel specific.

I tried to manually change it to something meaningful for my system. However, it would be really nice if we find a way to abstract "generic perf events".
For example, we could have the "generic cpu clock" event, which targets/subtargers can define as a alias for a specific hw event type. So, on Intel chips, that "generic cpu clock" event would be wired to UNHALTED_CORE_CYCLES. Is there a plan for doing it?

The other problem is the dependency with libpfm. At the moment, this is not a big deal for people that work on linux. I don't know if there is an equivalent library for windows. However, if I remember correctly, libpfm builds on top of the linux perf events. Is there a plan to get rid of that dependency?

Overally, I really like the tool and the goals that it plans to achieve.
At the moment, my concern is that the tool is limited to a very specific system and CPU family.

So, I am okay with this patch as long as there is a plan to remove the dependency with cpu vendor specific hardware perf events (at least in the medium term).

About the patch:
I noticed that you added your name in most TODO comments. Example "// TODO(courbet): Handle memory.". Please remove your name, and possibly make some of those comments more descriptive.

The major problem I see is that the tool expects perf event "UNHALTED_CORE_CYCLES" to be always available. However, that event is Intel specific.
I tried to manually change it to something meaningful for my system. However, it would be really nice if we find a way to abstract "generic perf events".
For example, we could have the "generic cpu clock" event, which targets/subtargers can define as a alias for a specific hw event type. So, on Intel chips, that "generic cpu clock" event would be wired to UNHALTED_CORE_CYCLES. Is there a plan for doing it?

If you look at the code where the event is defined, there is a TODO to get the name of the event from the sched model (or resource Resource Unit for uops). Our plan is to add the name of the counters to use as an optional property of the SchedModel in TD files.

The other problem is the dependency with libpfm. At the moment, this is not a big deal for people that work on linux. I don't know if there is an equivalent library for windows. However, if I remember correctly, libpfm builds on top of the linux perf events. Is there a plan to get rid of that dependency?

The dependency is optional. People who do not have libpfm get an error message when trying to use llvm-exegesis for now. Libpfm is supposed to be available on windows, though I have absolutely no knowledge of how LLVM finds its dependencies on windows. I don;t think it's an issue in the short term since CPUs (hopefully :)) behave the same under all OSes, but I agree that it would be a ice to have.

Overally, I really like the tool and the goals that it plans to achieve.
At the moment, my concern is that the tool is limited to a very specific system and CPU family.
So, I am okay with this patch as long as there is a plan to remove the dependency with cpu vendor specific hardware perf events (at least in the medium term).

Definitely, see my answer above.

courbet updated this revision to Diff 140752.Apr 3 2018, 3:57 AM
  • Address Andrea's comments.
andreadb accepted this revision.Apr 3 2018, 3:57 AM

The major problem I see is that the tool expects perf event "UNHALTED_CORE_CYCLES" to be always available. However, that event is Intel specific.
I tried to manually change it to something meaningful for my system. However, it would be really nice if we find a way to abstract "generic perf events".
For example, we could have the "generic cpu clock" event, which targets/subtargers can define as a alias for a specific hw event type. So, on Intel chips, that "generic cpu clock" event would be wired to UNHALTED_CORE_CYCLES. Is there a plan for doing it?

If you look at the code where the event is defined, there is a TODO to get the name of the event from the sched model (or resource Resource Unit for uops). Our plan is to add the name of the counters to use as an optional property of the SchedModel in TD files.

Cool. Thanks! I like the idea :-).

I don't have other problems with the tool.

If you fix the TODO comments, then it looks good to me.

The other problem is the dependency with libpfm. At the moment, this is not a big deal for people that work on linux. I don't know if there is an equivalent library for windows. However, if I remember correctly, libpfm builds on top of the linux perf events. Is there a plan to get rid of that dependency?

The dependency is optional. People who do not have libpfm get an error message when trying to use llvm-exegesis for now. Libpfm is supposed to be available on windows, though I have absolutely no knowledge of how LLVM finds its dependencies on windows. I don;t think it's an issue in the short term since CPUs (hopefully :)) behave the same under all OSes, but I agree that it would be a ice to have.

Overally, I really like the tool and the goals that it plans to achieve.
At the moment, my concern is that the tool is limited to a very specific system and CPU family.
So, I am okay with this patch as long as there is a plan to remove the dependency with cpu vendor specific hardware perf events (at least in the medium term).

Definitely, see my answer above.

This revision is now accepted and ready to land.Apr 3 2018, 3:57 AM

The major problem I see is that the tool expects perf event "UNHALTED_CORE_CYCLES" to be always available. However, that event is Intel specific.
I tried to manually change it to something meaningful for my system. However, it would be really nice if we find a way to abstract "generic perf events".
For example, we could have the "generic cpu clock" event, which targets/subtargers can define as a alias for a specific hw event type. So, on Intel chips, that "generic cpu clock" event would be wired to UNHALTED_CORE_CYCLES. Is there a plan for doing it?

If you look at the code where the event is defined, there is a TODO to get the name of the event from the sched model (or resource Resource Unit for uops). Our plan is to add the name of the counters to use as an optional property of the SchedModel in TD files.

Cool. Thanks! I like the idea :-).

I don't have other problems with the tool.

If you fix the TODO comments, then it looks good to me.

Great, thanks for the review Andrea !

I plan to submit this today unless there are more comments.

RKSimon accepted this revision.Apr 4 2018, 1:02 AM

LGTM - I'm keen to see this tested/working for a wider range of x86 (btver2, slm, znver1 etc.) and other targets but this is a good first step.

LGTM - I'm keen to see this tested/working for a wider range of x86 (btver2, slm, znver1 etc.) and other targets but this is a good first step.

Thanks Simon, we'll definitely address this.

Closed by commit rL329156: Add llvm-exegesis tool. (authored by courbet, committed by ). · Explain WhyApr 4 2018, 1:16 AM
This revision was automatically updated to reflect the committed changes.

Looks like CMake is not checking for external dependencies properly. I got build error on RHEL 6 because of missing perfmon/perf_event.h. Most likely perfmon version is too old on RHEL6.

I found /usr/lib/libpfm.so.3.10.0

Hi, I am failing to build llvm-exegesis' unit tests when llvm is a part of another project. I have posted a fix here: D45328. Can you take a look? Thanks.