This is an archive of the discontinued LLVM Phabricator instance.

Add llvm-exegesis tool.
ClosedPublic

Authored by courbet on Mar 15 2018, 7:47 AM.

Details

Summary

[llvm-exegesis][RFC] Automatic Measurement of Instruction Latency/Uops

This is the code corresponding to the RFC "llvm-exegesis Automatic Measurement of Instruction Latency/Uops".

The RFC is available on the LLVM mailing lists as well as the following document
for easier reading:
https://docs.google.com/document/d/1QidaJMJUyQdRrFKD66vE1_N55whe0coQ3h1GpFzz27M/edit?usp=sharing

Diff Detail

Event Timeline

courbet created this revision.Mar 15 2018, 7:47 AM
courbet updated this revision to Diff 138573.Mar 15 2018, 9:07 AM

clang-format the patch

Please run Clang-format and Clang-tidy modernize.

tools/llvm-exegesis/llvm-exegesis.rst should be moved to docs/Command. Please mention new tool in Release Notes.

tools/llvm-exegesis/BenchmarkRunner.cpp
24

Please use = default;

tools/llvm-exegesis/InMemoryAssembler.cpp
37

Please move static definitions out of anonymous namespace.

41

static should be used instead of anonymous namespace for functions in LLVM. Same in other places.

159

Please use = default;

198

Unnecessary comment.

tools/llvm-exegesis/InstructionSnippetGenerator.cpp
50

static should be used instead of anonymous namespace for functions in LLVM. Same in other places.

tools/llvm-exegesis/Latency.cpp
27

static should be used instead of anonymous namespace for functions in LLVM. Same in other places.

43

Please use = default;

tools/llvm-exegesis/X86.cpp
16

static should be used instead of anonymous namespace for functions in LLVM. Same in other places.

tools/llvm-exegesis/llvm-exegesis.rst
16 ↗(On Diff #138573)

Unnecessary empty line.

tools/llvm-exegesis/BenchmarkRunner.h
20

Headers order is till wrong (C++ headers should be after LLVM ones). Please remove empty lines between header groups. Same for other files.

Thanks for sharing this - this looks like a really promising tool.
I wonder if you have thought about adding unit/regression tests? Having them would make it quite a bit easier for others to contribute to its development.
Off the top of my head - to be able to unit/regression test this, you'd need a mock/fake "libpfm-like" implementation to make the test not rely on the results coming from whatever machine you're running on?
If that's a direction you want to take, designing that in earlier may be easier than retrofitting it in later?

tools/llvm-exegesis/README.txt
1 ↗(On Diff #138573)

A README file doesn't seem necessary given llvm-exegesis.rst exists?

Thanks for sharing this - this looks like a really promising tool.
I wonder if you have thought about adding unit/regression tests? Having them would make it quite a bit easier for others to contribute to its development.

Sure, we do have unit tests (they are gmock unit tests, not lit tests). I did not include them in this diff to make review more manageable, but I'll add them now if you want.

Thanks for sharing this - this looks like a really promising tool.
I wonder if you have thought about adding unit/regression tests? Having them would make it quite a bit easier for others to contribute to its development.

Sure, we do have unit tests (they are gmock unit tests, not lit tests). I did not include them in this diff to make review more manageable, but I'll add them now if you want.

I had expected something based on gmock unit tests to be needed here.
I think it could help the review as the unit tests might give further guidance to how the tool works by documenting examples of inputs and expected outputs?
I probably won't be the one to be able to review all of this - but at least I would find it helpful to see the unit tests as part of trying to review at least part of this.
So, yes, please do add them :)

courbet updated this revision to Diff 138678.Mar 16 2018, 3:53 AM
courbet marked 11 inline comments as done.
  • Address Eugene's comments.
  • Remove README
tools/llvm-exegesis/README.txt
1 ↗(On Diff #138573)

Right, this slipped in by mistake.

Eugene.Zelenko added inline comments.Mar 16 2018, 6:16 AM
tools/llvm-exegesis/InMemoryAssembler.cpp
9

Please separate with empty line.

tools/llvm-exegesis/InstructionSnippetGenerator.cpp
9

Please separate with empty line.

Eugene.Zelenko added inline comments.Mar 16 2018, 6:16 AM
tools/llvm-exegesis/Uops.cpp
209

It's better to specify type explicitly, since it's not obvious. Same in other places.

210

Brackets could be omitted.

214

Brackets could be omitted.

223

Brackets could be omitted.

Eugene.Zelenko added inline comments.Mar 16 2018, 6:18 AM
tools/llvm-exegesis/BenchmarkResult.h
16

Final underscore is not needed. Same in other places.

courbet updated this revision to Diff 138698.Mar 16 2018, 7:05 AM
courbet marked 7 inline comments as done.
  • Split llvm-exegesis into lib and main.
  • Add unit tests
  • Adress Eugene's comments.
gbedwell added inline comments.
docs/CommandGuide/llvm-exegesis.rst
2

I think you'll need to add a reference to this in CommandGuide/index.rst to prevent llvm-sphinx-docs from complaining (see rL326999)

courbet updated this revision to Diff 138721.Mar 16 2018, 8:59 AM
courbet marked an inline comment as done.
  • do not fail to compile if libpfm is not available.
  • Add doc in index.
courbet updated this revision to Diff 138883.Mar 19 2018, 1:21 AM
  • clang-format
courbet updated this revision to Diff 138889.Mar 19 2018, 3:20 AM
  • A few more style fixes.

Some initial style comments - braces, assertion messages, StringRef and for-loop cleanup in particular need addressing throughout.

tools/llvm-exegesis/lib/BenchmarkRunner.cpp
19 ↗(On Diff #138889)

Move to header?

21 ↗(On Diff #138889)

Move to header?

tools/llvm-exegesis/lib/InMemoryAssembler.cpp
54 ↗(On Diff #138889)
if (!PI->getNormalCtor()) {
  llvm::errs() << " cannot create pass: " << PI->getPassName() << "\n";
  return true;
}
llvm::Pass *P = PI->getNormalCtor()();
tools/llvm-exegesis/lib/InstructionSnippetGenerator.cpp
29 ↗(On Diff #138889)

(style) Remove braces where you can

55 ↗(On Diff #138889)
Var.IsDef = IsDef;
Var.IsUse = !IsDef;
72 ↗(On Diff #138889)

use llvm::any_of ?

86 ↗(On Diff #138889)

(style)

for (size_t I = 0, E = InstrInfo.getNumOperands(); I <E; ++I) {
tools/llvm-exegesis/lib/Latency.cpp
46 ↗(On Diff #138889)

(style) Add assertion message as well as condition

tools/llvm-exegesis/lib/LlvmState.cpp
46 ↗(On Diff #138889)

Put all getters in the headers?

tools/llvm-exegesis/lib/OperandGraph.h
59 ↗(On Diff #138889)

Should these be pass by value or ref?

tools/llvm-exegesis/lib/PerfHelper.cpp
79 ↗(On Diff #138889)

Return StringRef?

tools/llvm-exegesis/lib/PerfHelper.h
45 ↗(On Diff #138889)

Return StringRef?

55 ↗(On Diff #138889)

Return StringRef?

unittests/tools/llvm-exegesis/BenchmarkResultTest.cpp
37 ↗(On Diff #138889)

StringRef?

courbet updated this revision to Diff 139088.Mar 20 2018, 2:05 AM
courbet marked 9 inline comments as done.
  • Address Simon's comments.
courbet added inline comments.Mar 20 2018, 2:12 AM
tools/llvm-exegesis/lib/BenchmarkRunner.cpp
19 ↗(On Diff #138889)

I'm using the dtor as vtable anchor.

21 ↗(On Diff #138889)

ditto.

tools/llvm-exegesis/lib/InstructionSnippetGenerator.cpp
55 ↗(On Diff #138889)

We're actually conditionally updating the Var values here.

72 ↗(On Diff #138889)

We're returning Var here, so I'm not sure I can.

tools/llvm-exegesis/lib/OperandGraph.h
59 ↗(On Diff #138889)

Node is very small, we decided to pass by value.

Matt added a subscriber: Matt.Mar 20 2018, 3:46 AM
docs/ReleaseNotes.rst
45

Please insert empty line above and use :doc: to refer to documentation. See D44636 as example.

courbet updated this revision to Diff 139253.Mar 21 2018, 12:20 AM
courbet marked an inline comment as done.
  • Better llvm-exegesis doc.
docs/ReleaseNotes.rst
45

Thanks for the pointer.

spatel added a subscriber: spatel.Mar 22 2018, 9:45 AM

Any further comments ?

It'd be interesting to get some opinions from non-X86 teams, whether the current architecture can be made to work for ARM/PPC/MIPS etc.?

tools/llvm-exegesis/lib/InMemoryAssembler.cpp
95 ↗(On Diff #140045)

All these braces (for/if) can go.

138 ↗(On Diff #140045)

Remove braces from single line if()

54 ↗(On Diff #138889)

Why not this?

if (!PI->getNormalCtor()) {
  llvm::errs() << " cannot create pass: " << PI->getPassName() << "\n";
  return true;
}
llvm::Pass *P = PI->getNormalCtor()();
tools/llvm-exegesis/lib/InstructionSnippetGenerator.cpp
151 ↗(On Diff #140045)

Do you need the const on the ArrayRef? I don't think you can change anything - that's what MutableArrayRef is for.

159 ↗(On Diff #140045)
for (size_t I = 0, E = Vars.size(); I < E; ++I) {
230 ↗(On Diff #140045)
for (size_t I = 0, E = Vars.size(); I < E; ++I) {
tools/llvm-exegesis/lib/LlvmState.h
32 ↗(On Diff #140045)

Return StringRef instead for these?

tools/llvm-exegesis/lib/Uops.cpp
101 ↗(On Diff #140045)
for (size_t I = 0, E = Vars.size(); I < e; ++I) {
courbet updated this revision to Diff 140430.Mar 30 2018, 8:00 AM
courbet marked 4 inline comments as done.
  • Address Simon's comments.
tools/llvm-exegesis/lib/InstructionSnippetGenerator.cpp
151 ↗(On Diff #140045)

Thanks for pointing that out. Our internal ArrayRef (Span) can change its extent if not its elements. Fixed here and in 14 other places.

159 ↗(On Diff #140045)

The compiler should be able to inline these, right ?

230 ↗(On Diff #140045)

ditto

tools/llvm-exegesis/lib/Uops.cpp
101 ↗(On Diff #140045)

ditto

RKSimon added inline comments.Mar 30 2018, 8:11 AM
tools/llvm-exegesis/lib/InstructionSnippetGenerator.cpp
159 ↗(On Diff #140045)
courbet updated this revision to Diff 140433.Mar 30 2018, 8:25 AM
courbet marked an inline comment as done.
  • Always cache for loop end iterators.

Hi Clement,

Thanks for the tool. Could you please rebase the patch?

I finally managed to test the tool on my linux machine. However, I was unable to get some numbers on AMD cpus.

The major problem I see is that the tool expects perf event "UNHALTED_CORE_CYCLES" to be always available. However, that event is Intel specific.

I tried to manually change it to something meaningful for my system. However, it would be really nice if we find a way to abstract "generic perf events".
For example, we could have the "generic cpu clock" event, which targets/subtargers can define as a alias for a specific hw event type. So, on Intel chips, that "generic cpu clock" event would be wired to UNHALTED_CORE_CYCLES. Is there a plan for doing it?

The other problem is the dependency with libpfm. At the moment, this is not a big deal for people that work on linux. I don't know if there is an equivalent library for windows. However, if I remember correctly, libpfm builds on top of the linux perf events. Is there a plan to get rid of that dependency?

Overally, I really like the tool and the goals that it plans to achieve.
At the moment, my concern is that the tool is limited to a very specific system and CPU family.

So, I am okay with this patch as long as there is a plan to remove the dependency with cpu vendor specific hardware perf events (at least in the medium term).

About the patch:
I noticed that you added your name in most TODO comments. Example "// TODO(courbet): Handle memory.". Please remove your name, and possibly make some of those comments more descriptive.

The major problem I see is that the tool expects perf event "UNHALTED_CORE_CYCLES" to be always available. However, that event is Intel specific.
I tried to manually change it to something meaningful for my system. However, it would be really nice if we find a way to abstract "generic perf events".
For example, we could have the "generic cpu clock" event, which targets/subtargers can define as a alias for a specific hw event type. So, on Intel chips, that "generic cpu clock" event would be wired to UNHALTED_CORE_CYCLES. Is there a plan for doing it?

If you look at the code where the event is defined, there is a TODO to get the name of the event from the sched model (or resource Resource Unit for uops). Our plan is to add the name of the counters to use as an optional property of the SchedModel in TD files.

The other problem is the dependency with libpfm. At the moment, this is not a big deal for people that work on linux. I don't know if there is an equivalent library for windows. However, if I remember correctly, libpfm builds on top of the linux perf events. Is there a plan to get rid of that dependency?

The dependency is optional. People who do not have libpfm get an error message when trying to use llvm-exegesis for now. Libpfm is supposed to be available on windows, though I have absolutely no knowledge of how LLVM finds its dependencies on windows. I don;t think it's an issue in the short term since CPUs (hopefully :)) behave the same under all OSes, but I agree that it would be a ice to have.

Overally, I really like the tool and the goals that it plans to achieve.
At the moment, my concern is that the tool is limited to a very specific system and CPU family.
So, I am okay with this patch as long as there is a plan to remove the dependency with cpu vendor specific hardware perf events (at least in the medium term).

Definitely, see my answer above.

courbet updated this revision to Diff 140752.Apr 3 2018, 3:57 AM
  • Address Andrea's comments.
andreadb accepted this revision.Apr 3 2018, 3:57 AM

The major problem I see is that the tool expects perf event "UNHALTED_CORE_CYCLES" to be always available. However, that event is Intel specific.
I tried to manually change it to something meaningful for my system. However, it would be really nice if we find a way to abstract "generic perf events".
For example, we could have the "generic cpu clock" event, which targets/subtargers can define as a alias for a specific hw event type. So, on Intel chips, that "generic cpu clock" event would be wired to UNHALTED_CORE_CYCLES. Is there a plan for doing it?

If you look at the code where the event is defined, there is a TODO to get the name of the event from the sched model (or resource Resource Unit for uops). Our plan is to add the name of the counters to use as an optional property of the SchedModel in TD files.

Cool. Thanks! I like the idea :-).

I don't have other problems with the tool.

If you fix the TODO comments, then it looks good to me.

The other problem is the dependency with libpfm. At the moment, this is not a big deal for people that work on linux. I don't know if there is an equivalent library for windows. However, if I remember correctly, libpfm builds on top of the linux perf events. Is there a plan to get rid of that dependency?

The dependency is optional. People who do not have libpfm get an error message when trying to use llvm-exegesis for now. Libpfm is supposed to be available on windows, though I have absolutely no knowledge of how LLVM finds its dependencies on windows. I don;t think it's an issue in the short term since CPUs (hopefully :)) behave the same under all OSes, but I agree that it would be a ice to have.

Overally, I really like the tool and the goals that it plans to achieve.
At the moment, my concern is that the tool is limited to a very specific system and CPU family.
So, I am okay with this patch as long as there is a plan to remove the dependency with cpu vendor specific hardware perf events (at least in the medium term).

Definitely, see my answer above.

This revision is now accepted and ready to land.Apr 3 2018, 3:57 AM

The major problem I see is that the tool expects perf event "UNHALTED_CORE_CYCLES" to be always available. However, that event is Intel specific.
I tried to manually change it to something meaningful for my system. However, it would be really nice if we find a way to abstract "generic perf events".
For example, we could have the "generic cpu clock" event, which targets/subtargers can define as a alias for a specific hw event type. So, on Intel chips, that "generic cpu clock" event would be wired to UNHALTED_CORE_CYCLES. Is there a plan for doing it?

If you look at the code where the event is defined, there is a TODO to get the name of the event from the sched model (or resource Resource Unit for uops). Our plan is to add the name of the counters to use as an optional property of the SchedModel in TD files.

Cool. Thanks! I like the idea :-).

I don't have other problems with the tool.

If you fix the TODO comments, then it looks good to me.

Great, thanks for the review Andrea !

I plan to submit this today unless there are more comments.

RKSimon accepted this revision.Apr 4 2018, 1:02 AM

LGTM - I'm keen to see this tested/working for a wider range of x86 (btver2, slm, znver1 etc.) and other targets but this is a good first step.

LGTM - I'm keen to see this tested/working for a wider range of x86 (btver2, slm, znver1 etc.) and other targets but this is a good first step.

Thanks Simon, we'll definitely address this.

This revision was automatically updated to reflect the committed changes.

Looks like CMake is not checking for external dependencies properly. I got build error on RHEL 6 because of missing perfmon/perf_event.h. Most likely perfmon version is too old on RHEL6.

I found /usr/lib/libpfm.so.3.10.0

Hi, I am failing to build llvm-exegesis' unit tests when llvm is a part of another project. I have posted a fix here: D45328. Can you take a look? Thanks.