This is an archive of the discontinued LLVM Phabricator instance.

[Clang][WIP] Don't call distributeTypeAttrsFromDeclarator() on empty list.
AbandonedPublic

Authored by mboehme on Jun 23 2022, 6:08 AM.

Details

Reviewers
None
Summary

This check was present before https://reviews.llvm.org/D126061. I eliminated it
as part of that patch because it seemed superfluous; the check has no influence
on behavior. However, as @yurai007 points out in
https://reviews.llvm.org/D128097, this change may have had a negative impact on
compile time, so I'm adding the check back in.

Diff Detail

Event Timeline

mboehme created this revision.Jun 23 2022, 6:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 23 2022, 6:08 AM
mboehme requested review of this revision.Jun 23 2022, 6:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 23 2022, 6:08 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

Marking as {WIP] for the time being because I'm uploading this change for discussion.

nikic added a subscriber: nikic.Jun 23 2022, 8:06 AM

Not seeing a statistically significant impact from this change: http://llvm-compile-time-tracker.com/compare.php?from=8b6f69a4da5baaf3748798a84dd16a2481b7ca7f&to=797ba50f5fd88017925fe765427b1f5f136c3310&stat=instructions Maybe it's an improvement at the 0.05% level, but it may just as well be noise. As such, I probably wouldn't bother with the change.

mboehme added a comment.EditedJun 24 2022, 1:09 AM

Not seeing a statistically significant impact from this change: http://llvm-compile-time-tracker.com/compare.php?from=8b6f69a4da5baaf3748798a84dd16a2481b7ca7f&to=797ba50f5fd88017925fe765427b1f5f136c3310&stat=instructions Maybe it's an improvement at the 0.05% level, but it may just as well be noise. As such, I probably wouldn't bother with the change.

Thanks for performing the measurement for me!

Just a question for my understanding before I abandon the change: Shouldn't the "instructions" count be relatively resistant to noise? (I was assuming this is based on performance counters? What would cause noise in this metric?)

Edit: As an empirical experiment, I ran perf stat -e instructions,cpu-cycles multiple times on the same Clang invocation. (I assume perf is what the tracker uses to obtain the numbers it reports?) I did indeed observe variations in the instruction count from run to run, with a spread of about 0.15% between the smallest and largest value. The cycle count was really noisy (as you already pointed out to me on https://reviews.llvm.org/D128097), with a spread of over 7%.

I found these docs that state that the retired instruction counter can be incremented, among other things, on hardware interrupts and page faults. I assume this is one source for the variation?

I had previously assumed that instruction counts would be identical from run to run because that is what I observed with Callgrind (which is what I've been using to do most of my profiling). I assume that Callgrind uses some other mechanism, not based on hardware performance counters, to count instructions?

Anyway, given that the measured difference for this patch is below the noise level, I will abandon this change.

mboehme abandoned this revision.Jun 24 2022, 1:26 AM
nikic added a comment.Jun 24 2022, 1:41 AM

Just a question for my understanding before I abandon the change: Shouldn't the "instructions" count be relatively resistant to noise? (I was assuming this is based on performance counters? What would cause noise in this metric?)

There are two primary sources of noise. The first is kernel activity, as the instructions metric is a combined userspace and kernel metric. I do collect userspace-only as well (this is instructions:u). On that metric this does look like a small improvement just above the noise floor: http://llvm-compile-time-tracker.com/compare.php?from=8b6f69a4da5baaf3748798a84dd16a2481b7ca7f&to=797ba50f5fd88017925fe765427b1f5f136c3310&stat=instructions:u

The other is ASLR. If ASLR is disabled, the instructions:u measurements become perfectly noise-free. This is actually bad, because a lot of changes will then show statistically significant and perfectly reproducible compile-time changes that are ultimately just down to how exactly the address space gets laid out. So this is intentional noise :)

And yes, execution statistics are collected using perf. callgrind uses a CPU simulator instead, which is a much more controlled environment...

Just a question for my understanding before I abandon the change: Shouldn't the "instructions" count be relatively resistant to noise? (I was assuming this is based on performance counters? What would cause noise in this metric?)

There are two primary sources of noise. The first is kernel activity, as the instructions metric is a combined userspace and kernel metric. I do collect userspace-only as well (this is instructions:u).

Ah -- that makes sense.

On that metric this does look like a small improvement just above the noise floor: http://llvm-compile-time-tracker.com/compare.php?from=8b6f69a4da5baaf3748798a84dd16a2481b7ca7f&to=797ba50f5fd88017925fe765427b1f5f136c3310&stat=instructions:u

So based on that, would you recommend that I un-abandon this revision after all and submit it for review?

The other is ASLR. If ASLR is disabled, the instructions:u measurements become perfectly noise-free. This is actually bad, because a lot of changes will then show statistically significant and perfectly reproducible compile-time changes that are ultimately just down to how exactly the address space gets laid out. So this is intentional noise :)

And yes, execution statistics are collected using perf. callgrind uses a CPU simulator instead, which is a much more controlled environment...

Thanks for the clarification. I learned a lot about performance measurement from this discussion!

Indeed, measuring performance is tricky and chasing slow downs can be challenging especially those at the level of ~0.2%-0.1% or less.
Personally, every time when I need to use perf on Clang I follow those short guidelines to reduce noise and get relevant numbers: https://llvm.org/docs/Benchmarking.html#linux
It contains some extra information about controlling mentioned ASLR (via *randomize_va_space), SMT, cpu isolation and more.

Thanks for the pointer -- I wasn't aware of that document yet!