Page MenuHomePhabricator

Speedup llvm-dwarfdump 3.9x
ClosedPublic

Authored by jankratochvil on Aug 22 2020, 1:17 PM.

Details

Summary

Currently strace llvm-dwarfdump x.debug >/tmp/file:

ioctl(1, TCGETS, 0x7ffd64d7f340)        = -1 ENOTTY (Inappropriate ioctl for device)
write(1, "           DW_AT_decl_line\t(89)\n"..., 4096) = 4096
ioctl(1, TCGETS, 0x7ffd64d7f400)        = -1 ENOTTY (Inappropriate ioctl for device)
ioctl(1, TCGETS, 0x7ffd64d7f410)        = -1 ENOTTY (Inappropriate ioctl for device)
ioctl(1, TCGETS, 0x7ffd64d7f400)        = -1 ENOTTY (Inappropriate ioctl for device)

After this patch:

write(1, "0000000000001102 \"strlen\")\n     "..., 4096) = 4096
write(1, "site\n                  DW_AT_low"..., 4096) = 4096
write(1, "d53)\n\n0x000e4d4d:       DW_TAG_G"..., 4096) = 4096

The same speedup can be achieved by --color=0 but that is not much convenient.
After this patch it still uses ioctl for printing DWARF parsing errors.
Maybe raw_ostream &OS could be moved into DIDumpOptions DumpOpts which would somehow make this patch more simple, I can investigate it more and change it upon request.
I can also drop this patch if it looks too hacky.

Diff Detail

Event Timeline

jankratochvil created this revision.Aug 22 2020, 1:17 PM
jankratochvil requested review of this revision.Aug 22 2020, 1:17 PM

Thanks for bringing this up!

I believe llvm-dwarfdump isn't the only thing that prints using colours. I think we need a more general solution than this.

Also, perhaps this should be done by the stream itself - it seems like the stream knows whether or not it supports colours, so should we not just have the stream ignore attempts to use colours if the underlying thing doesn't support them?

jankratochvil planned changes to this revision.Aug 24 2020, 12:26 AM

Also, perhaps this should be done by the stream itself - it seems like the stream knows whether or not it supports colours, so should we not just have the stream ignore attempts to use colours if the underlying thing doesn't support them?

OK, I agree, I will move it there. That should also solve the caching problem.

If I understand the patch correctly, the speedup is gained by caching the mode instead of trying to recompute it every time we call colorsEnabled()? This seems like a huge speedup for such a small thing, maybe I'm missing something?

Also, perhaps this should be done by the stream itself - it seems like the stream knows whether or not it supports colours, so should we not just have the stream ignore attempts to use colours if the underlying thing doesn't support them?

We already disable colors when they're not supported by the stream.

OK, I agree, I will move it there. That should also solve the caching problem.

I don't think we should move it into the stream, if that's what you mean by "there". The color cl::opt is part of WithColor so we should check its value there.

Can we achieve the same thing by resolving the ColorMode to Enable and Disable in the WihtColor constructor?

If I understand the patch correctly, the speedup is gained by caching the mode instead of trying to recompute it every time we call colorsEnabled()? This seems like a huge speedup for such a small thing, maybe I'm missing something?

Also, perhaps this should be done by the stream itself - it seems like the stream knows whether or not it supports colours, so should we not just have the stream ignore attempts to use colours if the underlying thing doesn't support them?

We already disable colors when they're not supported by the stream.

Maybe I misunderstood what this bit of the strace output was complaining about then:
ioctl(1, TCGETS, 0x7ffd64d7f340) = -1 ENOTTY (Inappropriate ioctl for device)
(I have never used strace, so was only guessing based on its general appearance). I took it to mean that an attempt was made to change the colour, but failed because files don't support different colours, and that this failure is expensive.

OK, I agree, I will move it there. That should also solve the caching problem.

I don't think we should move it into the stream, if that's what you mean by "there". The color cl::opt is part of WithColor so we should check its value there.

Can we achieve the same thing by resolving the ColorMode to Enable and Disable in the WihtColor constructor?

If I understand the patch correctly, the speedup is gained by caching the mode instead of trying to recompute it every time we call colorsEnabled()?

Yes.

This seems like a huge speedup for such a small thing, maybe I'm missing something?

Detecting whether fd 1 supports colors means an ioctl() call which is a kernel call. That is several orders of magnitue more expensive than any in-process codeflow and in-process calculations. This is on Linux, I did not check OSX.

I don't think we should move it into the stream, if that's what you mean by "there". The color cl::opt is part of WithColor so we should check its value there.

I plan to make raw_ostream to have auto/on/off mode (default=auto). WithColor's ctor will set the mode of the raw_ostream. raw_ostream will do the expensive ioctl() detection only during first time changing color if raw_ostream has auto mode and raw_ostream will cache the detected value.

(Not yet written, busy with DWZ stuff, thanks for asking.)

Can we achieve the same thing by resolving the ColorMode to Enable and Disable in the WithColor constructor?

No because WithColor is ctored/dtored many times for every DIE. This is what results in so many expensive ioctl()s.

It could be done in WithColor if its Mode is changed to a static member but then that static member would be set according to current raw_ostream and it would be incorrect if WithColor's static member is later used with different raw_ostream.

Maybe I misunderstood what this bit of the strace output was complaining about then:
ioctl(1, TCGETS, 0x7ffd64d7f340) = -1 ENOTTY (Inappropriate ioctl for device)
(I have never used strace, so was only guessing based on its general appearance). I took it to mean that an attempt was made to change the colour, but failed because files don't support different colours, and that this failure is expensive.

Expensive is the detection. Color output is just printing some bytes like "\x1B[0;36m".

joerg added a subscriber: joerg.Aug 25 2020, 4:59 AM

Can you change raw_fd_ostream::has_colors to memoize the result of FileDescriptorHasColors(FD) instead? That should give most of the performance gain without breaking the abstractions as much.

Can you change raw_fd_ostream::has_colors to memoize the result of FileDescriptorHasColors(FD) instead?

Done, yes, that looks simple and it works. The new code is not thread-safe but other member variables there also are not.

joerg added inline comments.Aug 25 2020, 10:18 AM
llvm/include/llvm/Support/raw_ostream.h
298

I'd keep the original const in the interface contract and mark HasColors as mutable instead. It's an implementation detail that the field is set lazily and the property itself should be invariant.

jankratochvil marked an inline comment as done.
jhenderson accepted this revision.Aug 25 2020, 11:58 PM

LGTM, thanks!

This revision is now accepted and ready to land.Aug 25 2020, 11:58 PM
labath accepted this revision.Aug 26 2020, 1:25 AM
This revision was automatically updated to reflect the committed changes.