As the first step, this patch highlights some elements: registers and immediate values,
and some symbol names.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
- Build Status
Buildable 37021 Build 37020: arc lint + arc unit
Event Timeline
- Replace blue with cyan since it may be hard to read on some environments (for example, "Pro" profile of macOS Terminal.app).
i think the title of the diff should be adjusted a bit: 's/llvm-objcopy/llvm-objdump/g'
llvm/tools/llvm-objdump/llvm-objdump.cpp | ||
---|---|---|
336 | --color is a more common option name: ls, less, grep, zstd, etc Can you investigate a bit if it is possible to have both --color and --color=when ? |
Do you have a design or spec for what you consider to be "syntax highlighting?" From the early example you show for coloring literals, registers, etc., I don't see the benefit. For the current output, you can tell literals from registers, right? After all, the assembler is able to re-parse them in.
And in general, I dislike color markups in output as they always end up relying on some system/platform screen/color support. For instance, I don't think your highlighting will work on my OpenVMS x86 system.
If you haven't already, take a look at WithColor.h. I believe it allows you to stream things in a specific colour properly, including with disabling the feature via --color. From the WithColor class description, you can use it like WithColor(outs(), raw_ostream::CYAN) << "some text"; to print something to the stream in cyan.
llvm/test/tools/llvm-objdump/highlight.test | ||
---|---|---|
2 ↗ | (On Diff #211432) | REQUIRES usually go first in the output. |
4 ↗ | (On Diff #211432) | Move this comment to near the top of the file, and make it a bit more verbose, so that it describes the test as a whole. Take a look at recent new tests in various of the LLVM tools for examples. |
llvm/tools/llvm-objdump/llvm-objdump.cpp | ||
336 | LLVM already has a --color option (see for example llvm-dwarfdump output, WithColor::warning() and error() calls etc etc). I'm inclined to think that syntax highlighting can be on by default and just use --color=false (or whatever it is) to turn it off. If we go that route, the --color option should be documented in the llvm-objdump command guide, and also we should check it is in the tool help text (since it's in a different option category). |
llvm/test/tools/llvm-objdump/highlight.test | ||
---|---|---|
1 ↗ | (On Diff #211432) | Try and avoid adding a binary if you can. There's no particular reason this can't use yaml2obj to do things. Just add a .text section with appropriate content. |
5 ↗ | (On Diff #211432) | I might be missing something, but how does this show that the syntax highlighting works? Is it cross-platform? |
llvm/tools/llvm-objdump/llvm-objdump.cpp | ||
803–804 | How about instead of "HeaderLength" this is called "InstructionCol". The thing you're measuring isn't really a "Header". | |
1227 | Use ArrayRef here. |
llvm/test/tools/llvm-objdump/highlight.test | ||
---|---|---|
4 ↗ | (On Diff #211432) | We usually start comment from ## nowadays. |
llvm/tools/llvm-objdump/llvm-objdump.cpp | ||
350 | It is probably not clear what is Reset. Can it be Default? | |
354 | Make it const? | |
368 | It is generally not a good practice to do a lookup twice. (Also I do not think you will be able to use operator[] after making HighlightColors const) | |
1230 | You use BeforeText only once, so you do not need it it seems (just inline it). | |
1244 | Seems this would better look as ifs: ObjdumpHighlightColor Color; if (Span.Type == MarkupType::Imm) Color = ObjdumpHighlightColor::Immediate; else if (Span.Type == MarkupType::Reg) Color = ObjdumpHighlightColor::Register; else Color = ObjdumpHighlightColor::Reset; |
llvm/tools/llvm-objdump/llvm-objdump.cpp | ||
---|---|---|
336 | It looks lib/Support/WithColor.cpp#L106 already handles this... bool WithColor::colorsEnabled() { if (DisableColors) return false; if (UseColor == cl::BOU_UNSET) return OS.has_colors(); // e.g. is a terminal return UseColor == cl::BOU_TRUE; } | |
345 | Maybe just ObjdumpColor. | |
369 | Use WithColor::changeColor |
The original rough design is described in this document.
In addition to this patch, I plan to implement support for symbol names, relocations, and ascii strings in .rodata, and hopefully, highlight lines by instruction types (for example, highlight jmp instructions with green just like radare2 theme). As a handy (not feature-rich like radare2) disassembler, simple syntax highlighting feature is attractive at least to me.
After all, the assembler is able to re-parse them in.
I'm sorry but I can't see your point. Syntax highlighting is not for programs, but for humans to improve readability by emphasizing contexts.
And in general, I dislike color markups in output as they always end up relying on some system/platform screen/color support. For instance, I don't think your highlighting will work on my OpenVMS x86 system.
I agree with you that platform-independence is important too. I don't have experience with OpenVMS at all so could you explain why this patch won't work on it? This patch uses llvm::sys::Process::OutputColor internally (through raw_fd_ostream::changeColor) to avoid being platform-specific.
llvm/include/llvm/Support/WithColor.h | ||
---|---|---|
102 | Changes to WithColor.cpp and WithColor.h should be a separate patch, but before creating that I'd like to hear your comments. | |
llvm/test/tools/llvm-objdump/highlight.test | ||
5 ↗ | (On Diff #211432) | Oh I should have added shell to REQUIRES. This test makes sure that some elements in the disassembly are highlighted by checking ANSI escape sequences. |
llvm/tools/llvm-objdump/llvm-objdump.cpp | ||
336 |
What do you think about the compatibility with GNU objdump? Making highlighting enabled by default is incompatible behavior with GNU objcopy. Using with objdump in script (e.g., llvm-objdump -d foo | grep bar) won't be broken (it checks if stdout is a tty) so it won't be problem I think. | |
336 |
Did you mean whether can we allow both --color and --color=true in command-line options or not and how it is handled in colorsEnabled? |
llvm/test/tools/llvm-objdump/highlight.test | ||
---|---|---|
4 ↗ | (On Diff #212532) | # UNSUPPORTED: system-windows # REQUIRES: shell, x86-registered-target |
8 ↗ | (On Diff #212532) | llvm-objdump -d --no-show-raw-insn --color %t | tr -d '\033' | FileCheck %s --check-prefix=ENABLED |
15 ↗ | (On Diff #212532) | # ENABLED-NEXT: 619 movl 2099697([0;36m%rip[0m), [0;36m%edx[0m |
llvm/tools/llvm-objdump/llvm-objdump.cpp | ||
---|---|---|
2416 | Place extern cl::opt<cl::boolOrDefault> UseColor; at toplevel. ColorsEnabled = UseColor == cl::BOU_UNSET; may be simpler |
llvm/test/tools/llvm-objdump/highlight.test | ||
---|---|---|
8 ↗ | (On Diff #212532) | It looks much better. Thanks! |
llvm/test/tools/llvm-objdump/highlight.test | ||
---|---|---|
4 ↗ | (On Diff #212532) | Do we need both UNSUPPORTED: system-windows and REQUIRES: shell? I'd think the latter effectively implies the former (but could be wrong). |
1 ↗ | (On Diff #211432) | I'm being dumb. This should probably just be an assembly file and use llvm-mc to compile it. That'll make the input text more readable too. Also, the test should be in the X86 directory. |
1 ↗ | (On Diff #212692) | What is "it" here? Perhaps make this more verbose: "Show that llvm-objdump highlights ... by default." |
llvm/tools/llvm-objdump/llvm-objdump.cpp | ||
336 |
Compatibility is only important for things that can affect parsers (though I have no idea why anybody would be parsing disassembly...). If this can't affect parser behaviour, then I don't think we need to worry about it. If it can, then it should be off by default. | |
369 |
This is marked as done, but I don't see it being used? | |
369–371 | Rather than static casting things to int, for the assert, compare the enum values directly: assert(Color != ObjdumpColor::Sentinel) The whole point of a enum class is that you can't implicitly convert from/to them. You shouldn't be casting them ever. Also add a message to the assert i.e. assert(... && "Color cannot be Sentinel");. This all becomes moot if you follow this suggestion: instead of HighlightColors being an array, it should probably be a std::unordered_map or similar with ObjdumpColor as the key. That way you don't need the cast for the lookup, although you will need to check against HighlightColors.end(). |
llvm/test/tools/llvm-objdump/highlight.test | ||
---|---|---|
4 ↗ | (On Diff #212532) | Basically, REQUIRES: shell implies that the test environment is not Windows, but it looks that setting $LIT_USE_INTERNAL_SHELL forces running tests on Windows that requires shell [1]. Namely, UNSUPPORTED: system-windows disables this test on Windows even if $LIT_USE_INTERNAL_SHELL is set. I'm not familiar with Windows, but I think this it makes sense because I suppose that SetConsoleTextAttribute API does not emit ANSI escape sequences. |
1 ↗ | (On Diff #211432) | The input file should be an executable to test symbol highlighting because currently only RIP-relative symbols are supported. That said, I think this YAML lacks readability too. Do you have any idea? |
1 ↗ | (On Diff #212692) |
"it" referred to llvm-objdump. Fixed the comment. |
llvm/tools/llvm-objdump/llvm-objdump.cpp | ||
336 | I think so too (nobody parse the disassembly output written to a tty not a pipe, I believe). Do you think syntax highlighting should be enabled by default? This current implementation enables the feature only if --color is explicitly specified (ColorsEnabled = UseColor == cl::BOU_TRUE;). | |
369 | I've replaced outs().changeColor with WithColor. | |
369–371 | I used an array since it should be faster lookup but it was a premature optimization. I've replaced with std::map instead of std::unordered_map here. Enum class as key of std::unordered_map is not supported because of "the specified hash does not meet the Hash requirements" (I must have misunderstood something... it is supported in C++14). |
llvm/test/tools/llvm-objdump/X86/highlight.test | ||
---|---|---|
1 ↗ | (On Diff #213841) | Let's rename this test "disassembly-highlighting.test" (or whatever best matches the naming style of other disassembly tests in llvm-objdump). In particular, saying "disassemble" or "disassembly" or similar in the test name is important. On that note, this comment should say "when disassembling" somewhere. |
3 ↗ | (On Diff #213841) | You can remove "and the X86 disassembler" now. |
44 ↗ | (On Diff #213841) |
I don't I'm afraid. You might be able to simulate it using an assembly file with a single text section still. If you do stick with the YAML, I think you probably have too much content in your sections. You only check for 3 instructions, so you can reduce the bytes in the content down to those few. You can also significantly simplify much of this object. You probably don't need either .data and .bss. You probably don't need the section alignment fields. You probably can simplify the addresses to round numbers. You may not need all those symbols either (if I'm reading the checks correctly, only 'foo' is important). You don't need the entry label. Basically, you probably are better off starting with an empty YAML and building it up to the bare minimum for the test rather than using obj2yaml on a binary you want to simulate. |
llvm/tools/llvm-objdump/llvm-objdump.cpp | ||
336 | Yes, I think syntax highlighting can be on by default (it is for llvm-dwarfdump for example). |
- Enable syntax highlighting by default.
- Added WithColor::colorsAvailable
- I'll submit this as a separate patch if it's a good approach.
- Replaced the lengthy YAML with a hand-assembled shorter one.
llvm/include/llvm/Support/WithColor.h | ||
---|---|---|
97 | I think this needs rephrasing. How about "Determine whether the specified stream supports colors."? | |
llvm/lib/Support/WithColor.cpp | ||
99 | How about naming this colorsEnabled too? | |
102 | I'm not sure this clause makes sense to me? If the stream doesn't have colours, saying colours are available for it because somebody has specified --color seems wrong... Note that colorsEnabled() handles this case via DisableColors, I believe. | |
llvm/test/tools/llvm-objdump/X86/disassembly-highlighting.test | ||
3 | The comment says "by default", but the test doesn't show the default behaviour! It only shows behaviour when --color or --color=false is specified. |
llvm/include/llvm/Support/WithColor.h | ||
---|---|---|
97 | That sounds not sufficient to me. colorsAvailable returns true if the specified stream supports colors or --color=1 is given. As you suggested in another comment, I think adding another colorsEnabled() would be better: /// Determine whether colors are displayed. bool colorsEnabled(); static bool colorsEnabled(raw_ostream &OS); | |
llvm/lib/Support/WithColor.cpp | ||
102 |
That makes sense. I'll rename it to colorsEnabled.
DisableColors is not related to the --color option: it disables colors regardless of the option: /// @param DisableColors Whether to ignore color changes regardless of -color /// and support in OS WithColor(raw_ostream &OS, HighlightColor S, bool DisableColors = false) |
llvm/test/tools/llvm-objdump/X86/disassembly-highlighting.test | ||
---|---|---|
3 | Good catch! I'll fix this. |
llvm/tools/llvm-objdump/llvm-objdump.cpp | ||
---|---|---|
366 | IIUC, using WithColor to do nested coloring doesn't really work -- ~WithColor resets the terminal color to plain, *not* to what the color was before. So if you have some stream of characters like: In practice, I'm not sure any assembly actually has anything after nested spans -- the end of any nested span will line up with the span it's nested in. Still, it seems fragile/something that should be checked for somehow. Is this something you're handling somewhere that I'm just missing? |
llvm/tools/llvm-objdump/llvm-objdump.cpp | ||
---|---|---|
366 | Good point. As you said, this patch doesn't support such a case. A good example is memory operand: movq <imm:$1>, <mem:(<reg:%rax>, <reg:%rbx>)> ^ ^ Change color. WithColor resets the color here. I'll consider implementing a new class WithHighlightColor, which behaves like WithColor, but it saves/restores colors using a stack or something. |
llvm/tools/llvm-objdump/llvm-objdump.cpp | ||
---|---|---|
350 | = 0 can be deleted. You don't use its integral value. |
- Added WithNestedColor to support nested spans.
- See https://reviews.llvm.org/D65191#inline-594208 about the nested spans.
- Addressed other review comments.
llvm/include/llvm/Support/WithColor.h | ||
---|---|---|
97 | Updated to "Determine whether colors are displayed..." instead of "... stream supports colors" because even if the stream supports colors, it can be disabled by -color=false. | |
llvm/test/tools/llvm-objdump/X86/disassembly-highlighting.test | ||
3 | I just remembered that I added --color intentionally because stdout is not a tty. I've updated the comment: "... by default" → "... if colors are enabled". |
llvm/lib/Support/WithColor.cpp | ||
---|---|---|
102 | Under what situations can OS.has_colors() return false? I'm not convinced we want to return true if a stream doesn't have colours, even if the switch has been specified. | |
llvm/tools/llvm-objdump/llvm-objdump.cpp | ||
368 | a -> A Would it make sense for this code to exist in WithColor.h/.cpp? In fact, I wonder if this should be used by WithColor directly, i.e. have WithColor restore the previous state by default, rather than the plain version. | |
370 | Consider replacing the pair with a simple struct e.g. ColorState. That'll make the code using it much clearer. | |
397 | destructed -> destroyed |
llvm/lib/Support/WithColor.cpp | ||
---|---|---|
102 | In UNIX,OS.has_colors() returns false if either the stdout is not a tty (say, pipe) or the it does not support colors. Even if it does not support colors, it's sometimes useful to enable colors with the switch. For example, in disassembly-highlighting.test, since the stdout of llvm-objdump is a pipe to FileCheck, OS.has_colors() will return false. However, in order to check that the output includes colors, we need a way to force colors by the switch. | |
llvm/tools/llvm-objdump/llvm-objdump.cpp | ||
368 |
I think so too. I'll submit this part as a separate patch.
That sounds good to me. I'll try it. |
I think this needs rephrasing. How about "Determine whether the specified stream supports colors."?