Add support for syntax highlighting assembly. The current patch uses a different color to highlight registers, immediate and addresses.
I like how the first patch merges the "markup" and the "highlighting": that's something I considered myself. I'll need to take a more thorough look tomorrow to understand the purpose of the span and stack it uses. At least for the arm64 backend I don't think that's necessary as the start and end are clearly delineated.
Could be enum class instead of namespace.
Some targets may want to highlight other operand types [that are unique to the target] and / or using different color set. Consider Hexagon: one might want to hightlight braces and other punctuators like '++', for example. Can this method be made AArch64-specific?
It might make sense to use RAII object instead of resetting the style to None every time.
Overall this is a very cool idea. It does seem like it will need sanity checking for each architecture that enables it, but that doesn't mean opting in can't be as simple as a single bool somewhere.
I applied this and the lldb change and used lldb/test/Shell/Commands/command-disassemble-aarch64-extensions.s to test it since it's got most of the possible syntax in it.
First thing I notice is that in something like smmla v1.4s, v16.16b, v31.16b we don't treat the .16b as part of the operand. Same with /m and friends for SVE. I thought this would be bad but in fact I like it. It makes it clear what is register and what is format/mode so no one is thinking that p0/m is literally a register called p0/m, it's p0 with the /m mode for disabled lanes.
One strange thing was in one instruction the immediate is a different colour between objdump and lldb. Objump:
38: 54000490 bc.eq <red>0xc8</red> <lbl>
test.o[0x38] <+56>: bc.eq <yellow>0xc8</yellow>
Then I realised that this is because lldb doesn't resolve the branch target to a label because the function ends before it would see it. so it's still an immediate operand. Objdump finds the target which makes it an address. If I add a nop after the label so that the label is part of the chunk lldb looks at, I get:
test.o[0x38] <+56>: bc.eq <yellow>0xc8</yellow> ; <+200>
So FYI in case you also stumble into that.
I unified coloring and markup and used a WithColor-like RAII object. I think compared to the existing code, it improves the readability a lot, while simultaneously completely abstracting the coloring.
Perhaps we can add an option to llvm-mc, similar to --mdis, and tests near test/MC/Disassembler/AArch64/marked-up.txt.
Then keep very simple tests in llvm-objdump to test that coloring can be enabled.
Limit formatted_raw_ostream workaround to disassembling with color only. Apparently this changes output for certain tests. The formatted_raw_ostream class has several issues related to buffering, but that's orthogonal to this patch.
When testing something that is format sensitive, we want one test to test every space and typically use --strict-whitespace --match-full-lines, e.g. llvm/test/tools/llvm-readobj/ELF/relocations.test
I wonder whether this is sufficient to cover changes? Things like printImmSVE seem uncovered?
Document this a bit in the description?
IP is immediately dereferenced, so the assert seems unneeded (even if AC_MDisassemble uses it as well)
Since the patch uses the same mechanism for emitting markup annotations and colors, I don't see the need to duplicate the testing for the two. The latter is (at least partially) covered by marked-up.txt. If we want to make sure every instruction type is covered, then the markup annotations are a much better way to do so: they're much easier to read and write and they are supported on all platforms (unlike ANSI escape codes).
Looks reasonable to me, although it's been a while since I last looked at this sort of area.
Just making sure I fully follow here: this lis essentially setting the markup to Immediate for the rest of the function, whereas the cases like in printShifter below just do it for the sequence of << commands?
[[nodiscard]] on constructors is a C++20 feature, even though it works on older compilers (like gcc 7.4 we claim to support), it still produces an annoying (especially if you compile with -Werror) warning. Just wanted to mention.
https://github.com/llvm/llvm-project/commit/48987fb8cc844667c6ee2a315140cb8f3d817fc1 removes it, just because it's blocking some downstream stuff we have.
I do see the logic in having it so we don't forget to colour things but not sure how to do that in a compatible way right now.
We could do the macro back-compat thing:
// [[nodiscard]] on constructors is a C++20 feature... #if __cplusplus >= 202002L # define LLVM_CTOR_NODISCARD [[nodiscard]] #else # define LLVM_CTOR_NODISCARD #endif
Since this was a defect report against C++17, it should've been applied against it retroactively, so the proper check would be something like this:
#ifdef __has_cpp_attribute # if __has_cpp_attribute(nodiscard) >= 201907L ...