This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Predictably disassemble system registers with the same encoding
AbandonedPublic

Authored by chill on Feb 4 2020, 4:30 AM.

Details

Summary

The registers TRCEXTINSELR and TRCEXTINSELR0 are distinct registers, defined by
separate extension specifications (ETM and ETE, respectively), yet they use the
same encoding in MSR/MRS.

When performing a system register lookup by encoding, we would essentially
return a random one, depending on the number, relative position in the TableGen
file, whether the TableGen records for system registers are named or not, and,
if they are names, depending on record (not register!) name as well.

This issue could be solved by some system register aliases plus preferred
disassembly mechanism. We lack that, implementing it, is moderately non-trivial,
and not clear if it's worth it at all.

As a workaround, if we give the "conflicting" records names, and relying on
TableGen internals (keeping records in an ordered map, emitting lookup tables in
record name order, using std::lower_bound for the lookup), we get disassembly
always consistent with the lexicographical order of the record names.

Diff Detail

Event Timeline

chill created this revision.Feb 4 2020, 4:30 AM

We already have a hack for a similar problem in AArch64InstPrinter::printMRSSystemRegister and AArch64InstPrinter::printMSRSystemRegister, I think it would be better to do this there to keep all of the related hacks together. This avoids relying on the sorting order of the table (which isn't obvious that it needs to be preserved), and will make it more obvious once we have enough of these hacks to be worth improving the tablegen backend to handle them.

If you still think it's better to do it here, please add comments both here (explaining why these names are significant) and in tablegen (explaining that we rely on sorting by name).

FYI - @kmclaughlin sees a failure with test/MC/Disassembler/AArch64/trace-regs.txt when applying D73719, which is an otherwise totally unrelated patch; this patch seems to resolve the issue.
I like the suggestion by @ostannard to have an explicit workaround in AArch64InstPrinter instead, as we probably shouldn't rely on the order of definitions in the TableGen file to get the output we expect.

chill added a comment.Feb 5 2020, 4:04 AM

we probably shouldn't rely on the order of definitions in the TableGen file to get the output we expect.

The unnamed TableGen records get a name like annonymous_NNN...NN where NNN...NN
are monotonically increasing numbers, assigned in the order of definitions. The lexicographic order of the names
does not correspond to the numerical order. We are not relying on the order of definitions, but on the order
of explicitly spelled names.

We are not relying on the order of definitions, but on the order of explicitly spelled names.

I'd rather we avoid doing either of them, because they are both surprising and non-obvious. How many languages can you name where the lexicographic ordering of variable names matters?

chill abandoned this revision.Feb 5 2020, 5:44 AM

Alright, I'll do it the other way.