This is an archive of the discontinued LLVM Phabricator instance.

[AsmPrinter][ORE] use correct opcode name
ClosedPublic

Authored by shchenz on Nov 4 2021, 1:10 AM.

Details

Reviewers
jsji
fhahn
arsenm
paquette
Group Reviewers
Restricted Project
Commits
rG50acbbe3cd19: [AsmPrinter][ORE] use correct opcode name
Summary

This is found on AIX when using opt-viewer.py.

On AIX, we have a pseudo instruction:

def BCTRL_LWZinto_toc:
  XLForm_2_ext_and_DForm_1<19, 528, 20, 0, 1, 32, (outs),
   (ins memri:$src), "bctrl\n\tlwz 2, $src", IIC_BrB,
   [(PPCbctrl_load_toc iaddr:$src)]>, Requires<[In32BitMode]>;

This pseudo instruction consists of two instructions: bctrl without any explicit operands and lwz with two operands 2 and $src.

getMnemonic introduced in D90040 returns a very strange opcode name for the above instruction, it is bctrl\n\tlwz 2, because AsmWriterInst::AsmWriterInst() treats $ as separator between opcode and operands.

So in the output YAML when generating remarks for asm-printer pass, we get:

- String:          "\n"
- String:          "bctrl\n\tlwz 2, "
- String:          ': '
- INST_bctrl
      lwz 2,: '1' 
- String:          "\n"

The invalid YAML will cause opt-viewer.py to work abnormally.

Seems AsmWriterInst::AsmWriterInst() treating $ as the only separator for opcode name and operands is right, because \n or \t are both treated as a normal character in opcode name or operands name, and \n\t exists very common in instruction mnemonic on some targets. so we can not fix the issue there.

And using bctrl as the opcode name for the pseudo instruction BCTRL_LWZinto_toc also follows the current design, the base class for the instruction form indicates:

// Two joined instructions; used to emit two adjacent instructions as one.
// The itinerary from the first instruction is used for scheduling and
// classification.
class I2 {
}

Diff Detail

Event Timeline

shchenz created this revision.Nov 4 2021, 1:10 AM
shchenz requested review of this revision.Nov 4 2021, 1:10 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 4 2021, 1:10 AM
jsji added inline comments.Nov 4 2021, 12:02 PM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1418

Can we just use getToken()?

-        auto Name = (Twine("INST_") + KV.first.trim()).str();
+        auto Name = (Twine("INST_") + getToken(KV.first).first.trim()).str();
shchenz updated this revision to Diff 384948.Nov 4 2021, 8:51 PM

address comment

shchenz marked an inline comment as done.Nov 4 2021, 8:52 PM
shchenz added inline comments.
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
1418

Yes, getToken() is simpiler

shchenz updated this revision to Diff 384950.Nov 4 2021, 8:56 PM
shchenz marked an inline comment as done.

delete the unnecessary default paramater

jsji accepted this revision as: jsji.Nov 5 2021, 6:38 AM
This revision is now accepted and ready to land.Nov 5 2021, 6:38 AM
This revision was landed with ongoing or failed builds.Nov 7 2021, 5:55 PM
This revision was automatically updated to reflect the committed changes.
fhahn added inline comments.Nov 8 2021, 6:44 AM
llvm/test/CodeGen/PowerPC/instruction-mix-remarks-BCTRL_LWZinto_toc.ll
9

shouldn't this be 2 instructions and the count for tld is missing at the moment?

I am not very familiar with PPC instructions, but the description contains:

This pseudo instruction consists of two instructions: bctrl without any explicit operands and lwz with two operands 2 and $src.
shchenz added inline comments.Nov 8 2021, 4:32 PM
llvm/test/CodeGen/PowerPC/instruction-mix-remarks-BCTRL_LWZinto_toc.ll
9

This is a known limitation for this kind of pseudo instructions for scheduling/classification, and now one more, for opcode mnemonic if we want to get the name from getMnemonic

// Two joined instructions; used to emit two adjacent instructions as one.
// The itinerary from the first instruction is used for scheduling and
// classification.
class I2 {
}