This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/SystemZ/MCTargetDesc/
-
Target/
-
SystemZ/
-
MCTargetDesc/
-
SystemZMCCodeEmitter.cpp
-
utils/TableGen/
-
TableGen/
2/3
CodeEmitterGen.cpp

Differential D155329

[TableGen][CodeEmitterGen] Add support for querying operand bit offsets
ClosedPublic

Authored by iii on Jul 14 2023, 12:23 PM.

Download Raw Diff

Details

Reviewers

craig.topper
uweigand

Commits

rG8b655e1f0a70: [TableGen][CodeEmitterGen] Add support for querying operand bit offsets

Summary

In order to generate relocations or to apply fixups after the layout
has been computed, the targets need to know the offsets of the
respective operands. There are indirect ways to figure them out in some
cases, for example, on SystemZ, the first memory operand is always at
offset 2, and the second one is always at offset 4. But there are no
such tricks for the immediate operands on SystemZ, so one has to refer
to individual instruction encodings.

This information, however, is available to TableGen. Generate
the getOperandBitOffset() method to access it, and use it to simplify
getting memory operand offsets on SystemZ. This also paves the way for
implementing symbolic immediates on this platform.

For the multi-lit operands, getOperandBitOffset() returns the offset of
the first lit.

An alternative way to obtain offsets would be to pass them to the
encoder methods, but this would require reworking all targets. Also,
VarLenCodeEmitter already does this, but adopting it requires
reworking the respective targets without other significant benefits.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

iii created this revision.Jul 14 2023, 12:23 PM

Herald added a project: Restricted Project. · View Herald TranscriptJul 14 2023, 12:23 PM

Herald added a subscriber: hiraditya. · View Herald Transcript

iii requested review of this revision.Jul 14 2023, 12:23 PM

Herald added a project: Restricted Project. · View Herald TranscriptJul 14 2023, 12:23 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B245472: Diff 540532.Jul 14 2023, 1:24 PM

craig.topper added inline comments.Jul 14 2023, 7:37 PM

llvm/utils/TableGen/CodeEmitterGen.cpp
226	If I'm reading this right, this is the offset from the MSB of the encoding to the MSB of the field? Which works for SystemZ because the encoding bytes are emitted in big endian, but would not be useful for a little endian target?

Add little-endian support.
Print operand names.
Fix a typo in the #endif comment.

iii added inline comments.Jul 17 2023, 1:38 AM

llvm/utils/TableGen/CodeEmitterGen.cpp
226	You are right, I thought that reverseBitsForLittleEndianEncoding() would handle that, but it doesn't. I've updated the patch and verified the update with PPC, which sets isLittleEndianEncoding. For BCCTR (https://www.ibm.com/docs/en/aix/7.2?topic=set-bcctr-bcc-branch-conditional-count-register-instruction) we get: switch (OpNum) { case 0: // op: BI return 20; } which matches the drawing and the encoding in practice: 0: 20 1c 22 4c bcctr 1, 2, 3 bit 20 \| v BitV 00100000 00011100 00100010 01001100 Bit# 76543210 54321098 32109876 10987654 +++++++- ...--+++ +++----- ------++ Field 528 LK BH BO BI 19 BO

Harbormaster completed remote builds in B245752: Diff 540906.Jul 17 2023, 2:38 AM

iii marked an inline comment as done.Jul 17 2023, 8:11 AM

I'm not sure when targets are supposed to set isLittleEndianEncoding(). RISC-V encoding is definitely little endian, but we don't set isLittleEndianEncoding().

Hmm, yes, this indeed does not work for RISC-V. E.g., for LUI I get:

switch (OpNum) {
case 1:
  // op: imm20
  return 0;
case 0:
  // op: rd
  return 20;
}

which is in the wrong order:

0000000000000000 <.text>:
   0:	000c72b7          	lui	t0,0xc7

        b7 72 0c 00  # in memory

I'll need to find (or introduce) a different criterion.

Right now there doesn't seem to be a generic way to get the instruction endianness.
The best example is ARM:

ARMDisassembler(const MCSubtargetInfo &STI, MCContext &Ctx,
                const MCInstrInfo *MCII)
    : MCDisassembler(STI, Ctx), MCII(MCII) {
  InstructionEndianness = STI.hasFeature(ARM::ModeBigEndianInstructions)
                              ? llvm::support::big
                              : llvm::support::little;

where it's separate from data endianness and is not available to TableGen.
Even though it's used only for disassembly, it demonstrates the level of flexibility that may be requred.

So I wonder if it would be acceptable to pass endianness as a parameter to getOperandBitOffset() like this?

--- a/llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCCodeEmitter.cpp
+++ b/llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCCodeEmitter.cpp
@@ -55,6 +55,7 @@ private:
                                  SmallVectorImpl<MCFixup> &Fixups,
                                  const MCSubtargetInfo &STI) const;
   uint32_t getOperandBitOffset(const MCInst &MI, unsigned OpNum,
+                               bool IsLittleEndian,
                                const MCSubtargetInfo &STI) const;
 
   // Called by the TableGen code to get the binary encoding of operand
@@ -174,7 +175,7 @@ uint64_t SystemZMCCodeEmitter::getDispOpValue(const MCInst &MI, unsigned OpNum,
   if (MO.isImm())
     return static_cast<uint64_t>(MO.getImm());
   if (MO.isExpr()) {
-    uint32_t BitOffset = getOperandBitOffset(MI, OpNum, STI);
+    uint32_t BitOffset = getOperandBitOffset(MI, OpNum, false, STI);
     Fixups.push_back(MCFixup::create(BitOffset >> 3, MO.getExpr(),
                                      (MCFixupKind)Kind, MI.getLoc()));
     assert(Fixups.size() <= 2 && "More than two memory operands in MI?");
diff --git a/llvm/utils/TableGen/CodeEmitterGen.cpp b/llvm/utils/TableGen/CodeEmitterGen.cpp
index dcac1efb7132..ddca6f0a2a68 100644
--- a/llvm/utils/TableGen/CodeEmitterGen.cpp
+++ b/llvm/utils/TableGen/CodeEmitterGen.cpp
@@ -211,12 +211,11 @@ bool CodeEmitterGen::addCodeToMergeInOperand(Record *R, BitsInit *BI,
     unsigned hiBit = loBit + N;
     unsigned loInstBit = beginInstBit - N + 1;
     if (!BitOffsetCaseEmitted) {
-      unsigned bitOffset = Target.isLittleEndianEncoding()
-                               ? beginInstBit
-                               : BI->getNumBits() - beginInstBit - 1;
       BitOffsetCase += "      case " + utostr(OpIdx) + ":\n";
       BitOffsetCase += "        // op: " + VarName + "\n";
-      BitOffsetCase += "        return " + utostr(bitOffset) + ";\n";
+      BitOffsetCase += "        return IsLittleEndian ? " +
+                       utostr(beginInstBit) + " : " +
+                       utostr(BI->getNumBits() - beginInstBit - 1) + ";\n";
       BitOffsetCaseEmitted = true;
     }
     if (UseAPInt) {
@@ -537,6 +536,7 @@ void CodeEmitterGen::run(raw_ostream &o) {
       << "uint32_t " << Target.getName()
       << "MCCodeEmitter::getOperandBitOffset(const MCInst &MI,\n"
       << "    unsigned OpNum,\n"
+      << "    bool IsLittleEndian,\n"
       << "    const MCSubtargetInfo &STI) const {\n"
       << "  switch (MI.getOpcode()) {\n";
     emitCaseMap(o, BitOffsetCaseMap);

Pass endianness explicitly.

Fix the handling of multi-lit instructions on little-endian. The code used to return the start of the last lit, which is not useful.

Example output with this change:

case RISCV::AUIPC:
case RISCV::JAL:
case RISCV::LUI: {
  switch (OpNum) {
  case 1:
    // op: imm20
    return IsLittleEndian ? 12 : 0;
  case 0:
    // op: rd
    return IsLittleEndian ? 7 : 20;
  }
  break;
}

case SystemZ::CFI:
case SystemZ::CGFI:
case SystemZ::CIH:
case SystemZ::CLFI:
case SystemZ::CLGFI:
case SystemZ::CLIH:
case SystemZ::IIHF:
case SystemZ::IILF:
case SystemZ::LGFI:
case SystemZ::LLIHF:
case SystemZ::LLILF: {
  switch (OpNum) {
  case 0:
    // op: R1
    return IsLittleEndian ? 36 : 8;
  case 1:
    // op: I2
    return IsLittleEndian ? 0 : 16;
  }
  break;
}

In D155329#4509748, @iii wrote:

case SystemZ::LLILF: {
  switch (OpNum) {
  case 0:
    // op: R1
    return IsLittleEndian ? 36 : 8;
  case 1:
    // op: I2
    return IsLittleEndian ? 0 : 16;
  }
  break;
}

I guess this numbers are still a bit confusing to me. What exactly does IsLittleEndian specify here? TableGen doesn't really inherently know about the in-memory representation, it defines the instruction as a single integer. E.g. looking at how this instruction format is defined:

let Inst{47-40} = op{11-4};
let Inst{39-36} = R1;
let Inst{35-32} = op{3-0};
let Inst{31-0}  = I2;

we see it's represented as a single 48-bit integer. Its encoding into bytes in memory only happens in the platform-specific encodeInstruction routine here:

void SystemZMCCodeEmitter::encodeInstruction(const MCInst &MI,
                                             SmallVectorImpl<char> &CB,
                                             SmallVectorImpl<MCFixup> &Fixups,
                                             const MCSubtargetInfo &STI) const {
  MemOpsEmitted = 0;
  uint64_t Bits = getBinaryCodeForInstr(MI, Fixups, STI);
  unsigned Size = MCII.get(MI.getOpcode()).getSize();
  // Big-endian insertion of Size bytes.
  unsigned ShiftValue = (Size * 8) - 8;
  for (unsigned I = 0; I != Size; ++I) {
    CB.push_back(uint8_t(Bits >> ShiftValue));
    ShiftValue -= 8;
  }
}

So I would have expected the getOperandBitOffset call to return one of the values explicitly called out in the instruction format definition, i.e. 39 or 36 for R1 in the above example. (I guess it doesn't matter which one, as the user will know the field length, so they are able to compute the other one.) Using that value, plus the total size of the current instruction (which can be gotten via MCII.get(MI.getOpcode()).getSize()), the caller should able to compute whatever they need.

Harbormaster completed remote builds in B246147: Diff 541428.Jul 18 2023, 8:16 AM

Move the endianness handling to the target C++ code.

Harbormaster completed remote builds in B246270: Diff 541592.Jul 18 2023, 4:51 PM

The SystemZ parts LGTM. The TableGen changes also look reasonable to me, but I'd also like to wait for Craig's input on whether his concerns are resolved with this version.

It would be good to have a comment spelling out the explicit semantics of the new getOperandBitOffset routine that targets can rely on. Maybe the top of CodeEmitterGen.cpp would be a good place for this.

Document the new function.

craig.topper added inline comments.Jul 19 2023, 4:23 PM

llvm/utils/TableGen/CodeEmitterGen.cpp
11	MachineInstr should be MCInst here and in the new paragraph.

Harbormaster completed remote builds in B246707: Diff 542235.Jul 19 2023, 4:59 PM

s/MachineInstr/MCInst/g

Harbormaster completed remote builds in B246780: Diff 542326.Jul 20 2023, 12:41 AM

LGTM

This revision is now accepted and ready to land.Jul 20 2023, 12:54 AM

This revision was landed with ongoing or failed builds.Jul 20 2023, 1:11 AM

Closed by commit rG8b655e1f0a70: [TableGen][CodeEmitterGen] Add support for querying operand bit offsets (authored by iii). · Explain Why

This revision was automatically updated to reflect the committed changes.

iii added a commit: rG8b655e1f0a70: [TableGen][CodeEmitterGen] Add support for querying operand bit offsets.

@iii I'm seeing build warnings (which we treat as errors in most builds) due to this:

E:\llvm\ninja\lib\Target\SystemZ\SystemZGenMCCodeEmitter.inc(15414): warning C4060: switch statement contains no 'case' or 'default' labels

which came from:

    case SystemZ::CSCH:
    case SystemZ::HSCH:
    case SystemZ::IPK:
    case SystemZ::NNPA:
    case SystemZ::NOP_bare:
    case SystemZ::PALB:
    case SystemZ::PCC:
    case SystemZ::PCKMO:
    case SystemZ::PFPO:
    case SystemZ::PR:
    case SystemZ::PTFF:
    case SystemZ::PTLB:
    case SystemZ::RCHP:
    case SystemZ::RSCH:
    case SystemZ::SAL:
    case SystemZ::SAM24:
    case SystemZ::SAM31:
    case SystemZ::SAM64:
    case SystemZ::SCHM:
    case SystemZ::SCKPF:
    case SystemZ::TAM:
    case SystemZ::TEND:
    case SystemZ::TRAP2:
    case SystemZ::UPT:
    case SystemZ::XSCH: {
      switch (OpNum) {
      }
      break;
    }
  }
  std::string msg;
  raw_string_ostream Msg(msg);
  Msg << "Not supported instr[opcode]: " << MI << "[" << OpNum << "]";
  report_fatal_error(Msg.str().c_str());
}

#endif // GET_OPERAND_BIT_OFFSET

Please can you update this so that empty case map entries are not emitted?

Thanks for letting me know, I'll fix this.

https://reviews.llvm.org/D155805

Revision Contents

Path

Size

llvm/

lib/

Target/

SystemZ/

MCTargetDesc/

SystemZMCCodeEmitter.cpp

36 lines

utils/

TableGen/

CodeEmitterGen.cpp

149 lines

Diff 542356

llvm/lib/Target/SystemZ/MCTargetDesc/SystemZMCCodeEmitter.cpp

Show All 31 Lines
#define DEBUG_TYPE "mccodeemitter"		#define DEBUG_TYPE "mccodeemitter"

namespace {		namespace {

class SystemZMCCodeEmitter : public MCCodeEmitter {		class SystemZMCCodeEmitter : public MCCodeEmitter {
const MCInstrInfo &MCII;		const MCInstrInfo &MCII;
MCContext &Ctx;		MCContext &Ctx;

mutable unsigned MemOpsEmitted;

public:		public:
SystemZMCCodeEmitter(const MCInstrInfo &mcii, MCContext &ctx)		SystemZMCCodeEmitter(const MCInstrInfo &mcii, MCContext &ctx)
: MCII(mcii), Ctx(ctx) {		: MCII(mcii), Ctx(ctx) {
}		}

~SystemZMCCodeEmitter() override = default;		~SystemZMCCodeEmitter() override = default;

// OVerride MCCodeEmitter.		// OVerride MCCodeEmitter.
void encodeInstruction(const MCInst &MI, SmallVectorImpl<char> &CB,		void encodeInstruction(const MCInst &MI, SmallVectorImpl<char> &CB,
SmallVectorImpl<MCFixup> &Fixups,		SmallVectorImpl<MCFixup> &Fixups,
const MCSubtargetInfo &STI) const override;		const MCSubtargetInfo &STI) const override;

private:		private:
// Automatically generated by TableGen.		// Automatically generated by TableGen.
uint64_t getBinaryCodeForInstr(const MCInst &MI,		uint64_t getBinaryCodeForInstr(const MCInst &MI,
SmallVectorImpl<MCFixup> &Fixups,		SmallVectorImpl<MCFixup> &Fixups,
const MCSubtargetInfo &STI) const;		const MCSubtargetInfo &STI) const;
		uint32_t getOperandBitOffset(const MCInst &MI, unsigned OpNum,
		const MCSubtargetInfo &STI) const;

// Called by the TableGen code to get the binary encoding of operand		// Called by the TableGen code to get the binary encoding of operand
// MO in MI. Fixups is the list of fixups against MI.		// MO in MI. Fixups is the list of fixups against MI.
uint64_t getMachineOpValue(const MCInst &MI, const MCOperand &MO,		uint64_t getMachineOpValue(const MCInst &MI, const MCOperand &MO,
SmallVectorImpl<MCFixup> &Fixups,		SmallVectorImpl<MCFixup> &Fixups,
const MCSubtargetInfo &STI) const;		const MCSubtargetInfo &STI) const;

// Return the displacement value for the OpNum operand. If it is a symbol,		// Return the displacement value for the OpNum operand. If it is a symbol,
// add a fixup for it and return 0.		// add a fixup for it and return 0.
uint64_t getDispOpValue(const MCInst &MI, unsigned OpNum,		uint64_t getDispOpValue(const MCInst &MI, unsigned OpNum,
SmallVectorImpl<MCFixup> &Fixups,		SmallVectorImpl<MCFixup> &Fixups,
		const MCSubtargetInfo &STI, unsigned OpSize,
SystemZ::FixupKind Kind) const;		SystemZ::FixupKind Kind) const;

// Called by the TableGen code to get the binary encoding of an address.		// Called by the TableGen code to get the binary encoding of an address.
// The index or length, if any, is encoded first, followed by the base,		// The index or length, if any, is encoded first, followed by the base,
// followed by the displacement. In a 20-bit displacement,		// followed by the displacement. In a 20-bit displacement,
// the low 12 bits are encoded before the high 8 bits.		// the low 12 bits are encoded before the high 8 bits.
template <unsigned N>		template <unsigned N>
uint64_t getLenEncoding(const MCInst &MI, unsigned OpNum,		uint64_t getLenEncoding(const MCInst &MI, unsigned OpNum,
▲ Show 20 Lines • Show All 61 Lines • ▼ Show 20 Lines
};		};

} // end anonymous namespace		} // end anonymous namespace

void SystemZMCCodeEmitter::encodeInstruction(const MCInst &MI,		void SystemZMCCodeEmitter::encodeInstruction(const MCInst &MI,
SmallVectorImpl<char> &CB,		SmallVectorImpl<char> &CB,
SmallVectorImpl<MCFixup> &Fixups,		SmallVectorImpl<MCFixup> &Fixups,
const MCSubtargetInfo &STI) const {		const MCSubtargetInfo &STI) const {
MemOpsEmitted = 0;
uint64_t Bits = getBinaryCodeForInstr(MI, Fixups, STI);		uint64_t Bits = getBinaryCodeForInstr(MI, Fixups, STI);
unsigned Size = MCII.get(MI.getOpcode()).getSize();		unsigned Size = MCII.get(MI.getOpcode()).getSize();
// Big-endian insertion of Size bytes.		// Big-endian insertion of Size bytes.
unsigned ShiftValue = (Size * 8) - 8;		unsigned ShiftValue = (Size * 8) - 8;
for (unsigned I = 0; I != Size; ++I) {		for (unsigned I = 0; I != Size; ++I) {
CB.push_back(uint8_t(Bits >> ShiftValue));		CB.push_back(uint8_t(Bits >> ShiftValue));
ShiftValue -= 8;		ShiftValue -= 8;
}		}
}		}

uint64_t SystemZMCCodeEmitter::		uint64_t SystemZMCCodeEmitter::
getMachineOpValue(const MCInst &MI, const MCOperand &MO,		getMachineOpValue(const MCInst &MI, const MCOperand &MO,
SmallVectorImpl<MCFixup> &Fixups,		SmallVectorImpl<MCFixup> &Fixups,
const MCSubtargetInfo &STI) const {		const MCSubtargetInfo &STI) const {
if (MO.isReg())		if (MO.isReg())
return Ctx.getRegisterInfo()->getEncodingValue(MO.getReg());		return Ctx.getRegisterInfo()->getEncodingValue(MO.getReg());
if (MO.isImm())		if (MO.isImm())
return static_cast<uint64_t>(MO.getImm());		return static_cast<uint64_t>(MO.getImm());
llvm_unreachable("Unexpected operand type!");		llvm_unreachable("Unexpected operand type!");
}		}

uint64_t SystemZMCCodeEmitter::		uint64_t SystemZMCCodeEmitter::getDispOpValue(const MCInst &MI, unsigned OpNum,
getDispOpValue(const MCInst &MI, unsigned OpNum,
SmallVectorImpl<MCFixup> &Fixups,		SmallVectorImpl<MCFixup> &Fixups,
		const MCSubtargetInfo &STI,
		unsigned OpSize,
SystemZ::FixupKind Kind) const {		SystemZ::FixupKind Kind) const {
const MCOperand &MO = MI.getOperand(OpNum);		const MCOperand &MO = MI.getOperand(OpNum);
if (MO.isImm()) {		if (MO.isImm())
++MemOpsEmitted;
return static_cast<uint64_t>(MO.getImm());		return static_cast<uint64_t>(MO.getImm());
}
if (MO.isExpr()) {		if (MO.isExpr()) {
// All instructions follow the pattern where the first displacement has a		unsigned MIBitSize = MCII.get(MI.getOpcode()).getSize() * 8;
// 2 bytes offset, and the second one 4 bytes.		uint32_t RawBitOffset = getOperandBitOffset(MI, OpNum, STI);
unsigned ByteOffs = MemOpsEmitted++ == 0 ? 2 : 4;		uint32_t BitOffset = MIBitSize - RawBitOffset - OpSize;
Fixups.push_back(MCFixup::create(ByteOffs, MO.getExpr(), (MCFixupKind)Kind,		Fixups.push_back(MCFixup::create(BitOffset >> 3, MO.getExpr(),
MI.getLoc()));		(MCFixupKind)Kind, MI.getLoc()));
assert(Fixups.size() <= 2 && "More than two memory operands in MI?");		assert(Fixups.size() <= 2 && "More than two memory operands in MI?");
return 0;		return 0;
}		}
llvm_unreachable("Unexpected operand type!");		llvm_unreachable("Unexpected operand type!");
}		}

template <unsigned N>		template <unsigned N>
uint64_t		uint64_t
SystemZMCCodeEmitter::getLenEncoding(const MCInst &MI, unsigned OpNum,		SystemZMCCodeEmitter::getLenEncoding(const MCInst &MI, unsigned OpNum,
SmallVectorImpl<MCFixup> &Fixups,		SmallVectorImpl<MCFixup> &Fixups,
const MCSubtargetInfo &STI) const {		const MCSubtargetInfo &STI) const {
return getMachineOpValue(MI, MI.getOperand(OpNum), Fixups, STI) - 1;		return getMachineOpValue(MI, MI.getOperand(OpNum), Fixups, STI) - 1;
}		}

uint64_t		uint64_t
SystemZMCCodeEmitter::getDisp12Encoding(const MCInst &MI, unsigned OpNum,		SystemZMCCodeEmitter::getDisp12Encoding(const MCInst &MI, unsigned OpNum,
SmallVectorImpl<MCFixup> &Fixups,		SmallVectorImpl<MCFixup> &Fixups,
const MCSubtargetInfo &STI) const {		const MCSubtargetInfo &STI) const {
return getDispOpValue(MI, OpNum, Fixups, SystemZ::FixupKind::FK_390_12);		return getDispOpValue(MI, OpNum, Fixups, STI, 12,
		SystemZ::FixupKind::FK_390_12);
}		}

uint64_t		uint64_t
SystemZMCCodeEmitter::getDisp20Encoding(const MCInst &MI, unsigned OpNum,		SystemZMCCodeEmitter::getDisp20Encoding(const MCInst &MI, unsigned OpNum,
SmallVectorImpl<MCFixup> &Fixups,		SmallVectorImpl<MCFixup> &Fixups,
const MCSubtargetInfo &STI) const {		const MCSubtargetInfo &STI) const {
return getDispOpValue(MI, OpNum, Fixups, SystemZ::FixupKind::FK_390_20);		return getDispOpValue(MI, OpNum, Fixups, STI, 20,
		SystemZ::FixupKind::FK_390_20);
}		}

uint64_t		uint64_t
SystemZMCCodeEmitter::getPCRelEncoding(const MCInst &MI, unsigned OpNum,		SystemZMCCodeEmitter::getPCRelEncoding(const MCInst &MI, unsigned OpNum,
SmallVectorImpl<MCFixup> &Fixups,		SmallVectorImpl<MCFixup> &Fixups,
unsigned Kind, int64_t Offset,		unsigned Kind, int64_t Offset,
bool AllowTLS) const {		bool AllowTLS) const {
SMLoc Loc = MI.getLoc();		SMLoc Loc = MI.getLoc();
Show All 18 Lines	SystemZMCCodeEmitter::getPCRelEncoding(const MCInst &MI, unsigned OpNum,
if (AllowTLS && OpNum + 1 < MI.getNumOperands()) {		if (AllowTLS && OpNum + 1 < MI.getNumOperands()) {
const MCOperand &MOTLS = MI.getOperand(OpNum + 1);		const MCOperand &MOTLS = MI.getOperand(OpNum + 1);
Fixups.push_back(MCFixup::create(		Fixups.push_back(MCFixup::create(
0, MOTLS.getExpr(), (MCFixupKind)SystemZ::FK_390_TLS_CALL, Loc));		0, MOTLS.getExpr(), (MCFixupKind)SystemZ::FK_390_TLS_CALL, Loc));
}		}
return 0;		return 0;
}		}

		#define GET_OPERAND_BIT_OFFSET
#include "SystemZGenMCCodeEmitter.inc"		#include "SystemZGenMCCodeEmitter.inc"

MCCodeEmitter *llvm::createSystemZMCCodeEmitter(const MCInstrInfo &MCII,		MCCodeEmitter *llvm::createSystemZMCCodeEmitter(const MCInstrInfo &MCII,
MCContext &Ctx) {		MCContext &Ctx) {
return new SystemZMCCodeEmitter(MCII, Ctx);		return new SystemZMCCodeEmitter(MCII, Ctx);
}		}

llvm/utils/TableGen/CodeEmitterGen.cpp

//===- CodeEmitterGen.cpp - Code Emitter Generator ------------------------===//		//===- CodeEmitterGen.cpp - Code Emitter Generator ------------------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// CodeEmitterGen uses the descriptions of instructions and their fields to		// CodeEmitterGen uses the descriptions of instructions and their fields to
// construct an automated code emitter: a function that, given a MachineInstr,		// construct an automated code emitter: a function called
// returns the (currently, 32-bit unsigned) value of the instruction.		// getBinaryCodeForInstr() that, given a MCInst, returns the value of the
		craig.topperUnsubmitted Not Done Reply Inline Actions MachineInstr should be MCInst here and in the new paragraph. craig.topper: MachineInstr should be MCInst here and in the new paragraph.
		// instruction - either as an uint64_t or as an APInt, depending on the
		// maximum bit width of all Inst definitions.
		//
		// In addition, it generates another function called getOperandBitOffset()
		// that, given a MCInst and an operand index, returns the minimum of indices of
		// all bits that carry some portion of the respective operand. When the target's
		// encodeInstruction() stores the instruction in a little-endian byte order, the
		// returned value is the offset of the start of the operand in the encoded
		// instruction. Other targets might need to adjust the returned value according
		// to their encodeInstruction() implementation.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "CodeGenHwModes.h"		#include "CodeGenHwModes.h"
#include "CodeGenInstruction.h"		#include "CodeGenInstruction.h"
#include "CodeGenTarget.h"		#include "CodeGenTarget.h"
#include "InfoByHwMode.h"		#include "InfoByHwMode.h"
#include "VarLenCodeEmitterGen.h"		#include "VarLenCodeEmitterGen.h"
Show All 21 Lines

public:		public:
CodeEmitterGen(RecordKeeper &R) : Records(R) {}		CodeEmitterGen(RecordKeeper &R) : Records(R) {}

void run(raw_ostream &o);		void run(raw_ostream &o);

private:		private:
int getVariableBit(const std::string &VarName, BitsInit *BI, int bit);		int getVariableBit(const std::string &VarName, BitsInit *BI, int bit);
std::string getInstructionCase(Record *R, CodeGenTarget &Target);		std::pair<std::string, std::string>
std::string getInstructionCaseForEncoding(Record R, Record EncodingDef,		getInstructionCases(Record *R, CodeGenTarget &Target);
CodeGenTarget &Target);		void addInstructionCasesForEncoding(Record R, Record EncodingDef,
		CodeGenTarget &Target, std::string &Case,
		std::string &BitOffsetCase);
bool addCodeToMergeInOperand(Record R, BitsInit BI,		bool addCodeToMergeInOperand(Record R, BitsInit BI,
const std::string &VarName,		const std::string &VarName, std::string &Case,
std::string &Case, CodeGenTarget &Target);		std::string &BitOffsetCase,
		CodeGenTarget &Target);

void emitInstructionBaseValues(		void emitInstructionBaseValues(
raw_ostream &o, ArrayRef<const CodeGenInstruction *> NumberedInstructions,		raw_ostream &o, ArrayRef<const CodeGenInstruction *> NumberedInstructions,
CodeGenTarget &Target, int HwMode = -1);		CodeGenTarget &Target, int HwMode = -1);
		void
		emitCaseMap(raw_ostream &o,
		const std::map<std::string, std::vector<std::string>> &CaseMap);
unsigned BitWidth = 0u;		unsigned BitWidth = 0u;
bool UseAPInt = false;		bool UseAPInt = false;
};		};

// If the VarBitInit at position 'bit' matches the specified variable then		// If the VarBitInit at position 'bit' matches the specified variable then
// return the variable bit position. Otherwise return -1.		// return the variable bit position. Otherwise return -1.
int CodeEmitterGen::getVariableBit(const std::string &VarName,		int CodeEmitterGen::getVariableBit(const std::string &VarName,
BitsInit *BI, int bit) {		BitsInit *BI, int bit) {
if (VarBitInit *VBI = dyn_cast<VarBitInit>(BI->getBit(bit))) {		if (VarBitInit *VBI = dyn_cast<VarBitInit>(BI->getBit(bit))) {
if (VarInit *VI = dyn_cast<VarInit>(VBI->getBitVar()))		if (VarInit *VI = dyn_cast<VarInit>(VBI->getBitVar()))
if (VI->getName() == VarName)		if (VI->getName() == VarName)
return VBI->getBitNum();		return VBI->getBitNum();
} else if (VarInit *VI = dyn_cast<VarInit>(BI->getBit(bit))) {		} else if (VarInit *VI = dyn_cast<VarInit>(BI->getBit(bit))) {
if (VI->getName() == VarName)		if (VI->getName() == VarName)
return 0;		return 0;
}		}

return -1;		return -1;
}		}

// Returns true if it succeeds, false if an error.		// Returns true if it succeeds, false if an error.
bool CodeEmitterGen::addCodeToMergeInOperand(Record R, BitsInit BI,		bool CodeEmitterGen::addCodeToMergeInOperand(Record R, BitsInit BI,
const std::string &VarName,		const std::string &VarName,
std::string &Case,		std::string &Case,
		std::string &BitOffsetCase,
CodeGenTarget &Target) {		CodeGenTarget &Target) {
CodeGenInstruction &CGI = Target.getInstruction(R);		CodeGenInstruction &CGI = Target.getInstruction(R);

// Determine if VarName actually contributes to the Inst encoding.		// Determine if VarName actually contributes to the Inst encoding.
int bit = BI->getNumBits()-1;		int bit = BI->getNumBits()-1;

// Scan for a bit that this contributed to.		// Scan for a bit that this contributed to.
for (; bit >= 0; ) {		for (; bit >= 0; ) {
▲ Show 20 Lines • Show All 79 Lines • ▼ Show 20 Lines	for (--tmpBit; tmpBit >= 0;) {
if (varBit == -1 \|\| varBit != (beginVarBit - N))		if (varBit == -1 \|\| varBit != (beginVarBit - N))
break;		break;
++N;		++N;
--tmpBit;		--tmpBit;
}		}
++numOperandLits;		++numOperandLits;
}		}

		unsigned BitOffset = -1;
for (; bit >= 0; ) {		for (; bit >= 0; ) {
int varBit = getVariableBit(VarName, BI, bit);		int varBit = getVariableBit(VarName, BI, bit);

// If this bit isn't from a variable, skip it.		// If this bit isn't from a variable, skip it.
if (varBit == -1) {		if (varBit == -1) {
--bit;		--bit;
continue;		continue;
}		}

// Figure out the consecutive range of bits covered by this operand, in		// Figure out the consecutive range of bits covered by this operand, in
// order to generate better encoding code.		// order to generate better encoding code.
int beginInstBit = bit;		int beginInstBit = bit;
int beginVarBit = varBit;		int beginVarBit = varBit;
int N = 1;		int N = 1;
for (--bit; bit >= 0;) {		for (--bit; bit >= 0;) {
varBit = getVariableBit(VarName, BI, bit);		varBit = getVariableBit(VarName, BI, bit);
if (varBit == -1 \|\| varBit != (beginVarBit - N)) break;		if (varBit == -1 \|\| varBit != (beginVarBit - N)) break;
++N;		++N;
--bit;		--bit;
}		}

std::string maskStr;		std::string maskStr;
int opShift;		int opShift;

unsigned loBit = beginVarBit - N + 1;		unsigned loBit = beginVarBit - N + 1;
unsigned hiBit = loBit + N;		unsigned hiBit = loBit + N;
unsigned loInstBit = beginInstBit - N + 1;		unsigned loInstBit = beginInstBit - N + 1;
		BitOffset = loInstBit;
if (UseAPInt) {		if (UseAPInt) {
std::string extractStr;		std::string extractStr;
if (N >= 64) {		if (N >= 64) {
		craig.topperUnsubmitted Done Reply Inline Actions If I'm reading this right, this is the offset from the MSB of the encoding to the MSB of the field? Which works for SystemZ because the encoding bytes are emitted in big endian, but would not be useful for a little endian target? craig.topper: If I'm reading this right, this is the offset from the MSB of the encoding to the MSB of the…
		iiiAuthorUnsubmitted Done Reply Inline Actions You are right, I thought that reverseBitsForLittleEndianEncoding() would handle that, but it doesn't. I've updated the patch and verified the update with PPC, which sets isLittleEndianEncoding. For BCCTR (https://www.ibm.com/docs/en/aix/7.2?topic=set-bcctr-bcc-branch-conditional-count-register-instruction) we get: switch (OpNum) { case 0: // op: BI return 20; } which matches the drawing and the encoding in practice: 0: 20 1c 22 4c bcctr 1, 2, 3 bit 20 \| v BitV 00100000 00011100 00100010 01001100 Bit# 76543210 54321098 32109876 10987654 +++++++- ...--+++ +++----- ------++ Field 528 LK BH BO BI 19 BO iii: You are right, I thought that reverseBitsForLittleEndianEncoding() would handle that, but it…
extractStr = "op.extractBits(" + itostr(hiBit - loBit) + ", " +		extractStr = "op.extractBits(" + itostr(hiBit - loBit) + ", " +
itostr(loBit) + ")";		itostr(loBit) + ")";
Case += " Value.insertBits(" + extractStr + ", " +		Case += " Value.insertBits(" + extractStr + ", " +
itostr(loInstBit) + ");\n";		itostr(loInstBit) + ");\n";
} else {		} else {
extractStr = "op.extractBitsAsZExtValue(" + itostr(hiBit - loBit) +		extractStr = "op.extractBitsAsZExtValue(" + itostr(hiBit - loBit) +
", " + itostr(loBit) + ")";		", " + itostr(loBit) + ")";
Case += " Value.insertBits(" + extractStr + ", " +		Case += " Value.insertBits(" + extractStr + ", " +
Show All 22 Lines	if (UseAPInt) {
Case += " Value \|= (op & " + maskStr + ") >> " +		Case += " Value \|= (op & " + maskStr + ") >> " +
itostr(-opShift) + ";\n";		itostr(-opShift) + ";\n";
} else {		} else {
Case += " Value \|= (op & " + maskStr + ");\n";		Case += " Value \|= (op & " + maskStr + ");\n";
}		}
}		}
}		}
}		}

		if (BitOffset != (unsigned)-1) {
		BitOffsetCase += " case " + utostr(OpIdx) + ":\n";
		BitOffsetCase += " // op: " + VarName + "\n";
		BitOffsetCase += " return " + utostr(BitOffset) + ";\n";
		}

return true;		return true;
}		}

std::string CodeEmitterGen::getInstructionCase(Record *R,		std::pair<std::string, std::string>
CodeGenTarget &Target) {		CodeEmitterGen::getInstructionCases(Record *R, CodeGenTarget &Target) {
std::string Case;		std::string Case, BitOffsetCase;

		auto append = [&](const char *S) {
		Case += S;
		BitOffsetCase += S;
		};

if (const RecordVal *RV = R->getValue("EncodingInfos")) {		if (const RecordVal *RV = R->getValue("EncodingInfos")) {
if (auto *DI = dyn_cast_or_null<DefInit>(RV->getValue())) {		if (auto *DI = dyn_cast_or_null<DefInit>(RV->getValue())) {
const CodeGenHwModes &HWM = Target.getHwModes();		const CodeGenHwModes &HWM = Target.getHwModes();
EncodingInfoByHwMode EBM(DI->getDef(), HWM);		EncodingInfoByHwMode EBM(DI->getDef(), HWM);
Case += " switch (HwMode) {\n";		append(" switch (HwMode) {\n");
Case += " default: llvm_unreachable(\"Unhandled HwMode\");\n";		append(" default: llvm_unreachable(\"Unhandled HwMode\");\n");
for (auto &KV : EBM) {		for (auto &KV : EBM) {
Case += " case " + itostr(KV.first) + ": {\n";		append((" case " + itostr(KV.first) + ": {\n").c_str());
Case += getInstructionCaseForEncoding(R, KV.second, Target);		addInstructionCasesForEncoding(R, KV.second, Target, Case,
Case += " break;\n";		BitOffsetCase);
Case += " }\n";		append(" break;\n");
		append(" }\n");
}		}
Case += " }\n";		append(" }\n");
return Case;		return std::make_pair(std::move(Case), std::move(BitOffsetCase));
}		}
}		}
return getInstructionCaseForEncoding(R, R, Target);		addInstructionCasesForEncoding(R, R, Target, Case, BitOffsetCase);
		return std::make_pair(std::move(Case), std::move(BitOffsetCase));
}		}

std::string CodeEmitterGen::getInstructionCaseForEncoding(Record R, Record EncodingDef,		void CodeEmitterGen::addInstructionCasesForEncoding(
CodeGenTarget &Target) {		Record R, Record EncodingDef, CodeGenTarget &Target, std::string &Case,
std::string Case;		std::string &BitOffsetCase) {
BitsInit *BI = EncodingDef->getValueAsBitsInit("Inst");		BitsInit *BI = EncodingDef->getValueAsBitsInit("Inst");

// Loop over all of the fields in the instruction, determining which are the		// Loop over all of the fields in the instruction, determining which are the
// operands to the instruction.		// operands to the instruction.
bool Success = true;		bool Success = true;
		BitOffsetCase += " switch (OpNum) {\n";
for (const RecordVal &RV : EncodingDef->getValues()) {		for (const RecordVal &RV : EncodingDef->getValues()) {
// Ignore fixed fields in the record, we're looking for values like:		// Ignore fixed fields in the record, we're looking for values like:
// bits<5> RST = { ?, ?, ?, ?, ? };		// bits<5> RST = { ?, ?, ?, ?, ? };
if (RV.isNonconcreteOK() \|\| RV.getValue()->isComplete())		if (RV.isNonconcreteOK() \|\| RV.getValue()->isComplete())
continue;		continue;

Success &=		Success &= addCodeToMergeInOperand(R, BI, std::string(RV.getName()), Case,
addCodeToMergeInOperand(R, BI, std::string(RV.getName()),		BitOffsetCase, Target);
Case, Target);
}		}
		BitOffsetCase += " }\n";

if (!Success) {		if (!Success) {
// Dump the record, so we can see what's going on...		// Dump the record, so we can see what's going on...
std::string E;		std::string E;
raw_string_ostream S(E);		raw_string_ostream S(E);
S << "Dumping record for previous error:\n";		S << "Dumping record for previous error:\n";
S << *R;		S << *R;
PrintNote(E);		PrintNote(E);
}		}

StringRef PostEmitter = R->getValueAsString("PostEncoderMethod");		StringRef PostEmitter = R->getValueAsString("PostEncoderMethod");
if (!PostEmitter.empty()) {		if (!PostEmitter.empty()) {
Case += " Value = ";		Case += " Value = ";
Case += PostEmitter;		Case += PostEmitter;
Case += "(MI, Value";		Case += "(MI, Value";
Case += ", STI";		Case += ", STI";
Case += ");\n";		Case += ");\n";
}		}

return Case;
}		}

static void emitInstBits(raw_ostream &OS, const APInt &Bits) {		static void emitInstBits(raw_ostream &OS, const APInt &Bits) {
for (unsigned I = 0; I < Bits.getNumWords(); ++I)		for (unsigned I = 0; I < Bits.getNumWords(); ++I)
OS << ((I > 0) ? ", " : "") << "UINT64_C(" << utostr(Bits.getRawData()[I])		OS << ((I > 0) ? ", " : "") << "UINT64_C(" << utostr(Bits.getRawData()[I])
<< ")";		<< ")";
}		}

Show All 34 Lines	for (const CodeGenInstruction *CGI : NumberedInstructions) {
}		}
o << " ";		o << " ";
emitInstBits(o, Value);		emitInstBits(o, Value);
o << "," << '\t' << "// " << R->getName() << "\n";		o << "," << '\t' << "// " << R->getName() << "\n";
}		}
o << " UINT64_C(0)\n };\n";		o << " UINT64_C(0)\n };\n";
}		}

		void CodeEmitterGen::emitCaseMap(
		raw_ostream &o,
		const std::map<std::string, std::vector<std::string>> &CaseMap) {
		std::map<std::string, std::vector<std::string>>::const_iterator IE, EE;
		for (IE = CaseMap.begin(), EE = CaseMap.end(); IE != EE; ++IE) {
		const std::string &Case = IE->first;
		const std::vector<std::string> &InstList = IE->second;

		for (int i = 0, N = InstList.size(); i < N; i++) {
		if (i)
		o << "\n";
		o << " case " << InstList[i] << ":";
		}
		o << " {\n";
		o << Case;
		o << " break;\n"
		<< " }\n";
		}
		}

void CodeEmitterGen::run(raw_ostream &o) {		void CodeEmitterGen::run(raw_ostream &o) {
emitSourceFileHeader("Machine Code Emitter", o);		emitSourceFileHeader("Machine Code Emitter", o);

CodeGenTarget Target(Records);		CodeGenTarget Target(Records);
std::vector<Record*> Insts = Records.getAllDerivedDefinitions("Instruction");		std::vector<Record*> Insts = Records.getAllDerivedDefinitions("Instruction");

// For little-endian instruction bit encodings, reverse the bit order		// For little-endian instruction bit encodings, reverse the bit order
Target.reverseBitsForLittleEndianEncoding();		Target.reverseBitsForLittleEndianEncoding();
▲ Show 20 Lines • Show All 65 Lines • ▼ Show 20 Lines	if (!HwModes.empty()) {
o << " case " << I << ": InstBits = InstBits_" << HWM.getMode(I).Name		o << " case " << I << ": InstBits = InstBits_" << HWM.getMode(I).Name
<< "; break;\n";		<< "; break;\n";
}		}
o << " };\n";		o << " };\n";
}		}

// Map to accumulate all the cases.		// Map to accumulate all the cases.
std::map<std::string, std::vector<std::string>> CaseMap;		std::map<std::string, std::vector<std::string>> CaseMap;
		std::map<std::string, std::vector<std::string>> BitOffsetCaseMap;

// Construct all cases statement for each opcode		// Construct all cases statement for each opcode
for (Record *R : Insts) {		for (Record *R : Insts) {
if (R->getValueAsString("Namespace") == "TargetOpcode" \|\|		if (R->getValueAsString("Namespace") == "TargetOpcode" \|\|
R->getValueAsBit("isPseudo"))		R->getValueAsBit("isPseudo"))
continue;		continue;
std::string InstName =		std::string InstName =
(R->getValueAsString("Namespace") + "::" + R->getName()).str();		(R->getValueAsString("Namespace") + "::" + R->getName()).str();
std::string Case = getInstructionCase(R, Target);		std::string Case, BitOffsetCase;
		std::tie(Case, BitOffsetCase) = getInstructionCases(R, Target);

CaseMap[Case].push_back(std::move(InstName));		CaseMap[Case].push_back(InstName);
		BitOffsetCaseMap[BitOffsetCase].push_back(std::move(InstName));
}		}

// Emit initial function code		// Emit initial function code
if (UseAPInt) {		if (UseAPInt) {
int NumWords = APInt::getNumWords(BitWidth);		int NumWords = APInt::getNumWords(BitWidth);
o << " const unsigned opcode = MI.getOpcode();\n"		o << " const unsigned opcode = MI.getOpcode();\n"
<< " if (Scratch.getBitWidth() != " << BitWidth << ")\n"		<< " if (Scratch.getBitWidth() != " << BitWidth << ")\n"
<< " Scratch = Scratch.zext(" << BitWidth << ");\n"		<< " Scratch = Scratch.zext(" << BitWidth << ");\n"
<< " Inst = APInt(" << BitWidth << ", ArrayRef(InstBits + opcode * "		<< " Inst = APInt(" << BitWidth << ", ArrayRef(InstBits + opcode * "
<< NumWords << ", " << NumWords << "));\n"		<< NumWords << ", " << NumWords << "));\n"
<< " APInt &Value = Inst;\n"		<< " APInt &Value = Inst;\n"
<< " APInt &op = Scratch;\n"		<< " APInt &op = Scratch;\n"
<< " switch (opcode) {\n";		<< " switch (opcode) {\n";
} else {		} else {
o << " const unsigned opcode = MI.getOpcode();\n"		o << " const unsigned opcode = MI.getOpcode();\n"
<< " uint64_t Value = InstBits[opcode];\n"		<< " uint64_t Value = InstBits[opcode];\n"
<< " uint64_t op = 0;\n"		<< " uint64_t op = 0;\n"
<< " (void)op; // suppress warning\n"		<< " (void)op; // suppress warning\n"
<< " switch (opcode) {\n";		<< " switch (opcode) {\n";
}		}

// Emit each case statement		// Emit each case statement
std::map<std::string, std::vector<std::string>>::iterator IE, EE;		emitCaseMap(o, CaseMap);
for (IE = CaseMap.begin(), EE = CaseMap.end(); IE != EE; ++IE) {
const std::string &Case = IE->first;
std::vector<std::string> &InstList = IE->second;

for (int i = 0, N = InstList.size(); i < N; i++) {
if (i)
o << "\n";
o << " case " << InstList[i] << ":";
}
o << " {\n";
o << Case;
o << " break;\n"
<< " }\n";
}

// Default case: unhandled opcode		// Default case: unhandled opcode
o << " default:\n"		o << " default:\n"
<< " std::string msg;\n"		<< " std::string msg;\n"
<< " raw_string_ostream Msg(msg);\n"		<< " raw_string_ostream Msg(msg);\n"
<< " Msg << \"Not supported instr: \" << MI;\n"		<< " Msg << \"Not supported instr: \" << MI;\n"
<< " report_fatal_error(Msg.str().c_str());\n"		<< " report_fatal_error(Msg.str().c_str());\n"
<< " }\n";		<< " }\n";
if (UseAPInt)		if (UseAPInt)
o << " Inst = Value;\n";		o << " Inst = Value;\n";
else		else
o << " return Value;\n";		o << " return Value;\n";
o << "}\n\n";		o << "}\n\n";

		o << "#ifdef GET_OPERAND_BIT_OFFSET\n"
		<< "#undef GET_OPERAND_BIT_OFFSET\n\n"
		<< "uint32_t " << Target.getName()
		<< "MCCodeEmitter::getOperandBitOffset(const MCInst &MI,\n"
		<< " unsigned OpNum,\n"
		<< " const MCSubtargetInfo &STI) const {\n"
		<< " switch (MI.getOpcode()) {\n";
		emitCaseMap(o, BitOffsetCaseMap);
		o << " }\n"
		<< " std::string msg;\n"
		<< " raw_string_ostream Msg(msg);\n"
		<< " Msg << \"Not supported instr[opcode]: \" << MI << \"[\" << OpNum "
		"<< \"]\";\n"
		<< " report_fatal_error(Msg.str().c_str());\n"
		<< "}\n\n"
		<< "#endif // GET_OPERAND_BIT_OFFSET\n\n";
}		}
}		}

} // end anonymous namespace		} // end anonymous namespace

static TableGen::Emitter::OptClass<CodeEmitterGen>		static TableGen::Emitter::OptClass<CodeEmitterGen>
X("gen-emitter", "Generate machine code emitter");		X("gen-emitter", "Generate machine code emitter");