This is an archive of the discontinued LLVM Phabricator instance.

[X86] Add RDPRU instruction
ClosedPublic

Authored by probinson on Jun 30 2022, 11:59 AM.

Details

Summary

Add support for the RDPRU instruction on Zen2 processors.

User-facing features:

  • Clang option -m[no-]rdpru to enable/disable the feature
  • Support is implicit for znver2/znver3 processors
  • Preprocessor symbol RDPRU to indicate support
  • Header rdpruintrin.h to define intrinsics
  • "rdpru" mnemonic supported for assembler code

Internal features:

  • Clang builtin __builtin_ia32_rdpru
  • IR intrinsic @llvm.x86.rdpru

Diff Detail

Event Timeline

probinson created this revision.Jun 30 2022, 11:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 30 2022, 11:59 AM
probinson requested review of this revision.Jun 30 2022, 11:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 30 2022, 11:59 AM

If this patch is too big, I'm happy to break it into two (LLVM/Clang) or 4 (piecemeal features) parts.

Two questions:

  • Should the clang builtin be in BuiltinsX86_64.td instead of BuiltinsX86.td? It's supported only by Zen.
  • I didn't need to add a case for this in CGBuiltin.cpp, which was surprising; just adding the entry to BuiltinsX86.def was enough to automatically lower the Clang builtin to the IR intrinsic. Does that auto-magic mean there are cases that could be removed from CGBuiltin.cpp?

Oh, third question: We invented a new header for this. If gcc or somebody has added different intrinsics, or a different header, or whatever, let me know and I'm happy to adjust.

If this patch is too big, I'm happy to break it into two (LLVM/Clang) or 4 (piecemeal features) parts.

Two questions:

  • Should the clang builtin be in BuiltinsX86_64.td instead of BuiltinsX86.td? It's supported only by Zen.

BuiltinsX86_64.def generates a check that the triple is x86_64 and not i386/i686. It doesn't look like this instruction is 64-bit mode only so BuiltinsX86.def is the right place

  • I didn't need to add a case for this in CGBuiltin.cpp, which was surprising; just adding the entry to BuiltinsX86.def was enough to automatically lower the Clang builtin to the IR intrinsic. Does that auto-magic mean there are cases that could be removed from CGBuiltin.cpp?

The ClangBuiltin in IntrinsicsX86.td sets this up. All X86 code in CGBuiltin.cpp should be for builtins that don't appear in ClangBuiltin because they require some special handling.

craig.topper added inline comments.Jun 30 2022, 12:22 PM
llvm/lib/Target/X86/X86InstrSystem.td
742

I think TB here should be PS. Per table "Table A-8. Opcode 01h ModRM Extensions", the instruction is not valid with a 0xF2 or 0xF3 prefix. PS is our marker for that.

llvm/test/CodeGen/X86/rdpru.ll
3

Add i686 command lines to test the ReplaceNodeResults code.

llvm/test/MC/X86/x86-64.s
1581 ↗(On Diff #441474)

Please add a disassembler test too.

For reference the instruction is documented here: https://www.amd.com/system/files/TechDocs/24594.pdf (AMD64 APM Volume 3: General Purpose and System Instructions)

We definitely need to add i686 codegen test coverage (clang, llvm ir + mc) - as the instruction returns the timer values in EDX:EAX and we need to confirm that this is handled correctly

probinson updated this revision to Diff 441501.Jun 30 2022, 1:27 PM

Address review comments. Encode/decode test moved into its own file.

probinson marked 2 inline comments as done.Jun 30 2022, 1:29 PM

i686 tests added, but I have to leave it to the subject-matter experts whether they are correct.

llvm/lib/Target/X86/X86InstrSystem.td
742

Done, but is there a way I should be testing this?

32-bit codegen looks ok. 64-bit values are returned in edx:eax like the instruction.

llvm/lib/Target/X86/X86InstrSystem.td
742

You'd need to make sure it didn't disassemble with 0xf2 or 0xf3 before the regular instruction, but we're pretty lax on testing those cases for other instructions so it's not a big deal.

Please can you add a disassembler test?

Other than that - does anyone have any further comments?

llvm/test/CodeGen/X86/rdpru.ll
6

Worth adding -mcpu=znver2/znver3 tests?

  1. Update ReleaseNotes;
  2. Update clang/lib/Headers/cpuid.h and llvm/lib/Support/Host.cpp if there's a new bit for it;
  3. Add macro test in clang/test/Preprocessor/x86_target_features.c
clang/lib/Headers/rdpruintrin.h
25

Conflicted with line 11. Use x86intrin.h instead. The same below.

probinson updated this revision to Diff 442347.EditedJul 5 2022, 10:26 AM
probinson marked an inline comment as done.

Add __RDPRU__ test, release note, fix doc for intrinsics.

probinson marked an inline comment as done.Jul 5 2022, 10:31 AM
  1. Update ReleaseNotes;

Done, thanks for the reminder!

  1. Update clang/lib/Headers/cpuid.h and llvm/lib/Support/Host.cpp if there's a new bit for it;

Not done; I had considered this, but the bit lives in a CPUID leaf that is not currently supported at all by the header or Host.cpp. I'm willing to add RDPRU, with FIXMEs for the other feature bits, if people want it.

  1. Add macro test in clang/test/Preprocessor/x86_target_features.c

Done. For some reason I thought the clang intrinsic test would verify this, but actually it doesn't.

Please can you add a disassembler test?

Isn't that what test/MC/X86/RDPRU.s does (via llvm-objdump)? If "disassembler test" means something else, please point to an example.

probinson updated this revision to Diff 442355.Jul 5 2022, 10:46 AM

Add znver2/znver3 cases to rdpru.ll test

probinson marked an inline comment as done.Jul 5 2022, 10:46 AM

Please can you add a disassembler test?

Isn't that what test/MC/X86/RDPRU.s does (via llvm-objdump)? If "disassembler test" means something else, please point to an example.

https://github.com/llvm/llvm-project/tree/main/llvm/test/MC/Disassembler/X86 - they're kept inside the MC section, but its the "reverse lookup" disasm tests

probinson updated this revision to Diff 442377.Jul 5 2022, 12:53 PM

Add MC/Disassembler/X86 tests

@RKSimon is this what you meant?

Yes - thank you!

@pengfei any more comments?

pengfei accepted this revision.Jul 5 2022, 5:58 PM

LGTM.

llvm/docs/ReleaseNotes.rst
166–167

Nit: It's no ideal to put Clang info here. Maybe move it to Clang's Release Notes?

This revision is now accepted and ready to land.Jul 5 2022, 5:58 PM
probinson added inline comments.Jul 6 2022, 6:36 AM
llvm/docs/ReleaseNotes.rst
166–167

Will do.

This revision was landed with ongoing or failed builds.Jul 6 2022, 7:18 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJul 6 2022, 7:18 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
probinson added a comment.EditedAug 11 2022, 8:02 AM
  1. Update clang/lib/Headers/cpuid.h and llvm/lib/Support/Host.cpp if there's a new bit for it;

Not done; I had considered this, but the bit lives in a CPUID leaf that is not currently supported at all by the header or Host.cpp. I'm willing to add RDPRU, with FIXMEs for the other feature bits, if people want it.

@RKSimon pointed out to me that I was mistaken about this. Not sure how I missed it... I'm about to go on holiday so he is authorized to clean up my mess.