This is an archive of the discontinued LLVM Phabricator instance.

Support prefix padding for alignment purposes (Relaxable instructions only)
ClosedPublic

Authored by reames on Feb 27 2020, 1:21 PM.

Details

Summary

Now that D75203 has landed and baked for a few days, extend the basic approach to prefix padding as well. The patch itself is fairly straight forward.

I did hit a small snag which is that we don't really have a good model for how many prefixes are safe to add with the current information. I'd planned to use the nop length information as a proxy, but it appears that this has never been updated for modern Intels. I haven't investigated to figure out why just yet.

For the moment, this patch adds the functional support and some basic testing there of, but defaults to not enabling prefix padding. I want to be able to phrase a separate patch which adds the target specific reasoning and test it cleanly. I haven't decided whether I want to common it with the nop logic or not.

Diff Detail

Event Timeline

reames created this revision.Feb 27 2020, 1:21 PM
reames updated this revision to Diff 248383.Mar 4 2020, 7:42 PM

Rebase on recently landed patches, haven't yet addressed major todos.

reames updated this revision to Diff 249207.Mar 9 2020, 1:52 PM
reames retitled this revision from [POC] Demonstrate prefix padding on top of D75203 to Support prefix padding for alignment purposes (Relaxable instructions only).Mar 9 2020, 1:57 PM
reames edited the summary of this revision. (Show Details)
reames set the repository for this revision to rG LLVM Github Monorepo.
skan added inline comments.Mar 10 2020, 4:37 AM
llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
752

uint8_t TargetPrefixMax = static_cast<uint8_t>(X86PadMaxPrefixSize)

reames updated this revision to Diff 249512.Mar 10 2020, 4:10 PM

Address review comment

reames marked an inline comment as done.Mar 10 2020, 4:10 PM
skan added a comment.Mar 10 2020, 7:39 PM

Encountered a compile error when building SPEC with option -fno-lto -mllvm --x86-align-branch-boundary=32 -mllvm -x86-align-branch=fused+jcc+jmp -mllvm -x86-pad-max-prefix-size=5.

fatal error: error in backend: value of -130 is too large for field of 1 byte.

This error is reported when applying fixups

void X86AsmBackend::applyFixup(const MCAssembler &Asm, const MCFixup &Fixup,
                               const MCValue &Target,
                               MutableArrayRef<char> Data,
                               uint64_t Value, bool IsResolved,
                               const MCSubtargetInfo *STI) const {
  unsigned Size = getFixupKindSize(Fixup.getKind());

  assert(Fixup.getOffset() + Size <= Data.size() && "Invalid fixup offset!");

  int64_t SignedValue = static_cast<int64_t>(Value);
  if ((Target.isAbsolute() || IsResolved) &&
      getFixupKindInfo(Fixup.getKind()).Flags &
      MCFixupKindInfo::FKF_IsPCRel) {
    // check that PC relative fixup fits into the fixup size.
    if (Size > 0 && !isIntN(Size * 8, SignedValue))
      Asm.getContext().reportError(
                                   Fixup.getLoc(), "value of " + Twine(SignedValue) +
                                   " is too large for field of " + Twine(Size) +
                                   ((Size == 1) ? " byte." : " bytes."));

SKan, any chance you could bugpoint the failing IR? Having to setup a full clang and SPEC build to reproduce will be very time intensive.

skan added a comment.Mar 11 2020, 4:14 AM

SKan, any chance you could bugpoint the failing IR? Having to setup a full clang and SPEC build to reproduce will be very time intensive.

I have minimized the test case

  .text
.L1:
.rept 126
  int3
.endr
  jmp .L1
.rept 30
  int3
.endr
.p2align 5

Compile this with command

clang "-cc1as" "-triple" "x86_64-unknown-linux-gnu" "-filetype" "obj" "-main-file-name" "toke.s" "-target-cpu" "x86-64" "-mllvm" "-x86-pad-max-prefix-size=5" "-o" "test.o" "test.s"

and you would get the error

test.s:8:3: error: value of -130 is too large for field of 1 byte.
  jmp .L1

Reason of failure:
In this patch, we try to relax the instruction first and then try to add prefixes to it, which makes the jmp require a larger negative offset than it can encode.

reames updated this revision to Diff 250002.Mar 12 2020, 11:46 AM

Address bug found by skan. Doing so unfortunately limits the scope quite a bit, but getting access to the value for the fixup will require a bit of redesign. I would much prefer to get the reduced scope version enabled, and then broaden the scope with testing in place.

reames updated this revision to Diff 250004.Mar 12 2020, 11:48 AM

Use the proper diff command this time...

skan added inline comments.Mar 12 2020, 7:39 PM
llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
848–851

I'm not sure only checking FKF_IsPCRel here is safe enough, since the logic in X86AsmBackend::applyFixup also checks whether the value can fit in the fixup field when kind of fixup is not FKF_IsPCRel. Can we return false directly if !isFullyRelaxed(RF)?

reames updated this revision to Diff 250228.Mar 13 2020, 9:09 AM

Per review comment, be more conservative about non-fully relaxed instructions.

I think the previous code was in fact correct, but given it doesn't help any practical test I know of, being super conservative is reasonable for now.

Can I get an LGTM on this?

skan added a comment.Mar 13 2020, 9:13 PM

Per review comment, be more conservative about non-fully relaxed instructions.

I think the previous code was in fact correct, but given it doesn't help any practical test I know of, being super conservative is reasonable for now.

Can I get an LGTM on this?

I am running some tests for this patch, I will give it a LGTM once the tests are done.

skan accepted this revision.Mar 14 2020, 9:03 AM

I applied this patch and found no new fail for SPEC.

This revision is now accepted and ready to land.Mar 14 2020, 9:03 AM
skan added inline comments.Mar 14 2020, 9:15 AM
llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
110

Before landing this patch, could you make this description better? I don't think the word "redundant" is appropriate here since X86PadMaxPrefixSize counts in the existing prefixes of the instructions,which may be useful, e.g. escape prefix.

This revision was automatically updated to reflect the committed changes.