Page MenuHomePhabricator

[X86] Limit prefix padding w/target specific padding amount
Needs ReviewPublic

Authored by reames on Wed, Mar 18, 5:13 PM.

Details

Summary

This change defines the maximum number of prefixes safe to use for padding in a target specific way. As noted in the comments, different processor generations behave very differently w.r.t. redundant prefixes. Some processors - mostly, but not entirely older ones - encounter serious stalls if too many prefixes are used.

Note: Originally, this patch also enabled the prefix padding by default. That's currently on hold.

Diff Detail

Event Timeline

reames created this revision.Wed, Mar 18, 5:13 PM
Herald added a project: Restricted Project. · View Herald TranscriptWed, Mar 18, 5:13 PM
reames planned changes to this revision.Wed, Mar 18, 5:23 PM

I need to restructure this. As I noted over in https://reviews.llvm.org/D76176, we have a problem for the assembler use case w/inserting prefixes between previous data encoded prefixes and the instruction. This isn't relevant for the compiler (since we have the suppression mechanism), but we need to structure this such that only the compiler enables this feature. I'm very unhappy about how ugly that is, but I don't see a way around it.

skan added a comment.Wed, Mar 18, 8:35 PM

D76285 really reveals a bug when we want to enable prefix padding. I don't believe we can enable prefix padding of relaxable instructions by default without fixing that.

reames updated this revision to Diff 251413.Thu, Mar 19, 10:28 AM
reames retitled this revision from [X86] Enable prefix padding w/target specific padding amount to [X86] Limit prefix padding w/target specific padding amount.
reames edited the summary of this revision. (Show Details)

Restructure patch not to enable by default. This allows the target specific logic to be tested in tree while avoiding the optimized assembler issue for now.

@skan - Will reply to your point separately via email. I disagree and think we need a broader discussion on this point.

reames updated this revision to Diff 251420.Thu, Mar 19, 11:05 AM

Add an extra test to cover the target specific limits. Test structure is analogous to x86_long_nop.s.

LuoYuanke added inline comments.Thu, Mar 19, 8:38 PM
llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
844

I'm not sure the long nop feature can determine the prefix size. I'd leave it to Craig. :)

craig.topper added inline comments.Thu, Mar 19, 9:16 PM
llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
820

manditory->mandatory

846

What does conservatively handled mean here?

reames marked an inline comment as done.Wed, Mar 25, 11:00 AM
reames added inline comments.
llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
846

Basically, do all targets have the right setting for the number of prefixes allowed (proxied by nop length atm) and that escape bytes are counted where appropriate. (e.g. atom can decode three prefixes, but silvermont only gets three of either prefix or escape bytes.)

I haven't dug through all the code involved to figure out escape bytes are represented. In some cases, we appear to treat them as prefixes, but I'm not sure this handling is universal. This why I'm disabling prefix padding on targets which need to care about presence of escape bytes.

skan added inline comments.Fri, Mar 27, 1:29 AM
llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
865

Use a C++ style cast here.

865

Since the bug fix patch D76285 was landed, I think we can allow all targets to decide how many prefixes can be added by default,
and user can pass option -x86-pad-max-prefix-size to override the default value.

llvm/test/MC/X86/prefix-padding-length.s
11

From my knowledge, atom can decode 5 prefixes without penalty.