Page MenuHomePhabricator

reames (Philip Reames)
User

Projects

User does not belong to any projects.

User Details

User Since
Oct 23 2013, 6:32 PM (326 w, 12 h)

Recent Activity

Tue, Jan 21

reames planned changes to D72721: [BranchAlign] Disable autopadding in cold blocks to reduce code size impact.

@reames
I'm a little bit concerned about the performance impact of this patch. Do you have any data for the patch? Is there any performance penalty and how much code size is reduced?

Can you expand a little bit on your concern? This is driven by profile data and only applies if that data is available. If it is, it suppresses additional padding only in blocks thought to be globally cold.

Tue, Jan 21, 10:29 AM · Restricted Project

Wed, Jan 15

reames updated the diff for D72721: [BranchAlign] Disable autopadding in cold blocks to reduce code size impact.

Address review comment

Wed, Jan 15, 10:12 AM · Restricted Project
reames accepted D72754: [BasicBlock] add helper getPostdominatingDeoptimizeCall.
Wed, Jan 15, 8:12 AM · Restricted Project

Tue, Jan 14

reames committed rG1a7398eca204: [BranchAlign] Add master --x86-branches-within-32B-boundaries flag (authored by reames).
[BranchAlign] Add master --x86-branches-within-32B-boundaries flag
Tue, Jan 14, 6:19 PM
reames closed D72738: [BranchAlign] Add master --x86-branches-within-32B-boundaries flag.
Tue, Jan 14, 6:19 PM · Restricted Project
reames updated the diff for D72738: [BranchAlign] Add master --x86-branches-within-32B-boundaries flag.

Address review comment

Tue, Jan 14, 4:23 PM · Restricted Project
reames updated the diff for D72738: [BranchAlign] Add master --x86-branches-within-32B-boundaries flag.

Address review comment

Tue, Jan 14, 3:33 PM · Restricted Project
reames planned changes to D72291: Reimplement BoundaryAlign mechanism (mostly, but not quite NFC).

Getting this off review queues for the moment. Will return to this once other more urgent items in this area settle out.

Tue, Jan 14, 3:23 PM · Restricted Project
reames added a comment to D72225: Align branches within 32-Byte boundary(Prefix padding).

I have serious reservations about the rush to land this patch. I have expressed some of this privately, and other bits have been in previous review threads, but I want to put everything in one public place.

Tue, Jan 14, 3:23 PM · Restricted Project
reames created D72738: [BranchAlign] Add master --x86-branches-within-32B-boundaries flag.
Tue, Jan 14, 2:55 PM · Restricted Project
reames created D72721: [BranchAlign] Disable autopadding in cold blocks to reduce code size impact.
Tue, Jan 14, 10:23 AM · Restricted Project

Sat, Jan 11

reames added a comment to D72519: [LoopInfo] Support multi edge in getLoopLatch().

I think this may break a fairly major assumption that there's a single backedge if there is a latch block. As a simple example, here's the comment from LoopSimplify:

// If the header has more than two predecessors at this point (from the
// preheader and from multiple backedges), we must adjust the loop.
BasicBlock *LoopLatch = L->getLoopLatch();
if (!LoopLatch) {
Sat, Jan 11, 9:24 AM · Restricted Project
reames added a comment to D71563: [SCEV] Recognise the hardwareloop "loop.decrement.reg" intrinsic.

SCEV code looks fine, no comment on the expected semantics of the intrinsic.

Sat, Jan 11, 9:06 AM · Restricted Project
reames added a comment to D72215: [AArch64] Add function attribute "patchable-function-entry" to add NOPs at function entry.

@reames Is D19046 the LLVM function attribute "patchable-function" and the target opcode PATCHABLE_OP used by any projects? Sanjoy said you may know.

Sorry, for the slow response.

Sat, Jan 11, 9:06 AM · Restricted Project
reames committed rG1d641daf2603: [X86] Adjust nop emission by compiler to consider target decode limitations (authored by reames).
[X86] Adjust nop emission by compiler to consider target decode limitations
Sat, Jan 11, 8:47 AM
reames committed rG563d3e344452: [X86AsmBackend] Move static function before sole use [NFC] (authored by reames).
[X86AsmBackend] Move static function before sole use [NFC]
Sat, Jan 11, 8:47 AM
reames committed rG6cb3957730e9: [X86AsmBackend] Be consistent about placing definitions out of line [NFC] (authored by reames).
[X86AsmBackend] Be consistent about placing definitions out of line [NFC]
Sat, Jan 11, 8:47 AM

Fri, Jan 10

reames committed rG0c29d3ff2233: [Tests] Precommit tests showing default branch padding on skylake (authored by reames).
[Tests] Precommit tests showing default branch padding on skylake
Fri, Jan 10, 11:56 AM

Thu, Jan 9

reames added a comment to D72225: Align branches within 32-Byte boundary(Prefix padding).

I'm having a really hard time wrapping my head around the implementation. Can you give a high level summary of the usage of BoundaryAlign after this change? A couple of examples w/all the fragments written out might also help a lot.

Thu, Jan 9, 2:49 PM · Restricted Project
reames added a comment to D72225: Align branches within 32-Byte boundary(Prefix padding).

First set of purely minor stylistic comments as I start to get familiar with the code.

Thu, Jan 9, 1:41 PM · Restricted Project

Wed, Jan 8

reames committed rGba181d0063e4: [X86] Keep cl::opts at top of file [NFC] (authored by reames).
[X86] Keep cl::opts at top of file [NFC]
Wed, Jan 8, 10:59 AM
reames committed rG29ccb12e2c12: [BranchAlign] Compiler support for suppressing branch align (authored by reames).
[BranchAlign] Compiler support for suppressing branch align
Wed, Jan 8, 10:12 AM
reames closed D72303: [BranchAlign] Compiler support for suppressing branch align.
Wed, Jan 8, 10:11 AM · Restricted Project
reames added a comment to D72303: [BranchAlign] Compiler support for suppressing branch align.

I've addressed the vast majority of style comments and have noted why I don't think the remainder are appropriate.

Wed, Jan 8, 9:43 AM · Restricted Project
reames updated the diff for D72303: [BranchAlign] Compiler support for suppressing branch align.

Missed a review comment, now addressed.

Wed, Jan 8, 9:43 AM · Restricted Project
reames updated the diff for D72303: [BranchAlign] Compiler support for suppressing branch align.
Wed, Jan 8, 9:43 AM · Restricted Project

Tue, Jan 7

reames added inline comments to D72303: [BranchAlign] Compiler support for suppressing branch align.
Tue, Jan 7, 6:37 PM · Restricted Project
reames updated the diff for D72303: [BranchAlign] Compiler support for suppressing branch align.
Tue, Jan 7, 4:49 PM · Restricted Project
reames committed rG312a532dc045: [GVN/FP] Considate logic for reasoning about equality vs equivalance for floats (authored by reames).
[GVN/FP] Considate logic for reasoning about equality vs equivalance for floats
Tue, Jan 7, 4:10 PM
reames closed D67126: [GVN/FP] Considate logic for reasoning about equality vs equivalance for floats.
Tue, Jan 7, 4:09 PM · Restricted Project
reames added inline comments to D72303: [BranchAlign] Compiler support for suppressing branch align.
Tue, Jan 7, 4:00 PM · Restricted Project
reames updated the diff for D72303: [BranchAlign] Compiler support for suppressing branch align.

Completely rework patch to support integrated assembler only + comments for testing.

Tue, Jan 7, 3:04 PM · Restricted Project

Mon, Jan 6

reames planned changes to D72303: [BranchAlign] Compiler support for suppressing branch align.

It looks like the actual padding support got lost in one of the rebases. I'll need to rewrite that part. Though, I think I'm going to take an alternate approach which prioritizes the integrated assembler and simple marks nopadding regions with comments (i.e. rather than directives). That gives me most of what I'm looking for testing wise, and is a lot less likely to be controversial.

Mon, Jan 6, 3:52 PM · Restricted Project
reames committed rG08d17cb065da: [X86] Move an enum definition into a header to simplify future patches [NFC] (authored by reames).
[X86] Move an enum definition into a header to simplify future patches [NFC]
Mon, Jan 6, 3:22 PM
reames updated the diff for D72303: [BranchAlign] Compiler support for suppressing branch align.

Really helps if I upload the patch correctly...

Mon, Jan 6, 2:07 PM · Restricted Project
reames created D72303: [BranchAlign] Compiler support for suppressing branch align.
Mon, Jan 6, 1:29 PM · Restricted Project
reames created D72291: Reimplement BoundaryAlign mechanism (mostly, but not quite NFC).
Mon, Jan 6, 10:16 AM · Restricted Project

Dec 22 2019

reames committed rGbe051f4312aa: [Test] Add examples of problematic assembler auto-padding (authored by reames).
[Test] Add examples of problematic assembler auto-padding
Dec 22 2019, 9:01 AM

Dec 20 2019

reames committed rGdcda6be7579c: Add a set of tests with basic coverage of the recently added boundary align… (authored by reames).
Add a set of tests with basic coverage of the recently added boundary align…
Dec 20 2019, 5:22 PM
reames added a comment to D70157: Align branches within 32-Byte boundary(NOP padding).

I've gone ahead and landed the patch so that we can iterate in tree. See commit 14fc20ca62821b5f85582bf76a467d412248c248.

Dec 20 2019, 12:34 PM · Restricted Project, Restricted Project
reames committed rGc148e2e2ef86: More style cleanups following rG14fc20ca6282 [NFC] (authored by reames).
More style cleanups following rG14fc20ca6282 [NFC]
Dec 20 2019, 12:24 PM
reames committed rG4024d49edc15: Fix a memory leak introduced w/the instruction padding support in rG14fc20ca6282 (authored by reames).
Fix a memory leak introduced w/the instruction padding support in rG14fc20ca6282
Dec 20 2019, 12:13 PM
reames committed rG8b725f0459ee: Comment and adjust style in the newly introduced MCBoundaryAlignFragment… (authored by reames).
Comment and adjust style in the newly introduced MCBoundaryAlignFragment…
Dec 20 2019, 12:12 PM
reames committed rG14fc20ca6282: Align branches within 32-Byte boundary (NOP padding) (authored by reames).
Align branches within 32-Byte boundary (NOP padding)
Dec 20 2019, 11:41 AM
reames closed D70157: Align branches within 32-Byte boundary(NOP padding).
Dec 20 2019, 11:41 AM · Restricted Project, Restricted Project

Dec 19 2019

reames accepted D70157: Align branches within 32-Byte boundary(NOP padding).

LGTM. The patch isn't perfect, but this has reached the point where landing and iterating in tree is the fastest way to convergence. To be clear, this LGTM *only* applies to the current patch contents, and the *internal only* flag names this introduces. These may change before we expose this publicly.

Dec 19 2019, 2:53 PM · Restricted Project, Restricted Project
reames committed rG8277c91cf342: [StackMaps] Be explicit about label formation [NFC] (try 2) (authored by reames).
[StackMaps] Be explicit about label formation [NFC] (try 2)
Dec 19 2019, 2:11 PM
reames committed rGbc7595d934b9: [StackMaps] Be explicit about label formation [NFC] (authored by reames).
[StackMaps] Be explicit about label formation [NFC]
Dec 19 2019, 12:43 PM
reames committed rGcf6aafa47c37: [FaultMaps] Make label formation a bit more explicit [NFC] (authored by reames).
[FaultMaps] Make label formation a bit more explicit [NFC]
Dec 19 2019, 12:43 PM
reames added a comment to D70157: Align branches within 32-Byte boundary(NOP padding).

Specifically on the revised patch, I remain confused by the need for multiple subtypes. The need for fragments *between* the potentially fused instructions doesn't make sense to me. What I was expecting to see was the following:
BoundaryAlign w/target=the branch fragment
.. some possibly empty sequence of fragments (i.e. the test/cmp/etc..) ...
the branch fragment
a new data fragment if the branch fragment was a DF

(i.e. a single BounaryAlign fragment which aligns a payload which is defined as "next fragment to target fragment inclusive".)

To be specific, I'd expect to see the following for an example fused sequence:

  1. BoundaryAlign w/Target = 3
  2. DataFragment containing TEST RAX, RAX
  3. RelaxeableFragment containing JNE symbo

    Why do we need anything between the two fragments of the fused pair?

    (As a reminder, I am new to this code. If I'm missing the obvious, please just point it out.)

JUMP is not always emiteed into MCRelaxableFragment, it also can be emitted into MCDataFragment and arithmetic ops with constant arguments of unknown value (e.g. ADD,AND) can be emitted into MCRelaxableFragment , you can find related code in MCObjectStreamer::EmitInstructionImpl, X86AsmBackend::mayNeedRelaxation. Let's say JCC is fused with TEST, there are four possible positions for JCC and CMP

  1. JCC and CMP are in same MCDataFragment
  2. JCC and CMP are in two different MCDataFragment
  3. JCC and CMP are in two different MCRelaxableFragment
  4. JCC in a MCRelaxableFragment, CMP is in a MCDataFragment

    and since MCCompactEncodedInstFragment is not applicable yet, i don't what's its behaviour.

    In order to compute the total size of CMP and JCC in MCAssembler::relaxBoundaryAlign, I insert a FusedJccSplit to force CMP and JCC in two fragments. Do you have any better idea?

I agree there are multiple cases here, see the original generic description instead of the specific example. The general question is why a *range* of fragments can't be defined. Computing the instruction size for the entire range then just requires walking from first to last fragment in the range summing the size of each. If both instructions are within the same data fragment, then no relaxation is needed, and the size of both is simply the size of the data fragment.

Dec 19 2019, 9:02 AM · Restricted Project, Restricted Project

Dec 17 2019

reames added inline comments to D71581: [X86] Fix an 8 bit testb being selected when folding a volatile i32 load pattern..
Dec 17 2019, 5:48 PM · Restricted Project
reames updated the diff for D71315: [WIP] Draft assembler support for branch alignment.

Update to proposed .push_align_branch_boundary syntax. Note that there's no implementation - as that's in the original patch - this is just round trip serialization.

Dec 17 2019, 5:38 PM · Restricted Project
reames added a comment to D70157: Align branches within 32-Byte boundary(NOP padding).

I've updated the draft assembler support in https://reviews.llvm.org/D71315 to match the proposal here. Again, this is very much WIP and mostly just to show proposed syntax/usage.

Dec 17 2019, 5:38 PM · Restricted Project, Restricted Project
reames added a comment to D70157: Align branches within 32-Byte boundary(NOP padding).

Specifically on the revised patch, I remain confused by the need for multiple subtypes. The need for fragments *between* the potentially fused instructions doesn't make sense to me. What I was expecting to see was the following:
BoundaryAlign w/target=the branch fragment
.. some possibly empty sequence of fragments (i.e. the test/cmp/etc..) ...
the branch fragment
a new data fragment if the branch fragment was a DF

Dec 17 2019, 12:23 PM · Restricted Project, Restricted Project
reames added a comment to D70157: Align branches within 32-Byte boundary(NOP padding).

.push_align_branch_boundary [N,] [instruction,]*

I'd like to raise again the possibility of using a more general region directive to denote "It is allowable to add prefixes/nops before instructions in this region if the assembler wants to", as I'd started discussing in https://reviews.llvm.org/D71238#1786885 (but let's move the discussion here).

James, I think this proposal is increasing the scope of this proposal too much. It also ignores some of the use cases identified and described in the writeup (i.e. the scoped semantics). I'm open to discussing such a feature more generally, but I'd prefer to see a more narrowly focused feature immediately.

Dec 17 2019, 12:07 PM · Restricted Project, Restricted Project
reames added a comment to D70157: Align branches within 32-Byte boundary(NOP padding).

There is a precedant: .pushsection/.popsection (MCStreamer::SectionStack). With .push_align_branch/.pop_align_branch, we probably don't need the 'switch-to-default' action.

I don't know how likely we may ever need nested states (e.g. an .include directive inside an .align_branch region where the included file has own idea about branch alignment), but .push/.pop does not seem to be more complex than disable/enable/default.

I rethink about the directives and prefer the .push/.pop pair as @MaskRay suggested. To be specified, I'd suggest to use .push_align_branch_boundary and .pop_align_branch_boundary to align with MC command line options. They will cowork with the command line options and overwrite the options if both are existing.

I agree that we need the push/pop semantics.

Dec 17 2019, 12:01 PM · Restricted Project, Restricted Project

Dec 16 2019

reames abandoned D71238: Align non-fused branches within 32-Byte boundary (basic case).

Abandoning this patch. After call, we're consolidating back on the original patch now that direction has been agreed on.

Dec 16 2019, 5:49 PM · Restricted Project
reames added a comment to D70157: Align branches within 32-Byte boundary(NOP padding).

Here are the minutes from our phone call a few minutes ago.

Dec 16 2019, 5:40 PM · Restricted Project, Restricted Project
reames added a comment to D70157: Align branches within 32-Byte boundary(NOP padding).

Noting another issue we found in local testing (with an older version of this patch). This interacts badly with the implicit exception mechanism in LLVM. For that mechanism, we end up generating assembly which looks more or less like this:
Ltmp:

cmp %rsi, (%rdi)
jcc <target>
Dec 16 2019, 3:47 PM · Restricted Project, Restricted Project
reames added a comment to D70157: Align branches within 32-Byte boundary(NOP padding).

In offline discussion, there was an agreement that we needed further coordination to make sure this patch moves forward quickly. For that reason, there will be a call happening today at 4pm Pacific. Interested parties are welcome to attend.

Dec 16 2019, 10:53 AM · Restricted Project, Restricted Project

Dec 12 2019

reames accepted D69914: [LVI] Normalize pointer behavior.

LGTM again.

Dec 12 2019, 4:35 PM · Restricted Project
reames requested changes to D69914: [LVI] Normalize pointer behavior.
Dec 12 2019, 12:01 PM · Restricted Project
reames added a comment to D71383: [LoopGuard] Instructions in loop preheader and loop exit must be safe to execute speculatively..

This doesn't make sense to me. I don't see why we need the speculation restriction here. The discussion mentions concern about side exits, but if so, that's a different property. (i.e. isGuaranteedToTransferToSuccessor)

Dec 12 2019, 11:39 AM · Restricted Project

Dec 11 2019

reames updated the diff for D71315: [WIP] Draft assembler support for branch alignment.

Rebase after CodePadding removal.

Dec 11 2019, 10:49 AM · Restricted Project
reames added a comment to D71238: Align non-fused branches within 32-Byte boundary (basic case).

Do we need some option or syntax to disable ".boundary_align 4", so that the same code can be compiled for some machine that do not need align branch?

I wouldn't think so. Macros could be used for this purpose if needed. We - to my admittedly limited knowledge - don't have anything analogous for other forms of alignment for instance.

Dec 11 2019, 9:08 AM · Restricted Project

Dec 10 2019

reames updated the summary of D71315: [WIP] Draft assembler support for branch alignment.
Dec 10 2019, 4:36 PM · Restricted Project
reames added a comment to D71238: Align non-fused branches within 32-Byte boundary (basic case).

I've gone as far as writing a rough draft of textual assembler support.

I posted the WIP draft here (https://reviews.llvm.org/D71315). Please keep the discussion here or the original thread. I don't want to fork discussion further. Once we've settled direction, I can cleanup and repost the patch if needed.

Dec 10 2019, 4:36 PM · Restricted Project
reames created D71315: [WIP] Draft assembler support for branch alignment.
Dec 10 2019, 4:36 PM · Restricted Project
reames added a comment to D71238: Align non-fused branches within 32-Byte boundary (basic case).

Honestly, I both strongly agree with this, and see the argument as a somewhat lost cause. I think in practical terms, we *will* have to support something which works for both compiler generated assembly and legacy hand written assembly. I dislike that fact and the design it encourages, but it's reality.

Hand-written assembly is generally responsible for its own performance optimizations, and we do not usually expect the assembler to muck around with the code. So, why is it assumed that _this_ optimization should be done automatically on such assembly code?

The more I think about this, the more I think James is right here.

Dec 10 2019, 2:37 PM · Restricted Project
reames added a comment to D71238: Align non-fused branches within 32-Byte boundary (basic case).

I just want to mention here, that my comment on the original patch still stands and really needs to be addressed:

We need to have detailed performance and binary size numbers for this kind of change. We should be looking at the test suite and SPEC for performance. I'd like to see before/after performance numbers on a system that is *not* using the microcode feature that addresses jCC (either a newer CPU, an older CPu, or an unpatched CPU). I'd also like to see before/after performance numbers on a system that *is* using the microcode feature to address the jCC issue. I'd suggest Clang itself, Firefox, and Chromium as binaries to show the binary size impact of this patch. Happy for others to suggest binaries to use to evaluate the binary size impact.

I strongly agree that Intel should share these numbers in detail. So far, we've seen some summary shared on llvm-dev, but nothing in detail.

Dec 10 2019, 9:50 AM · Restricted Project
reames added a comment to D71238: Align non-fused branches within 32-Byte boundary (basic case).

In function X86AsmBackend::alignBranchesBegin, I added the comment

// The prefix or nop isn't inserted if the previous item is hard code, which
// may be used to hardcode an instruction, since there is no clear instruction
// boundary.

I'm sorry if I didn't make it clear. As far as i am concerned, I think the hard code case should be dealt with.

I'm still not following. Can you explain the use case here? What test case differs based on the presence of the "hard code" support in the original patch?

Consider hand-written assembly code which looks like this:
...
Inserting anything between the tacked-on-prefix with .byte, and the rest of the instruction written mnemonically, would be bad.

Ok, this makes more sense now.

Dec 10 2019, 9:32 AM · Restricted Project

Dec 9 2019

reames added a comment to D71238: Align non-fused branches within 32-Byte boundary (basic case).

The patch doesn't handle with the hard code case either, but as far as I can see that change was not mentioned in the description of this patch.

Unless I'm misreading the original patch, the notion of "hard code" is only relevant to the prefix padding scheme. Admittedly, I might be misreading, it isn't clearly stated anywhere in the original patch exactly what "hard code" is or what purpose it serves. .

In function X86AsmBackend::alignBranchesBegin, I added the comment

// The prefix or nop isn't inserted if the previous item is hard code, which
// may be used to hardcode an instruction, since there is no clear instruction
// boundary.

I'm sorry if I didn't make it clear. As far as i am concerned, I think the hard code case should be dealt with.

I'm still not following. Can you explain the use case here? What test case differs based on the presence of the "hard code" support in the original patch?

Dec 9 2019, 7:26 PM · Restricted Project
reames added a comment to D71238: Align non-fused branches within 32-Byte boundary (basic case).

The patch doesn't handle with the hard code case either, but as far as I can see that change was not mentioned in the description of this patch.

Unless I'm misreading the original patch, the notion of "hard code" is only relevant to the prefix padding scheme. Admittedly, I might be misreading, it isn't clearly stated anywhere in the original patch exactly what "hard code" is or what purpose it serves. .

Dec 9 2019, 6:49 PM · Restricted Project
reames added a comment to D70157: Align branches within 32-Byte boundary(NOP padding).

I just posted an alternate review (https://reviews.llvm.org/D71238) which attempts to carve out a minimum reviewable piece of complexity. The hope is that we can review that one quickly (as there are fewer interacting concerns), and then rebase this one (possibly splitting further).

Dec 9 2019, 5:18 PM · Restricted Project, Restricted Project
reames created D71238: Align non-fused branches within 32-Byte boundary (basic case).
Dec 9 2019, 5:16 PM · Restricted Project

Dec 6 2019

reames added a comment to D70157: Align branches within 32-Byte boundary(NOP padding).

Recording something so I don't forget it when we get back to the prefix padding version. The write up on the bundle align mode stuff mentions a concerning memory overhead for the feature. Since the basic implementation techniques are similar, we need to make sure we assess the memory overhead of the prefix padding implementation. See https://www.chromium.org/nativeclient/pnacl/aligned-bundling-support-in-llvm for context. I don't believe this is likely to be an issue for the nop padding variant.

Dec 6 2019, 11:18 AM · Restricted Project, Restricted Project
reames added a comment to D71106: [MC] Delete MCCodePadder.

LGTM w/minor comment addressed before submit.

Dec 6 2019, 11:09 AM · Restricted Project

Dec 5 2019

reames added a comment to D70157: Align branches within 32-Byte boundary(NOP padding).

I've been digging through the code for this for the last day or so. This is a new area for me, so it's possible I'm off base, but I have some concerns about the current design.

First, there appears to already be support for instruction bundling and alignment in the assembler today. I stumbled across the .bundle_align_mode, .bundle_start, and .bundle_end mechanism (https://lists.llvm.org/pipermail/llvm-dev/2012-December/056723.html) which seems to *heavily* overlap with this proposal. I suspect that the compiler support suggested by James and myself earlier in this thread could be implemented on to this existing mechanism.

Second, the new callbacks and infrastructure added appear to overlap heavily w/the MCCodePadding infrastructure. (Which, admitted, appears unused and untested.)

My conclusion after looking at all of that was actually that I plan to propose removing both the MCCodePadding and all the bundle-padding infrastructure, not add new stuff on top of it -- the former is unused, and I believe the latter is only for Chrome's NaCL, which is deprecated, and fairly close to being removed. If we need something similar in the future, we should certainly look to both of those for inspiration, but I don't think we need to be constrained by them.

I can definitely see removing the code padding stuff, since it's unused and untested.

Dec 5 2019, 5:47 PM · Restricted Project, Restricted Project
reames added a comment to D70157: Align branches within 32-Byte boundary(NOP padding).

I've been digging through the code for this for the last day or so. This is a new area for me, so it's possible I'm off base, but I have some concerns about the current design.

Dec 5 2019, 3:17 PM · Restricted Project, Restricted Project
reames added a comment to D70157: Align branches within 32-Byte boundary(NOP padding).

We uncovered another functional issue with this patch, or at least, the interaction of this patch and other parts of LLVM. In our support for STATEPOINT, PATCHPOINT, and STACKMAP we use N-byte nop sequences for regions of code which might be patched out. It's important that these regions are exactly N bytes as concurrent patching which doesn't replace an integral number of instructions is ill-defined on X86-64. This patch causes the N-byte nop sequence to sometimes become (N+M) bytes which breaks the patching. I believe that the XRAY support may have a similar issue.

Dec 5 2019, 2:39 PM · Restricted Project, Restricted Project

Dec 2 2019

reames added a comment to D70376: [LVI] Restructure caching.

Adding @reames in case he has some thoughts on how we handle caching/invalidation.

This looks very reasonable to me. I ran through it and didn't spot anything obviously problematic, so also LGTM.

Dec 2 2019, 5:17 PM · Restricted Project
reames resigned from D70901: [Matrix] Update shape propagation to iterate until done..
Dec 2 2019, 4:52 PM · Restricted Project
reames resigned from D70900: [Matrix] Propagate and use shape information for loads..
Dec 2 2019, 4:52 PM · Restricted Project
reames resigned from D70898: [Matrix] Propagate and use shape info for binary operators..
Dec 2 2019, 4:52 PM · Restricted Project
reames resigned from D70899: [Matrix] Implement back-propagation of shape information..
Dec 2 2019, 4:52 PM · Restricted Project
reames resigned from D70897: [Matrix] Add forward shape propagation and first shape aware lowerings..
Dec 2 2019, 4:52 PM · Restricted Project
reames abandoned D70623: [SCEV] Compute trip counts w/frozen conditions.

I agree with the analysis from Nuno, and thus the patch as written is wrong.

Dec 2 2019, 4:43 PM · Restricted Project
reames added a comment to D70157: Align branches within 32-Byte boundary(NOP padding).

I want to chime in support of jyknight's meta comments - particularly the one about the need to balance execution speed vs code size differently in hot vs cold code. For our use case, we have a very large amount of branch dense known cold paths, and being able to only align fast path branches would be a substantial space savings.

Dec 2 2019, 9:39 AM · Restricted Project, Restricted Project

Nov 22 2019

reames created D70623: [SCEV] Compute trip counts w/frozen conditions.
Nov 22 2019, 3:11 PM · Restricted Project
reames committed rG3f8a2af8f43f: Slightly speculative buildbot fix for issue reported in 8293f74 commit thread (authored by reames).
Slightly speculative buildbot fix for issue reported in 8293f74 commit thread
Nov 22 2019, 11:39 AM

Nov 21 2019

reames committed rGdfb7a9091aff: [LoopPred] Robustly handle partially unswitched loops (authored by reames).
[LoopPred] Robustly handle partially unswitched loops
Nov 21 2019, 3:52 PM
reames added inline comments to D70502: Broaden the definition of a "widenable branch".
Nov 21 2019, 3:52 PM · Restricted Project
reames committed rG8293f7434577: Further cleanup manipulation of widenable branches [NFC] (authored by reames).
Further cleanup manipulation of widenable branches [NFC]
Nov 21 2019, 3:16 PM
reames committed rGaaea24802bf5: Broaden the definition of a "widenable branch" (authored by reames).
Broaden the definition of a "widenable branch"
Nov 21 2019, 10:50 AM
reames committed rGd9426c336089: [Tests] Autogenerate a bunch of SCEV trip count tests for readability. Will… (authored by reames).
[Tests] Autogenerate a bunch of SCEV trip count tests for readability. Will…
Nov 21 2019, 10:50 AM
reames closed D70502: Broaden the definition of a "widenable branch".
Nov 21 2019, 10:50 AM · Restricted Project
reames committed rG70d173fb1f7b: [SCEV] Add a mode to skip classification when printing analysis (authored by reames).
[SCEV] Add a mode to skip classification when printing analysis
Nov 21 2019, 10:32 AM
reames committed rGf1a9a8323223: [SCEV] Be robust against IR generated by simple-loop-unswitch (authored by reames).
[SCEV] Be robust against IR generated by simple-loop-unswitch
Nov 21 2019, 9:55 AM
reames closed D70459: [SCEV] Be robust against IR generated by simple-loop-unswitch.
Nov 21 2019, 9:55 AM · Restricted Project
reames added inline comments to D70502: Broaden the definition of a "widenable branch".
Nov 21 2019, 9:54 AM · Restricted Project