chandlerc (Chandler Carruth)Administrator
User

Projects

User does not belong to any projects.

User Details

User Since
Jul 7 2012, 2:54 PM (275 w, 4 d)
Roles
Administrator

Recent Activity

Yesterday

chandlerc added inline comments to D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable..
Tue, Oct 17, 11:16 AM
chandlerc updated the diff for D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable..

Add a test case covering critical edge splitting.

Tue, Oct 17, 11:15 AM
chandlerc added inline comments to D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable..
Tue, Oct 17, 3:26 AM
chandlerc updated the diff for D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable..

Add a stricter form of checking for profitability to avoid one potential issue
with this patch. Also clean up comments in various places to try and help
address code review feedback.

Tue, Oct 17, 3:24 AM
chandlerc added a comment to D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable..

Hopefully responding to all of the comments. Note the last one which is probably the most interesting point of discussion around cost modeling.

Tue, Oct 17, 12:03 AM

Mon, Oct 16

chandlerc updated the diff for D34200: [PM/unswitch] Teach SimpleLoopUnswitch to do non-trivial unswitching, making it no longer even remotely simple..

Update addressing the remaining comments from Sanjoy.

Mon, Oct 16, 1:03 AM
chandlerc added inline comments to D34200: [PM/unswitch] Teach SimpleLoopUnswitch to do non-trivial unswitching, making it no longer even remotely simple..
Mon, Oct 16, 1:02 AM
chandlerc added a comment to D34200: [PM/unswitch] Teach SimpleLoopUnswitch to do non-trivial unswitching, making it no longer even remotely simple..

Updates and/or responses. Largely addressed all the comments with the next patch upload.

Mon, Oct 16, 12:52 AM

Sun, Oct 15

chandlerc updated the diff for D34200: [PM/unswitch] Teach SimpleLoopUnswitch to do non-trivial unswitching, making it no longer even remotely simple..

Rebase onto the new LoopInfo API and get everything working. Also rebase into
a monorepo checkout.

Sun, Oct 15, 12:31 AM

Fri, Oct 13

chandlerc added a comment to D38085: Use the basic cost if a GEP is not used as addressing mode.

This patch got reverted again, and I wanted to provide some insight as to why...

Fri, Oct 13, 11:00 AM

Thu, Oct 12

chandlerc added a comment to D38489: TargetMachine: Merge TargetMachine and LLVMTargetMachine.

Turns out this doesn't work nice with the current layering. There are some tools that just link libTarget but not libCodeGen. Merging LLVMTargetMachine and TargetMachine however forces us to move TargetMachine into libCodeGen breaking such users.

I think when merging LLVMTargetMachine and TargetMachine we also have to be consequent and merge libTarget into libCodeGen to facilitate that. Not sure how people think about that, and I'm not sure I will find the time to prepare patches for all llvm users linking to libTarget.

Thu, Oct 12, 6:23 PM

Wed, Oct 11

chandlerc added inline comments to D38824: [X86] Synchronize the existing CPU predefined macros with the cases that gcc defines them.
Wed, Oct 11, 4:12 PM
chandlerc added inline comments to D38824: [X86] Synchronize the existing CPU predefined macros with the cases that gcc defines them.
Wed, Oct 11, 3:18 PM

Tue, Oct 10

chandlerc added a comment to D38154: [PassManager] Run global opts after the inliner.

so it is possible that the large improvement in compile time is due to we broke the pipeline and due to this less optimization was possible/executed.

Tue, Oct 10, 11:48 PM
chandlerc added a comment to D38433: Introduce a specialized data structure to be used in a subsequent change.

(Haven't addressed the code comments yet since the design isn't settled)

Have you considered building a ChunkedVector instead of a ChunkedList? Specifically, there is a great trick where you use a single index with the low bits being an index into the chunk and the high bits being an index into a vector of pointers. It has many of the benefits you list and is a bit simpler I think. It also supports essentially the entire vector API if desired. Both bi-directional and even random access are reasonably efficient. Good locality, etc.

With a vector-of-buffers implementation, I'm a bit worried about the space overhead on the smaller cases. For instance, this is the histogram of how this data structure is populated from a clang-bootstrap (also in https://reviews.llvm.org/D38434):

     Count: 731310
       Min: 1
      Mean: 8.555150
50th %tile: 4
95th %tile: 25
99th %tile: 53
       Max: 433

If I used a vector-of-buffers, I will either have to recompute the capacity and end (of the last buffer) on every insert (which will require an additional deref and some computation) or have to keep two words in the data structure over the three that smallvector keeps anyway. This adds a lot of relative overhead on the median case (4 elements). In fact, the current situation of two extra words also qualifies as a "lot" of relative overhead IMO, and I want to think of an SSO to improve the situation.

Tue, Oct 10, 1:20 PM
chandlerc added a comment to D38433: Introduce a specialized data structure to be used in a subsequent change.

Have you considered building a ChunkedVector instead of a ChunkedList? Specifically, there is a great trick where you use a single index with the low bits being an index into the chunk and the high bits being an index into a vector of pointers. It has many of the benefits you list and is a bit simpler I think. It also supports essentially the entire vector API if desired. Both bi-directional and even random access are reasonably efficient. Good locality, etc.

Tue, Oct 10, 10:31 AM

Sat, Oct 7

chandlerc added a comment to D38154: [PassManager] Run global opts after the inliner.

when this was merged in to rust I noticed that it did not give any gain.
I think that I made some thing wrong when I tested your modification from my original patch.
as I understand it is to late at this location due to the most time is spent "in" addFunctionSimplificationPasses.

what was the problem with having the GDCE pass directly after the add of the inliner?

Sat, Oct 7, 4:12 PM

Thu, Oct 5

chandlerc added a comment to D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable..

(quick responses, still working on code updates from the review)

Thu, Oct 5, 6:08 PM
chandlerc added a comment to D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable..

I've taken some time to go through the code in NewGVN that computes similar things based on the suggestion from Danny. These now do *very* similar things in terms of walk. The differences I've seen are:

Thu, Oct 5, 10:06 AM
chandlerc added a comment to D38154: [PassManager] Run global opts after the inliner.

I would've checked in this a while ago :) Anyway, I think it was good to run some tests, but this is clearly a good thing to do.
I'm going ahead and committing this as it seems we built enough consensus around it. In the worst case, we're going to revert it :)

Thu, Oct 5, 9:58 AM
chandlerc added a comment to D38154: [PassManager] Run global opts after the inliner.

I did a small test in rust with -Os and the compile time do not change with the patch so a check for OptLevel > 2 is maybe good to have.

Thu, Oct 5, 12:54 AM

Wed, Oct 4

chandlerc added a comment to D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable..

FWIW, I think with the latest patch update this has now addressed Danny's primary concerns, and should be much more effective at avoiding re-visiting regions of IR.

Wed, Oct 4, 2:38 AM
chandlerc updated the diff for D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable..

Rebase (and move to mono-repo layout, sorry for any disruption).

Wed, Oct 4, 2:36 AM

Tue, Oct 3

chandlerc added a comment to D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable..

I'll admit to not staring at the cost model too hard (I rarely have useful input on that kind of target specific thing), but it looked at a glance like you trying to calculate which might constant fold as well.
If not, or it's not part of the goal, awesome.

Tue, Oct 3, 7:09 PM
chandlerc added a reviewer for D38359: [X86] Ignore DBG instructions in X86CmovConversion optimization to resolve PR34565: rnk.

Thanks Chandler for reviewing the code.

Is there any way that the debug instructions can become invalid due to this transform? I'm not seeing anything obvious, but wanted to make sure you've thought about this too.

This is how the code I am trying to fix looks like:

%vreg2<def,tied1> = CMOVB32rr %vreg6<tied0>, %vreg1, %EFLAGS<imp-use>
DBG_VALUE %vreg2, %noreg, !"b", <!DIExpression()>; GR32:%vreg2 line no:7
DBG_VALUE %vreg2, %noreg, !"b", <!DIExpression()>; GR32:%vreg2 line no:7
%vreg3<def,tied1> = CMOVB32rr %vreg0<tied0>, %vreg6, %EFLAGS<imp-use>

I assume 3 facts at this point:

  1. DBG_VALUE has only virtual register and cannot have physical register, that is true because we are still working with in SSA machine code.
  2. DBG_VALUE instruction generates a void value, i.e., no instruction can use it (or refer to it).
  3. When we replace all uses of the CMOV instruction with the new PHINode instruction, that also changes the virtual register in the DBG_VALUE.

    Saying that, we can always move all DBG_VALUE instructions forward up to the end of Machine Basic Block, right?
Tue, Oct 3, 12:54 AM

Mon, Oct 2

chandlerc added a comment to D38154: [PassManager] Run global opts after the inliner.

I'd recommend using LNT,

This seems like a really cumbersome process to expect folks to go through for changes intending to improve compile time. Is there no way to simplify it?

Which part is cumbersome?

Mon, Oct 2, 10:23 PM
chandlerc added a comment to D38154: [PassManager] Run global opts after the inliner.

I'd recommend using LNT,

Mon, Oct 2, 10:05 PM
chandlerc added a reviewer for D38482: TargetMachine: Use LLVMTargetMachine in CodeGen: craig.topper.
  1. If in #1 you mean actual targets -- are there any left now that the C backend is gone? What are these targets? I wonder if we could simplify things by collapsing a no longer necessary layer.
  • There is none in open source, and I don't know of any internal ones at my company.
  • According to the mailing list this may be good for SPIR-V (http://lists.llvm.org/pipermail/llvm-dev/2017-June/114593.html) which doesn't really need any of the CodeGen infrastructure.
  • Indeed collapsing the two classes would slightly simplify the code, and I'd be perfectly fine with that.
  • On the other hand it's kinda neat to have a CodeGen internal interface with LLVMTargetMachine and an outside interface with TargetMachine
Mon, Oct 2, 7:40 PM
chandlerc added a comment to D38359: [X86] Ignore DBG instructions in X86CmovConversion optimization to resolve PR34565.

Is there any way that the debug instructions can become invalid due to this transform? I'm not seeing anything obvious, but wanted to make sure you've thought about this too.

Mon, Oct 2, 6:21 PM
chandlerc added inline comments to D38154: [PassManager] Run global opts after the inliner.
Mon, Oct 2, 6:02 PM
chandlerc added a comment to D38482: TargetMachine: Use LLVMTargetMachine in CodeGen.

I generally like the direction of the cleanup. A few high level questions:

Mon, Oct 2, 5:57 PM
chandlerc added a comment to D38154: [PassManager] Run global opts after the inliner.

I'm generally fine with this going in given the measurements.

Mon, Oct 2, 5:31 PM

Sun, Oct 1

chandlerc added inline comments to D38154: [PassManager] Run global opts after the inliner.
Sun, Oct 1, 10:45 PM

Wed, Sep 27

chandlerc accepted D38201: Use a BumpPtrAllocator for Loop objects.

LGTM!

Wed, Sep 27, 5:58 PM

Tue, Sep 26

chandlerc added a comment to D38201: Use a BumpPtrAllocator for Loop objects.

There is a problem with this patch -- the legacy pass manager checks if a loop is valid or not by querying isInvalid, which is incorrect; since we call the destructor for loop objects to mark them as "invalid" calling isInvalid for possibly invalid loop objects has UB. We already have this UB in User::operator delete though, and so given that and that the legacy pass manager is going away at some point, perhaps this is okay? Alternatively, I could add a mechanism on the legacy loop pass manager that lets passes denote loop objects to have been deleted (by remembering the pointers in a set) and use that to decide if a Loop object is valid or not (and with that, I can guard isInvalid with #ifndef NDEBUG).

Sadly, I think this would be problematic. I think it'll trip ASan's use-after-scope or use-after-lifetime checks.... =/ But maybe there is a way around that. Or maybe just implement the solution you suggest. Dunno, but definitely worth checking.

We already do this for User instances -- we read from fields in the User object to be destroyed in its operator delete, after the destructors have run. However, two wrongs don't make a right so I'll first try to see if I can fix this in the old pass manager.

Tue, Sep 26, 11:29 PM
chandlerc accepted D37198: [InlineCost] add visitSelectInst().

(Marking as accepted now that Easwaran has had a chance to look as well)

Tue, Sep 26, 11:24 PM
chandlerc added a comment to D37198: [InlineCost] add visitSelectInst().

This is looking pretty good. Some nit-picky improvements below. I'd love for Easwaran to get another chance to look at this too though...

Tue, Sep 26, 3:29 PM
chandlerc added a comment to D38201: Use a BumpPtrAllocator for Loop objects.

There is a problem with this patch -- the legacy pass manager checks if a loop is valid or not by querying isInvalid, which is incorrect; since we call the destructor for loop objects to mark them as "invalid" calling isInvalid for possibly invalid loop objects has UB. We already have this UB in User::operator delete though, and so given that and that the legacy pass manager is going away at some point, perhaps this is okay? Alternatively, I could add a mechanism on the legacy loop pass manager that lets passes denote loop objects to have been deleted (by remembering the pointers in a set) and use that to decide if a Loop object is valid or not (and with that, I can guard isInvalid with #ifndef NDEBUG).

Tue, Sep 26, 3:29 PM

Fri, Sep 22

chandlerc requested changes to D37198: [InlineCost] add visitSelectInst().

Mostly minor suggestions below...

Fri, Sep 22, 11:16 AM

Thu, Sep 21

chandlerc added inline comments to D37198: [InlineCost] add visitSelectInst().
Thu, Sep 21, 10:15 PM
chandlerc added a comment to D37902: [CodeExtractor] Fix multiple bugs under certain shape of extracted region.

Can you revert the unrelated formatting changes? Otherwise reviewing this is a bit hard.

Thu, Sep 21, 4:06 PM
chandlerc added a comment to D38154: [PassManager] Run global opts after the inliner.

Could you include analogous updates to the new PM's pipeline? (I can do it myself in a follow-up if you'd rather...)

Thu, Sep 21, 2:28 PM

Wed, Sep 20

chandlerc updated the diff for D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable..

Rebase (no substantive changes yet...)

Wed, Sep 20, 12:31 AM

Tue, Sep 19

chandlerc accepted D38055: Tighten the invariants around LoopBase::invalidate.

Generally looks good. Description should likely be updated to reflect changes made during review. Feel free to submit!

Tue, Sep 19, 6:57 PM
chandlerc edited reviewers for D37900: [SROA] Really remove associated dbg.declare when removing dead alloca, added: rnk; removed: chandlerc.

I suspect Reid is in a much better position to look at debug info changes to SROA than I am, but let me know if I can help...

Tue, Sep 19, 6:28 PM
chandlerc requested changes to D37939: [Mem2Reg] Also handle memcpy.

(marking as needing changes to clear dashboard)

Tue, Sep 19, 6:27 PM
chandlerc requested changes to D38042: EmitAssemblyHelper: CodeGenOpts.DisableLLVMOpts should not overrule CodeGenOpts.VerifyModule..

Marking as needing changes to clear dashboard.

Tue, Sep 19, 6:26 PM
chandlerc removed a reviewer for D38068: [DebugInfo] Use a DenseMap to coalesce MachineOperand locations: chandlerc.
Tue, Sep 19, 6:26 PM
chandlerc requested changes to D37198: [InlineCost] add visitSelectInst().

Just wanted to say thanks so much for working on this -- its a huge improvement to the inline cost.

Tue, Sep 19, 6:15 PM
chandlerc requested changes to D33946: [InlineCost] Find identical loads in the callee.

Sorry for the awful delay when I got pulled into stuff, should be able to stay with the (small) stuff left in the review.

Tue, Sep 19, 6:01 PM
chandlerc requested changes to D38055: Tighten the invariants around LoopBase::invalidate.
Tue, Sep 19, 5:34 PM
chandlerc added a comment to D38042: EmitAssemblyHelper: CodeGenOpts.DisableLLVMOpts should not overrule CodeGenOpts.VerifyModule..

Absolutely. I think the verifier should never, under any circumstances, mutate the IR. Think about it, with the current design if a pass corrupts the debug info the verifier may "hide" this by stripping it out rather than allowing us to find it.

Okay, I think I agree. Before I venture off implementing this, do you think that separating out a StripBrokenDebugInfoPass that depends on the Verifier is right way forward?

Tue, Sep 19, 4:06 PM
chandlerc added a comment to D38055: Tighten the invariants around LoopBase::invalidate.

Generally like the API changes here. The enum seems a big improvement.

Tue, Sep 19, 3:59 PM
chandlerc accepted D37996: [LoopInfo] Make LoopBase and Loop destructors non-public.

LGTM, nits below.

Tue, Sep 19, 2:49 PM
chandlerc added a comment to D38042: EmitAssemblyHelper: CodeGenOpts.DisableLLVMOpts should not overrule CodeGenOpts.VerifyModule..

But, the verifier itself will just "crash". It won't print a stack trace, but I don't see why that's much better? And this flag is supposed to be a developer option and not a user facing one, so I'm somewhat confused at what the intent is here...

No, it will report_fatal_error() instead of crashing the compiler later on.
In any case, that is not my primary motivation: The intent is exactly what is illustrated by the testcase, stripping malformed debug info metadata produced by older, buggy versions of clang. The backstory to this is that historically the Verifier was very weak when it came to verifying debug info. To allow us to make the Verifier better (stricter), while still supporting importing of older .bc files produced by older compilers, we added a mechanism that strips all debug info metadata if the verification of the debug info failed, but the bitcode is otherwise correct.

Ok, that use case makes way more sense. I'd replace the change description with some discussion of this use case.

My next question is -- why is this done by the verifier? It seems *really* bad that the verifier *changes the IR*. Don't get me wrong, what you're trying to do (strip malformed debug info) makes perfect sense. But I think that the thing which does that shouldn't be called a verifier. It might *use* the verifier of course.

That was a purely pragmatic decision: Most (but not all) LLVM-based tools are running the Verifier as an LLVM pass so adding the stripping into the pass was the least invasive way of implementing this feature. If we are truly bothered by this, I think what could work is to separate out a second StripBrokenDebugInfo pass that depends on the Verifier and is guaranteed to run immediately after it. I don't see this adding much value though, and we would have to modify all tools to schedule the new pass explicitly. Do you think that would be worth pursuing?

Tue, Sep 19, 11:39 AM
chandlerc added a comment to D37939: [Mem2Reg] Also handle memcpy.

Hi @efriedma,

this is the julia pass pipeline (https://github.com/JuliaLang/julia/blob/master/src/jitlayers.cpp#L148). IIRC the original list of passes came from VMKit,
but the pass list was adjusted as needed over the years.

Tue, Sep 19, 10:48 AM
chandlerc added a comment to D38042: EmitAssemblyHelper: CodeGenOpts.DisableLLVMOpts should not overrule CodeGenOpts.VerifyModule..

But, the verifier itself will just "crash". It won't print a stack trace, but I don't see why that's much better? And this flag is supposed to be a developer option and not a user facing one, so I'm somewhat confused at what the intent is here...

No, it will report_fatal_error() instead of crashing the compiler later on.
In any case, that is not my primary motivation: The intent is exactly what is illustrated by the testcase, stripping malformed debug info metadata produced by older, buggy versions of clang. The backstory to this is that historically the Verifier was very weak when it came to verifying debug info. To allow us to make the Verifier better (stricter), while still supporting importing of older .bc files produced by older compilers, we added a mechanism that strips all debug info metadata if the verification of the debug info failed, but the bitcode is otherwise correct.

Tue, Sep 19, 10:47 AM
chandlerc added a comment to D38042: EmitAssemblyHelper: CodeGenOpts.DisableLLVMOpts should not overrule CodeGenOpts.VerifyModule..

But, the verifier itself will just "crash". It won't print a stack trace, but I don't see why that's much better? And this flag is supposed to be a developer option and not a user facing one, so I'm somewhat confused at what the intent is here...

Tue, Sep 19, 10:30 AM

Sep 15 2017

chandlerc committed rL313435: [git] Update the llvm git helper script to work correctly with the.
[git] Update the llvm git helper script to work correctly with the
Sep 15 2017, 7:15 PM
chandlerc committed rL313409: [SLP] Revert r312791 and other necessary commits, except for TTI and.
[SLP] Revert r312791 and other necessary commits, except for TTI and
Sep 15 2017, 3:25 PM

Sep 14 2017

chandlerc committed rL313242: [PM/CGSCC] Teach the CGSCC pass manager components to gracefully handle.
[PM/CGSCC] Teach the CGSCC pass manager components to gracefully handle
Sep 14 2017, 1:37 AM

Sep 8 2017

chandlerc committed rL312809: [x86] Fix GCC pedantic warnings about default arguments for lambdas..
[x86] Fix GCC pedantic warnings about default arguments for lambdas.
Sep 8 2017, 11:25 AM
chandlerc added a comment to D37177: [X86] Don't disable slow INC/DEC if optimizing for size.

I generally like the direction, peanut gallery comment. No need to wait for me to land or anything...

Sep 8 2017, 11:08 AM
chandlerc added a comment to D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable..

So, I'm still parsing your first comment Danny, but seeing the second, I think there may be some confusion here...

Sep 8 2017, 8:29 AM

Sep 7 2017

chandlerc added inline comments to D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable..
Sep 7 2017, 6:45 PM
chandlerc updated the diff for D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable..

Update based on code review from David and Hal.

Sep 7 2017, 6:45 PM
chandlerc added a comment to D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable..

Just some initial comments around the test cases.

It seems that all test cases are testing non speculative code hoisting -- there are no speculative hoisting tests.

Sep 7 2017, 6:31 PM
chandlerc committed rL312768: [x86] Flesh out the custom ISel for RMW aritmetic ops with used flags to.
[x86] Flesh out the custom ISel for RMW aritmetic ops with used flags to
Sep 7 2017, 5:19 PM
chandlerc closed D37141: [x86] Flesh out the custom ISel for RMW aritmetic ops with used flags to cover the bitwise operators. by committing rL312768: [x86] Flesh out the custom ISel for RMW aritmetic ops with used flags to.
Sep 7 2017, 5:19 PM
chandlerc added a comment to D33960: ErrorHandling: report_fatal_error: call abort() instead of exit().
In D33960#864213, @rnk wrote:

I'm not sure this is a good idea. For years, report_fatal_error outside clang hasn't printed a stack trace or a banner, and that's intentional behavior.

Sep 7 2017, 5:06 PM
chandlerc committed rL312764: [x86] Extend the manual ISel of `add` and `sub` with both RMW memory.
[x86] Extend the manual ISel of `add` and `sub` with both RMW memory
Sep 7 2017, 4:56 PM
chandlerc closed D37139: [x86] Extend the manual ISel of `add` and `sub` with both RMW memory operands and used flags to support matching immediate operands. by committing rL312764: [x86] Extend the manual ISel of `add` and `sub` with both RMW memory.
Sep 7 2017, 4:55 PM
chandlerc accepted D33960: ErrorHandling: report_fatal_error: call abort() instead of exit().

Hrm.

Seeing that closing stderr test - I wouldn't mind some more eyes on this change. There's a fair amount of report_fatal_error usage in LLVM tools that probably aren't abort-worthy? But I'm not sure. But maybe that's OK (I mean they're mostly internal tools anyway)

Sep 7 2017, 4:42 PM
chandlerc added a comment to D37139: [x86] Extend the manual ISel of `add` and `sub` with both RMW memory operands and used flags to support matching immediate operands..

Rebased, fixes included. Anything else before I land this?

Sep 7 2017, 12:03 AM
chandlerc updated the diff for D37139: [x86] Extend the manual ISel of `add` and `sub` with both RMW memory operands and used flags to support matching immediate operands..

Rebase, including Craig's fixes.

Sep 7 2017, 12:03 AM
chandlerc commandeered D37139: [x86] Extend the manual ISel of `add` and `sub` with both RMW memory operands and used flags to support matching immediate operands..
Sep 7 2017, 12:02 AM

Sep 6 2017

chandlerc added inline comments to D37504: [x86] Fix PR34377 by disabling cmov conversion when we relied on it performing a zext of a register..
Sep 6 2017, 1:59 AM

Sep 5 2017

chandlerc committed rL312620: [x86] Fix PR34377 by disabling cmov conversion when we relied on it.
[x86] Fix PR34377 by disabling cmov conversion when we relied on it
Sep 5 2017, 11:29 PM
chandlerc closed D37504: [x86] Fix PR34377 by disabling cmov conversion when we relied on it performing a zext of a register. by committing rL312620: [x86] Fix PR34377 by disabling cmov conversion when we relied on it.
Sep 5 2017, 11:29 PM
chandlerc created D37504: [x86] Fix PR34377 by disabling cmov conversion when we relied on it performing a zext of a register..
Sep 5 2017, 10:45 PM
chandlerc created D37467: Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable..
Sep 5 2017, 4:34 AM

Aug 29 2017

chandlerc accepted D37280: [X86] Provide a separate feature bit for macro fusion support instead of basing it on the AVX flag.

LGTM!

Aug 29 2017, 9:28 PM
chandlerc added a comment to D37280: [X86] Provide a separate feature bit for macro fusion support instead of basing it on the AVX flag.

A quick skim of Agner's suggsets that all of Core2, Nehalem, Westmere, Bulldozer, Piledriver, Steamroler, and Zen all do nearly full blown macro fusion for cmp/test and a branch. Even the Via Nana apparently does some of this apparently....

Aug 29 2017, 6:42 PM
chandlerc added a comment to D37250: [X86] Apply SlowIncDec feature to Sandybridge/Ivybridge CPUs as well.

There is definitely a false dependency issue that can be triggered by "shift CL" because variables shifts are defined to not update any flags if CL is 0, but the shift value isn't know until it executes so the shift is dependent on the old flags including CF

Aug 29 2017, 1:26 PM
chandlerc added a comment to D37250: [X86] Apply SlowIncDec feature to Sandybridge/Ivybridge CPUs as well.

Put differently, maybe we only want this on processors *older* than sandybridge... but today, we don't. and we don't have a lot of tuning for older processors... so maybe we should just nuke it going forward.

Aug 29 2017, 11:54 AM
chandlerc added a comment to D37250: [X86] Apply SlowIncDec feature to Sandybridge/Ivybridge CPUs as well.

I think Pentium 4 and Silvermont read the old CF value and pass it through the INC/DEC instruction.

I think the "core" line prior to Sandy Bridge any instruction that reads the CF flag after INC/DEC experienced a stall until the INC/DEC instruction retired.

I think from Sandy Bridge onward, it is no longer a stall until retirement, but an instruction that reads the CF flag after INC/DEC become dependent on the last writer of CF before the INC/DEC.

Aug 29 2017, 11:53 AM
chandlerc accepted D37250: [X86] Apply SlowIncDec feature to Sandybridge/Ivybridge CPUs as well.

I'm fine with this, but it would be good to get confirmation that there is a genuine partial flag update stall issue we're avoiding with this... As long as you all can confirm that this is a real issue even on modern processors, then all this makes sense. =D

Aug 29 2017, 8:50 AM

Aug 26 2017

chandlerc added a comment to D37184: [X86] Add constant pool decoding to more instructions.

Generally speaking, I love this kind of thing. I think it makes tests and the actual assembly dramatically easier to read.

Aug 26 2017, 4:08 PM

Aug 25 2017

chandlerc added a comment to D37139: [x86] Extend the manual ISel of `add` and `sub` with both RMW memory operands and used flags to support matching immediate operands..

Comments addressed, thanks!

Aug 25 2017, 5:36 PM
chandlerc updated the diff for D37139: [x86] Extend the manual ISel of `add` and `sub` with both RMW memory operands and used flags to support matching immediate operands..

Update with minor formatting fixes and a major fix to correctly detect usage of
CF from an operation and suppress toggling between ADD and SUB in that case.

Aug 25 2017, 5:35 PM
chandlerc committed rL311806: [x86] Teach the backend to fold more read-modify-write memory operands.
[x86] Teach the backend to fold more read-modify-write memory operands
Aug 25 2017, 3:52 PM
chandlerc closed D37130: [x86] Teach the backend to fold more read-modify-write memory operands to instructions. by committing rL311806: [x86] Teach the backend to fold more read-modify-write memory operands.
Aug 25 2017, 3:52 PM
chandlerc added a dependent revision for D37139: [x86] Extend the manual ISel of `add` and `sub` with both RMW memory operands and used flags to support matching immediate operands.: D37141: [x86] Flesh out the custom ISel for RMW aritmetic ops with used flags to cover the bitwise operators..
Aug 25 2017, 5:56 AM
chandlerc created D37141: [x86] Flesh out the custom ISel for RMW aritmetic ops with used flags to cover the bitwise operators..
Aug 25 2017, 5:56 AM
chandlerc updated the diff for D37139: [x86] Extend the manual ISel of `add` and `sub` with both RMW memory operands and used flags to support matching immediate operands..

Switch to a more verbose (more lines of code) pattern for selecting opcodes in
the ADD and SUB block. This will become more important when adding AND, OR, and
XOR which all want the same behavior.

Aug 25 2017, 5:42 AM
chandlerc updated the diff for D37139: [x86] Extend the manual ISel of `add` and `sub` with both RMW memory operands and used flags to support matching immediate operands..

Clean up some stale cruft in the test.

Aug 25 2017, 5:19 AM
chandlerc added a dependent revision for D37130: [x86] Teach the backend to fold more read-modify-write memory operands to instructions.: D37139: [x86] Extend the manual ISel of `add` and `sub` with both RMW memory operands and used flags to support matching immediate operands..
Aug 25 2017, 5:02 AM
chandlerc created D37139: [x86] Extend the manual ISel of `add` and `sub` with both RMW memory operands and used flags to support matching immediate operands..
Aug 25 2017, 5:02 AM

Aug 24 2017

chandlerc added inline comments to D37130: [x86] Teach the backend to fold more read-modify-write memory operands to instructions..
Aug 24 2017, 11:30 PM
chandlerc updated the diff for D37130: [x86] Teach the backend to fold more read-modify-write memory operands to instructions..

Another update from review.

Aug 24 2017, 11:30 PM