Page MenuHomePhabricator

morisset (Robin Morisset)
User

Projects

User does not belong to any projects.

User Details

User Since
Jul 14 2014, 6:06 PM (275 w, 22 h)

Recent Activity

Sep 11 2015

morisset accepted D12557: [AtomicExpand] Refactor load/cmpxchg callbacks to make LLSC explicit. NFCI..

LGTM, there is just an assert to fix.

Sep 11 2015, 2:50 AM

Sep 4 2015

morisset added a comment to D12338: Make `llvm::expandAtomicRMWToCmpXchg`'s initial load atomic..

It sounds great to me (although a lot of work for such a corner case, but I don't see any better solution).

Sep 4 2015, 1:59 AM

Aug 31 2015

morisset added a reviewer for D12338: Make `llvm::expandAtomicRMWToCmpXchg`'s initial load atomic.: reames.

After thinking some more about it, I realized that having the load stay non-atomic is actually unsound because there might be a race, which would make it undefined behaviour.
Sadly, making it monotonic would still introduce these redundant cmpxchg8b/16b since monotonic requires the access to be atomic.
I don't know well enough the semantics of unordered accesses to know if they would do the trick, and the llvm reference manual description of them is fairly lightweight on details.

Aug 31 2015, 5:07 PM
morisset added a comment to D12338: Make `llvm::expandAtomicRMWToCmpXchg`'s initial load atomic..
In D12338#232597, @jfb wrote:

This isn't just for PNaCl, so the description should be updated. You should mention something along the lines of:

On weak memory systems the CPU can speculate on subsequent loads (e.g. the cmpxchg) and observe them without honoring the happens-before ordering of the corresponding stores. This is the "load buffering" problem in literature, and occurs on ARM and POWER.

Aug 31 2015, 4:39 PM

Jul 23 2015

morisset added a comment to D11382: x86 atomic: optimize a.store(reg op a.load(acquire), release).

LGTM for the integer part.

Jul 23 2015, 6:46 AM

Nov 12 2014

morisset added a comment to D5758: Add a fence elimination pass.

To dberlin: I looked at this article, but they explain well that the reason min-cut is so expensive for PRE is because it must be repeated for each computation in the function (of which there can be 10s of thousand in very large function) and must look at a potentially huge graph. In comparison we only run this twice: once for hwsync and one for lwsync. Furthermore, because the graph is stopped by any memory access (and not just use/kill of some very specific computation as in PRE), I expect each of these runs of min-cut to be quite cheap. I have not had the time to benchmark the compile-time cost of this pass (deadline tomorrow for PLDI..), but in summary I expect it to be small, even for large functions full of fences.

Nov 12 2014, 4:44 AM

Oct 31 2014

morisset added a reviewer for D5758: Add a fence elimination pass: hfinkel.

Hi Hal,

Oct 31 2014, 10:45 AM

Oct 27 2014

morisset added a comment to D5758: Add a fence elimination pass.

ping.

Oct 27 2014, 11:31 AM

Oct 21 2014

morisset added a reviewer for D5758: Add a fence elimination pass: reames.
Oct 21 2014, 2:45 PM
morisset updated the diff for D5758: Add a fence elimination pass.

Initialize the FlowGraphNode::Next iterator more cleanly

Oct 21 2014, 2:41 PM
morisset updated the diff for D5758: Add a fence elimination pass.

Add an option "-view-fences-pre-flow-graph" to visualize the graph being used,
per the request of jfb (it makes things much easier to debug).

Oct 21 2014, 10:59 AM

Oct 17 2014

morisset added inline comments to D5758: Add a fence elimination pass.
Oct 17 2014, 2:30 PM

Oct 16 2014

morisset closed D5474: Erase fence insertion from SelectionDAGBuilder.cpp (NFC).

Closed by commit rL219957 (authored by @morisset).

Oct 16 2014, 1:45 PM
morisset updated the diff for D5758: Add a fence elimination pass.

Apply clang-format, fix small style issues pointed by jfb.
I haven't done the bigger modifications yet.

Oct 16 2014, 12:24 PM
morisset added a comment to D5758: Add a fence elimination pass.

I missed your idea of running this pass for C++11 fences instead. In short: this is probably useless, as C++11 fences are expected to be rather rare (compared to atomic accesses that result in target-specific fences), and the semantics of C++11 fences are hairy enough I don't want to try without a clear benefit. So while the infrastructure could support it, I don't plan on doing it.

Oct 16 2014, 11:59 AM
morisset updated D5758: Add a fence elimination pass.
Oct 16 2014, 11:21 AM
morisset added a comment to D5758: Add a fence elimination pass.

Thanks for the review !

Oct 16 2014, 11:20 AM

Oct 13 2014

morisset added a reviewer for D5758: Add a fence elimination pass: t.p.northover.
Oct 13 2014, 12:53 PM
morisset retitled D5758: Add a fence elimination pass from to Add a fence elimination pass.
Oct 13 2014, 12:43 PM

Oct 9 2014

morisset added a comment to D5683: Statepoint infrastructure for garbage collection.

I don't know enough about this topic to comment on the details of the ideas (although they seem reasonable and match what had been discussed on the mailing list), but I have a read part of the patch and found a few typos and have a few inline comments.

Oct 9 2014, 3:38 PM

Oct 8 2014

morisset closed D5678: [X86] Don't transform atomic-load-add into an inc/dec when inc/dec is slow.

r219357

Oct 8 2014, 5:43 PM
morisset updated the diff for D5678: [X86] Don't transform atomic-load-add into an inc/dec when inc/dec is slow.

Follow jfb suggestion.

Oct 8 2014, 4:10 PM
morisset added inline comments to D5678: [X86] Don't transform atomic-load-add into an inc/dec when inc/dec is slow.
Oct 8 2014, 3:45 PM
morisset closed D5677: [X86] Avoid generating inc/dec when slow for x.atomic_store(1 + x.atomic_load()).

Closed by commit rL219336 (authored by @morisset).

Oct 8 2014, 12:48 PM
morisset retitled D5678: [X86] Don't transform atomic-load-add into an inc/dec when inc/dec is slow from to [X86] Don't transform atomic-load-add into an inc/dec when inc/dec is slow.
Oct 8 2014, 11:54 AM
morisset retitled D5677: [X86] Avoid generating inc/dec when slow for x.atomic_store(1 + x.atomic_load()) from to [X86] Avoid generating inc/dec when slow for x.atomic_store(1 + x.atomic_load()).
Oct 8 2014, 11:52 AM
morisset updated the diff for D5474: Erase fence insertion from SelectionDAGBuilder.cpp (NFC).

Rebase on trunk + use doxygen trick suggested by jfb

Oct 8 2014, 11:02 AM

Oct 7 2014

morisset closed D5655: [X86] Fix a bug with fetch_add(INT32_MIN).

Closed by commit rL219257 (authored by @morisset).

Oct 7 2014, 5:04 PM
morisset updated the diff for D5655: [X86] Fix a bug with fetch_add(INT32_MIN).

Apply jfb suggestions.

Oct 7 2014, 4:26 PM
morisset updated the diff for D5655: [X86] Fix a bug with fetch_add(INT32_MIN).

Based on an offline comment by jf, improve the patch so that the immediate is
merged into the addq.

Oct 7 2014, 3:14 PM
morisset updated D5655: [X86] Fix a bug with fetch_add(INT32_MIN).
Oct 7 2014, 3:04 PM
morisset retitled D5655: [X86] Fix a bug with fetch_add(INT32_MIN) from to [X86] Fix a bug with fetch_add(INT32_MIN).
Oct 7 2014, 3:03 PM

Oct 3 2014

morisset added inline comments to D5474: Erase fence insertion from SelectionDAGBuilder.cpp (NFC).
Oct 3 2014, 3:56 PM
morisset closed D5317: [Power] Use lwsync for non-seq_cst fences.

Closed by commit rL218995 (authored by @morisset).

Oct 3 2014, 11:14 AM
morisset added a comment to D5317: [Power] Use lwsync for non-seq_cst fences.

I will add the test with ppc440.

Oct 3 2014, 8:50 AM

Oct 2 2014

morisset closed D5590: Update Atomics.rst.

Closed by commit rL218937 (authored by @morisset).

Oct 2 2014, 6:14 PM
morisset updated the diff for D5317: [Power] Use lwsync for non-seq_cst fences.

Rebase on trunk.

Oct 2 2014, 6:13 PM
morisset abandoned D5316: [Power] Add a new feature flag for support of the sync instruction.

Duplicate with commit r218923 by Hal FInkel.

Oct 2 2014, 3:55 PM
morisset retitled D5590: Update Atomics.rst from to Update Atomics.rst.
Oct 2 2014, 3:41 PM
morisset closed D5587: [Power] Improve the expansion of atomic loads/stores.

Closed by commit rL218922 (authored by @morisset).

Oct 2 2014, 3:37 PM
morisset updated the diff for D5587: [Power] Improve the expansion of atomic loads/stores.

Add tests for indexed loads/stores

Oct 2 2014, 3:33 PM
morisset added a comment to D5587: [Power] Improve the expansion of atomic loads/stores.

Thank you ! This paragraph of the documentation was indeed what I was
thinking about, when saying that atomicity is not a problem, I should have
made it more explicit. I will try to generate tests for the indexed
versions and add them in.

Oct 2 2014, 2:37 PM
morisset retitled D5587: [Power] Improve the expansion of atomic loads/stores from to [Power] Improve the expansion of atomic loads/stores.
Oct 2 2014, 1:32 PM

Sep 29 2014

morisset added a comment to D5532: [MemoryDependenceAnalysis] Fix compile time slowdown.

I don't think I know this code well enough to give the LGTM, but the general approach looks good.
Also, thank you for the detailed testing/explanation.

Sep 29 2014, 5:00 PM

Sep 25 2014

morisset closed D5422: Lower idempotent RMWs to fence+load.

Closed by commit rL218455 (authored by @morisset).

Sep 25 2014, 10:37 AM

Sep 23 2014

morisset retitled D5474: Erase fence insertion from SelectionDAGBuilder.cpp (NFC) from to Erase fence insertion from SelectionDAGBuilder.cpp (NFC).
Sep 23 2014, 3:59 PM
morisset closed D5404: [X86] Make wide loads be managed by AtomicExpand.

Closed by commit rL218332 (authored by @morisset).

Sep 23 2014, 2:09 PM
morisset updated the diff for D5422: Lower idempotent RMWs to fence+load.

Add requested comments

Sep 23 2014, 2:02 PM
morisset closed D5180: [Power] Use AtomicExpandPass for fence insertion, and use lwsync where appropriate.

Closed by commit rL218331 (authored by @morisset).

Sep 23 2014, 1:57 PM
morisset closed D5179: Add AtomicExpandPass::bracketInstWithFences, and use it whenever getInsertFencesForAtomic would trigger in SelectionDAGBuilder.

Closed by commit rL218329 (authored by @morisset).

Sep 23 2014, 1:41 PM
morisset added inline comments to D5179: Add AtomicExpandPass::bracketInstWithFences, and use it whenever getInsertFencesForAtomic would trigger in SelectionDAGBuilder.
Sep 23 2014, 11:44 AM
morisset updated the diff for D5422: Lower idempotent RMWs to fence+load.

Partial cleanup of the control-flow + extra test + fixed formatting/typos.

Sep 23 2014, 11:30 AM
morisset added a comment to D5422: Lower idempotent RMWs to fence+load.

Thanks for the review. Answers inline, and patch next.

Sep 23 2014, 11:29 AM
morisset updated the diff for D5404: [X86] Make wide loads be managed by AtomicExpand.

Use CHECK-LABEL and CHECK-NEXT

Sep 23 2014, 10:25 AM
morisset added inline comments to D5404: [X86] Make wide loads be managed by AtomicExpand.
Sep 23 2014, 10:24 AM
morisset updated the diff for D5404: [X86] Make wide loads be managed by AtomicExpand.

Preserve FIXME comment, and refine tests.

Sep 23 2014, 10:01 AM
morisset added a comment to D5404: [X86] Make wide loads be managed by AtomicExpand.

The tests for cmpxchg16b were indeed testing for the lock prefix, but not the tests for the 8b version. I will upload a new patch with an improved test (and preserved FIXME).

Sep 23 2014, 9:58 AM
morisset updated the diff for D5179: Add AtomicExpandPass::bracketInstWithFences, and use it whenever getInsertFencesForAtomic would trigger in SelectionDAGBuilder.

Updated comments.

Sep 23 2014, 9:36 AM

Sep 22 2014

morisset added a comment to D5179: Add AtomicExpandPass::bracketInstWithFences, and use it whenever getInsertFencesForAtomic would trigger in SelectionDAGBuilder.

Thanks for the review, I will fix the comments as described below.

Sep 22 2014, 10:37 AM

Sep 19 2014

morisset abandoned D5091: [X86] replace (atomic fetch_add of 0) by (mfence; mov).

Doing this optimization so late was looking increasingly brittle as it must interact with AtomicExpandPass earlier at the IR level.
I reimplemented it at that level in D5422, and it turned out a lot cleaner.

Sep 19 2014, 4:53 PM
morisset retitled D5422: Lower idempotent RMWs to fence+load from to Lower idempotent RMWs to fence+load.
Sep 19 2014, 4:51 PM

Sep 18 2014

morisset retitled D5404: [X86] Make wide loads be managed by AtomicExpand from to [X86] Make wide loads be managed by AtomicExpand.
Sep 18 2014, 3:52 PM
morisset updated the diff for D5179: Add AtomicExpandPass::bracketInstWithFences, and use it whenever getInsertFencesForAtomic would trigger in SelectionDAGBuilder.

Cleanup of some now redundant code that I had missed.

Sep 18 2014, 1:02 PM
morisset closed D5386: Restore "[ARM, Fix] Fix emitLeading/TrailingFence on old ARM processors".

Closed by commit rL218066 (authored by @morisset).

Sep 18 2014, 12:06 PM

Sep 17 2014

morisset retitled D5386: Restore "[ARM, Fix] Fix emitLeading/TrailingFence on old ARM processors" from to Restore "[ARM, Fix] Fix emitLeading/TrailingFence on old ARM processors".
Sep 17 2014, 12:43 PM
morisset closed D5304: [ARM, Fix] Fix emitLeading/TrailingFence on old ARM processors.

Closed by commit rL217965 (authored by @morisset).

Sep 17 2014, 10:51 AM

Sep 16 2014

morisset updated the diff for D5304: [ARM, Fix] Fix emitLeading/TrailingFence on old ARM processors.

Rebase on trunk.

Sep 16 2014, 5:30 PM
morisset added a comment to D5304: [ARM, Fix] Fix emitLeading/TrailingFence on old ARM processors.

There is no barrier generated by the other path, the entire operation (store/load/rmw/whatever) is turned into a libcall on these old processors. It is tested (summarily) at the end of atomic-load-store.ll at least.

Sep 16 2014, 5:29 PM
morisset closed D5090: [X86] Use the generic AtomicExpandPass instead of X86AtomicExpandPass.

r217928

Sep 16 2014, 5:22 PM
morisset added a comment to D5364: Refactoring SimplifyLibCalls to remove static initializers and generally cleaning up the code..

It mostly seems reasonable to me, but I don't know this area of the code, and do not feel at all comfortable giving an LGTM, so someone more experienced should look at it.

Sep 16 2014, 5:13 PM
morisset added a comment to D5090: [X86] Use the generic AtomicExpandPass instead of X86AtomicExpandPass.

Thanks !
I will commit it soon (currently testing after rebase on trunk).

Sep 16 2014, 4:50 PM
morisset added a comment to D5090: [X86] Use the generic AtomicExpandPass instead of X86AtomicExpandPass.

ping. Is this patch good to go ? There are several other patches with dependencies on it (such as D5179 - also needing review, and D5180 - that already received an LGTM).

Sep 16 2014, 2:37 PM

Sep 12 2014

morisset updated the diff for D5304: [ARM, Fix] Fix emitLeading/TrailingFence on old ARM processors.

Thanks jf for catching this forgotten test!

Sep 12 2014, 10:45 AM

Sep 11 2014

morisset added a comment to D5316: [Power] Add a new feature flag for support of the sync instruction.

I did not know that ppc is also supposed to be a generic processor. I can
either add HasSYNC to it, or remove it from generic (and change the broken
tests). Which would you prefer ?

Sep 11 2014, 3:48 PM
morisset retitled D5317: [Power] Use lwsync for non-seq_cst fences from to [Power] Use lwsync for non-seq_cst fences.
Sep 11 2014, 2:56 PM
morisset added a comment to D5316: [Power] Add a new feature flag for support of the sync instruction.

I had originally left HasSYNC out of generic, but added it in because it broke a bunch of tests (which did not specify anything about the processor and yet were expecting syncs). Should I instead modify all of these tests to use a given processor ? If so which one ?

Sep 11 2014, 2:56 PM
morisset retitled D5316: [Power] Add a new feature flag for support of the sync instruction from to [Power] Add a new feature flag for support of the sync instruction.
Sep 11 2014, 2:38 PM

Sep 10 2014

morisset retitled D5304: [ARM, Fix] Fix emitLeading/TrailingFence on old ARM processors from to [ARM, Fix] Fix emitLeading/TrailingFence on old ARM processors.
Sep 10 2014, 5:39 PM

Sep 5 2014

morisset updated the diff for D5090: [X86] Use the generic AtomicExpandPass instead of X86AtomicExpandPass.

Improve comments.

Sep 5 2014, 11:24 AM

Sep 4 2014

morisset updated the diff for D5180: [Power] Use AtomicExpandPass for fence insertion, and use lwsync where appropriate.

Fix typo and add comment.

Sep 4 2014, 3:17 PM
morisset updated the diff for D5180: [Power] Use AtomicExpandPass for fence insertion, and use lwsync where appropriate.

Add a Fixme about the use of IsBookE flag for sync support.

Sep 4 2014, 10:40 AM
morisset updated the diff for D5180: [Power] Use AtomicExpandPass for fence insertion, and use lwsync where appropriate.

Fix formatting based on hfinkel comments.

Sep 4 2014, 10:34 AM

Sep 3 2014

morisset added parent revisions for D5180: [Power] Use AtomicExpandPass for fence insertion, and use lwsync where appropriate: D5179: Add AtomicExpandPass::bracketInstWithFences, and use it whenever getInsertFencesForAtomic would trigger in SelectionDAGBuilder, D5090: [X86] Use the generic AtomicExpandPass instead of X86AtomicExpandPass.
Sep 3 2014, 4:27 PM
morisset retitled D5180: [Power] Use AtomicExpandPass for fence insertion, and use lwsync where appropriate from to [Power] Use AtomicExpandPass for fence insertion, and use lwsync where appropriate.
Sep 3 2014, 4:23 PM
morisset retitled D5179: Add AtomicExpandPass::bracketInstWithFences, and use it whenever getInsertFencesForAtomic would trigger in SelectionDAGBuilder from to Add AtomicExpandPass::bracketInstWithFences, and use it whenever getInsertFencesForAtomic would trigger in SelectionDAGBuilder.
Sep 3 2014, 4:10 PM
morisset closed D5035: Refactor AtomicExpandPass and add a generic isAtomic() method to Instruction.

Closed by commit rL217080 (authored by @morisset).

Sep 3 2014, 2:39 PM
morisset closed D4960: Implement emitLeading/TrailingFence in the ARM backend.

r217076
Thanks everyone

Sep 3 2014, 2:19 PM
morisset added inline comments to D5091: [X86] replace (atomic fetch_add of 0) by (mfence; mov).
Sep 3 2014, 2:09 PM
morisset added a comment to D4960: Implement emitLeading/TrailingFence in the ARM backend.

Ok, can the revision be accepted then ? Thanks.

Sep 3 2014, 12:43 PM
morisset updated the diff for D5090: [X86] Use the generic AtomicExpandPass instead of X86AtomicExpandPass.

Fix wrong update

Sep 3 2014, 11:54 AM
morisset updated the diff for D5090: [X86] Use the generic AtomicExpandPass instead of X86AtomicExpandPass.

Update comment.

Sep 3 2014, 11:53 AM
morisset updated the diff for D4960: Implement emitLeading/TrailingFence in the ARM backend.

MakeDMB -> makeDMB
Erase comment.

Sep 3 2014, 11:52 AM
morisset added a comment to D5091: [X86] replace (atomic fetch_add of 0) by (mfence; mov).

Thanks for the review, I will modify the comments accordingly.

Sep 3 2014, 11:43 AM
morisset added a comment to D5090: [X86] Use the generic AtomicExpandPass instead of X86AtomicExpandPass.

There was already a lot of tests for atomics on X86 (test/CodeGen/X86/atomic*), so I did not modify them in this patch, just checked that they still pass.

Sep 3 2014, 10:50 AM
morisset added a comment to D4960: Implement emitLeading/TrailingFence in the ARM backend.

Thanks for the review !

Sep 3 2014, 10:38 AM

Sep 2 2014

morisset added a comment to D4960: Implement emitLeading/TrailingFence in the ARM backend.

ping.

Sep 2 2014, 3:52 PM
morisset closed D4796: [X86] Allow atomic operations using immediates to avoid using a register.

Commited r216980

Sep 2 2014, 3:37 PM
morisset accepted D4796: [X86] Allow atomic operations using immediates to avoid using a register.

Got a lgtm from Nadav Rotem on the mailing list.

Sep 2 2014, 3:36 PM
morisset closed D5019: Fix MemoryDependenceAnalysis in cases where QueryInstr is a CmpXchg or a AtomicRMW.

Closed by commit rL216940 (authored by @morisset).

Sep 2 2014, 1:27 PM

Aug 29 2014

morisset updated the diff for D5019: Fix MemoryDependenceAnalysis in cases where QueryInstr is a CmpXchg or a AtomicRMW.

Fix the formatting

Aug 29 2014, 5:29 PM