Page MenuHomePhabricator

marels (Martin Elshuber)
User

Projects

User does not belong to any projects.

User Details

User Since
Sep 11 2018, 6:41 AM (203 w, 5 d)

Recent Activity

Dec 10 2018

marels added inline comments to D53706: [MultiTailCallElimination]: Pass to eliminate multiple tail calls.
Dec 10 2018, 3:06 AM

Dec 4 2018

marels retitled D53706: [MultiTailCallElimination]: Pass to eliminate multiple tail calls from [RecursionStackElimination]: Pass to eliminate recursions to [MultiTailCallElimination]: Pass to eliminate multiple tail calls.
Dec 4 2018, 8:10 AM
marels added a comment to D53706: [MultiTailCallElimination]: Pass to eliminate multiple tail calls.

I implemented in a way such that it should be correct work on all architectures. I just use the TargetTransformInfo for the heuristics. Everything else completely target independent.
So, fundamentally wrong I think it is unlikely. If they profit from converting recursion into loops by explicitly modelling the state on the stack, I cannot tell (especially regarding performance).
As I can benchmark only on AArch64 in a suitable way, I prefer to enable this patch only for AArch64 and leave other architectures out for this initial version.

Dec 4 2018, 8:09 AM
marels added inline comments to D53706: [MultiTailCallElimination]: Pass to eliminate multiple tail calls.
Dec 4 2018, 7:39 AM
marels updated the diff for D53706: [MultiTailCallElimination]: Pass to eliminate multiple tail calls.
  • Renamed Pass to MultiTailCallElimination as it seems more fitting
    • Removed Unnecessary Marker Implementation
    • Changed Code Generation quite a bit
    • Add heuristic allowing to skip transformation if not profitable
    • Add Test Cases
    • Enable pass only for O3
Dec 4 2018, 7:19 AM

Nov 20 2018

marels added a comment to D53706: [MultiTailCallElimination]: Pass to eliminate multiple tail calls.

I also did some measurements on a use-case to compare the marker algorithm (AArch64): Physics-Simulation in the hot loop the application traverses a full octree.

Nov 20 2018, 9:02 AM
marels added a comment to D53706: [MultiTailCallElimination]: Pass to eliminate multiple tail calls.

I think there was some confusion in how the lists are managed.

Nov 20 2018, 8:38 AM

Nov 19 2018

marels added inline comments to D52653: [CodeGen, AArch64] Combine Interleaved Loads which are not covered by the Vectorizer.
Nov 19 2018, 11:03 AM
marels added inline comments to D52653: [CodeGen, AArch64] Combine Interleaved Loads which are not covered by the Vectorizer.
Nov 19 2018, 10:42 AM

Nov 14 2018

marels added a comment to D53706: [MultiTailCallElimination]: Pass to eliminate multiple tail calls.

Hi & Thank you for the input:

Nov 14 2018, 3:31 AM

Nov 12 2018

marels added a comment to D52653: [CodeGen, AArch64] Combine Interleaved Loads which are not covered by the Vectorizer.

Unless I do not receive any further comments, I think it is OK to commit this patch
Thank you for supporting

Nov 12 2018, 8:06 AM
marels added a comment to D53706: [MultiTailCallElimination]: Pass to eliminate multiple tail calls.

ping

Nov 12 2018, 6:38 AM
marels updated the diff for D52653: [CodeGen, AArch64] Combine Interleaved Loads which are not covered by the Vectorizer.
  • fix typos
  • update commit message
Nov 12 2018, 5:45 AM

Oct 29 2018

marels added a comment to D52653: [CodeGen, AArch64] Combine Interleaved Loads which are not covered by the Vectorizer.

I also added missing colons in some of the test, and in the main loop I added a missing pop_back

Oct 29 2018, 10:45 AM
marels updated the diff for D52653: [CodeGen, AArch64] Combine Interleaved Loads which are not covered by the Vectorizer.
  • removed VectorWidth related thing
  • removed some comments
  • added testcases to check correct behavior in weird but possible IR inputs
Oct 29 2018, 10:35 AM

Oct 25 2018

marels added a comment to D53706: [MultiTailCallElimination]: Pass to eliminate multiple tail calls.

Revised broken formatting in summary

Oct 25 2018, 8:41 AM
marels updated the summary of D53706: [MultiTailCallElimination]: Pass to eliminate multiple tail calls.
Oct 25 2018, 8:40 AM
marels created D53706: [MultiTailCallElimination]: Pass to eliminate multiple tail calls.
Oct 25 2018, 8:26 AM

Oct 17 2018

marels updated the diff for D52653: [CodeGen, AArch64] Combine Interleaved Loads which are not covered by the Vectorizer.

Add context of all changed files. Sorry I missed it the previous update

Oct 17 2018, 4:22 AM
marels updated the diff for D52653: [CodeGen, AArch64] Combine Interleaved Loads which are not covered by the Vectorizer.
  • Change Code to reflect comments from John Brawn.
  • Add InterleavedLoadCombineImpl class to explicitly avoid state propagation between different invocations
  • Remove shared_ptr<> constructs in favor to lists and in-place object handling
  • Move isInterleavedLoad into VectorInfo (also changed name to isInterleaved)
Oct 17 2018, 4:17 AM

Oct 15 2018

marels added a comment to D52653: [CodeGen, AArch64] Combine Interleaved Loads which are not covered by the Vectorizer.

@john.brawn thanks for the input. I commented on some and will upload a new revision shortly

Oct 15 2018, 11:19 AM
marels added a comment to D52653: [CodeGen, AArch64] Combine Interleaved Loads which are not covered by the Vectorizer.

@john.brawn thanks for the input. I commented on some and will upload a new revision shortly

Oct 15 2018, 11:12 AM

Oct 12 2018

marels added a comment to D52653: [CodeGen, AArch64] Combine Interleaved Loads which are not covered by the Vectorizer.

ping

Oct 12 2018, 9:27 AM

Sep 28 2018

marels created D52653: [CodeGen, AArch64] Combine Interleaved Loads which are not covered by the Vectorizer.
Sep 28 2018, 8:07 AM

Sep 27 2018

marels added a comment to D51942: [InstCombine] Fold (C/x)>0 into x>0 if possible.

Thanks

Sep 27 2018, 8:14 AM
marels added a comment to D51942: [InstCombine] Fold (C/x)>0 into x>0 if possible.

No, I do not think so. Can you do this for me?

Sep 27 2018, 8:07 AM
marels updated the diff for D51942: [InstCombine] Fold (C/x)>0 into x>0 if possible.

Fixed according to comments from @spatel

Sep 27 2018, 6:36 AM

Sep 26 2018

marels updated the diff for D51942: [InstCombine] Fold (C/x)>0 into x>0 if possible.

Changed code to use pattern matching API. When the inputs are vectors only splat vectors are considered.

Sep 26 2018, 7:40 AM
marels added a comment to D51942: [InstCombine] Fold (C/x)>0 into x>0 if possible.

I was not able to find a way to match the following predicates with the existing API.

Sorry this wasn't clear - I was only suggesting that we handle vector splat (all constants within the vector are identical or undef) patterns in this patch. You're correct that handling arbitrary vector constants is a harder problem. The API I would use here is "m_APFloat" (it deals with splat constants internally, so you probably don't need to do anything special in the calling code for this patch).

Sep 26 2018, 4:23 AM

Sep 25 2018

marels added a comment to D51942: [InstCombine] Fold (C/x)>0 into x>0 if possible.

Thank you for the input,

Sep 25 2018, 10:59 AM

Sep 11 2018

marels created D51942: [InstCombine] Fold (C/x)>0 into x>0 if possible.
Sep 11 2018, 10:44 AM