Page MenuHomePhabricator

fhahn (Florian Hahn)
User

Projects

User does not belong to any projects.

User Details

User Since
Aug 18 2016, 4:39 AM (126 w, 6 d)

Recent Activity

Today

fhahn added a comment to D57123: [MergeSets] Add infrastructure to build merge sets based on Das and Ramakrishna's paper..

The patch currently implements variant 1 form the paper (TDMSC-I) as it is slightly simpler. I will follow up with variant 2 (TDMSC-II) in a bit and do some more detailed compile time measurement. Also the only reason MergeSets live in IteratedDominanceFrontier.h/cpp was convenience, they could be moved to their own files.

Wed, Jan 23, 3:03 PM
fhahn created D57123: [MergeSets] Add infrastructure to build merge sets based on Das and Ramakrishna's paper..
Wed, Jan 23, 3:00 PM
fhahn committed rL351945: [HotColdSplitting] Remove unused SSAUpdater.h include (NFC)..
[HotColdSplitting] Remove unused SSAUpdater.h include (NFC).
Wed, Jan 23, 3:51 AM
fhahn created D57092: [HotColdSplitting] Get DT and PDT from the pass manager..
Wed, Jan 23, 3:48 AM

Yesterday

fhahn added a comment to D57033: [SSAUpdater] Handle case with single available value faster..

Sounds good to me, I'll move the code to LCSSA without DT checks. Thanks!

Tue, Jan 22, 1:19 PM
fhahn added a comment to D57033: [SSAUpdater] Handle case with single available value faster..

If LCSSA just doesn't care that it's creating dead references to variables instead of undefs, it would be faster to completely skip the check. Otherwise, sure, that makes sense.

Tue, Jan 22, 1:12 PM
fhahn added a comment to D57033: [SSAUpdater] Handle case with single available value faster..

SSAUpdater has code to insert undef in certain cases. Can you prove those cases don't apply to all callers of GetValueAtEndOfBlock?

Tue, Jan 22, 12:20 PM

Mon, Jan 21

fhahn created D57033: [SSAUpdater] Handle case with single available value faster..
Mon, Jan 21, 12:44 PM

Fri, Jan 18

fhahn added a comment to D56921: [LCSSA] Add expensive verification of LCSSA form for sub-loops..

Just please make sure to run stage3 build and verify this doesn't break.

Fri, Jan 18, 11:46 AM
fhahn committed rL351571: [SelectionDAG] Split very large token factors for chained stores to 64k chunks..
[SelectionDAG] Split very large token factors for chained stores to 64k chunks.
Fri, Jan 18, 10:43 AM
fhahn closed D56740: [SelectionDAG] Split very large token factors for chained stores to 64k chunks..
Fri, Jan 18, 10:42 AM
fhahn created D56921: [LCSSA] Add expensive verification of LCSSA form for sub-loops..
Fri, Jan 18, 9:56 AM
fhahn committed rL351567: [LCSSA] Skip blocks in sub-loops when scanning for uses..
[LCSSA] Skip blocks in sub-loops when scanning for uses.
Fri, Jan 18, 9:40 AM
fhahn closed D56848: [LCSSA] Skip blocks in sub-loops when scanning for uses..
Fri, Jan 18, 9:40 AM
fhahn added inline comments to D56848: [LCSSA] Skip blocks in sub-loops when scanning for uses..
Fri, Jan 18, 9:39 AM
fhahn committed rL351552: [SelectionDAG] Add getTokenFactor, which splits nodes with > 64k operands..
[SelectionDAG] Add getTokenFactor, which splits nodes with > 64k operands.
Fri, Jan 18, 6:10 AM
fhahn closed D56739: [SelectionDAG] Add getTokenFactor, which splits nodes with > 64k operands..
Fri, Jan 18, 6:09 AM
fhahn committed rL351537: [SelectionDAG] Add static getMaxNumOperands function to SDNode..
[SelectionDAG] Add static getMaxNumOperands function to SDNode.
Fri, Jan 18, 2:06 AM
fhahn closed D56859: [SelectionDAG] Add static getMaxNumOperands function to SDNode..
Fri, Jan 18, 2:06 AM

Thu, Jan 17

fhahn added inline comments to D56848: [LCSSA] Skip blocks in sub-loops when scanning for uses..
Thu, Jan 17, 1:06 PM
fhahn updated the diff for D56848: [LCSSA] Skip blocks in sub-loops when scanning for uses..

state explicitly that sub-loops must be in LCSSA form already for formLCSSA()

Thu, Jan 17, 1:03 PM
fhahn updated the diff for D56859: [SelectionDAG] Add static getMaxNumOperands function to SDNode..

Fix typo.

Thu, Jan 17, 9:15 AM
fhahn added inline comments to D56859: [SelectionDAG] Add static getMaxNumOperands function to SDNode..
Thu, Jan 17, 9:02 AM
fhahn updated the diff for D56859: [SelectionDAG] Add static getMaxNumOperands function to SDNode..

Update places to use getMaxNumOperands, add more context to description.

Thu, Jan 17, 8:56 AM
fhahn added inline comments to D56739: [SelectionDAG] Add getTokenFactor, which splits nodes with > 64k operands..
Thu, Jan 17, 8:48 AM
fhahn updated the diff for D56739: [SelectionDAG] Add getTokenFactor, which splits nodes with > 64k operands..

updated to use SDNode::getMaxNumOperands

Thu, Jan 17, 8:46 AM
fhahn created D56859: [SelectionDAG] Add static getMaxNumOperands function to SDNode..
Thu, Jan 17, 8:44 AM
fhahn updated the summary of D56848: [LCSSA] Skip blocks in sub-loops when scanning for uses..
Thu, Jan 17, 5:11 AM
fhahn created D56848: [LCSSA] Skip blocks in sub-loops when scanning for uses..
Thu, Jan 17, 5:11 AM

Wed, Jan 16

fhahn updated the diff for D56740: [SelectionDAG] Split very large token factors for chained stores to 64k chunks..

update to use getTokenFactor.

Wed, Jan 16, 2:03 PM
fhahn added a comment to D56739: [SelectionDAG] Add getTokenFactor, which splits nodes with > 64k operands..

@fhahn Are you intending to look at the similar SDNode::NumValues limit? PR7250 + PR37000 ?

Wed, Jan 16, 2:01 PM
fhahn updated the diff for D56739: [SelectionDAG] Add getTokenFactor, which splits nodes with > 64k operands..

Thanks Eli, changed to getTokenFactor.

Wed, Jan 16, 1:59 PM
fhahn committed rL351318: [SelectionDAG] Update check in createOperands to reflect max() is a valid value..
[SelectionDAG] Update check in createOperands to reflect max() is a valid value.
Wed, Jan 16, 2:10 AM
fhahn closed D56738: [SelectionDAG] Update check in createOperands to reflect max() is a valid value..
Wed, Jan 16, 2:10 AM

Tue, Jan 15

fhahn added a comment to D56738: [SelectionDAG] Update check in createOperands to reflect max() is a valid value..

Can any test case be added here, or would it be just too big to realistically test?

Tue, Jan 15, 12:10 PM
fhahn added a parent revision for D56740: [SelectionDAG] Split very large token factors for chained stores to 64k chunks.: D56739: [SelectionDAG] Add getTokenFactor, which splits nodes with > 64k operands..
Tue, Jan 15, 12:03 PM
fhahn added a child revision for D56739: [SelectionDAG] Add getTokenFactor, which splits nodes with > 64k operands.: D56740: [SelectionDAG] Split very large token factors for chained stores to 64k chunks..
Tue, Jan 15, 12:03 PM
fhahn created D56740: [SelectionDAG] Split very large token factors for chained stores to 64k chunks..
Tue, Jan 15, 12:03 PM
fhahn created D56739: [SelectionDAG] Add getTokenFactor, which splits nodes with > 64k operands..
Tue, Jan 15, 12:01 PM
fhahn created D56738: [SelectionDAG] Update check in createOperands to reflect max() is a valid value..
Tue, Jan 15, 11:58 AM
fhahn closed D56679: [InstCombine] Don't undo 0 - (X * Y) canonicalization when combining subs..

Committed in rL351187, thanks!

Tue, Jan 15, 3:27 AM
fhahn committed rL351187: [InstCombine] Don't undo 0 - (X * Y) canonicalization when combining subs..
[InstCombine] Don't undo 0 - (X * Y) canonicalization when combining subs.
Tue, Jan 15, 3:22 AM

Mon, Jan 14

fhahn added inline comments to D56679: [InstCombine] Don't undo 0 - (X * Y) canonicalization when combining subs..
Mon, Jan 14, 12:25 PM
fhahn updated the diff for D56679: [InstCombine] Don't undo 0 - (X * Y) canonicalization when combining subs..

Thanks, added check for ConstantExpr.

Mon, Jan 14, 12:18 PM
fhahn updated the summary of D56679: [InstCombine] Don't undo 0 - (X * Y) canonicalization when combining subs..
Mon, Jan 14, 11:32 AM
fhahn retitled D56679: [InstCombine] Don't undo 0 - (X * Y) canonicalization when combining subs. from [InstCombine] Avoid undoing 0 - (X * Y) canonicalization when combining subs. to [InstCombine] Don't undo 0 - (X * Y) canonicalization when combining subs..
Mon, Jan 14, 11:31 AM
fhahn created D56679: [InstCombine] Don't undo 0 - (X * Y) canonicalization when combining subs..
Mon, Jan 14, 11:31 AM
fhahn created D56668: [WIP][CodeGenPrepare] Duplicate and sink shuffles and extends if they can be done for free..
Mon, Jan 14, 8:23 AM

Sat, Jan 12

fhahn accepted D53349: [VPlan] Changes to implement VPlan based predication for VPlan-native path..

LGTM, thanks! Please wait a bit with committing, in case anybody else has additional comments.

Sat, Jan 12, 10:50 AM
fhahn accepted D56572: [AArch64] Add new target feature to fuse arithmetic and logic operations.

LGTM, thanks! It would be great if you could extend the test cases a bit more before committing, covering cases with shifted regs and cases where some SUs have multiple successors and not all of them should be fused.

Sat, Jan 12, 10:17 AM

Fri, Jan 11

fhahn requested changes to D46283: [AArch64] Set vectorizer-maximize-bandwidth as default true.

Requesting changes as additional info on the bzip and PDF regressions is needed.

Fri, Jan 11, 9:20 AM
fhahn added a comment to D44450: [InlineCost] Don't return early when allowSizeGrowth(CS) is false.

Jessica, are you still planing on pushing this change? :)

Fri, Jan 11, 9:17 AM
fhahn added reviewers for D56596: Enable fma formation for fp16 on x86 and aarch64: SjoerdMeijer, olista01.

The AArch64 side LGTM. Also adding Oliver and Sjoerd as they worked on FP16 side as well, in case they have any thoughts.

Fri, Jan 11, 9:16 AM
fhahn added a comment to D47949: [callsitesplit] Limit the # of predecessors walk when recording condition.

Thanks for the change @fhahn. I will dig up my test case and run it.

Fri, Jan 11, 8:58 AM
fhahn added a comment to D56572: [AArch64] Add new target feature to fuse arithmetic and logic operations.

Would it be possible to have a MIR test here, which just runs the scheduler?

I tried to keep the existing pattern, but I could add one too. Should it be done now or later?

Fri, Jan 11, 8:58 AM
fhahn added a comment to D48818: [VPlan] Introduce simplifyPlainCFG step in H-CFG builder..

Diego, are you still planning to land this patch?

Fri, Jan 11, 8:52 AM
fhahn requested changes to D47943: Sample code for porting MachinePipeliner to AArch64+SVE.

Marking this as requiring changes, as the MachinePipeliner code is getting refactored a bit in D56084. Please let us know when you want us to review the patch again.

Fri, Jan 11, 8:47 AM
fhahn added a comment to D56135: Allow reverse iterators for `MachineBasicBlock::findDebugLoc` and `MachineBasicBlock::findPrevDebugLoc`.

Are there any instances in the current codebase where this would be useful?

For example X86InstrInfo::AnalyzeBranchImpl can be rewritten with more natural reverse iterators.

Fri, Jan 11, 8:39 AM
fhahn added a comment to D56572: [AArch64] Add new target feature to fuse arithmetic and logic operations.

Would it be possible to have a MIR test here, which just runs the scheduler?

Fri, Jan 11, 8:35 AM

Wed, Jan 9

fhahn added a comment to D56406: Remove check for single use in ShrinkDemandedConstant.

LGTM

LGTM, thanks, but it would be good if someone more familiar with AMDGPU could also have a quick look.

Can you commit a regression testcase for this?

Wed, Jan 9, 1:09 PM
fhahn committed rL350762: [AArch64] Add test for constant shrinking with multiple users (NFC)..
[AArch64] Add test for constant shrinking with multiple users (NFC).
Wed, Jan 9, 1:08 PM
fhahn added a comment to D56482: DO NOT SUBMIT. Draft for guidelines on using Phabricator..

It might be good to mention those guidelines explicitly here https://llvm.org/docs/Contributing.html#how-to-submit-a-patch.

Wed, Jan 9, 12:32 PM
fhahn added a comment to rL350647: [NewPM] Port tsan.

The bot is back to passing after reverting: http://green.lab.llvm.org/green/job/clang-stage1-configure-RA/52757/. Please let me know if you have trouble reproducing the failure.

Wed, Jan 9, 7:16 AM
fhahn added a comment to rL350647: [NewPM] Port tsan.

One of the errors with TSan and this patch on macOS is the following while running binaries compiled with TSan:

Wed, Jan 9, 5:39 AM
fhahn committed rL350719: Revert r350647: "[NewPM] Port tsan".
Revert r350647: "[NewPM] Port tsan"
Wed, Jan 9, 5:36 AM
fhahn committed rC350718: Revert r350648: "Fix clang for r350647: Missed a function rename".
Revert r350648: "Fix clang for r350647: Missed a function rename"
Wed, Jan 9, 5:34 AM
fhahn committed rL350718: Revert r350648: "Fix clang for r350647: Missed a function rename".
Revert r350648: "Fix clang for r350647: Missed a function rename"
Wed, Jan 9, 5:34 AM
fhahn added a comment to rL350647: [NewPM] Port tsan.

It looks like this patch may have broken some bots on green dragon, e.g lots of ThreadSanitizer and UBSan-ThreadSanitizer-x86_64 tests stared failing after this went in: http://green.lab.llvm.org/green/job/clang-stage1-configure-RA/52725/

Wed, Jan 9, 3:21 AM

Tue, Jan 8

fhahn added a comment to D56084: Resubmit of rL345008 "Split MachinePipeliner code into header and cpp files".

I am seeing lldb-cmake failing again, with the same error as with this patch earlier: http://green.lab.llvm.org/green/job/lldb-cmake/16557/

Tue, Jan 8, 12:38 PM
fhahn committed rL350630: [MachineVerifier] Include offending register in allocatable live-in error msg..
[MachineVerifier] Include offending register in allocatable live-in error msg.
Tue, Jan 8, 7:20 AM
fhahn closed D55946: [MachineVerifier] Include offending register in allocatable live-in error msg..
Tue, Jan 8, 7:20 AM

Mon, Jan 7

fhahn accepted D56406: Remove check for single use in ShrinkDemandedConstant.

LGTM, thanks, but it would be good if someone more familiar with AMDGPU could also have a quick look.

Mon, Jan 7, 3:08 PM
fhahn added a comment to D56406: Remove check for single use in ShrinkDemandedConstant.

Great thanks! IMO the caller of ShrinkDemandedConstant should make sure the demanded bits valid for all users, as the existing code in TargetLowering.cpp already should do.

Mon, Jan 7, 2:42 PM
fhahn added a comment to D56289: Added single use check to ShrinkDemandedConstant.

I am seeing a regression with this change on AArch64, where we do not use shrinked constants, e.g. for and/or if they have multiple users. I am not entirely sure what the problem is this patch is solving, as I am not familiar with AMDGPU. Is it a case where shrinking the constant leads to shrinking the related Op and in the end this introduces unnecessary conversion code, as multiple users require different bitwidths?

Mon, Jan 7, 6:33 AM

Fri, Jan 4

fhahn committed rL350395: [ValueTracking] Fix a misuse of APInt in GetPointerBaseWithConstantOffset.
[ValueTracking] Fix a misuse of APInt in GetPointerBaseWithConstantOffset
Fri, Jan 4, 6:57 AM
fhahn closed D38501: [ValueTracking] Fix a misuse of APInt in GetPointerBaseWithConstantOffset.
Fri, Jan 4, 6:57 AM
fhahn added a comment to D38501: [ValueTracking] Fix a misuse of APInt in GetPointerBaseWithConstantOffset.

Thanks! Do you need someone to commit the change on your behalf?

He does. I'm planning to do it this morning, but if you do it first, that's fine too :-)

Fri, Jan 4, 5:59 AM
fhahn added a comment to D38501: [ValueTracking] Fix a misuse of APInt in GetPointerBaseWithConstantOffset.

Thanks! Do you need someone to commit the change on your behalf?

Fri, Jan 4, 3:18 AM

Thu, Jan 3

fhahn added a comment to D56242: Elevate instructions across if-else blocks for better constant propagation.

Thanks for the patch! I plan to have a closer look in the next few days.

Thu, Jan 3, 2:45 PM
fhahn added a comment to rL350290: Resubmit rL345008 "Split MachinePipeliner code into header and cpp files".

Looks like the patch has just been reverted in rL350345. Please let me know if you have problems reproducing the failure.

Thu, Jan 3, 11:17 AM
fhahn added a comment to rL350290: Resubmit rL345008 "Split MachinePipeliner code into header and cpp files".

It looks like this commit breaks the lldb-cmake bot again, with the following error (http://green.lab.llvm.org/green/job/lldb-cmake/16085/consoleFull#-15612593049ba4694-19c4-4d7e-bec5-911270d8a58c) :

Thu, Jan 3, 11:12 AM

Wed, Jan 2

fhahn added a comment to D56135: Allow reverse iterators for `MachineBasicBlock::findDebugLoc` and `MachineBasicBlock::findPrevDebugLoc`.

Are there any instances in the current codebase where this would be useful?

Wed, Jan 2, 8:53 AM

Dec 21 2018

fhahn committed rL349935: [ARM] Set Defs = [CPSR] for COPY_STRUCT_BYVAL, as it clobbers CPSR..
[ARM] Set Defs = [CPSR] for COPY_STRUCT_BYVAL, as it clobbers CPSR.
Dec 21 2018, 10:12 AM
fhahn closed D55909: [ARM] Set Defs = [CPSR] for COPY_STRUCT_BYVAL, as it clobbers CPSR..
Dec 21 2018, 10:11 AM

Dec 20 2018

fhahn added inline comments to D55909: [ARM] Set Defs = [CPSR] for COPY_STRUCT_BYVAL, as it clobbers CPSR..
Dec 20 2018, 6:20 PM
fhahn updated the diff for D55909: [ARM] Set Defs = [CPSR] for COPY_STRUCT_BYVAL, as it clobbers CPSR..

Updated to just use Defs = [CPSR], as suggested by Eli.

Dec 20 2018, 6:19 PM
fhahn added inline comments to D55909: [ARM] Set Defs = [CPSR] for COPY_STRUCT_BYVAL, as it clobbers CPSR..
Dec 20 2018, 5:38 PM
fhahn updated the diff for D55909: [ARM] Set Defs = [CPSR] for COPY_STRUCT_BYVAL, as it clobbers CPSR..

Reduced patch to just add CPSR to live-ins of newly created blocks, if required.

Dec 20 2018, 4:36 PM
fhahn added a comment to D55909: [ARM] Set Defs = [CPSR] for COPY_STRUCT_BYVAL, as it clobbers CPSR..

The interesting constraint here is that ExpandISelPesudos runs immediately after ISel. And if you have an instruction which produces physical registers from an allocatable register class, ISel will COPY the values to virtual registers immediately after that instruction (and similarly for instructions that consume physical registers). So in practice, on ARM, the only physical register with a non-trivial live interval in ExpandISelPesudos is CPSR.

Granted, that's a bit fragile, and it isn't formally enforced anywhere.

Dec 20 2018, 1:27 PM
fhahn added a comment to D55909: [ARM] Set Defs = [CPSR] for COPY_STRUCT_BYVAL, as it clobbers CPSR..

I'm a little concerned computeAndAddLiveIns is more expensive than checkAndUpdateCPSRKill. In theory, they're both quadratic, but in practice I would expect checkAndUpdateCPSRKill is much cheaper: it only tracks one register, and quits immediately when it finds a def of CPSR.

Dec 20 2018, 12:43 PM
fhahn created D55946: [MachineVerifier] Include offending register in allocatable live-in error msg..
Dec 20 2018, 12:18 PM
fhahn updated the diff for D55909: [ARM] Set Defs = [CPSR] for COPY_STRUCT_BYVAL, as it clobbers CPSR..

Use LivePhysRegs for liveins computation, pushed code to ARMISelLowering, handled all cases I could spot there. Adding Eli, as he fixed a similar problem in D54192, which I think should also be handled by the general liveins computation.

Dec 20 2018, 12:01 PM
fhahn committed rL349794: [LAA] Avoid generating RT checks for known deps preventing vectorization..
[LAA] Avoid generating RT checks for known deps preventing vectorization.
Dec 20 2018, 10:53 AM
fhahn closed D55798: [LAA] Avoid generating RT checks for known deps preventing vectorization..
Dec 20 2018, 10:53 AM
fhahn added inline comments to D55909: [ARM] Set Defs = [CPSR] for COPY_STRUCT_BYVAL, as it clobbers CPSR..
Dec 20 2018, 10:24 AM

Dec 19 2018

fhahn added inline comments to D55798: [LAA] Avoid generating RT checks for known deps preventing vectorization..
Dec 19 2018, 6:39 PM
fhahn updated the diff for D55798: [LAA] Avoid generating RT checks for known deps preventing vectorization..

Rename to PossiblySafeWithRtChecks and FoundNonConstantDistanceDependence.

Dec 19 2018, 6:35 PM
fhahn created D55909: [ARM] Set Defs = [CPSR] for COPY_STRUCT_BYVAL, as it clobbers CPSR..
Dec 19 2018, 6:16 PM
fhahn added a comment to D55538: [InterleavedAccessAnalysis] Fix integer overflow in insertMember..

I don't understand the implications of returning false here...

I'd like to see a general plan for eventually excising the use of "int" from all the interleaving code before we start making changes here, so it's clear we're heading in the right direction.

Dec 19 2018, 2:16 PM
fhahn added inline comments to D55798: [LAA] Avoid generating RT checks for known deps preventing vectorization..
Dec 19 2018, 1:09 PM