fhahn (Florian Hahn)
User

Projects

User does not belong to any projects.

User Details

User Since
Aug 18 2016, 4:39 AM (68 w, 4 d)

Recent Activity

Yesterday

fhahn added a reviewer for D41003: Silence GCC 7 warning by using an enum class.: RKSimon.

@RKSimon I saw that you committed a couple of changes fixing some GCC warnings in the Hexagon backend. Thank you very much for that! This patch here fixes the same warning as rL320254, in a different way.

Sun, Dec 10, 4:13 AM

Sat, Dec 9

fhahn committed rL320252: [InlineFunction] Set debug loc for call to forward varargs..
[InlineFunction] Set debug loc for call to forward varargs.
Sat, Dec 9, 6:26 AM
fhahn closed D40432: [InlineFunction] Set debug loc for call to forward varargs..
Sat, Dec 9, 6:26 AM
fhahn added a comment to D38585: Enable interprocedural optimization in libquantum - LLVM-part [WIP].

Hi Tobias,

Sat, Dec 9, 4:51 AM

Fri, Dec 8

fhahn committed rL320199: [CodeExtractor] Add debug locations for new call and branch instrs..
[CodeExtractor] Add debug locations for new call and branch instrs.
Fri, Dec 8, 1:49 PM
fhahn closed D40413: [CodeExtractor] Add debug locations for new call and branch instrs..
Fri, Dec 8, 1:49 PM
fhahn committed rL320145: Publication: Fix date for I/O Optimisation and elimination via partial….
Publication: Fix date for I/O Optimisation and elimination via partial…
Fri, Dec 8, 6:05 AM
fhahn committed rL320143: Publication: I/O Optimisation and elimination via partial evaluation.
Publication: I/O Optimisation and elimination via partial evaluation
Fri, Dec 8, 4:39 AM
fhahn updated the diff for D40413: [CodeExtractor] Add debug locations for new call and branch instrs..

Fix code to use the first debug location found, not the last! Updated the test case with multiple debug locations, to check we actually pick the first one, if there are multiple blocks with debug info.

Fri, Dec 8, 4:17 AM
fhahn created D41003: Silence GCC 7 warning by using an enum class..
Fri, Dec 8, 3:10 AM

Thu, Dec 7

fhahn added a reviewer for D40696: Enable aggressive FMA on T99 and provide AArch64 option for other micro-arch's: fhahn.

Thanks! I agree with Matthias in that a target feature would be better to enable/disable aggressive FMA. In AArch64.td there are already quite a lot of similar target features, like FeatureFuseAES or FeatureSlowSTRQro that enable certain optimizations for some micro architectures.

Thu, Dec 7, 12:05 PM
fhahn added a comment to D40973: [LV] Remove unnecessary DoExtraAnalysis guard (silent bug).

This seems like a bug indeed! Could you add a test case without a preheader, to make sure we catch this regression in the future?

Thu, Dec 7, 11:28 AM
fhahn added inline comments to D40728: [CallSiteSplitting] Refactor creating callsites..
Thu, Dec 7, 10:05 AM
fhahn updated the diff for D40728: [CallSiteSplitting] Refactor creating callsites..

Added more test cases and fixed infinite loop/

Thu, Dec 7, 10:01 AM

Wed, Dec 6

fhahn committed rL319982: Remove `lnt update` command..
Remove `lnt update` command.
Wed, Dec 6, 2:55 PM
fhahn closed D40704: Remove `lnt update` command. by committing rL319982: Remove `lnt update` command..
Wed, Dec 6, 2:55 PM
fhahn closed D40704: Remove `lnt update` command..
Wed, Dec 6, 2:55 PM
fhahn added inline comments to D40728: [CallSiteSplitting] Refactor creating callsites..
Wed, Dec 6, 2:53 PM
fhahn committed rL319980: [AArch64] Add patterns to replace fsub fmul with fma fneg..
[AArch64] Add patterns to replace fsub fmul with fma fneg.
Wed, Dec 6, 2:49 PM
fhahn closed D40306: [AArch64] Add patterns to replace fsub fmul with fma fneg..
Wed, Dec 6, 2:49 PM
fhahn committed rL319951: [MachineCombiner] Add up latencies of all instructions in new pattern..
[MachineCombiner] Add up latencies of all instructions in new pattern.
Wed, Dec 6, 12:28 PM
fhahn closed D40307: [MachineCombiner] Add up latencies of all instructions in new pattern..
Wed, Dec 6, 12:28 PM
fhahn committed rL319947: [InlineFunction] Only replace call if there are VarArgs to forward..
[InlineFunction] Only replace call if there are VarArgs to forward.
Wed, Dec 6, 11:48 AM
fhahn closed D40412: [InlineFunction] Only replace call if there are VarArgs to forward..
Wed, Dec 6, 11:47 AM
fhahn added a comment to D40412: [InlineFunction] Only replace call if there are VarArgs to forward..

Thanks for having a look Eli! Dropping all the info is not intentional, this was missed during review. I've patch that adds debug locations D40432 and will follow this up with another patch preserving other metadata & co

Wed, Dec 6, 11:21 AM
fhahn accepted D40307: [MachineCombiner] Add up latencies of all instructions in new pattern..

Accept in Phabricator after LGTM by Gerolf and Evandro, to make Arc happy.

Wed, Dec 6, 9:27 AM
fhahn updated the diff for D40307: [MachineCombiner] Add up latencies of all instructions in new pattern..

Thanks for having a look. There are no perf regressions on Cortex-A57 on SPEC2k, the lnt test suite and SPEC2006. I'll commit this now, but will also run SPEC2017 on another board.

Wed, Dec 6, 9:26 AM
fhahn added a reviewer for D40412: [InlineFunction] Only replace call if there are VarArgs to forward.: efriedma.
Wed, Dec 6, 9:21 AM
fhahn updated the diff for D40413: [CodeExtractor] Add debug locations for new call and branch instrs..

Thanks for all the feedback, it's been really helpful. I have updated the code to look for a valid debug location in all extracted blocks and added a test case where there is no debug loc in the header.

Wed, Dec 6, 9:19 AM
fhahn accepted D36704: [CodeGen] Improve the consistency of instruction fusion.

Thanks Evandro. I think this looks good to me now. I still think it's unfortunate that we basically duplicate the loop to add artificial dependences, but I don't see a way to share the code without making it more complicated than it is now. Please wait a bit with committing, to give Matthias a chance to voice additional concerns.

Wed, Dec 6, 7:03 AM
fhahn added a comment to D40883: [LV] Ignore the cost of values that will not appear in the vectorized loop.

The comment next to the VecValuesToIgnore declaration explicitly states that those instructions should be skipped during cost modelling if VF > 1 (/// Values to ignore in the cost model when VF > 1.)

Wed, Dec 6, 2:23 AM

Tue, Dec 5

fhahn added inline comments to D40413: [CodeExtractor] Add debug locations for new call and branch instrs..
Tue, Dec 5, 1:39 PM
fhahn added a comment to D38279: [MachineScheduler] Enable latency heuristic based on scheduled lat..

This makes sense to me as well, but I wonder why Issued gets a different value depending on top/bot ?

Tue, Dec 5, 6:53 AM
fhahn updated the diff for D40704: Remove `lnt update` command..

Remove lnt update instead of fixing it.

Tue, Dec 5, 6:45 AM
fhahn added a comment to D40307: [MachineCombiner] Add up latencies of all instructions in new pattern..

ping

Tue, Dec 5, 6:30 AM
fhahn added inline comments to D40413: [CodeExtractor] Add debug locations for new call and branch instrs..
Tue, Dec 5, 6:29 AM
fhahn added a comment to D40833: [DAGCombine] isLegalNarrowLoad function (NFC).

Thanks Sam, LGTM too.

Tue, Dec 5, 5:49 AM

Mon, Dec 4

fhahn added a comment to D40413: [CodeExtractor] Add debug locations for new call and branch instrs..

Thanks for having a look! I'll address the assertion before committing tomorrow and also think a bit about the PGO case.

Mon, Dec 4, 2:57 PM
fhahn added a comment to D40413: [CodeExtractor] Add debug locations for new call and branch instrs..

This is probably okay. Is there an introduction into partial inlining somewhere? I'd like to better understand what the transformation is doing, so I can give more helpful review feedback.

Mon, Dec 4, 10:10 AM
fhahn updated the diff for D40413: [CodeExtractor] Add debug locations for new call and branch instrs..
Mon, Dec 4, 9:58 AM
fhahn added a comment to D40696: Enable aggressive FMA on T99 and provide AArch64 option for other micro-arch's.

On a related note, the MachineCombiner ignores enableAggressiveFMAFusion, so at -O3, the example won't be combined. I plan to put up a patch soon that updates the machine combiner to consider combining multiple uses, if profitable.

Mon, Dec 4, 8:35 AM
fhahn added inline comments to D40696: Enable aggressive FMA on T99 and provide AArch64 option for other micro-arch's.
Mon, Dec 4, 8:29 AM

Fri, Dec 1

fhahn added a comment to D40704: Remove `lnt update` command..

I would have expected that you can just setup the postgresql connection in lnt.cfg and the tables get created on first server startup. Though admittedly I haven't setup any production servers, so I may be missing something...

Fri, Dec 1, 3:27 PM
fhahn updated the diff for D40728: [CallSiteSplitting] Refactor creating callsites..

Thanks for having a look! I ran clang-format now.

Fri, Dec 1, 3:20 PM
fhahn added a comment to D40704: Remove `lnt update` command..

I used

Fri, Dec 1, 10:45 AM
fhahn added inline comments to D40696: Enable aggressive FMA on T99 and provide AArch64 option for other micro-arch's.
Fri, Dec 1, 9:21 AM
fhahn added a comment to D40412: [InlineFunction] Only replace call if there are VarArgs to forward..

Ping. This change only replaces the call site if there actually are varargs to forward.

Fri, Dec 1, 8:04 AM
fhahn added a comment to D40413: [CodeExtractor] Add debug locations for new call and branch instrs..

ping

Fri, Dec 1, 8:03 AM
fhahn added inline comments to D40696: Enable aggressive FMA on T99 and provide AArch64 option for other micro-arch's.
Fri, Dec 1, 8:00 AM
fhahn accepted D40695: Improve loop unrolling performance on T99.

LGTM, as long as you are happy with the speedup.

Fri, Dec 1, 7:57 AM
fhahn added a dependent revision for D40728: [CallSiteSplitting] Refactor creating callsites.: D40729: [CallSiteSplitting] Remove isOrHeader restriction..
Fri, Dec 1, 7:36 AM
fhahn added a dependency for D40729: [CallSiteSplitting] Remove isOrHeader restriction.: D40728: [CallSiteSplitting] Refactor creating callsites..
Fri, Dec 1, 7:36 AM
fhahn created D40729: [CallSiteSplitting] Remove isOrHeader restriction..
Fri, Dec 1, 7:36 AM
fhahn created D40728: [CallSiteSplitting] Refactor creating callsites..
Fri, Dec 1, 7:33 AM
fhahn committed rL319538: [InstSimplify] More fcmp cases when comparing against negative constants..
[InstSimplify] More fcmp cases when comparing against negative constants.
Fri, Dec 1, 4:34 AM
fhahn closed D40012: [InstSimplify] More fcmp cases when comparing against negative constants..
Fri, Dec 1, 4:34 AM
fhahn updated the summary of D40012: [InstSimplify] More fcmp cases when comparing against negative constants..
Fri, Dec 1, 4:33 AM
fhahn accepted D40713: [DAGCombine] Remove isAndLoadExtLoad arguments (NFC).

Great, thanks Sam. This is a NFC, could you change the title to something like [DAGCombine] Remove unused isAndLoadExtLoad arguments (NFC)

Fri, Dec 1, 4:30 AM
fhahn created D40704: Remove `lnt update` command..
Fri, Dec 1, 1:58 AM

Thu, Nov 30

fhahn accepted D40667: [DAGCombine] Simplify ISD::AND handling in ReduceLoadWidth.

Nice, thanks Eli. Looks good to me. It might be worth waiting with committing till tomorrow in case Sam has any more thoughts.

Thu, Nov 30, 1:01 PM

Wed, Nov 29

fhahn added inline comments to D40363: [AArch64][SVE] Asm: Improve diagnostics further when +sve is not specified.
Wed, Nov 29, 9:09 AM
fhahn added reviewers for D40363: [AArch64][SVE] Asm: Improve diagnostics further when +sve is not specified: echristo, efriedma.

Thanks Sander. IMO this is a nice improvement to the error messages and not SVE specific. Could you update the title & description? The only potential issue I can think about the same assembler string being enabled by multiple target features. What would happen in that case? The worst thing that can happen is only printing 1 of the target features to enable it?

Wed, Nov 29, 9:04 AM

Tue, Nov 28

fhahn added a comment to D40511: [AArch64] Fix scheduling resources for post indexed loads and stores.

The change for LoadPostIdx looks good, I am not entirely sure about StorePostIdx. Could you elaborate why we should remove ReadAdrBase there? Maybe @javed.absar has some thoughts too.

Tue, Nov 28, 2:23 PM
fhahn added reviewers for D40299: [Complex] Don't use __div?c3 when building with fast-math.: arphaman, GorNishanov, hfinkel, fhahn.

Looks good to me, with some nits. However it seems that we do not use the fast math flags anywhere else in Clang codegen, so it would be good to clarify if it is OK to use it here.

Tue, Nov 28, 12:22 PM
fhahn added a comment to D40307: [MachineCombiner] Add up latencies of all instructions in new pattern..

Thanks Evandro for having a look! I would appreciate any feedback from Gerolf or Sebastian, as they reviewed/added the change summing up the latencies of the deleted instructions.

Tue, Nov 28, 11:22 AM
fhahn added a comment to D38279: [MachineScheduler] Enable latency heuristic based on scheduled lat..

This makes sense to me as well, but I wonder why Issued gets a different value depending on top/bot ?

Tue, Nov 28, 11:21 AM
fhahn committed rL319158: [TailRecursionElimination] Skip debug intrinsics..
[TailRecursionElimination] Skip debug intrinsics.
Tue, Nov 28, 1:33 AM
fhahn closed D40440: [TailRecursionElimination] Skip debug intrinsics..
Tue, Nov 28, 1:32 AM · debug-info

Mon, Nov 27

fhahn accepted D40107: [AArch64] Remove obsoleted feature.

LGTM, given that happy with the slight change in behavior in llvm/test/CodeGen/AArch64/arm64-ldp-cluster.ll for exynos.

Mon, Nov 27, 3:37 PM
fhahn added inline comments to D36704: [CodeGen] Improve the consistency of instruction fusion.
Mon, Nov 27, 9:00 AM
fhahn accepted D40360: [AArch64][SVE] Asm: Add SVE predicate register definitions and parsing support.

Thanks Sander, looks good to me now. I've added Eli and Eric as reviewers, please wait with committing so they can have a look too.

Mon, Nov 27, 6:59 AM
fhahn updated the diff for D40306: [AArch64] Add patterns to replace fsub fmul with fma fneg..

Addressed Evandro's comment.

Mon, Nov 27, 6:52 AM
fhahn added a comment to D40177: performance improvements for ThunderX2 T99.

Also if I understand correctly, this should have an impact on scheduling instructions using WriteAtomic, like CASB.

It doesn't. Maybe it was supposed to, in theory, but in reality it makes no difference whatsoever.

The only noticeable difference is in instruction cost: when Unsupported == 1, the estimated instruction cost is higher than when Unsupported == 0. Which makes no difference whatsoever in practice, given that the real cost of LSE instructions is pretty high anyway.

We've been emitting LSE instructions with no problems for many months. Whether or not Unsupported was 0 or 1 made no difference whatsoever.

Mon, Nov 27, 6:33 AM
fhahn accepted D39595: [DAGCombine] Refactor ReduceLoadWidth.

Thanks Sam, looks good to me. Please wait a day or two with committing, to give others a chance to raise additional concerns.

Mon, Nov 27, 2:29 AM

Fri, Nov 24

fhahn created D40440: [TailRecursionElimination] Skip debug intrinsics..
Fri, Nov 24, 6:52 AM · debug-info
fhahn updated the diff for D40413: [CodeExtractor] Add debug locations for new call and branch instrs..

Improve test slightly.

Fri, Nov 24, 6:45 AM
fhahn added a dependency for D40432: [InlineFunction] Set debug loc for call to forward varargs.: D40413: [CodeExtractor] Add debug locations for new call and branch instrs..
Fri, Nov 24, 5:57 AM
fhahn created D40432: [InlineFunction] Set debug loc for call to forward varargs..
Fri, Nov 24, 5:57 AM
fhahn added a dependent revision for D40413: [CodeExtractor] Add debug locations for new call and branch instrs.: D40432: [InlineFunction] Set debug loc for call to forward varargs..
Fri, Nov 24, 5:57 AM
fhahn added a dependency for D40413: [CodeExtractor] Add debug locations for new call and branch instrs.: D40412: [InlineFunction] Only replace call if there are VarArgs to forward..
Fri, Nov 24, 2:54 AM
fhahn added a dependent revision for D40412: [InlineFunction] Only replace call if there are VarArgs to forward.: D40413: [CodeExtractor] Add debug locations for new call and branch instrs..
Fri, Nov 24, 2:54 AM
fhahn created D40413: [CodeExtractor] Add debug locations for new call and branch instrs..
Fri, Nov 24, 2:54 AM
fhahn created D40412: [InlineFunction] Only replace call if there are VarArgs to forward..
Fri, Nov 24, 2:52 AM

Wed, Nov 22

fhahn added inline comments to D40360: [AArch64][SVE] Asm: Add SVE predicate register definitions and parsing support.
Wed, Nov 22, 9:31 AM
fhahn added inline comments to D40177: performance improvements for ThunderX2 T99.
Wed, Nov 22, 8:57 AM

Tue, Nov 21

fhahn added reviewers for D40012: [InstSimplify] More fcmp cases when comparing against negative constants.: spatel, RKSimon.

Looks like a straight-forward simplification to me, but it would be good if someone else could have a look too.

Tue, Nov 21, 8:25 AM
fhahn added inline comments to D38585: Enable interprocedural optimization in libquantum - LLVM-part [WIP].
Tue, Nov 21, 8:05 AM
fhahn added a comment to D40307: [MachineCombiner] Add up latencies of all instructions in new pattern..

I ran into this being a problem in D40306. The patterns added there are are not profitable on Cortex-A57, but because the NewRootLatency is under-estimated, the unprofitable pattern gets applied.

Tue, Nov 21, 7:55 AM
fhahn added a dependent revision for D40307: [MachineCombiner] Add up latencies of all instructions in new pattern.: D40306: [AArch64] Add patterns to replace fsub fmul with fma fneg..
Tue, Nov 21, 7:54 AM
fhahn added a dependency for D40306: [AArch64] Add patterns to replace fsub fmul with fma fneg.: D40307: [MachineCombiner] Add up latencies of all instructions in new pattern..
Tue, Nov 21, 7:54 AM
fhahn created D40307: [MachineCombiner] Add up latencies of all instructions in new pattern..
Tue, Nov 21, 7:54 AM
fhahn created D40306: [AArch64] Add patterns to replace fsub fmul with fma fneg..
Tue, Nov 21, 7:51 AM

Mon, Nov 20

fhahn added a comment to D39743: [ValueLattice] Add CompactValueLatticeElement..

This needs some more work. I'll update the patch and post some data once I am done.

Mon, Nov 20, 1:55 PM

Sat, Nov 18

fhahn committed rL318593: [CallSiteSplitting] Remove some indirection (NFC)..
[CallSiteSplitting] Remove some indirection (NFC).
Sat, Nov 18, 10:14 AM
fhahn closed D40037: [CallSiteSplitting] Remove some indirection (NFC)..
Sat, Nov 18, 10:14 AM

Fri, Nov 17

fhahn added a comment to D39743: [ValueLattice] Add CompactValueLatticeElement..

> Can I ask why you chose to tackle this one? What motivated you to care about the memory consumption of LVI? Just curious about the background and motivating test cases.

Fri, Nov 17, 8:59 AM
fhahn updated the diff for D39743: [ValueLattice] Add CompactValueLatticeElement..

Thanks @reames ! This patch should introduces a more compact lattice element representation using PointerIntPair, not the ConstantRange folding set.

Fri, Nov 17, 8:56 AM

Thu, Nov 16

fhahn added a comment to D40030: [AArch64][TableGen] Skip tied result operands for InstAlias.

just 2 nits that might be worth considering. Otherwise LGTM as a fix for the reverted D29219

Thu, Nov 16, 6:53 AM
fhahn added inline comments to D40107: [AArch64] Remove obsoleted feature.
Thu, Nov 16, 2:36 AM
fhahn updated the diff for D40037: [CallSiteSplitting] Remove some indirection (NFC)..

Add check to ensure we only split call sites in tryToSplitOnOrPredicatedArgument if the header block has exactly 2 successors. I plan to post a patch relaxing that soonish.

Thu, Nov 16, 2:30 AM