Page MenuHomePhabricator
Feed Advanced Search

Fri, Sep 13

efriedma added a comment to D67483: [ARM] Reserve an emergency spill slot for fp16 addressing modes that need it.

I have almost this exact change in a WIP patch, but I didn't get around to posting it because I didn't have a testcase, and I ended up deciding to handle the Thumb1 estimation differently.

Fri, Sep 13, 3:20 PM · Restricted Project
efriedma added a comment to D67562: [MemorySSA] Update MSSA for non-conventional AA..

I guess in theory it's possible, for example, for a load to be ModRefInfo::NoModRef because it loads from a readonly global, and to write a pass that depends on the assumption that we won't create a MemoryUse for that load. (I think we actually create a MemoryUse for that at the moment, but it could change.)

Fri, Sep 13, 2:49 PM · Restricted Project
efriedma created D67571: [LICM] Don't verify domtree/loopinfo unless EXPENSIVE_CHECKS is enabled..
Fri, Sep 13, 2:22 PM · Restricted Project
efriedma added a comment to D67562: [MemorySSA] Update MSSA for non-conventional AA..

Why does an instruction which doesn't read or write memory have an associated MemorySSA memory access? Do we assume everything accesses memory if basicaa is disabled? Would it make sense to fix that, instead of adding checks which always fail normally?

Fri, Sep 13, 12:23 PM · Restricted Project
efriedma added inline comments to D67392: [ARM][ParallelDSP] Change smlad insertion order.
Fri, Sep 13, 12:09 PM
efriedma added inline comments to D66955: [DebugInfo][If-Converter] Update call site info during the optimization.
Fri, Sep 13, 11:53 AM · debug-info

Thu, Sep 12

efriedma added inline comments to D67392: [ARM][ParallelDSP] Change smlad insertion order.
Thu, Sep 12, 1:36 PM
efriedma accepted D67468: [ConstantFolding] Expand folding of some library functions.

LGTM

Thu, Sep 12, 11:54 AM · Restricted Project

Wed, Sep 11

efriedma updated the diff for D67375: [ARM] VFPv2 only supports 16 D registers..

Minor fix for ARMTargetParser: vfp2sp should be FPURestriction::SP_D16. I don't think this has any practical effect beyond making clang emit the "+vfp2sp" attribute in all the cases where it's legal.

Wed, Sep 11, 2:28 PM · Restricted Project
efriedma created D67467: [ARM] Update clang for removal of vfp2d16 and vfp2d16sp.
Wed, Sep 11, 2:25 PM · Restricted Project
efriedma accepted D67459: [ConstantFolding] Refactor math functions to use LLVM ones (NFC).

LGTM

Wed, Sep 11, 2:23 PM · Restricted Project
efriedma committed rG403e08d4cf3a: [ConstantHoisting] Fix non-determinism. (authored by efriedma).
[ConstantHoisting] Fix non-determinism.
Wed, Sep 11, 11:55 AM
efriedma committed rL371644: [ConstantHoisting] Fix non-determinism..
[ConstantHoisting] Fix non-determinism.
Wed, Sep 11, 11:53 AM
efriedma closed D66114: [ConstantHoisting] Fix non-determinism..
Wed, Sep 11, 11:53 AM · Restricted Project
efriedma added inline comments to D66955: [DebugInfo][If-Converter] Update call site info during the optimization.
Wed, Sep 11, 9:13 AM · debug-info

Tue, Sep 10

efriedma added a comment to D58233: Allow replacing intrinsic operands with variables.

It looks like the change is essentially a no-op for constant hoisting itself? Or maybe I'm misreading the code somehow? But sure, I'm fine with generally hoisting the check out of canReplaceOperandWithVariable, for all three of the current callers.

Tue, Sep 10, 12:09 PM
efriedma added a comment to D58233: Allow replacing intrinsic operands with variables.

I don't really have any concerns about this from the perspective of correctness; any breakage will be obvious and easy to fix.

Tue, Sep 10, 11:47 AM
efriedma added a comment to D67392: [ARM][ParallelDSP] Change smlad insertion order.

Note: I have no idea why git has decided that I've made a change to an MC test.

Tue, Sep 10, 11:40 AM
efriedma added a comment to D67392: [ARM][ParallelDSP] Change smlad insertion order.

This can help reduce register pressure.

Tue, Sep 10, 11:36 AM
efriedma accepted D67367: [LoopInterchange] Properly move condition, induction increment and ops to latch..

LGTM

Tue, Sep 10, 11:31 AM · Restricted Project

Mon, Sep 9

efriedma added inline comments to D67350: [IfCvt][ARM] Optimise diamond if-conversion for code size.
Mon, Sep 9, 6:52 PM · Restricted Project
efriedma added a comment to D67367: [LoopInterchange] Properly move condition, induction increment and ops to latch..

This makes more sense than D67076, I think.

Mon, Sep 9, 3:45 PM · Restricted Project
efriedma created D67375: [ARM] VFPv2 only supports 16 D registers..
Mon, Sep 9, 3:32 PM · Restricted Project
efriedma added a comment to D67362: [SLP] limit vectorization of Constant subclasses (PR33958).

Maybe not worth trying harder. There are other reasons we might not want to introduce constant pools that require relocations.

Mon, Sep 9, 12:59 PM · Restricted Project
efriedma added a comment to D67362: [SLP] limit vectorization of Constant subclasses (PR33958).

In some cases it might be possible to form a vector constant pool entry. We don't do that currently, though.

Mon, Sep 9, 12:32 PM · Restricted Project
efriedma committed rG79f0d3a6e58b: [IfConversion] Correctly handle cases where analyzeBranch fails. (authored by efriedma).
[IfConversion] Correctly handle cases where analyzeBranch fails.
Mon, Sep 9, 11:29 AM
efriedma committed rL371434: [IfConversion] Correctly handle cases where analyzeBranch fails..
[IfConversion] Correctly handle cases where analyzeBranch fails.
Mon, Sep 9, 11:28 AM
efriedma closed D67306: [IfConversion] Correctly handle cases where analyzeBranch fails..
Mon, Sep 9, 11:28 AM · Restricted Project

Fri, Sep 6

efriedma added a comment to D64884: [PHINode] Preserve use-list order when removing incoming values..

My motivation for this change is to fix a case where we end up with a non-deterministic use-list order after simplifycfg, which I think is caused by this trashing of the use-lists.

Fri, Sep 6, 4:45 PM · Restricted Project
efriedma added a comment to D67178: [SCEV] Use loop guard info when computing the max BE taken count in howFarToZero..

Do you mean in general or in for SCEV?

Fri, Sep 6, 4:21 PM · Restricted Project
efriedma created D67306: [IfConversion] Correctly handle cases where analyzeBranch fails..
Fri, Sep 6, 4:14 PM · Restricted Project
efriedma added a comment to D67300: [ConstantFolding] Fold constant calls to log2().

I think a few of the functions we use ConstantFoldFP with actually already have implementations in APFloat: floor, ceil, round, fabs, fmod.

Fri, Sep 6, 3:51 PM · Restricted Project
efriedma accepted D67028: Use musttail for variadic method thunks when possible.

LGTM

Fri, Sep 6, 3:45 PM · Restricted Project, Restricted Project
efriedma added a comment to D65148: [SimplifyCFG] Bump phi-node-folding-threshold from 2 to 3.

That looks mostly fine, then. Maybe still a few more cmovs than I'd like, still... but close enough.

Fri, Sep 6, 3:23 PM · Restricted Project
efriedma added a comment to D67300: [ConstantFolding] Fold constant calls to log2().

I'm not really happy with adding more uses of ConstantFoldFP/ConstantFoldBinaryFP; they're flawed because they produce results that depend on the host's libm implementation. But I guess there isn't any reason to support log and not log2.

Fri, Sep 6, 3:09 PM · Restricted Project
efriedma added a comment to D62989: [Unroll] Do NOT unroll a loop with small runtime upperbound.

ping

Fri, Sep 6, 2:59 PM · Restricted Project
efriedma updated subscribers of D67281: [AArch64][SimplifyCFG] Add additional cost for instructions in mergeConditionalStoreToAddress.

I'm trying to understand the issue you're seeing... I guess it comes down to something like the following?

Fri, Sep 6, 2:37 PM · Restricted Project
efriedma accepted D67220: [ARM][ParallelDSP] Fix for sext input.

LGTM

Fri, Sep 6, 12:47 PM · Restricted Project
efriedma added inline comments to D67203: [IfConversion] Fix diamond conversion with unanalyzable branches..
Fri, Sep 6, 12:44 PM · Restricted Project

Thu, Sep 5

efriedma updated the diff for D66114: [ConstantHoisting] Fix non-determinism..

Got rid of a few changes that were clearly unnecessary, because we never iterated over the maps in question.

Thu, Sep 5, 6:06 PM · Restricted Project
efriedma accepted D67244: LangRef: mention MSan's problem with speculative conditional branches..

LGTM

Thu, Sep 5, 4:43 PM · Restricted Project
efriedma committed rG9dd453ce8d6b: [AArch64] Add testcase for codegen for sdiv by 2. (authored by efriedma).
[AArch64] Add testcase for codegen for sdiv by 2.
Thu, Sep 5, 4:40 PM
efriedma committed rL371147: [AArch64] Add testcase for codegen for sdiv by 2..
[AArch64] Add testcase for codegen for sdiv by 2.
Thu, Sep 5, 4:38 PM
efriedma added a comment to D67087: [X86] Override BuildSDIVPow2 for X86..

I've also blocked INT_MIN as the transform isn't valid for that.

Thu, Sep 5, 4:10 PM · Restricted Project
efriedma added a comment to D67028: Use musttail for variadic method thunks when possible.

In your test case, we hit the early return that I linked to, so we don't try to clone, and we don't need to emit an error.

Thu, Sep 5, 2:56 PM · Restricted Project, Restricted Project
efriedma added a comment to D67028: Use musttail for variadic method thunks when possible.

Oops, meant to actually include the testcase in my last comment:

Thu, Sep 5, 2:14 PM · Restricted Project, Restricted Project
efriedma added a comment to D67028: Use musttail for variadic method thunks when possible.

In the MS ABI, deriving a new class may require the creation of new thunks for methods that were not overridden, so we can't use the same trick.

Thu, Sep 5, 2:14 PM · Restricted Project, Restricted Project
efriedma accepted D67205: [SimplifyCFG] Don't SimplifyBranchOnICmpChain with ExtraCase.

Basically, the only rule is that you should not speculatively introduce a conditional branch on value that might be undef and is not guaranteed to execute in the input IR.

Thu, Sep 5, 1:29 PM · Restricted Project
efriedma committed rGcae1e47f6ed7: [IfConversion] Fix diamond conversion with unanalyzable branches. (authored by efriedma).
[IfConversion] Fix diamond conversion with unanalyzable branches.
Thu, Sep 5, 1:03 PM
efriedma committed rL371111: [IfConversion] Fix diamond conversion with unanalyzable branches..
[IfConversion] Fix diamond conversion with unanalyzable branches.
Thu, Sep 5, 1:03 PM
efriedma closed D67203: [IfConversion] Fix diamond conversion with unanalyzable branches..
Thu, Sep 5, 1:03 PM · Restricted Project
efriedma added inline comments to D67220: [ARM][ParallelDSP] Fix for sext input.
Thu, Sep 5, 12:36 PM · Restricted Project
efriedma added a comment to D67205: [SimplifyCFG] Don't SimplifyBranchOnICmpChain with ExtraCase.

You description of the issue uses "||" in the pseudo-code; I assume you actually mean a non-short-circuiting "|"?

Thu, Sep 5, 12:29 PM · Restricted Project
efriedma added a comment to D67076: [LoopInterchange] Make sure we create PHI nodes for uses in split off latch..

Is there a potential correctness issue here, if some operation that isn't the exit condition or the IV increment somehow ends up in the new latch?

Thu, Sep 5, 12:16 PM · Restricted Project
efriedma added a comment to D67207: [JumpThreading] Fix the AssertVH handling for debug builds..

It seems like there are multiple problems here if "BB" gets deleted eagerly: there the AssertingVH failure you're mentioning here, LVI->eraseBlock doesn't work, and the for loop "for (auto &BB : F)" also breaks. So I don't think this patch really solves anything.

Thu, Sep 5, 12:15 PM · Restricted Project

Wed, Sep 4

efriedma created D67203: [IfConversion] Fix diamond conversion with unanalyzable branches..
Wed, Sep 4, 5:53 PM · Restricted Project
efriedma added inline comments to D67199: [InstCombine] Expand the simplification of log().
Wed, Sep 4, 4:44 PM · Restricted Project
efriedma added a comment to D67178: [SCEV] Use loop guard info when computing the max BE taken count in howFarToZero..

In terms of general API, I don't think we want to expose "applyLoopGuards"; the SCEV transform proposed here isn't really useful outside of trying to find the minimum or maximum, as far as I can tell. Which min/max expressions we want to form depends on whether we're computing a "max" or a "min". And restricting the API so the point in the CFG we're querying has to be a loop header doesn't seem helpful; other places might care about values after a loop etc.

Wed, Sep 4, 4:44 PM · Restricted Project
efriedma added a comment to D67199: [InstCombine] Expand the simplification of log().

Patch uploaded without context.

Wed, Sep 4, 4:16 PM · Restricted Project
efriedma added a comment to D65148: [SimplifyCFG] Bump phi-node-folding-threshold from 2 to 3.

So, what metric specifically do you want to see, a count of CMOV instructions at the end of codegen, how it is changed by this patch?

Wed, Sep 4, 4:07 PM · Restricted Project
efriedma added a comment to D67076: [LoopInterchange] Make sure we create PHI nodes for uses in split off latch..

I'm not sure how you never hit this issue before. I mean, I can see from the testcase that the compare for the outer loop's branch can end up in the inner loop... but why is it not *always* in the inner loop? What criteria are we using to sink it in some cases? Should we be sinking in all cases?

Wed, Sep 4, 3:56 PM · Restricted Project
efriedma added a comment to D63972: [CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks.

New changes look okay... but maybe someone else should look too, since I've missed multiple serious issues with the CFG updating code.

Wed, Sep 4, 3:41 PM · Restricted Project
efriedma added a comment to D65148: [SimplifyCFG] Bump phi-node-folding-threshold from 2 to 3.

Aggressively flattening the CFG has tradeoffs. If the branch is very unpredictable, or it unblocks some important optimization, it can have a huge benefit. If you don't fall into one of those cases, you're mildly degrading the performance of a bunch of code, by forcing the execution of instructions where the result isn't used.

Wed, Sep 4, 3:36 PM · Restricted Project
efriedma added a comment to D67178: [SCEV] Use loop guard info when computing the max BE taken count in howFarToZero..

Instead of writing a C++ unittest, you should be able to use "opt -analyze -scalar-evolution" to test this.

Wed, Sep 4, 12:21 PM · Restricted Project

Tue, Sep 3

efriedma accepted D66935: [AArch64][DebugInfo] Do not recompute CalleeSavedStackSize.

I'm afraid there's a latent bug in the interaction between ShrinkWrap and PEI, but I guess the effect might be sort of hard to spot; if we allocate an unnecessary stack object before PEI, it would be hard to notice.

Tue, Sep 3, 6:11 PM · Restricted Project
efriedma accepted D66993: [ARM][ParallelDSP] SExt mul for accumulation.

LGTM

Tue, Sep 3, 3:28 PM · Restricted Project
efriedma added a comment to D66935: [AArch64][DebugInfo] Do not recompute CalleeSavedStackSize.

This seems much better.

Tue, Sep 3, 3:11 PM · Restricted Project
efriedma added inline comments to D67105: [TargetLowering] Fix another potential FPE in expandFP_TO_UINT.
Tue, Sep 3, 12:42 PM · Restricted Project

Fri, Aug 30

efriedma added a comment to D67028: Use musttail for variadic method thunks when possible.

Do we have test coverage for a variadic, covariant thunk for a function without a definition? I don't think there's any way for us to actually emit that, but we should make sure the error message is right.

Fri, Aug 30, 5:33 PM · Restricted Project, Restricted Project
efriedma added inline comments to D67021: [DAGCombiner] improve throughput of shift+logic+shift.
Fri, Aug 30, 5:12 PM · Restricted Project
efriedma accepted D65737: [InstCombine] mempcpy(d,s,n) to memcpy(d,s,n) + n.

LGTM

Fri, Aug 30, 2:16 PM · Restricted Project
efriedma updated subscribers of D66935: [AArch64][DebugInfo] Do not recompute CalleeSavedStackSize.

This method seems to be called by several passes:

Fri, Aug 30, 11:48 AM · Restricted Project
efriedma added a comment to D66993: [ARM][ParallelDSP] SExt mul for accumulation.

Could you describe the complete flow here? If getAccumulator() returns a value, I think I see how it works; the type of that value is the type of the final accumulator, and you need to sign-extend multiplies to match that type. If getAccumulator() returns null, I don't see how this is supposed to work; it looks like the code arbitrarily decides the accumulator should be 32 bits.

Fri, Aug 30, 11:34 AM · Restricted Project

Thu, Aug 29

efriedma added a comment to D66839: Fix stack address builtin for negative numbers.

In the context of __builtin_frame_address, an arbitrary limit is probably okay. Maybe something like 0xFFFF, which is larger than anyone would realistically use, but doesn't take a crazy amount of time to compile.

Thu, Aug 29, 5:25 PM · Restricted Project
efriedma updated subscribers of D66924: [NewGVN] Add phi-of-ops instr as user of FoundVal..
Thu, Aug 29, 3:40 PM · Restricted Project
efriedma added a comment to D66862: Make lround builtin constexpr (and others).

It's probably worth adding testcases for 0.5 and -0.5. I think the current implementation behaves correctly, but it would be easy to mess up with a small change to the code.

Thu, Aug 29, 2:54 PM · Restricted Project
efriedma added a comment to D66935: [AArch64][DebugInfo] Do not recompute CalleeSavedStackSize.

Why are we calling determineCalleeSaves in LiveDebugValues, anyway? Can't it just call getCalleeSavedInfo()?

Thu, Aug 29, 12:21 PM · Restricted Project

Tue, Aug 27

efriedma added a comment to D66839: Fix stack address builtin for negative numbers.

We usually prefer to generate error messages for incorrect parameters to builtins in SemaChecking.cpp.

Tue, Aug 27, 5:01 PM · Restricted Project
efriedma accepted D66660: [ARM][ParallelDSP] Change search for muls.

LGTM

Tue, Aug 27, 11:59 AM · Restricted Project

Mon, Aug 26

efriedma added a comment to D66618: [WIP] Expose functions to determine pointer properties (Align & Deref).

Can we remove the CanBeNull argument from getPointerDereferenceableBytes()? It looks like it's currently unused. Or are you planning to use it somewhere?

Mon, Aug 26, 3:40 PM · Restricted Project

Fri, Aug 23

efriedma added a comment to D66309: Introduce infrastructure for an incremental port of SelectionDAG atomic load/store handling.

Is there an llvm-dev thread for the general project? It looks fine to me, but maybe it should have a wider audience.

Fri, Aug 23, 3:39 PM · Restricted Project
efriedma accepted D66639: AArch64: avoid cycle when forming post-increment NEON loads.

On a side-note, we really should try to come up with a better algorithm for forming pre/post-increment operations; hasPredecessorHelper is slow.

Fri, Aug 23, 3:05 PM · Restricted Project
efriedma added a comment to D66664: [FIX] Nonnull is not always implied by dereferenceable.

If you want to really expand out the meaning of "CanBeNull", it means "was the number of dereferenceable bytes computed using a dereferenceable_or_null attribute/metadata". It has nothing to do with whether a null pointer is generally valid in the given address space. The logic has always worked this way, since before it was extracted into a separate function in D17572.

Fri, Aug 23, 2:49 PM · Restricted Project
efriedma added inline comments to D66660: [ARM][ParallelDSP] Change search for muls.
Fri, Aug 23, 1:31 PM · Restricted Project
efriedma added a comment to D66664: [FIX] Nonnull is not always implied by dereferenceable.

getPointerDereferenceableBytes returns some number of dereferenceable bytes. If CanBeNull is true, that result is modified: if the pointer value is null, the number of known dereferenceable bytes is actually zero.

Fri, Aug 23, 1:17 PM · Restricted Project

Thu, Aug 22

efriedma added inline comments to D66210: [RFC/WIP][RISCV] Enable the machine outliner for RISC-V.
Thu, Aug 22, 2:06 PM · Restricted Project
efriedma added a comment to D66461: [CaptureTracker] Comparisons of allocation pointers do not capture.

Thinking about it a bit more, that definition is probably okay.

Thu, Aug 22, 1:51 PM · Restricted Project
efriedma added a comment to D66461: [CaptureTracker] Comparisons of allocation pointers do not capture.

As I described, the compare will capture only if the other pointer was captured. So additionally to the checks in the patch ask the tracker if that is the case

Thu, Aug 22, 12:31 PM · Restricted Project
efriedma added a comment to D66122: [CodeGen] Emit dynamic initializers for static TLS vars in outlined scopes.

Oh, I somehow forgot that was legal. :( That breaks this whole approach (well, maybe the lambda could capture "w", but that seems way too complicated). So we're left with a few possibilities:

Thu, Aug 22, 11:58 AM · Restricted Project, Restricted Project

Wed, Aug 21

efriedma added a comment to D66122: [CodeGen] Emit dynamic initializers for static TLS vars in outlined scopes.

Added a few more minor comments.

Wed, Aug 21, 1:41 PM · Restricted Project, Restricted Project
efriedma added a comment to D66539: [ELF][ARM] Simplify some llvm-objdump tests with both ARM/Thumb states.

(a) have two MCInstrAnalysis (how), (b) pass an extra parameter to evaluateBranch, or (c) make it stateful?

Wed, Aug 21, 12:16 PM · Restricted Project
efriedma added a comment to D66461: [CaptureTracker] Comparisons of allocation pointers do not capture.

Yes, you can't capture a pointer that's already captured.

Wed, Aug 21, 11:58 AM · Restricted Project
efriedma accepted D65020: [GVN] Do PHI translations across all edges between the load and the unavailable pred..

LGTM

Wed, Aug 21, 11:41 AM · Restricted Project
efriedma accepted D66543: [DAGCombiner] Remove mostly redundant calls to AddToWorklist.

These are redundant because they're new nodes? Sure, LGTM.

Wed, Aug 21, 11:39 AM · Restricted Project

Tue, Aug 20

efriedma added a comment to D66510: Fix for "DICompileUnit not listed in llvm.dbg.cu" verification error after cloning a function from a different module.

Why are the changes to CloneModule.cpp necessary?

Tue, Aug 20, 7:15 PM · Restricted Project
efriedma added a comment to D64174: [DAGCombine] Do several rounds of combine for addcarry nodes..

Changing this lead very quickly to a world of pain

Tue, Aug 20, 7:10 PM · Restricted Project
efriedma added a comment to D66439: [LibFunc] "free" captures the pointer operand.

I'd rather analyze whether the memory is valid separately from whether the pointer itself is captured, I think? Removing nocapture markings makes all the existing uses of capture analysis weaker.

Tue, Aug 20, 5:07 PM · Restricted Project
efriedma added inline comments to D66122: [CodeGen] Emit dynamic initializers for static TLS vars in outlined scopes.
Tue, Aug 20, 4:56 PM · Restricted Project, Restricted Project
efriedma added inline comments to D66122: [CodeGen] Emit dynamic initializers for static TLS vars in outlined scopes.
Tue, Aug 20, 3:51 PM · Restricted Project, Restricted Project
efriedma added a comment to D66439: [LibFunc] "free" captures the pointer operand.

Is there some practical issue I'm missing?

Tue, Aug 20, 2:57 PM · Restricted Project