Page MenuHomePhabricator

Please use GitHub pull requests for new patches. Phabricator shutdown timeline

Feed Advanced Search

Tue, Sep 19

dexonsmith accepted D156618: [IR] Fix a memory leak if Function::dropAllReferences() is followed by setHungoffOperand.

LGTM! Thanks.

Tue, Sep 19, 8:16 AM · Restricted Project, Restricted Project

Thu, Sep 14

dexonsmith added inline comments to D156618: [IR] Fix a memory leak if Function::dropAllReferences() is followed by setHungoffOperand.
Thu, Sep 14, 8:54 AM · Restricted Project, Restricted Project
dexonsmith added a comment to D74094: Reapply: [IRGen] Emit lifetime intrinsics around temporary aggregate argument allocas.

I don't yet fully comprehend yet what's going wrong, and probably need to familiarize myself with the language rules around auto's type deduction.

Thu, Sep 14, 8:11 AM · Restricted Project, Restricted Project

Tue, Sep 12

dexonsmith added a comment to D61878: [libc++] Optimize unordered_{multiset,multimap} equality comparison.

[Github PR transition cleanup]

This patch doesn't seem correct to me: unordered containers are not ordered, so I don't think we can do a linear walk of __first1 and __first2, compare elements side by side and draw any conclusion from that. Abandoning.

Tue, Sep 12, 8:02 AM · Restricted Project, Restricted Project

Fri, Sep 8

dexonsmith added inline comments to D159064: [Modules] Make clang modules for the C standard library headers.
Fri, Sep 8, 3:08 PM · Restricted Project, Restricted Project

Wed, Sep 6

dexonsmith requested changes to D156618: [IR] Fix a memory leak if Function::dropAllReferences() is followed by setHungoffOperand.

I looked at the documentation for User::dropAllReferences(), which this overrides. That has strengthened my opinion so I'm requesting changes.

Wed, Sep 6, 11:57 AM · Restricted Project, Restricted Project

Tue, Sep 5

dexonsmith added inline comments to D154119: Fix: Distinguish CFI Metadata Checks in MergeFunctions Pass.
Tue, Sep 5, 3:03 PM · Restricted Project, Restricted Project
dexonsmith added a comment to D154119: Fix: Distinguish CFI Metadata Checks in MergeFunctions Pass.

ping

Tue, Sep 5, 2:15 PM · Restricted Project, Restricted Project

Mon, Sep 4

dexonsmith added a comment to D156618: [IR] Fix a memory leak if Function::dropAllReferences() is followed by setHungoffOperand.

Seems unfortunate to add calls to ConstantPointerNull::get() when all of the callers are about to call delete.

Can we instead somehow enforce that delete is always called right after? Perhaps add an assertion that this is so?

Or, looking back at the motivation (calling deleteBody()), can we have two entry points?

  • One prepares for deletion (99% of callers), skips CPN::get()
  • One makes it a prototype (deleteBody), calls CPN::get()

Thanks for the comment. I think it is a little redundant. Since we already has dector to release the memory, another entry points seems redundant.
So I'd like to reserve the only prototype version.

Mon, Sep 4, 8:16 AM · Restricted Project, Restricted Project

Thu, Aug 31

dexonsmith added a comment to D156618: [IR] Fix a memory leak if Function::dropAllReferences() is followed by setHungoffOperand.

Seems unfortunate to add calls to ConstantPointerNull::get() when all of the callers are about to call delete.

Can we instead somehow enforce that delete is always called right after? Perhaps add an assertion that this is so?

Thu, Aug 31, 11:51 AM · Restricted Project, Restricted Project
dexonsmith added a comment to D156618: [IR] Fix a memory leak if Function::dropAllReferences() is followed by setHungoffOperand.

Seems unfortunate to add calls to ConstantPointerNull::get() when all of the callers are about to call delete.

Thu, Aug 31, 11:48 AM · Restricted Project, Restricted Project

Wed, Aug 30

dexonsmith added inline comments to D159064: [Modules] Make clang modules for the C standard library headers.
Wed, Aug 30, 1:57 PM · Restricted Project, Restricted Project

Tue, Aug 29

dexonsmith added inline comments to D159064: [Modules] Make clang modules for the C standard library headers.
Tue, Aug 29, 8:10 AM · Restricted Project, Restricted Project

Mon, Aug 28

dexonsmith added a comment to D137213: [clang][modules] NFCI: Pragma diagnostic mappings: write/read FileID instead of SourceLocation.

Now that the behaviour change is understood, maybe it'd be useful to split the patch in two:

  • First, this patch, plus a call to ASTReader::getInputFile() only for its side effects, to make this patch actually NFC.
  • Second, committed a few days later, a patch that removes the call and adds a test confirming -I is no longer implied by -fembed-all-input-files.
Mon, Aug 28, 6:33 PM · Restricted Project, Restricted Project
dexonsmith added a comment to D154119: Fix: Distinguish CFI Metadata Checks in MergeFunctions Pass.

Sorry, Phabricator lost my comment somehow. Adding back a version of it now.

Mon, Aug 28, 9:54 AM · Restricted Project, Restricted Project
dexonsmith requested changes to D154119: Fix: Distinguish CFI Metadata Checks in MergeFunctions Pass.
Mon, Aug 28, 9:46 AM · Restricted Project, Restricted Project

Aug 22 2023

dexonsmith added inline comments to D158301: Add back overriding-t-options for -m<os>-version-min diagnostic.
Aug 22 2023, 12:56 PM · Restricted Project, Restricted Project
dexonsmith added inline comments to D158301: Add back overriding-t-options for -m<os>-version-min diagnostic.
Aug 22 2023, 10:22 AM · Restricted Project, Restricted Project

Aug 19 2023

dexonsmith added inline comments to D158301: Add back overriding-t-options for -m<os>-version-min diagnostic.
Aug 19 2023, 12:12 PM · Restricted Project, Restricted Project
dexonsmith added inline comments to D158301: Add back overriding-t-options for -m<os>-version-min diagnostic.
Aug 19 2023, 8:30 AM · Restricted Project, Restricted Project

Aug 17 2023

dexonsmith added a comment to D158137: Rename warn_drv_overriding_flag_option (-Woverriding-t-option) to warn_drv_overriding_flag_option (-Woverriding-option).

Can you explain the downside of leaving behind an alias?

Two minor ones. (a) Existing -Wno-overriding-t-option will not notice that they need to migrate and (b) Clang has accrued tiny tech debt.
If we eventually remove -Wno-overriding-t-option for tidiness, we will have to break -Werror -Wno-overriding-t-option users.

I guess it's not clear to me we'd need to remove the alias. The usual policy (I think?) is that clang driver options don't disappear. It seems like a small piece of debt to maintain the extra alias in this case, and if it's kept, then users don't actually need to migrate. And then you can feel safe updating Darwin.cpp as well.

-W* options are different from regular driver options in that -Wunknown-unknown-unknown leads to a warning instead of an error, while a regular unrecognized driver option leads to an error.
We deprecate driver options and make use of them warnings, and newer Clang generally emits more warnings. These would break -Werror users as well, but we still do them anyway if reasonable.

I understand that it is a small piece of debt, but my point is that we don't need the debt.

Aug 17 2023, 10:42 PM · Restricted Project, Restricted Project
dexonsmith added a comment to D158137: Rename warn_drv_overriding_flag_option (-Woverriding-t-option) to warn_drv_overriding_flag_option (-Woverriding-option).

Can you explain the downside of leaving behind an alias?

Two minor ones. (a) Existing -Wno-overriding-t-option will not notice that they need to migrate and (b) Clang has accrued tiny tech debt.
If we eventually remove -Wno-overriding-t-option for tidiness, we will have to break -Werror -Wno-overriding-t-option users.

Aug 17 2023, 6:20 PM · Restricted Project, Restricted Project
dexonsmith added inline comments to D155818: [CloneFunction][DebugInfo] Clone DISubprogram's local types.
Aug 17 2023, 5:50 PM · Restricted Project, Restricted Project
dexonsmith added a comment to D158137: Rename warn_drv_overriding_flag_option (-Woverriding-t-option) to warn_drv_overriding_flag_option (-Woverriding-option).

Can you explain the downside of leaving behind an alias?

Aug 17 2023, 3:23 PM · Restricted Project, Restricted Project
dexonsmith added a comment to D158137: Rename warn_drv_overriding_flag_option (-Woverriding-t-option) to warn_drv_overriding_flag_option (-Woverriding-option).

This seems to drop -Woverriding-t-option entirely. Could that break builds if someone has (e.g.) -Werror -Wno-overriding-t-option in their build settings?

Aug 17 2023, 9:34 AM · Restricted Project, Restricted Project

Aug 16 2023

dexonsmith accepted D158137: Rename warn_drv_overriding_flag_option (-Woverriding-t-option) to warn_drv_overriding_flag_option (-Woverriding-option).

Perhaps as a follow-up, rename warn_drv_overriding_flag_option to have “t” in it?

Aug 16 2023, 10:18 PM · Restricted Project, Restricted Project

Aug 15 2023

dexonsmith added a comment to D74094: Reapply: [IRGen] Emit lifetime intrinsics around temporary aggregate argument allocas.

Nice!

Aug 15 2023, 4:53 PM · Restricted Project, Restricted Project
dexonsmith added inline comments to D155818: [CloneFunction][DebugInfo] Clone DISubprogram's local types.
Aug 15 2023, 8:39 AM · Restricted Project, Restricted Project

Aug 7 2023

dexonsmith added a comment to D89001: [clang] Don't look into <sysroot> for C++ headers if they are found alongside the toolchain.

SGTM.

Aug 7 2023, 8:28 AM · Restricted Project, Restricted Project
dexonsmith resigned from D157283: [clang] Match -isysroot behaviour with system compiler on Darwin.

This looks correct to me, but I'd rather have someone at Apple confirm. @ldionne (or @arphaman or @jroelofs), can you review and/or help to find the right person?

Aug 7 2023, 7:34 AM · Restricted Project, Restricted Project

Aug 6 2023

dexonsmith accepted D157011: [Clang][Tooling] Accept preprocessed input files.

LGTM.

Aug 6 2023, 2:08 PM · Restricted Project, Restricted Project

Aug 4 2023

dexonsmith added a comment to D89001: [clang] Don't look into <sysroot> for C++ headers if they are found alongside the toolchain.

I don't have access to rdar these days to look into the current state or to refresh my memory.

Aug 4 2023, 12:33 PM · Restricted Project, Restricted Project

Aug 3 2023

dexonsmith updated subscribers of D74094: Reapply: [IRGen] Emit lifetime intrinsics around temporary aggregate argument allocas.

Hi @erik.pilkington, I see this got reverted:

Aug 3 2023, 3:20 PM · Restricted Project, Restricted Project

Jul 28 2023

dexonsmith resigned from D156520: [TII] NFCI: Simplify the interface for isTriviallyReMaterializable.
Jul 28 2023, 5:43 AM · Restricted Project, Restricted Project

Jul 11 2023

dexonsmith added inline comments to D154503: [Sema] Fix handling of functions that hide classes.
Jul 11 2023, 7:28 AM · Restricted Project, Restricted Project

Jul 5 2023

dexonsmith accepted D154502: [AST] Fix bug in UnresolvedSet::erase of last element.

LGTM.

Jul 5 2023, 6:49 AM · Restricted Project, Restricted Project

Jun 5 2023

dexonsmith accepted D152151: AutoUpgrade: Fix crash when tbaa has an empty argument.

LGTM, thanks!

Jun 5 2023, 5:14 PM · Restricted Project, Restricted Project
dexonsmith added a comment to D152151: AutoUpgrade: Fix crash when tbaa has an empty argument.

Should the verifier fail on this? If so, might be nice to make that happen before dropping it; if not, LGTM.

Jun 5 2023, 6:32 AM · Restricted Project, Restricted Project

May 31 2023

dexonsmith resigned from D151786: [Tooling] Remove unused function setRestoreWorkingDir.
May 31 2023, 6:16 AM · Restricted Project, Restricted Project

May 17 2023

dexonsmith added a comment to D144322: [libc++][Modules] Make top level modules for all C++ headers with OS/clang versions.

If you want to ignore all the added includes and review the rest, it should be in a good state. I'm temporarily using this review to test for more missing includes via CI. I guess my local environment is different enough that it didn't find all of them.

May 17 2023, 10:18 AM · Restricted Project, Restricted Project

May 1 2023

dexonsmith updated subscribers of D103930: [clang][HeaderSearch] Fix implicit module when using header maps.

Friendly ping

@arphaman, @jansvoboda11, I have made the patch buildable on all platforms and have all tests passed. There was also a small fix (temp path for modules artefact) at the test that could fix its run on some platforms. Could you look at it? Does it have any issues on your side?

May 1 2023, 2:55 PM · Restricted Project, Restricted Project

Apr 25 2023

dexonsmith accepted D149125: Remove code only needed to detect a pre-4.0 API break..

LGTM

Apr 25 2023, 7:22 AM · Restricted Project, Restricted Project

Apr 24 2023

dexonsmith accepted D149122: Remove code only needed to detect a pre-4.0 API break..

LGTM!

Apr 24 2023, 10:30 PM · Restricted Project, Restricted Project

Apr 17 2023

dexonsmith added a comment to D144322: [libc++][Modules] Make top level modules for all C++ headers with OS/clang versions.

I'd be curious to see just how bad the perf is with implicitly-discovered-and-explicitly-built modules. Maybe it wouldn't be as bad as suspected. And will get faster with the scanning speedups @jansvoboda11 is working on.

  • Do we know how deep the build graph would be? (Would we get good parallelism when building?)

I made a rough graph that's not quite accurate but is probably close enough to answer that question. It's way more wide and shallow than I thought it would be.


It's not so much parallelism that we're worried about, it's more like launching clang ~200 times to build the algorithm module and later opening 200 pcm files.

Apr 17 2023, 2:23 PM · Restricted Project, Restricted Project
dexonsmith updated subscribers of D144322: [libc++][Modules] Make top level modules for all C++ headers with OS/clang versions.

Discussed this some more with @ldionne, @Bigcheese, @vsapsai, @var-const. We don't love this patch for a few reasons.

  1. It looks arbitrary what got left in std and what got pulled out to top level.
  2. It looks arbitrary which private detail headers are in the new std_abc modules and which ones are top level modules.
  3. Figuring out where the private detail headers go manually is tricky, and possibly difficult to maintain. (I've rebased this a couple of times over the last few months and it keeps adding module cycles that have been tough to resolve. It might be less difficult if the cycles get resolved as the headers are changed, or it might just always be hard.)

The two ideas we had to improve the situation are these.

  1. Make a top level module for all of the headers. This is the simple approach and, if there aren't any include cycles, will be the easiest way to avoid module cycles. But it makes ~950 top level modules which looks kind of goofy and has a fair chance at sub-optimal performance.
Apr 17 2023, 12:57 PM · Restricted Project, Restricted Project
dexonsmith accepted D142318: [Support] Add PerThreadBumpPtrAllocator class..

Sounds good; happy for this to land while you continue working on how to do the checking.

Apr 17 2023, 11:32 AM · Restricted Project, Restricted Project
dexonsmith added a comment to D142318: [Support] Add PerThreadBumpPtrAllocator class..

This sounds okay to me, but I admit I don't know llvm::parallel well enough to understand the implications.

Apr 17 2023, 7:44 AM · Restricted Project, Restricted Project

Apr 15 2023

dexonsmith added a comment to D142318: [Support] Add PerThreadBumpPtrAllocator class..

I think this solves only part of the problem: it checks the fact that executor is already created when getThreadIndex() is requested. But it does not check that thread index is valid. If thread was created not by ThreadPoolExecutor then it would have zero index which clashes with thread index of main thread and Thread0. I thought we want to check that other threads were not used with getThreadIndex.

Checking ThreadPoolExecutor existence still useful check and it would be good to implement it. If we found a good way to check thread indexes it would also be useful.

Yeah, seems like a good start for now. This would catch the case where someone is NOT using llvm::parallel at all, but has a bunch of threads, and is wrongly assuming this allocator is safe for concurrent use in general.

This check will help for pure users of getThreadIndex() but will not help users of PerThreadBumpPtrAllocator as it calls "detail::Executor::getDefaultExecutor()->getThreadsNum();" in the constructor. Thus any call to getThreadIndex() after PerThreadBumpPtrAllocator is created will have HasDefaultExecutor == true.

Apr 15 2023, 8:09 AM · Restricted Project, Restricted Project

Apr 14 2023

dexonsmith added a comment to D142318: [Support] Add PerThreadBumpPtrAllocator class..

I am OK to do that separate patch right after the current patch. Just do not have a good idea for this at the moment.

WDYT of the idea above, to have a Boolean flag that checks whether getDefaultExecutor() has been called, and assert on that in getThreadIndex()?

I think this solves only part of the problem: it checks the fact that executor is already created when getThreadIndex() is requested. But it does not check that thread index is valid. If thread was created not by ThreadPoolExecutor then it would have zero index which clashes with thread index of main thread and Thread0. I thought we want to check that other threads were not used with getThreadIndex.

Checking ThreadPoolExecutor existence still useful check and it would be good to implement it. If we found a good way to check thread indexes it would also be useful.

Apr 14 2023, 4:36 PM · Restricted Project, Restricted Project
dexonsmith added a comment to D142318: [Support] Add PerThreadBumpPtrAllocator class..

I am OK to do that separate patch right after the current patch. Just do not have a good idea for this at the moment.

Apr 14 2023, 1:48 PM · Restricted Project, Restricted Project
dexonsmith added a comment to D142318: [Support] Add PerThreadBumpPtrAllocator class..

Seems like threads are assigned IDs from 1 in the ThreadPoolExecutor constructor via calls to work(). The main thread assigns threadIndex to 0 in the same place:

Aha, looks like I misread the code. The work() calls are coming from within a lambda that's executed by the first created thread. So, right now, the main thread has the same threadIndex as the first spawned thread.

(But if that's the case, doesn't that cause a problem for the allocator? Doesn't the allocator require that the main thread has a different ID from the worker threads?)

Apr 14 2023, 11:16 AM · Restricted Project, Restricted Project
dexonsmith added a comment to D142318: [Support] Add PerThreadBumpPtrAllocator class..

Seems like threads are assigned IDs from 1 in the ThreadPoolExecutor constructor via calls to work(). The main thread assigns threadIndex to 0 in the same place:

Apr 14 2023, 11:09 AM · Restricted Project, Restricted Project
dexonsmith added a comment to D142318: [Support] Add PerThreadBumpPtrAllocator class..

so far I suggest to implement safety check as a separate patch. After having a good solution for this.

Looks good to me.

I think if we don't add the check now it's unlikely to happen later.

Apr 14 2023, 10:42 AM · Restricted Project, Restricted Project
dexonsmith added a comment to D142318: [Support] Add PerThreadBumpPtrAllocator class..

so far I suggest to implement safety check as a separate patch. After having a good solution for this.

Looks good to me.

Apr 14 2023, 10:41 AM · Restricted Project, Restricted Project

Apr 13 2023

dexonsmith added a comment to D142318: [Support] Add PerThreadBumpPtrAllocator class..

BTW, if others feel strongly that such an assertion wouldn't be useful (say, maybe there's reason to believe that even unit tests wouldn't trigger it in practice due to @MaskRay's points?), happy to back away and let this land without it.

Apr 13 2023, 3:59 PM · Restricted Project, Restricted Project
dexonsmith added a comment to D142318: [Support] Add PerThreadBumpPtrAllocator class..

the possible solution might be initializing threadIndex to some unrelated value by default.
f.e. setting threadIndex to -1; Threads created by ThreadPoolExecutor would have indexes in range 0 ... ThreadsNum.
It will trigger assertions "assert(getThreadIndex() < NumOfAllocators);" for wrong threads inside PerThreadAllocator methods. Does it sound OK?

Apr 13 2023, 3:45 PM · Restricted Project, Restricted Project
dexonsmith added a comment to D142318: [Support] Add PerThreadBumpPtrAllocator class..

I still have a general concern: this utility isn't safe to use in general LLVM library code, and while that's documented in the header, there's nothing enforcing that or checking for it. I think it'd be easy to get this wrong, and our existing test coverage would be unlikely to catch mistakes, but it could be a big problem for tools/libraries that have their own thread pools and depend on LLVM code.

Apr 13 2023, 1:47 PM · Restricted Project, Restricted Project

Mar 12 2023

dexonsmith added a comment to D83906: [CodeGen] Emit a call instruction instead of an invoke if the called llvm function is marked nounwind.
In D83906#4186887, @hoy wrote:
In D83906#4183453, @hoy wrote:

Wondering if we can come up with a way to tell the optimizer about that, e.g., through a new module flag. When it comes to LTO, the selection of linkonce_odr symbols should already been done and the optimizer may be able to recompute the attributes based on pre-LTO attributes, or at least we can allow IPO to one module only, which should still do a better job than FE does?

I don't think there's much point in passing anything to LTO. There are very few linkonce_odr symbols in LTO, since LTO has the advantage of an export list from the link. Symbols not on the export list are internalized (they're given local linkage).

That sounds to me an opportunity to get a broader IPO done precisely in the prelink optimizer, as long as we find a way to tell it the incoming IR has source fidelity. What do you think about idea of introducing a module flag? Maybe it's worth discussing in the forum as a followup of introducing a cc1 flag for a stable IR gen.

Mar 12 2023, 7:08 PM · Restricted Project, Restricted Project
dexonsmith resigned from D145864: [CodeGen] Introduce threshold option to constrain stack slot sharing during stack coloring..

I'm not the right person to review this. @Gerolf, perhaps you can suggest an alternate?

Mar 12 2023, 9:14 AM · Restricted Project, Restricted Project

Mar 10 2023

dexonsmith added a comment to D83906: [CodeGen] Emit a call instruction instead of an invoke if the called llvm function is marked nounwind.
In D83906#4183453, @hoy wrote:

Wondering if we can come up with a way to tell the optimizer about that, e.g., through a new module flag. When it comes to LTO, the selection of linkonce_odr symbols should already been done and the optimizer may be able to recompute the attributes based on pre-LTO attributes, or at least we can allow IPO to one module only, which should still do a better job than FE does?

Mar 10 2023, 7:40 AM · Restricted Project, Restricted Project

Mar 9 2023

dexonsmith updated subscribers of D83906: [CodeGen] Emit a call instruction instead of an invoke if the called llvm function is marked nounwind.

So your argument is that it would not be possible to recognize that we're doing such an optimization and mark the function as having had a possible semantics change?

Mar 9 2023, 2:37 PM · Restricted Project, Restricted Project
dexonsmith added a comment to D83906: [CodeGen] Emit a call instruction instead of an invoke if the called llvm function is marked nounwind.
In D83906#4182847, @hoy wrote:

As far as I know, the optimizer IPO pass that infers function attributes (i..e InferFunctionAttrsPass) is placed at the very beginning of the optimization pipeline. Does this sound to you that the side effects computed for linkonce_odr functions there can be trusted by the rest of the pipeline?

Mar 9 2023, 2:12 PM · Restricted Project, Restricted Project
dexonsmith added a comment to D83906: [CodeGen] Emit a call instruction instead of an invoke if the called llvm function is marked nounwind.
  • At IRGen time, you know the LLVM attributes have not been adjusted after the optimized refined the function's behaviour. It should be safe to have IPA peepholes, as long as IRGen's other peepholes don't refine behaviour and add attributes based on that.
  • In the optimizer, if you're looking at de-refineable function then you don't know which attributes come directly from the source and which were implied by optimizer refinements. You can't trust you'll get the same function attributes at runtime.

Hmm. I see what you're saying, but it's an interesting question how it applies here. In principle, the optimizer should not be changing the observable semantics of functions, which certainly includes things like whether the function throws. Maybe the optimizer can only figure out that a function throws in one TU, but if it "figures that out" and then a function with supposedly the same semantics actually does throw — not just retains the static ability to throw on a path that happens not to be taken dynamically, but actually throws at runtime — then arguably something has gone badly wrong.

Mar 9 2023, 2:01 PM · Restricted Project, Restricted Project
dexonsmith added a comment to D83906: [CodeGen] Emit a call instruction instead of an invoke if the called llvm function is marked nounwind.
In D83906#4182428, @hoy wrote:

In C++, you get linkonce_odr all over the place. It's basically all functions that are defined in C++ headers that are available for inlining.

On the other hand, the frontend knows the token sequence from the source language. It knows whether function B is inherently nounwind based on its ODR token sequence; in which case, it's safe to use the attribute for an IPA peephole.

Thanks for the detailed explanation again! As you pointed out previously, linkonce_odr is something the front end can optimize. I'm wondering why the front end is confident about that the linker would not replace the current definition with something else.

Mar 9 2023, 12:20 PM · Restricted Project, Restricted Project
dexonsmith added a comment to D83906: [CodeGen] Emit a call instruction instead of an invoke if the called llvm function is marked nounwind.
In D83906#4181981, @hoy wrote:

That said, the LLVM optimizer does not strictly subsume the front-end because of how it fails to handle linkonce_odr functions as in https://reviews.llvm.org/D18634. I'm wondering how common the linkonce_odr linkage is for C++. In @wlei's example, none of the functions there is linkonce_odr. Is there a particular source-level annotate that specifies functions to be linkonce_odr?

Mar 9 2023, 11:21 AM · Restricted Project, Restricted Project
dexonsmith added inline comments to D130303: Fix include order in CXType.cpp.
Mar 9 2023, 2:16 AM · Restricted Project, Restricted Project

Mar 8 2023

dexonsmith added a comment to D83906: [CodeGen] Emit a call instruction instead of an invoke if the called llvm function is marked nounwind.

Oh, de-refining is pretty nifty / evil. This patch has background:
https://reviews.llvm.org/D18634

Mar 8 2023, 11:06 PM · Restricted Project, Restricted Project
dexonsmith added a comment to D83906: [CodeGen] Emit a call instruction instead of an invoke if the called llvm function is marked nounwind.

Hi @ahatanak

We recently hit an issue of inconsistent codegen related with this optimization. In one build, Clang frontend generates different llvm IRs for the same function that is originally from one header file. It turned out this optimization gives different results for different function definition order which is naturally unstable.

See this two repro programs:

p1.cpp: https://godbolt.org/z/bavTYEG1x

void foo() {};
void bar() noexcept {foo();};


p2.cpp: https://godbolt.org/z/zfsnzPrE6

void foo();
void bar() noexcept {foo();};
void foo(){};

See the codegens of bar are different, for p2.cpp, the callee(foo)’s definition is after the caller(bar), it's unknown to be marked nounwind before it see foo's definition, so it still generates the invoke things.

This inconsistency affected the AutoFDO, one of our work assigns consecutive number IDs to the BBs of CFG, the unstable CFGs causes the BB ID mismatched and a lot of samples are lost.

Would like to hear from your feedback. Wondering if FE can handle this perfectly or perhaps we can just leave it for BE. Thank you in advance!

cc @hoy @modimo @wenlei

Mar 8 2023, 5:43 PM · Restricted Project, Restricted Project

Mar 7 2023

dexonsmith added a comment to D130303: Fix include order in CXType.cpp.

@dexonsmith can you weigh in?

Mar 7 2023, 2:47 PM · Restricted Project, Restricted Project

Mar 6 2023

dexonsmith resigned from D144503: [ADT] Allow `llvm::enumerate` to enumerate over multiple ranges.
Mar 6 2023, 1:58 PM · Restricted Project, Restricted Project, Restricted Project
dexonsmith accepted D130303: Fix include order in CXType.cpp.

Pinging alternative reviewer +@dexonsmith for a libclang API addition

Looks reasonable to me -- this only changes behaviour of the existing API when there was corruption before -- but if the goal is to get a vendor of libclang-as-a-stable-API to sign off, I can't help.

@arphaman, if you're busy, is there someone else that could take a quick look?

Mar 6 2023, 11:28 AM · Restricted Project, Restricted Project
dexonsmith accepted D145388: [ADT][NFC] Use declval to suppress warning for nullptr use..

LGTM too.

Mar 6 2023, 11:23 AM · Restricted Project, Restricted Project

Mar 5 2023

dexonsmith accepted D145317: [ValueMapper] Preserve poison types during value mapping.

LGTM.

Mar 5 2023, 9:06 AM · Restricted Project, Restricted Project

Feb 24 2023

dexonsmith added a comment to D144322: [libc++][Modules] Make top level modules for all C++ headers with OS/clang versions.

Sorry, forgot to ask the first time. This change is for Clang modules and not for C++20 modules, right? Asking because believe C++20 modules have standard-enforced module names.

Yes, as far as I know the module map is only used for Clang modules.

Feb 24 2023, 10:16 PM · Restricted Project, Restricted Project

Feb 20 2023

dexonsmith added a comment to D144322: [libc++][Modules] Make top level modules for all C++ headers with OS/clang versions.

Great to see this making progress; looking forward to seeing the related workarounds removed from the Darwin SDKs.

What's the performance impact (for scanning/building/importing modules) of turning these all into top-level modules? If it's a significant regression, maybe there's a way to isolate the headers that are involved in the cycle with the SDK and the Clang headers from those that aren't, while still using only a handful of modules. And if it's a performance improvement in some cases (due to better parallelism or building less stuff), could be something to highlight.

We haven't tested the performance yet. We could try to optimize and the C headers that aren't included by the other ones could maybe stay in std. That's kind of risky though because we can't really guarantee what the include_next'ed headers will include. I think we're better off keeping the C headers in their own modules across the board (in clang and also the Apple headers).

Feb 20 2023, 6:16 PM · Restricted Project, Restricted Project
dexonsmith added a comment to D144322: [libc++][Modules] Make top level modules for all C++ headers with OS/clang versions.

Great to see this making progress; looking forward to seeing the related workarounds removed from the Darwin SDKs.

Feb 20 2023, 5:44 PM · Restricted Project, Restricted Project

Jan 26 2023

dexonsmith accepted D142651: [llvm][docs] Update old metadata syntax in examples.

LGTM

Jan 26 2023, 10:10 AM · Restricted Project, Restricted Project

Jan 22 2023

dexonsmith updated subscribers of D142318: [Support] Add PerThreadBumpPtrAllocator class..

It's nice how simple this is!

Jan 22 2023, 4:34 PM · Restricted Project, Restricted Project

Jan 6 2023

dexonsmith added inline comments to D133715: [ADT] Add TrieRawHashMap.
Jan 6 2023, 6:06 PM · Restricted Project, Restricted Project
dexonsmith added a comment to D133715: [ADT] Add TrieRawHashMap.

Maybe RawHashTrieMap? It reads better when Raw is in the front, and it contains hash-trie and trie-map, which are both terms describing data structures similar to this but this is much simpler, thus raw.

Jan 6 2023, 5:51 PM · Restricted Project, Restricted Project
dexonsmith added a comment to D133715: [ADT] Add TrieRawHashMap.

I guess one concern with TrieHashMap is that if this is the lower level implementation, and someone might implement a more map-like API on top of this, we might not want to take the "better" name for the data structure that'll be less directly used?

Could prefix with "Raw" or maybe TrieRawHashMap? (since it's the hashing part that's particularly "raw" - relying on the hash being unique, etc)

Jan 6 2023, 2:58 PM · Restricted Project, Restricted Project
dexonsmith added a comment to D133713: [Support] Introduce ThreadSafeAllocator.

The reason I don't want to have the CAS blocked by a better allocator is that I need to write and test it, also figure out if it needs to have any code sharing with regular BumpPtrAllocator.

Jan 6 2023, 2:56 PM · Restricted Project, Restricted Project
dexonsmith added a comment to D133715: [ADT] Add TrieRawHashMap.

But having a fast concurrent BumpPtrAllocator would be independently useful, and I'd suggest optimizing the allocator before bloating the default trie size.

+1 for fast concurrent ThreadSafeBumpPtrAllocator.

What do you think about following alternative implementation?

class ThreadSafeBumpPtrAllocator {
  ThreadSafeBumpPtrAllocator() {
    size_t ThreadsNum = ThreadPoolStrategy.compute_thread_count();
    allocators.resize(ThreadsNum);
  }
  
  void* Allocate (size_t Num) {
      size_t AllocatorIdx = getThreadIdx();
      
      return allocators[AllocatorIdx].Allocate(Num);
  }

  std::vector<BumpPtrAllocator> allocators;
};

static thread_local ThreadIdx;

size_t getThreadIdx() {
  return ThreadIdx;
}

This implementation uses the fact that ThreadPoolExecutor creates a fixed number
of threads(ThreadPoolStrategy.compute_thread_count()) and keeps them until destructed
. ThreadPoolExecutor can initialise thread local field ThreadIdx to the proper thread index.
The getThreadIdx() could return index of thread inside ThreadPoolExecutor.Threads.
ThreadSafeBumpPtrAllocator keeps separate allocator for each thread. In this case each thread would
always use separate allocator. No neccessary to have locks, cas operations, no races...

Jan 6 2023, 11:05 AM · Restricted Project, Restricted Project

Jan 5 2023

dexonsmith added a comment to D133715: [ADT] Add TrieRawHashMap.

Ping. All feedbacks are addressed.

Additional notes: I dig a bit deeper into the benchmark from https://reviews.llvm.org/D132548 as it shows bad scaling during parallel insertion tests (stop scaling to 8 threads). That is actually caused by our ThreadSafeBumpPtrAllocator that currently takes a lock. We can improve it in future since our use case doesn't expect heavy parallel insertions. However, it is quite obvious we should tune to RootBits and SubTrieBits. Increasing RootBits can significantly decrease the contention. A better strategy might be something like starting with something like 10 bits, then 4 bits, 2 bits and 1 bit. Shrinking number of bits can lead to better memory usage since the slots usage in the deep nodes are very low.

Could ThreadSafeBumpPtrAllocator could be made lock-free? I think at least it would be possible to implement one that only locked when a new allocation is needed, instead of every time the ptr is bumped as now. (I’ll think about it a bit.)

Note that in the CAS use case it’s ideally true that most insertions are duplicates and don’t need to call the allocator at all. This is why we’ve been able to get away with a lock on each allocation.

Jan 5 2023, 2:45 PM · Restricted Project, Restricted Project
dexonsmith added a comment to D133715: [ADT] Add TrieRawHashMap.

Ping. All feedbacks are addressed.

Additional notes: I dig a bit deeper into the benchmark from https://reviews.llvm.org/D132548 as it shows bad scaling during parallel insertion tests (stop scaling to 8 threads). That is actually caused by our ThreadSafeBumpPtrAllocator that currently takes a lock. We can improve it in future since our use case doesn't expect heavy parallel insertions. However, it is quite obvious we should tune to RootBits and SubTrieBits. Increasing RootBits can significantly decrease the contention. A better strategy might be something like starting with something like 10 bits, then 4 bits, 2 bits and 1 bit. Shrinking number of bits can lead to better memory usage since the slots usage in the deep nodes are very low.

Jan 5 2023, 1:48 PM · Restricted Project, Restricted Project

Jan 3 2023

dexonsmith accepted D140890: llvm-reduce: Reduce individual operands of named metadata.

LGTM with one bit inline.

Jan 3 2023, 7:32 AM · Restricted Project, Restricted Project

Dec 30 2022

dexonsmith added a comment to D138760: BLAKE3: do not try to use neon on big-endian aarch64.

Thanks for the hint / redirect.
I've submitted a pull request at https://github.com/BLAKE3-team/BLAKE3/pull/280

Thank you! If we take the patch here, we may risk losing the change when someone upgrades BLAKE3. So it's better to ensure that it works in the upstream...

Dec 30 2022, 11:15 AM · Restricted Project, Restricted Project

Dec 29 2022

dexonsmith updated subscribers of D132455: [ADT] add ConcurrentHashtable class..

@dexonsmith & co working on the CAS have also proposed a thread safe hash table of sorts ( https://reviews.llvm.org/D133715 )- it's a bit more esoteric/specialized, but I wonder if the use cases overlap enough to be able to unify them?

Dec 29 2022, 8:36 AM · Restricted Project, Restricted Project

Dec 14 2022

dexonsmith added a comment to D133715: [ADT] Add TrieRawHashMap.

eh, maybe string's easier to read in expect diagnostics anyway... just seems a bit awkward/circuitous/unfortunate :/ (I guess the stringification could move into the test code, implemented via such a friend relationship)

Dec 14 2022, 1:36 PM · Restricted Project, Restricted Project
dexonsmith added a comment to D133715: [ADT] Add TrieRawHashMap.
  • a test support friend class that can inspect/test the trie layout without relying on stringification;

This bit I'm not so sure about - we don't test the bucketing behavior of DenseHashMap so far as I know (or other similar implementation details of the various other hashing data structures - (since they're the ones with fairly complicated implementation details, they seem the best comparison)), for instance, we only test its interface - why would we do differently for this data structure?

IMO the layout is more complex than a flat array with quadratic probing. The logic for equal and/or nearly-equal hashes is a bit subtle, and there were hard-to-reason about bugs originally. The layout-stringification was necessary to understand what was going wrong, and the tests that use it help ensure future refactorings don't get it wrong.

I don't remember the bugs, but two examples of subtleties:

  • On an exact match you don't want to "sink" the existing entry down a level to a new sub-trie (you need to detect "exact match" before sinking). Getting this wrong will affect performance but not otherwise be user-visible.
  • The deepest sub-trie might be a different/smaller size than the others because there are only a few bits left-over, and the handling needs to be right. It's simpler to check for correct layout directly than to guess about what user-visible effects there might be for errors.

I'd be a bit uneasy with the layout tests being dropped altogether.

Maybe an alternative to testing the layout directly would be to add a verification member function that iterated through the data structure and ensured everything was self-consistent (else crash? else return false?). Then the tests could call the member function after a series of insertions that might trigger a "bad" layout.

Fair enough - if it's sufficient to have a verify operation (maybe "assertValid" - so, yeah, crash when not valid) I'd go with that, but given the argument you've made, if you think verifying the specific structure is significantly more valuable than that, I'd be OK with some private/test-friended introspection API.

Dec 14 2022, 12:15 PM · Restricted Project, Restricted Project

Dec 13 2022

dexonsmith added a comment to D133715: [ADT] Add TrieRawHashMap.
  • a test support friend class that can inspect/test the trie layout without relying on stringification;

This bit I'm not so sure about - we don't test the bucketing behavior of DenseHashMap so far as I know (or other similar implementation details of the various other hashing data structures - (since they're the ones with fairly complicated implementation details, they seem the best comparison)), for instance, we only test its interface - why would we do differently for this data structure?

Dec 13 2022, 4:12 PM · Restricted Project, Restricted Project
dexonsmith added a comment to D133715: [ADT] Add TrieRawHashMap.

Note that there's an example data structure called ThreadSafeHashMappedTrieSet buried in the unit tests, which has an interface that hides the hash. Could be useful for some clients.

  • Maybe it would make sense to lift up? Name could be PerfectHashSet
  • You might want to build a map that also handles the hashing; name could potentially be PerfectHashMap

Maybe this could be PerfectHashSetImpl?

  • Has the "guts" for implementing a client-friendly PerfectHashSet or PerfectHashMap?
  • ... but can be used directly if you want to manage the hashing yourself?

I take it the current use cases you have in mind/lined up in the CAS use this directly? Maybe a Raw prefix? RawPerfectHashSet, RawConcurrentHashSet, etc... some combination/choice of those sort of things?

Dec 13 2022, 3:17 PM · Restricted Project, Restricted Project
dexonsmith added a comment to D133715: [ADT] Add TrieRawHashMap.

Is it a set or a map?

Dec 13 2022, 2:34 PM · Restricted Project, Restricted Project
dexonsmith updated subscribers of D139163: Utils: Add utility pass to lower ifuncs.

To answer your questions in the comments about what to do about resolvers with arguments: At least glibc always calls ifunc resolvers without any arguments. It just reads the address of the resolver function from the ELF file, casts it to a pointer to a void-returning function with no arguments and calls it.

So, I agree: Resolvers with arguments should not be allowed.

Dec 13 2022, 8:59 AM · Restricted Project, Restricted Project

Dec 10 2022

dexonsmith added a comment to D139767: [DFAPacketizer] Move DefaultVLIWScheduler class declaration to header file.

I don't know much about this code; resigning as reviewer. But I took a quick look. I wonder if this the right fix, or whether VLIWPacketizerList::VLIWScheduler's type should be ScheduleDAGInstrs, and downcast when necessary on use, allowing DefaultVLIWScheduler to stay private. (Up to others to sort out!)

Thanks @dexonsmith for the suggestion. But I think even if we want to downcast ( dynamic_cast) the declaration should be available to the derived class, which is not the case in the current implementation.

Dec 10 2022, 10:12 PM · Restricted Project, Restricted Project
dexonsmith resigned from D139767: [DFAPacketizer] Move DefaultVLIWScheduler class declaration to header file.

I don't know much about this code; resigning as reviewer. But I took a quick look. I wonder if this the right fix, or whether VLIWPacketizerList::VLIWScheduler's type should be ScheduleDAGInstrs, and downcast when necessary on use, allowing DefaultVLIWScheduler to stay private. (Up to others to sort out!)

Dec 10 2022, 9:38 AM · Restricted Project, Restricted Project

Dec 8 2022

dexonsmith added inline comments to D133716: [CAS] Add LLVMCAS library with InMemoryCAS implementation.
Dec 8 2022, 4:39 PM · Restricted Project, Restricted Project
dexonsmith added inline comments to D133716: [CAS] Add LLVMCAS library with InMemoryCAS implementation.
Dec 8 2022, 3:21 PM · Restricted Project, Restricted Project
dexonsmith added inline comments to D133716: [CAS] Add LLVMCAS library with InMemoryCAS implementation.
Dec 8 2022, 2:51 PM · Restricted Project, Restricted Project
dexonsmith added inline comments to D133716: [CAS] Add LLVMCAS library with InMemoryCAS implementation.
Dec 8 2022, 1:48 PM · Restricted Project, Restricted Project