LGTM! Thanks.
Please use GitHub pull requests for new patches. Phabricator shutdown timeline
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Tue, Sep 19
Thu, Sep 14
In D74094#4645562, @nickdesaulniers wrote:I don't yet fully comprehend yet what's going wrong, and probably need to familiarize myself with the language rules around auto's type deduction.
Tue, Sep 12
In D61878#4644316, @ldionne wrote:[Github PR transition cleanup]
This patch doesn't seem correct to me: unordered containers are not ordered, so I don't think we can do a linear walk of __first1 and __first2, compare elements side by side and draw any conclusion from that. Abandoning.
Fri, Sep 8
Wed, Sep 6
I looked at the documentation for User::dropAllReferences(), which this overrides. That has strengthened my opinion so I'm requesting changes.
Tue, Sep 5
In D154119#4638649, @oskarwirga wrote:ping
Mon, Sep 4
In D156618#4636483, @taolq wrote:In D156618#4632192, @dexonsmith wrote:In D156618#4632179, @dexonsmith wrote:Seems unfortunate to add calls to ConstantPointerNull::get() when all of the callers are about to call delete.
Can we instead somehow enforce that delete is always called right after? Perhaps add an assertion that this is so?
Or, looking back at the motivation (calling deleteBody()), can we have two entry points?
- One prepares for deletion (99% of callers), skips CPN::get()
- One makes it a prototype (deleteBody), calls CPN::get()
Thanks for the comment. I think it is a little redundant. Since we already has dector to release the memory, another entry points seems redundant.
So I'd like to reserve the only prototype version.
Thu, Aug 31
In D156618#4632179, @dexonsmith wrote:Seems unfortunate to add calls to ConstantPointerNull::get() when all of the callers are about to call delete.
Can we instead somehow enforce that delete is always called right after? Perhaps add an assertion that this is so?
Seems unfortunate to add calls to ConstantPointerNull::get() when all of the callers are about to call delete.
Wed, Aug 30
Tue, Aug 29
Mon, Aug 28
Now that the behaviour change is understood, maybe it'd be useful to split the patch in two:
- First, this patch, plus a call to ASTReader::getInputFile() only for its side effects, to make this patch actually NFC.
- Second, committed a few days later, a patch that removes the call and adds a test confirming -I is no longer implied by -fembed-all-input-files.
Sorry, Phabricator lost my comment somehow. Adding back a version of it now.
Aug 22 2023
Aug 19 2023
Aug 17 2023
In D158137#4597565, @MaskRay wrote:In D158137#4597491, @dexonsmith wrote:In D158137#4597009, @MaskRay wrote:In D158137#4596948, @dexonsmith wrote:Can you explain the downside of leaving behind an alias?
Two minor ones. (a) Existing -Wno-overriding-t-option will not notice that they need to migrate and (b) Clang has accrued tiny tech debt.
If we eventually remove -Wno-overriding-t-option for tidiness, we will have to break -Werror -Wno-overriding-t-option users.I guess it's not clear to me we'd need to remove the alias. The usual policy (I think?) is that clang driver options don't disappear. It seems like a small piece of debt to maintain the extra alias in this case, and if it's kept, then users don't actually need to migrate. And then you can feel safe updating Darwin.cpp as well.
-W* options are different from regular driver options in that -Wunknown-unknown-unknown leads to a warning instead of an error, while a regular unrecognized driver option leads to an error.
We deprecate driver options and make use of them warnings, and newer Clang generally emits more warnings. These would break -Werror users as well, but we still do them anyway if reasonable.I understand that it is a small piece of debt, but my point is that we don't need the debt.
In D158137#4597009, @MaskRay wrote:In D158137#4596948, @dexonsmith wrote:Can you explain the downside of leaving behind an alias?
Two minor ones. (a) Existing -Wno-overriding-t-option will not notice that they need to migrate and (b) Clang has accrued tiny tech debt.
If we eventually remove -Wno-overriding-t-option for tidiness, we will have to break -Werror -Wno-overriding-t-option users.
Can you explain the downside of leaving behind an alias?
This seems to drop -Woverriding-t-option entirely. Could that break builds if someone has (e.g.) -Werror -Wno-overriding-t-option in their build settings?
Aug 16 2023
Perhaps as a follow-up, rename warn_drv_overriding_flag_option to have “t” in it?
Aug 15 2023
Nice!
Aug 7 2023
SGTM.
Aug 6 2023
LGTM.
Aug 4 2023
I don't have access to rdar these days to look into the current state or to refresh my memory.
Aug 3 2023
In D74094#4554327, @foad wrote:Hi @erik.pilkington, I see this got reverted:
Jul 28 2023
Jul 11 2023
Jul 5 2023
LGTM.
Jun 5 2023
LGTM, thanks!
Should the verifier fail on this? If so, might be nice to make that happen before dropping it; if not, LGTM.
May 31 2023
May 17 2023
In D144322#4344301, @iana wrote:If you want to ignore all the added includes and review the rest, it should be in a good state. I'm temporarily using this review to test for more missing includes via CI. I guess my local environment is different enough that it didn't find all of them.
May 1 2023
In D103930#4310061, @ivanmurashko wrote:Friendly ping
@arphaman, @jansvoboda11, I have made the patch buildable on all platforms and have all tests passed. There was also a small fix (temp path for modules artefact) at the test that could fix its run on some platforms. Could you look at it? Does it have any issues on your side?
Apr 25 2023
LGTM
Apr 24 2023
LGTM!
Apr 17 2023
In D144322#4275378, @iana wrote:In D144322#4275247, @dexonsmith wrote:I'd be curious to see just how bad the perf is with implicitly-discovered-and-explicitly-built modules. Maybe it wouldn't be as bad as suspected. And will get faster with the scanning speedups @jansvoboda11 is working on.
- Do we know how deep the build graph would be? (Would we get good parallelism when building?)
I made a rough graph that's not quite accurate but is probably close enough to answer that question. It's way more wide and shallow than I thought it would be.
libcxx-dependencies.gv.pdf1 MBDownload
It's not so much parallelism that we're worried about, it's more like launching clang ~200 times to build the algorithm module and later opening 200 pcm files.
In D144322#4274873, @iana wrote:Discussed this some more with @ldionne, @Bigcheese, @vsapsai, @var-const. We don't love this patch for a few reasons.
- It looks arbitrary what got left in std and what got pulled out to top level.
- It looks arbitrary which private detail headers are in the new std_abc modules and which ones are top level modules.
- Figuring out where the private detail headers go manually is tricky, and possibly difficult to maintain. (I've rebased this a couple of times over the last few months and it keeps adding module cycles that have been tough to resolve. It might be less difficult if the cycles get resolved as the headers are changed, or it might just always be hard.)
The two ideas we had to improve the situation are these.
- Make a top level module for all of the headers. This is the simple approach and, if there aren't any include cycles, will be the easiest way to avoid module cycles. But it makes ~950 top level modules which looks kind of goofy and has a fair chance at sub-optimal performance.
Sounds good; happy for this to land while you continue working on how to do the checking.
This sounds okay to me, but I admit I don't know llvm::parallel well enough to understand the implications.
Apr 15 2023
In D142318#4270943, @avl wrote:In D142318#4270235, @dexonsmith wrote:In D142318#4269744, @avl wrote:I think this solves only part of the problem: it checks the fact that executor is already created when getThreadIndex() is requested. But it does not check that thread index is valid. If thread was created not by ThreadPoolExecutor then it would have zero index which clashes with thread index of main thread and Thread0. I thought we want to check that other threads were not used with getThreadIndex.
Checking ThreadPoolExecutor existence still useful check and it would be good to implement it. If we found a good way to check thread indexes it would also be useful.
Yeah, seems like a good start for now. This would catch the case where someone is NOT using llvm::parallel at all, but has a bunch of threads, and is wrongly assuming this allocator is safe for concurrent use in general.
This check will help for pure users of getThreadIndex() but will not help users of PerThreadBumpPtrAllocator as it calls "detail::Executor::getDefaultExecutor()->getThreadsNum();" in the constructor. Thus any call to getThreadIndex() after PerThreadBumpPtrAllocator is created will have HasDefaultExecutor == true.
Apr 14 2023
In D142318#4269744, @avl wrote:In D142318#4269662, @dexonsmith wrote:In D142318#4269644, @avl wrote:I am OK to do that separate patch right after the current patch. Just do not have a good idea for this at the moment.
WDYT of the idea above, to have a Boolean flag that checks whether getDefaultExecutor() has been called, and assert on that in getThreadIndex()?
I think this solves only part of the problem: it checks the fact that executor is already created when getThreadIndex() is requested. But it does not check that thread index is valid. If thread was created not by ThreadPoolExecutor then it would have zero index which clashes with thread index of main thread and Thread0. I thought we want to check that other threads were not used with getThreadIndex.
Checking ThreadPoolExecutor existence still useful check and it would be good to implement it. If we found a good way to check thread indexes it would also be useful.
In D142318#4269644, @avl wrote:I am OK to do that separate patch right after the current patch. Just do not have a good idea for this at the moment.
In D142318#4269161, @dexonsmith wrote:In D142318#4269070, @dexonsmith wrote:Seems like threads are assigned IDs from 1 in the ThreadPoolExecutor constructor via calls to work(). The main thread assigns threadIndex to 0 in the same place:
Aha, looks like I misread the code. The work() calls are coming from within a lambda that's executed by the first created thread. So, right now, the main thread has the same threadIndex as the first spawned thread.
(But if that's the case, doesn't that cause a problem for the allocator? Doesn't the allocator require that the main thread has a different ID from the worker threads?)
In D142318#4269070, @dexonsmith wrote:Seems like threads are assigned IDs from 1 in the ThreadPoolExecutor constructor via calls to work(). The main thread assigns threadIndex to 0 in the same place:
In D142318#4269070, @dexonsmith wrote:In D142318#4268765, @MaskRay wrote:In D142318#4268729, @avl wrote:so far I suggest to implement safety check as a separate patch. After having a good solution for this.
Looks good to me.
I think if we don't add the check now it's unlikely to happen later.
In D142318#4268765, @MaskRay wrote:In D142318#4268729, @avl wrote:so far I suggest to implement safety check as a separate patch. After having a good solution for this.
Looks good to me.
Apr 13 2023
BTW, if others feel strongly that such an assertion wouldn't be useful (say, maybe there's reason to believe that even unit tests wouldn't trigger it in practice due to @MaskRay's points?), happy to back away and let this land without it.
In D142318#4266426, @avl wrote:the possible solution might be initializing threadIndex to some unrelated value by default.
f.e. setting threadIndex to -1; Threads created by ThreadPoolExecutor would have indexes in range 0 ... ThreadsNum.
It will trigger assertions "assert(getThreadIndex() < NumOfAllocators);" for wrong threads inside PerThreadAllocator methods. Does it sound OK?
I still have a general concern: this utility isn't safe to use in general LLVM library code, and while that's documented in the header, there's nothing enforcing that or checking for it. I think it'd be easy to get this wrong, and our existing test coverage would be unlikely to catch mistakes, but it could be a big problem for tools/libraries that have their own thread pools and depend on LLVM code.
Mar 12 2023
In D83906#4186887, @hoy wrote:In D83906#4184916, @dexonsmith wrote:In D83906#4183453, @hoy wrote:Wondering if we can come up with a way to tell the optimizer about that, e.g., through a new module flag. When it comes to LTO, the selection of linkonce_odr symbols should already been done and the optimizer may be able to recompute the attributes based on pre-LTO attributes, or at least we can allow IPO to one module only, which should still do a better job than FE does?
I don't think there's much point in passing anything to LTO. There are very few linkonce_odr symbols in LTO, since LTO has the advantage of an export list from the link. Symbols not on the export list are internalized (they're given local linkage).
That sounds to me an opportunity to get a broader IPO done precisely in the prelink optimizer, as long as we find a way to tell it the incoming IR has source fidelity. What do you think about idea of introducing a module flag? Maybe it's worth discussing in the forum as a followup of introducing a cc1 flag for a stable IR gen.
I'm not the right person to review this. @Gerolf, perhaps you can suggest an alternate?
Mar 10 2023
In D83906#4183453, @hoy wrote:Wondering if we can come up with a way to tell the optimizer about that, e.g., through a new module flag. When it comes to LTO, the selection of linkonce_odr symbols should already been done and the optimizer may be able to recompute the attributes based on pre-LTO attributes, or at least we can allow IPO to one module only, which should still do a better job than FE does?
Mar 9 2023
In D83906#4182902, @rjmccall wrote:So your argument is that it would not be possible to recognize that we're doing such an optimization and mark the function as having had a possible semantics change?
In D83906#4182847, @hoy wrote:As far as I know, the optimizer IPO pass that infers function attributes (i..e InferFunctionAttrsPass) is placed at the very beginning of the optimization pipeline. Does this sound to you that the side effects computed for linkonce_odr functions there can be trusted by the rest of the pipeline?
In D83906#4182777, @rjmccall wrote:In D83906#4182287, @dexonsmith wrote:
- At IRGen time, you know the LLVM attributes have not been adjusted after the optimized refined the function's behaviour. It should be safe to have IPA peepholes, as long as IRGen's other peepholes don't refine behaviour and add attributes based on that.
- In the optimizer, if you're looking at de-refineable function then you don't know which attributes come directly from the source and which were implied by optimizer refinements. You can't trust you'll get the same function attributes at runtime.
Hmm. I see what you're saying, but it's an interesting question how it applies here. In principle, the optimizer should not be changing the observable semantics of functions, which certainly includes things like whether the function throws. Maybe the optimizer can only figure out that a function throws in one TU, but if it "figures that out" and then a function with supposedly the same semantics actually does throw — not just retains the static ability to throw on a path that happens not to be taken dynamically, but actually throws at runtime — then arguably something has gone badly wrong.
In D83906#4182428, @hoy wrote:In D83906#4182287, @dexonsmith wrote:In C++, you get linkonce_odr all over the place. It's basically all functions that are defined in C++ headers that are available for inlining.
On the other hand, the frontend knows the token sequence from the source language. It knows whether function B is inherently nounwind based on its ODR token sequence; in which case, it's safe to use the attribute for an IPA peephole.
Thanks for the detailed explanation again! As you pointed out previously, linkonce_odr is something the front end can optimize. I'm wondering why the front end is confident about that the linker would not replace the current definition with something else.
In D83906#4181981, @hoy wrote:That said, the LLVM optimizer does not strictly subsume the front-end because of how it fails to handle linkonce_odr functions as in https://reviews.llvm.org/D18634. I'm wondering how common the linkonce_odr linkage is for C++. In @wlei's example, none of the functions there is linkonce_odr. Is there a particular source-level annotate that specifies functions to be linkonce_odr?
Mar 8 2023
Oh, de-refining is pretty nifty / evil. This patch has background:
https://reviews.llvm.org/D18634
In D83906#4179512, @wlei wrote:Hi @ahatanak
We recently hit an issue of inconsistent codegen related with this optimization. In one build, Clang frontend generates different llvm IRs for the same function that is originally from one header file. It turned out this optimization gives different results for different function definition order which is naturally unstable.
See this two repro programs:
p1.cpp: https://godbolt.org/z/bavTYEG1x
void foo() {}; void bar() noexcept {foo();};p2.cpp: https://godbolt.org/z/zfsnzPrE6
void foo(); void bar() noexcept {foo();}; void foo(){};See the codegens of bar are different, for p2.cpp, the callee(foo)’s definition is after the caller(bar), it's unknown to be marked nounwind before it see foo's definition, so it still generates the invoke things.
This inconsistency affected the AutoFDO, one of our work assigns consecutive number IDs to the BBs of CFG, the unstable CFGs causes the BB ID mismatched and a lot of samples are lost.
Would like to hear from your feedback. Wondering if FE can handle this perfectly or perhaps we can just leave it for BE. Thank you in advance!
Mar 7 2023
In D130303#4175664, @collinbaker wrote:@dexonsmith can you weigh in?
Mar 6 2023
In D130303#3724392, @dexonsmith wrote:In D130303#3724247, @rnk wrote:Pinging alternative reviewer +@dexonsmith for a libclang API addition
Looks reasonable to me -- this only changes behaviour of the existing API when there was corruption before -- but if the goal is to get a vendor of libclang-as-a-stable-API to sign off, I can't help.
@arphaman, if you're busy, is there someone else that could take a quick look?
LGTM too.
Mar 5 2023
LGTM.
Feb 24 2023
In D144322#4151759, @iana wrote:In D144322#4151757, @vsapsai wrote:Sorry, forgot to ask the first time. This change is for Clang modules and not for C++20 modules, right? Asking because believe C++20 modules have standard-enforced module names.
Yes, as far as I know the module map is only used for Clang modules.
Feb 20 2023
In D144322#4140228, @iana wrote:In D144322#4140227, @dexonsmith wrote:Great to see this making progress; looking forward to seeing the related workarounds removed from the Darwin SDKs.
What's the performance impact (for scanning/building/importing modules) of turning these all into top-level modules? If it's a significant regression, maybe there's a way to isolate the headers that are involved in the cycle with the SDK and the Clang headers from those that aren't, while still using only a handful of modules. And if it's a performance improvement in some cases (due to better parallelism or building less stuff), could be something to highlight.
We haven't tested the performance yet. We could try to optimize and the C headers that aren't included by the other ones could maybe stay in std. That's kind of risky though because we can't really guarantee what the include_next'ed headers will include. I think we're better off keeping the C headers in their own modules across the board (in clang and also the Apple headers).
Great to see this making progress; looking forward to seeing the related workarounds removed from the Darwin SDKs.
Jan 26 2023
LGTM
Jan 22 2023
It's nice how simple this is!
Jan 6 2023
In D133715#4032713, @steven_wu wrote:Maybe RawHashTrieMap? It reads better when Raw is in the front, and it contains hash-trie and trie-map, which are both terms describing data structures similar to this but this is much simpler, thus raw.
In D133715#4032581, @dblaikie wrote:I guess one concern with TrieHashMap is that if this is the lower level implementation, and someone might implement a more map-like API on top of this, we might not want to take the "better" name for the data structure that'll be less directly used?
Could prefix with "Raw" or maybe TrieRawHashMap? (since it's the hashing part that's particularly "raw" - relying on the hash being unique, etc)
In D133713#4031859, @steven_wu wrote:The reason I don't want to have the CAS blocked by a better allocator is that I need to write and test it, also figure out if it needs to have any code sharing with regular BumpPtrAllocator.
In D133715#4031396, @avl wrote:But having a fast concurrent BumpPtrAllocator would be independently useful, and I'd suggest optimizing the allocator before bloating the default trie size.
+1 for fast concurrent ThreadSafeBumpPtrAllocator.
What do you think about following alternative implementation?
class ThreadSafeBumpPtrAllocator { ThreadSafeBumpPtrAllocator() { size_t ThreadsNum = ThreadPoolStrategy.compute_thread_count(); allocators.resize(ThreadsNum); } void* Allocate (size_t Num) { size_t AllocatorIdx = getThreadIdx(); return allocators[AllocatorIdx].Allocate(Num); } std::vector<BumpPtrAllocator> allocators; }; static thread_local ThreadIdx; size_t getThreadIdx() { return ThreadIdx; }This implementation uses the fact that ThreadPoolExecutor creates a fixed number
of threads(ThreadPoolStrategy.compute_thread_count()) and keeps them until destructed
. ThreadPoolExecutor can initialise thread local field ThreadIdx to the proper thread index.
The getThreadIdx() could return index of thread inside ThreadPoolExecutor.Threads.
ThreadSafeBumpPtrAllocator keeps separate allocator for each thread. In this case each thread would
always use separate allocator. No neccessary to have locks, cas operations, no races...
Jan 5 2023
In D133715#4029879, @steven_wu wrote:In D133715#4029841, @dexonsmith wrote:In D133715#4029442, @steven_wu wrote:Ping. All feedbacks are addressed.
Additional notes: I dig a bit deeper into the benchmark from https://reviews.llvm.org/D132548 as it shows bad scaling during parallel insertion tests (stop scaling to 8 threads). That is actually caused by our ThreadSafeBumpPtrAllocator that currently takes a lock. We can improve it in future since our use case doesn't expect heavy parallel insertions. However, it is quite obvious we should tune to RootBits and SubTrieBits. Increasing RootBits can significantly decrease the contention. A better strategy might be something like starting with something like 10 bits, then 4 bits, 2 bits and 1 bit. Shrinking number of bits can lead to better memory usage since the slots usage in the deep nodes are very low.
Could ThreadSafeBumpPtrAllocator could be made lock-free? I think at least it would be possible to implement one that only locked when a new allocation is needed, instead of every time the ptr is bumped as now. (I’ll think about it a bit.)
Note that in the CAS use case it’s ideally true that most insertions are duplicates and don’t need to call the allocator at all. This is why we’ve been able to get away with a lock on each allocation.
In D133715#4029442, @steven_wu wrote:Ping. All feedbacks are addressed.
Additional notes: I dig a bit deeper into the benchmark from https://reviews.llvm.org/D132548 as it shows bad scaling during parallel insertion tests (stop scaling to 8 threads). That is actually caused by our ThreadSafeBumpPtrAllocator that currently takes a lock. We can improve it in future since our use case doesn't expect heavy parallel insertions. However, it is quite obvious we should tune to RootBits and SubTrieBits. Increasing RootBits can significantly decrease the contention. A better strategy might be something like starting with something like 10 bits, then 4 bits, 2 bits and 1 bit. Shrinking number of bits can lead to better memory usage since the slots usage in the deep nodes are very low.
Jan 3 2023
LGTM with one bit inline.
Dec 30 2022
In D138760#4020385, @MaskRay wrote:In D138760#4018383, @he32 wrote:Thanks for the hint / redirect.
I've submitted a pull request at https://github.com/BLAKE3-team/BLAKE3/pull/280Thank you! If we take the patch here, we may risk losing the change when someone upgrades BLAKE3. So it's better to ensure that it works in the upstream...
Dec 29 2022
In D132455#4018472, @dblaikie wrote:@dexonsmith & co working on the CAS have also proposed a thread safe hash table of sorts ( https://reviews.llvm.org/D133715 )- it's a bit more esoteric/specialized, but I wonder if the use cases overlap enough to be able to unify them?
Dec 14 2022
In D133715#3996103, @dblaikie wrote:eh, maybe string's easier to read in expect diagnostics anyway... just seems a bit awkward/circuitous/unfortunate :/ (I guess the stringification could move into the test code, implemented via such a friend relationship)
In D133715#3995750, @dblaikie wrote:In D133715#3993521, @dexonsmith wrote:In D133715#3993489, @dblaikie wrote:
- a test support friend class that can inspect/test the trie layout without relying on stringification;
This bit I'm not so sure about - we don't test the bucketing behavior of DenseHashMap so far as I know (or other similar implementation details of the various other hashing data structures - (since they're the ones with fairly complicated implementation details, they seem the best comparison)), for instance, we only test its interface - why would we do differently for this data structure?
IMO the layout is more complex than a flat array with quadratic probing. The logic for equal and/or nearly-equal hashes is a bit subtle, and there were hard-to-reason about bugs originally. The layout-stringification was necessary to understand what was going wrong, and the tests that use it help ensure future refactorings don't get it wrong.
I don't remember the bugs, but two examples of subtleties:
- On an exact match you don't want to "sink" the existing entry down a level to a new sub-trie (you need to detect "exact match" before sinking). Getting this wrong will affect performance but not otherwise be user-visible.
- The deepest sub-trie might be a different/smaller size than the others because there are only a few bits left-over, and the handling needs to be right. It's simpler to check for correct layout directly than to guess about what user-visible effects there might be for errors.
I'd be a bit uneasy with the layout tests being dropped altogether.
Maybe an alternative to testing the layout directly would be to add a verification member function that iterated through the data structure and ensured everything was self-consistent (else crash? else return false?). Then the tests could call the member function after a series of insertions that might trigger a "bad" layout.
Fair enough - if it's sufficient to have a verify operation (maybe "assertValid" - so, yeah, crash when not valid) I'd go with that, but given the argument you've made, if you think verifying the specific structure is significantly more valuable than that, I'd be OK with some private/test-friended introspection API.
Dec 13 2022
In D133715#3993489, @dblaikie wrote:
- a test support friend class that can inspect/test the trie layout without relying on stringification;
This bit I'm not so sure about - we don't test the bucketing behavior of DenseHashMap so far as I know (or other similar implementation details of the various other hashing data structures - (since they're the ones with fairly complicated implementation details, they seem the best comparison)), for instance, we only test its interface - why would we do differently for this data structure?
In D133715#3993311, @dblaikie wrote:Note that there's an example data structure called ThreadSafeHashMappedTrieSet buried in the unit tests, which has an interface that hides the hash. Could be useful for some clients.
- Maybe it would make sense to lift up? Name could be PerfectHashSet
- You might want to build a map that also handles the hashing; name could potentially be PerfectHashMap
Maybe this could be PerfectHashSetImpl?
- Has the "guts" for implementing a client-friendly PerfectHashSet or PerfectHashMap?
- ... but can be used directly if you want to manage the hashing yourself?
I take it the current use cases you have in mind/lined up in the CAS use this directly? Maybe a Raw prefix? RawPerfectHashSet, RawConcurrentHashSet, etc... some combination/choice of those sort of things?
In D133715#3993178, @dblaikie wrote:Is it a set or a map?
In D139163#3991248, @MoritzS wrote:To answer your questions in the comments about what to do about resolvers with arguments: At least glibc always calls ifunc resolvers without any arguments. It just reads the address of the resolver function from the ELF file, casts it to a pointer to a void-returning function with no arguments and calls it.
So, I agree: Resolvers with arguments should not be allowed.
Dec 10 2022
In D139767#3986728, @DarshanRamakant wrote:In D139767#3986476, @dexonsmith wrote:I don't know much about this code; resigning as reviewer. But I took a quick look. I wonder if this the right fix, or whether VLIWPacketizerList::VLIWScheduler's type should be ScheduleDAGInstrs, and downcast when necessary on use, allowing DefaultVLIWScheduler to stay private. (Up to others to sort out!)
Thanks @dexonsmith for the suggestion. But I think even if we want to downcast ( dynamic_cast) the declaration should be available to the derived class, which is not the case in the current implementation.
I don't know much about this code; resigning as reviewer. But I took a quick look. I wonder if this the right fix, or whether VLIWPacketizerList::VLIWScheduler's type should be ScheduleDAGInstrs, and downcast when necessary on use, allowing DefaultVLIWScheduler to stay private. (Up to others to sort out!)