- On Windows, extended lambda has extra issues due to the numbering schemes are different between the host compilation (Microsoft C++ ABI) and the device compilation (Itanium C++ ABI. Additional device side lambda number is required per lambda for the host compilation to correctly mangle the device-side lambda name.
- A hybrid numbering context MSHIPNumberingContext is introduced to number a lambda for both host- and device-compilations.
It would be nicer if there were a single entry point that does the right thing for all mangling contexts.
I know it's inheritance instead of composition, but maybe the simple thing to do here would be to inherit from the MicrosoftMangleNumberingContext, and then implement an override for getDeviceManglingNumber that delegates to the Itanium one.
This is actually the device's mangling context, and it's an Itanium mangling context. I think it needs comments and a more descriptive variable name.
It seems a shame to grow LambdaDefinitionData by a pointer for all users of C++ that do not use CUDA. Optimizing bitfields may be worth the time, but I'll leave it to @rjmccall or @rsmith to give guidance on whether that's worth it.
An alternative would be to store the device numbers in the mangling context and look them up when needed, since they are so rarely needed.
I'll add comments at the call site.
Rather than having a virtual method that returns bool, call a virtual method that does the entire thing you are trying to do. For example, MSHIP* could be responsible for setting the mangling number on the class. Although, maybe it should be storing the number in some side table now that I think about it.
I like the alternative way by storing all numbering into the mangle/numbering context instead of AST itself. But, it needs additional numbering post-processing after AST importing. Sound to me a major refactoring work likely to be addressed later.
Generally, I don't think we can count on contributors to come back later and optimize memory usage, so it seems reasonable to ask to avoid the regression in the first place. Above I see the bitfield usage optimizing memory usage of the number of captures, and then here we spent lots of memory storing device mangling numbers that are only used for CUDA. I think the memory usage concern still stands. I don't think it's unreasonable to maintain these numbers on the side in a DenseMap.
I agree that we should aim to avoid broad memory impact for uncommon features, and this seems to apply.
I think that's the right idea in general. I don't know how much it really matters for LambdaData, because there aren't a huge number of lambdas in typical translation units. But if you want to optimize this, a hashtable in the ASTContext or ManglingContext seems reasonable.