Page MenuHomePhabricator

Make TargetInfo store an actual DataLayout instead of a string.
ClosedPublic

Authored by jyknight on Feb 11 2016, 6:37 PM.

Details

Summary

Use it to calculate UserLabelPrefix, instead of specifying it (often
incorrectly).

Note that the *actual* user label prefix has always come from the
DataLayout, and is handled within LLVM. The main thing clang's
TargetInfo::UserLabelPrefix did was to set the #define value. Having
these be different from each-other is just silly.

Diff Detail

Repository
rL LLVM

Event Timeline

jyknight updated this revision to Diff 47763.Feb 11 2016, 6:37 PM
jyknight retitled this revision from to Make TargetInfo store an actual DataLayout instead of a string..
jyknight updated this object.
jyknight added a reviewer: echristo.
jyknight added a subscriber: cfe-commits.
yaron.keren added a subscriber: yaron.keren.EditedFeb 11 2016, 7:41 PM

We have tried to keep one copy of DataLayout around
http://reviews.llvm.org/D11103
Can it share the Module->getDataLayout() ?

We have tried to keep one copy of DataLayout around
http://reviews.llvm.org/D11103
Can it share the Module->getDataLayout() ?

We're at the beginning of clang here, before preprocessing. There's no llvm::Module yet, one of those doesn't show up until way later, when the AST is ready, and is going to be turned into IR.

Driveby comment: the changes in unittests look unrelated?

Those changes were necessary due to a name (IIRC it was "Module") existing
in both the clang and llvm namespaces, and after this change, becoming
ambiguous in those files.

Those changes were necessary due to a name (IIRC it was "Module") existing
in both the clang and llvm namespaces, and after this change, becoming
ambiguous in those files.

Ah, that one. Okay.

Ping. I'm pretty sure you said you were going to look at this. :)

echristo accepted this revision.Mar 2 2016, 11:37 AM
echristo edited edge metadata.

Sorry, I've been waffling on this, but I think this is fine.

Thanks!

-eric

This revision is now accepted and ready to land.Mar 2 2016, 11:37 AM

This is awesome!

include/clang/CodeGen/BackendUtil.h
38 ↗(On Diff #47763)

Should TDesc be a reference?

This revision was automatically updated to reflect the committed changes.
jyknight marked an inline comment as done.

Zombie comment time! I kind of don't love this change. clang's libBasic shouldn't depend on llvm/lib/IR: Practically there's no reason that building clang-format should build lib/IR and its deps, and more importantly semantically the dependency feels off too.

The dep from libBasic to lib/IR was added in https://reviews.llvm.org/rG8d043e82ef1179fed1d3d85171905c55bda7312f (oct 2014) then one day later http://reviews.llvm.org/rG86204b21e98b9d73b9ab381c87862837d2357082 formalized that but https://reviews.llvm.org/rGa0ac3c2bf049eb28e77a19dd70cbd570585f4747 (two days later, also oct 2014) removed the dependency again. Alas, not from the CMakeLists.txt. Then, 1.5 years later, this added it back, and it's still here. I'd like to remove the dependency from clangBasic on lib/IR again.

I see these options:

  1. Kind of revert this patch. Make TargetInfo again store the datalayout str and the prefix, and then assert somewhere that sees both layers (CodeGen) that the prefix matches the mangling in the layout string. Practically, I'd give resetDataLayout() the prefix as 2nd arg since that seems like a good place to set both. This abstractly makes sense to me since TargetInfo for the most part duplicates DataLayout.
  1. split llvm/IR/DataLayout into DataLayoutString that just does the parsing and pointer width and primitive things and move that part to llvm/Support, and make IR/DataLayout use that new llvm/Support/DataLayoutString to do the llvm::Type-dependent bits. That way, TargetInfo could lean _more_ on the DataLayoutString for pointer sizes and widths and whatnot. But this general approach leads to everything being in llvm/Support over time :)
  1. (not sure if feasible) move TargetInfo stuff out of Basic into a new libTargetInfo. Then the dep on IR could stay. That fixes layering but doesn't address that TargetInfo is independent of DataLayout except for this one thing.

I lean towards (1) since that feels like it's most in the spirit of the original TargetInfo design.

Does anyone have strong opinions?

rnk added a comment.Mar 29 2021, 10:12 AM

I think 1 is OK as long as we can assert that things don't get out of sync. I like the idea of putting the prefix next to the data layout string.

In the long run, splitting up Support might be reasonable, and pushing DataLayout down out of IR seems like something that could happen at that time. I would like to reduce the set of things that tablgen depends on by making Support into a smaller library of just system APIs and data structures, moving everything else out. Maybe the right way to do that is to leave Support as the ambiguous, kitchen sink library, and make a smaller, more targeted library under it. We already have the notion of the ADT library in our header structure for data structures. We used to have the System library before it was merged with Support. I think the only thing blocking this refactoring is the will to disentangle the circular dependencies.

In general, I think it's extremely unfortunate that Clang and LLVM have different copies of the same information. It's a problem for way more than just this one situation. So, I really don't like choice 1 -- I think it's moving in the wrong direction.

The recent thread about compiler-rt libcalls vs size of "int" is another example of this sort of issue of duplicated info across the llvm/clang boundary being troublesome. Other times, the information in question is buried in the Target implementation in LLVM, and Clang doesn't depend upon targets being compiled in...usually, so it can't access it from there. E.g. the fact that Clang even has to come up with a DataLayout string on its own is an example of this problem.

So, I like option 2. I think a lot more of the information in TargetInfo ought to be shared with LLVM. I note that ARM already pushed a bunch of shared stuff into LLVM to reduce duplication, which is great. But, it had to put it into a rather odd place (llvm/include/llvm/Support/{ARMTargetParser.h,ARMAttributeParser.h,...), because of the layering and optionality concerns. That part is not so great -- putting all this target-specific info into "Support" doesn't feel like the best fit. There should be some sort of Target-specific location for this sort of stuff to live which Clang can depend on.

In general, I think it's extremely unfortunate that Clang and LLVM have different copies of the same information. It's a problem for way more than just this one situation. So, I really don't like choice 1 -- I think it's moving in the wrong direction.

The recent thread about compiler-rt libcalls vs size of "int" is another example of this sort of issue of duplicated info across the llvm/clang boundary being troublesome. Other times, the information in question is buried in the Target implementation in LLVM, and Clang doesn't depend upon targets being compiled in...usually, so it can't access it from there. E.g. the fact that Clang even has to come up with a DataLayout string on its own is an example of this problem.

So, I like option 2. I think a lot more of the information in TargetInfo ought to be shared with LLVM. I note that ARM already pushed a bunch of shared stuff into LLVM to reduce duplication, which is great. But, it had to put it into a rather odd place (llvm/include/llvm/Support/{ARMTargetParser.h,ARMAttributeParser.h,...), because of the layering and optionality concerns. That part is not so great -- putting all this target-specific info into "Support" doesn't feel like the best fit. There should be some sort of Target-specific location for this sort of stuff to live which Clang can depend on.

FWIW I agree here.

Maybe LLVM should grow an LLVMTargetInfo library that sits between LLVMSupport and LLVMCore, as a version of solution 2 that doesn't induce more bloating in Support.