This is an archive of the discontinued LLVM Phabricator instance.

[IR] Add TargetExtTypeClass
AcceptedPublic

Authored by nhaehnle on Apr 6 2023, 3:40 AM.

Details

Reviewers
Flakebi
Summary

This allows for extensibility in how the target type info is obtained,
serving two purposes:

  1. Small step towards "Dialects in LLVM", in providing extensibility for users of LLVM. Users of LLVM can now define their own "target" extension types (now a somewhat misleading name) with custom layout and properties, to represent high-level concepts that are lowered away before code reaches the actual backend.
  1. By storing a pointer to the type class in the type itself, we can avoid string comparisons when looking up type info.

Diff Detail

Event Timeline

nhaehnle created this revision.Apr 6 2023, 3:40 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 6 2023, 3:40 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
nhaehnle requested review of this revision.Apr 6 2023, 3:40 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 6 2023, 3:40 AM

Is the patch missing changes in "llvm/IR/TargetExtType.h"?

jsilvanus added inline comments.Apr 6 2023, 6:48 AM
llvm/lib/IR/LLVMContext.cpp
397

Maybe add an assert that TypeClass is not yet registered?

nhaehnle updated this revision to Diff 521310.May 11 2023, 7:37 AM

Major update, adding IR printing, parsing and bitcode writing/reading of the type info.

Also adds the ability to register a verifier for target types, which is used by aarch64.svcount.

nhaehnle updated this revision to Diff 521314.May 11 2023, 7:51 AM

Remove two TODOs that have been done.

arsenm added a subscriber: arsenm.May 12 2023, 9:14 AM
arsenm added inline comments.
llvm/docs/LangRef.rst
3802–3803 ↗(On Diff #521314)

I don't think the LangRef should refer to doxygen for things like this and should explicitly list the properties

llvm/lib/AsmParser/LLParser.cpp
3259 ↗(On Diff #521314)

Missing assembler tests for the various error cases

llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
3992 ↗(On Diff #521314)

Use tuple unbinding syntax instead of .first/.second

4009 ↗(On Diff #521314)

Seems like it should be report_fatal_error

nhaehnle marked 3 inline comments as done.May 15 2023, 1:13 AM
nhaehnle added inline comments.
llvm/docs/LangRef.rst
3802–3803 ↗(On Diff #521314)

That text was pre-existing, but I'll replace it.

llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
4009 ↗(On Diff #521314)

This is the bitcode *writer*, so getting here is always a bug in the code, not a bug in any input (extending sdata::Value without updating this code here).

nhaehnle updated this revision to Diff 522078.May 15 2023, 1:14 AM
nhaehnle marked an inline comment as done.
  • explicitly list properties in LangRef
  • add negative parser/assembler tests
  • use tuple binding syntax
Matt added a subscriber: Matt.May 15 2023, 2:09 PM
Flakebi added inline comments.
llvm/test/Assembler/invalid-typeinfo.ll
51 ↗(On Diff #522078)

What happens when adding a target type with the same name twice?

nhaehnle marked 2 inline comments as done.Jun 20 2023, 4:51 AM
nhaehnle added inline comments.
llvm/test/Assembler/invalid-typeinfo.ll
51 ↗(On Diff #522078)

This should be accepted as long as there's no conflict in the type info. I'm adding a test.

nhaehnle updated this revision to Diff 532901.Jun 20 2023, 6:51 AM
nhaehnle marked an inline comment as done.

Rebase, layout type defaults to void

A question inline, looks good to me otherwise.

llvm/docs/BitCodeFormat.rst
568 ↗(On Diff #532901)

Is the XXXXYXXXXX string intended or accidental?

nhaehnle marked an inline comment as done.Jun 20 2023, 8:43 AM
nhaehnle added inline comments.
llvm/docs/BitCodeFormat.rst
568 ↗(On Diff #532901)

Whoops :) That is very much accidental. I meant to update that line but then managed to miss it on my final scan.

nhaehnle updated this revision to Diff 532954.Jun 20 2023, 8:44 AM
nhaehnle marked an inline comment as done.

Missed a spot...

Flakebi accepted this revision.Jun 20 2023, 8:48 AM

Thanks! LGTM

This revision is now accepted and ready to land.Jun 20 2023, 8:48 AM
nikic added a subscriber: nikic.Jun 20 2023, 12:10 PM

Could you please add a test where a target type is first used and later type info for it is provided? I'd like to make sure this is an error.

nikic added inline comments.Jun 20 2023, 12:37 PM
llvm/include/llvm/Bitcode/LLVMBitCodes.h
738 ↗(On Diff #532954)

What is SDATA_SYMBOL used for, in the context of this patch? It doesn't look like the corresponding IR handling in D150370 supports this.

nhaehnle updated this revision to Diff 533019.Jun 20 2023, 12:46 PM

Could you please add a test where a target type is first used and later type info for it is provided? I'd like to make sure this is an error.

Added the conflicting-typeinfos3.ll subtest which cover this (if I understood you correctly).

nhaehnle added inline comments.Jun 20 2023, 12:47 PM
llvm/include/llvm/Bitcode/LLVMBitCodes.h
738 ↗(On Diff #532954)

It isn't used yet. Symbols-as-values is something that I added when exploring how to put enums into metadata, and I didn't remove it here because it does seem useful.

Could you please add a test where a target type is first used and later type info for it is provided? I'd like to make sure this is an error.

Added the conflicting-typeinfos3.ll subtest which cover this (if I understood you correctly).

Yes, that's what I had in mind. Thanks!

llvm/docs/LangRef.rst
3837 ↗(On Diff #532954)

This could use some clarification on how the properties for target types with different arguments relate, if at all. If you define properties for target("mytype"), do those also apply to target("mytype", i32)?

Do you need to repeat the properties for every combination of arguments in the IR, or is there some inheritance?

llvm/include/llvm/Bitcode/LLVMBitCodes.h
738 ↗(On Diff #532954)

In that case, please drop it until it is used for something (and thus testable).

nhaehnle updated this revision to Diff 533140.Jun 20 2023, 11:12 PM
nhaehnle marked an inline comment as done.
  • Drop symbols-as-values for now
  • LangRef clarifications
llvm/docs/LangRef.rst
3837 ↗(On Diff #532954)

They don't relate and there is no inheritance, every target type is distinct and has its own properties. I did briefly consider the possibility of inheritance, but it seemed to quickly get very complex and not that useful. I'm going to try to clarify that.

nikic added inline comments.
llvm/docs/LangRef.rst
3837 ↗(On Diff #532954)

I see. I think this point could use some input from people who are using target types, in particular @jcranmer-intel.

The way I originally expected this to work is that the target type properties are independent of the type arguments entirely, so that target("mytype") and all target("mytype", Ti, Ni) share the same properties.

I'm not familiar enough with how these types are used to say whether independent properties for each argument combination makes sense or not.

jcranmer-intel added inline comments.Jul 5 2023, 8:23 AM
llvm/docs/LangRef.rst
3837 ↗(On Diff #532954)

All of the use cases I have had have the property that properties are based entirely on the string property. Indeed, one property that could be useful to check is the number of integer/type arguments!

That said, there are some cases where varying type parameters might change the properties of the type--suppose something like target("complex", float) versus target("complex", double) or target("decfp", 64) versus target("decfp", 32). Although for many of those types, I do question if they should be target extension types in the first place.

llvm/include/llvm/IR/TargetExtType.h
51–59 ↗(On Diff #533500)

I question the need for these to be function pointers rather than instance properties. (The verifier function does need to be its own function).

It seems this comes down to this question: Should the type properties be per type name or should they be per type?

But so far, it seems that nobody has very strong feelings in either direction. Personally, I'd err on the side of flexibility (hence the current shape of this change), but I don't feel strongly about it. How do we decide such things? :)

llvm/docs/LangRef.rst
3837 ↗(On Diff #532954)

There's a question of how strongly you feel about this and how it would affect the runtime representation.

If we did change the properties to be per type name instead of per type, we should probably also store them per type name.

llvm/include/llvm/IR/TargetExtType.h
51–59 ↗(On Diff #533500)

This is basically the same question as before. If we decide that these properties should be fixed on a per-type-name basis, then yes, we can just make them member variables here. Otherwise, they need to be fallbacks.

nhaehnle added inline comments.Jul 11 2023, 4:45 AM
llvm/include/llvm/IR/TargetExtType.h
51–59 ↗(On Diff #533500)

s/fallbacks/callbacks/