This is an archive of the discontinued LLVM Phabricator instance.

Implement _ExtInt as an extended int type specifier.
ClosedPublic

Authored by erichkeane on Feb 4 2020, 7:08 AM.

Details

Summary

Introduction/Motivation:
LLVM-IR supports integers of non-power-of-2 bitwidth, in the iN syntax.
Integers of non-power-of-two aren't particularly interesting or useful
on most hardware, so much so that no language in Clang has been
motivated to expose it before.

However, in the case of FPGA hardware normal integer types where the
full bitwidth isn't used, is extremely wasteful and has severe
performance/space concerns. Because of this, Intel has introduced this
functionality in the High Level Synthesis compiler[0]
under the name "Arbitrary Precision Integer" (ap_int for short). This
has been extremely useful and effective for our users, permitting them
to optimize their storage and operation space on an architecture where
both can be extremely expensive.

We are proposing upstreaming a more palatable version of this to the
community, in the form of this proposal and accompanying patch. We are
proposing the syntax _ExtInt(N). We intend to propose this to the WG14
committee[1], and the underscore-capital seems like the active direction
for a WG14 paper's acceptance. An alternative that Richard Smith
suggested on the initial review was __int(N), however we believe that
is much less acceptable by WG14. We considered _Int, however _Int is
used as an identifier in libstdc++ and there is no good way to fall
back to an identifier (since _Int(5) is indistinguishable from an
unnamed initializer of a template type named _Int).

[0]https://www.intel.com/content/www/us/en/software/programmable/quartus-prime/hls-compiler.html)
[1]http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2472.pdf

Diff Detail

Event Timeline

erichkeane created this revision.Feb 4 2020, 7:08 AM
rsmith added a subscriber: rsmith.Feb 5 2020, 6:27 PM

Have you considered how this would interact with our other language extensions? Can we form a vector of _ExtInt(N)? A _Complex _ExtInt(N)? I expect for some of that to just fall out of the implementation, but we should have documentation and test coverage either way.

I don't see any test coverage for _Atomic(_ExtInt(N)), which may not fall out from the other work because (if memory serves) atomic lowering doesn't go through the normal ConvertTypeForMem path and instead deals with padding itself.

I would like to see your documentation cover the calling convention used when passing/returning these types by value.

clang/lib/AST/ItaniumMangle.cpp
3472

This is not a valid vendor-extension mangling. There are two choices here, per the current scheme:

  1. mangle as a type, using (lowercase) u followed by a source-name, such as u9_ExtInt17 / u10_UExtInt17
  2. mangle as a type qualifier, using (capital) U followed by a source-name and optional template-args, such as U7_ExtIntILi17EEi / U7_ExtIntILi17EEj

Neither of these gives a particularly nice demangling. If WG14 seems likely to accept the proposal, you should register a proper mangling as part of the Itanium ABI.

keryell added a subscriber: keryell.Feb 6 2020, 7:12 AM
erichkeane marked an inline comment as done.Feb 6 2020, 7:34 AM

Have you considered how this would interact with our other language extensions? Can we form a vector of _ExtInt(N)? A _Complex _ExtInt(N)? I expect for some of that to just fall out of the implementation, but we should have documentation and test coverage either way.

I don't see any test coverage for _Atomic(_ExtInt(N)), which may not fall out from the other work because (if memory serves) atomic lowering doesn't go through the normal ConvertTypeForMem path and instead deals with padding itself.

I would like to see your documentation cover the calling convention used when passing/returning these types by value.

We can do a std::vector or any of the standard containers with no problems.

Extended-vector types don't really make sense for non-powers-of-two (plus have some odd gotchas when it comes to vectors of i1 for example), so I've added a test that shows that this is rejected.

_Complex _ExtInt(N) is rejected when parsing _Complex. It doesn't seem to make sense to me to support them, so I've added a test that shows it is invalid. Do you disagree?

_Atomic seems to be broken (atomic memory access size mus be byte-sized), but I'll continue to work on it to update this patch further.

I've also asked a coworker to better describe the calling convention so that we can add that text to the language-extensions.

Thanks!

clang/lib/AST/ItaniumMangle.cpp
3472

Well shucks, I was so close :) Looks like I'm just missing the last E/i from the second one.

The second seems consistently changable with the one below (for a dependent version), so I think I'm stuck with that. It DOES show up a little weird unfortunately, but at least now it demangles.

I'll ask about an official mangling mechanism for ItaniumABI.

erichkeane updated this revision to Diff 242901.Feb 6 2020, 7:37 AM

Go through @rsmith 's comments. Current opens:

1- Getting an official mangling. This will likely need to wait until WG14 has seen the paper, in the meantime, use @rsmith suggested version.

2- _Atomic _ExtInt creates invalid IR. Patch to fix this is WIP.

3- Document Calling Convention in Language Extensions. Text is WIP.

4- Determine whether implicit conversions between these types should be acceptable. Pending discussion on CFE-dev.

martong resigned from this revision.Feb 6 2020, 7:42 AM

Deal with _Atomic. There isn't really a sensible way to do _Atomics on all platforms for these types, so this patch now limits to the LLVM atomic instruction limits, which is powers-of-2 >=8.

Extended-vector types don't really make sense for non-powers-of-two (plus have some odd gotchas when it comes to vectors of i1 for example), so I've added a test that shows that this is rejected.

OK, that seems fine to me.

_Complex _ExtInt(N) is rejected when parsing _Complex. It doesn't seem to make sense to me to support them, so I've added a test that shows it is invalid. Do you disagree?

We allow _Complex T for integral types T in general, so this seems inconsistent. Are there problems supporting this? Is it just a parser limitation or something deeper? If there's some reason it doesn't naturally work (as there is for at least non-power-of-two-sized integers in vectors) then rejecting it seems fine.

We allow _Complex T for integral types T in general, so this seems inconsistent. Are there problems supporting this? Is it just a parser limitation or something deeper? If there's some reason it doesn't naturally work (as there is for at least non-power-of-two-sized integers in vectors) then rejecting it seems fine.

At the moment it doesn't work because of parsing. I'm unaware of there is a deeper limitation, but I'll look at it. I was unaware that we allow integral types for _Complex, I'd looked it up on cppreference and only saw the FP types so I was basically just confused as to what your intent was. I'll look closer now that I've done the _Atomic limitations.

At the moment it doesn't work because of parsing. I'm unaware of there is a deeper limitation, but I'll look at it. I was unaware that we allow integral types for _Complex, I'd looked it up on cppreference and only saw the FP types so I was basically just confused as to what your intent was.

Nice! Telecommunication applications love complex numbers of weird integers on FPGA. Of course, it is outside of any common CPU ABI but useful for HLS tool-chains based on Clang. Useful also for SYCL extensions for FPGA.

Add _Complex support for these types. As it turns out, they 'just work'! I added a check to the same place we check 'int' in the declaration specifiers, and the IR that gets generated is eminently logical.

Fixed test that expected _Complex _ExtInt to fail.

Remove conversions between _ExtInt until WG14 decides what should be done with them.

Ka-Ka added a subscriber: Ka-Ka.Feb 18 2020, 5:47 AM
erichkeane added a subscriber: lattner.

Update language extensions to document calling convention impact.

This patch should now be completely up to date based on the comments here and from cfe-dev, particularly those by @lattner

Ping! I believe this should be ready for review and does everything requested in the CFEDev conversation.

Added a couple inline comments

clang/docs/ReleaseNotes.rst
62

^useage^usage^ -- please also correct spelling in the code

clang/include/clang/AST/ASTContext.h
1215

Is this comment correct? "with the specified underlying type"

erichkeane marked 2 inline comments as done.Feb 27 2020, 7:53 AM
erichkeane added inline comments.
clang/include/clang/AST/ASTContext.h
1215

probably should be 'signedness', will update.

ABataev added inline comments.
clang/include/clang/AST/Type.h
6196

Use /// style

6197

final class

6225

final class

clang/include/clang/AST/TypeLoc.h
2453–2457

final classes?

clang/lib/AST/ASTContext.cpp
4020

auto *

4038

auto *

5879–5881

No need for braces here.

clang/lib/AST/Type.cpp
2117–2119

Enclose substatement in braces to follow the coding style used for other substatements.

erichkeane marked 8 inline comments as done.

Changes requested by @ABataev

keryell added inline comments.Feb 27 2020, 10:44 AM
clang/lib/AST/ASTContext.cpp
4020

Why no just auto without a * everywhere?

erichkeane marked an inline comment as done.Feb 27 2020, 10:45 AM
erichkeane added inline comments.
clang/lib/AST/ASTContext.cpp
4020

The LLVM coding standard requires it.

aaron.ballman added inline comments.Feb 28 2020, 8:54 AM
clang/include/clang/AST/Type.h
6199–6200

Would it be a bad idea to use a bit-field here to smash these fields together? (I don't want to limit the number of bits we can support, but I think it's nice to keep type objects small.)

6206

We might as well add isSigned() as well and convert some of the !isUnsigned() calls to use it.

6228–6229

Same question here -- we could use a PointerIntPair.

6236

Similar here.

clang/include/clang/Basic/DiagnosticSemaKinds.td
10380

I think we should say "bits" in here to clarify what units the size is measured in.

10382

Same here.

clang/lib/AST/ASTContext.cpp
9290–9293

castAs<> because we already know the expected types?

clang/lib/CodeGen/CGExprScalar.cpp
3705

auto *

clang/lib/Sema/SemaDecl.cpp
14558

this -> This

clang/lib/Sema/SemaTemplateDeduction.cpp
2113

Did you mean to use dyn_cast<> here instead?

2130

Did you mean to use dyn_cast<> here instead?

clang/lib/Sema/SemaType.cpp
2168

Comment doesn't match the code.

2197–2198

Formatting looks slightly off here (the indentation for the second trailing argument looks funky, but maybe that's just Phab).

clang/test/CodeGen/ext-int.cpp
7

I'd appreciate some codegen tests that show how this type works with variadic functions in C and demonstrate that you can pass these values and pull them back out via va_arg.

clang/test/SemaCXX/ext-int.cpp
104

I'd also like a test like:

constexpr int func() { return 42; }
_ExtInt(func()) ThisShouldWorkRight;
207–208

I'd like to see some similar tests for alignof

253

These tests should be in a parsing test rather than in sema.

267

Some additional sema/codegen tests I'd like to see:

  • That this new type works with typeid in C++ (that two types with the same sign and width match, and distinct types don't)
  • The behavior of explicit casts to this type and from this type in both C and C++
  • C test that this behaves properly in a _Generic selection expression with other _ExtInt bit-width and sign types
  • C test that we can properly create VLAs of _ExtInt type
  • C and C++ tests that offsetof behaves properly on structures with these types
erichkeane marked 21 inline comments as done.

Thanks @aaron.ballman for the review! I think this is everything. Note that I disallowed attribute 'mode' since I don't see how it makes sense with these types.

erichkeane added inline comments.Feb 28 2020, 2:10 PM
clang/lib/Sema/SemaType.cpp
2197–2198

Strange... Doesn't show up that way in VIM. I had to copy/paste this review from a file on a linux share out of Notepad++ into the editor because the patch size is larger than the 8MB Phab max, so I can imagine a space or two got lost in that translation :)

My workspace copy is correct however.

erichkeane marked an inline comment as done.Feb 28 2020, 2:10 PM

Ping! My understanding is that WG14 gave a very positive response to this, so I'd love to get this into clang so that we can get more user interaction.

erichkeane edited reviewers, added: eandrews; removed: elizabethandrews.Apr 16 2020, 11:34 AM
mibintc accepted this revision.Apr 16 2020, 12:18 PM
This revision is now accepted and ready to land.Apr 16 2020, 12:18 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptApr 17 2020, 7:33 AM

This change breaks the lldb build with

lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:3958:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]

switch (qual_type->getTypeClass()) {
        ^

lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:4633:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]

switch (qual_type->getTypeClass()) {
        ^

lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:4889:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]

switch (qual_type->getTypeClass()) {

I would fix it, but I don't know the lldb code well enough.

I am reverting this change shortly.

This change breaks the lldb build with

lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:3958:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]

switch (qual_type->getTypeClass()) {
        ^

lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:4633:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]

switch (qual_type->getTypeClass()) {
        ^

lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:4889:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]

switch (qual_type->getTypeClass()) {

I would fix it, but I don't know the lldb code well enough.

I am reverting this change shortly.

I think I know how to fix the LLDB code, can you hold off on reverting and let me take a look?

Reverted in a4b88c044980337bb14390be654fe76864aa60ec. Happy to approve an updated change.

This change breaks the lldb build with

lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:3958:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]

switch (qual_type->getTypeClass()) {
        ^

lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:4633:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]

switch (qual_type->getTypeClass()) {
        ^

lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:4889:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch]

switch (qual_type->getTypeClass()) {

I would fix it, but I don't know the lldb code well enough.

I am reverting this change shortly.

Also, can you share the buildbot that failed here? I'd like to keep an eye on it after my fix.

Already relanded here: 5f0903e9bec9

@saugustine I checked the buildbots, every one with LLDB in its name stayed green, and I cannot see the warning in any of their STDIO. Can you point out how I can proactively keep an eye on this? The lldb build is broken out-of-the-box anyway for me due to the LLDB Python Wrapper having a syntax error in the SWIG bindings.

bash-4.2$ swig -copyright

SWIG Version 2.0.10
/workspaces/llvm-project/lldb/bindings/./python/python-typemaps.swig:496: Error: Syntax error in input(3).

I believe this change pushes clang\lib\Sema\SemaTemplateDeduction.cpp over the 16-bit COFF section limit for Windows Debug builds. Could you please resolve it or add /bigobj to the CMakeFile file for MSVC (see clang\lib\CodeGen\CMakeLists.txt). ERROR C1128 Thanks!

I believe this change pushes clang\lib\Sema\SemaTemplateDeduction.cpp over the 16-bit COFF section limit for Windows Debug builds. Could you please resolve it or add /bigobj to the CMakeFile file for MSVC (see clang\lib\CodeGen\CMakeLists.txt). ERROR C1128 Thanks!

Done: https://github.com/llvm/llvm-project/commit/50511a406df4475984e5c51feadada2c92aaf97a

I didn't see that on any of the buildbots, where did you see it?

I didn't see that on any of the buildbots, where did you see it?

Thanks so much! It's funny that you say that because we just had that exact discussion this morning. I saw it in our local builds, and as far as I can tell none of the buildbots are doing a full Debug build in Windows (if there is one can you point me to it?). Maybe something we could add eventually.

An FPGA programmer is hitting this issue from your unit test:

c++
  signed _ExtInt(1) m; // expected-error{{signed _ExtInt must have a bit size of at least 2}}

Why do you not allow a type able to represent {-1, 0}?

An FPGA programmer is hitting this issue from your unit test:

c++
  signed _ExtInt(1) m; // expected-error{{signed _ExtInt must have a bit size of at least 2}}

Why do you not allow a type able to represent {-1, 0}?

Our FPGA team didn't see any value in it, so we didn't propose it to WG14, so we didn't implement it here. I'm sure a follow-up paper to WG14 would be considered, and if it shows promise/positive reception, we'd welcome a patch here.

Why do you not allow a type able to represent {-1, 0}?

Our FPGA team didn't see any value in it, so we didn't propose it to WG14, so we didn't implement it here. I'm sure a follow-up paper to WG14 would be considered, and if it shows promise/positive reception, we'd welcome a patch here.

It comes from an FPGA tool programmer who knows about HLS *and* modern C++, so probably a very rare specie. :-)
The typical FPGA programmer and tool writer is often more focused on C, so knows barely nothing about generic programming.
In the meantime we can probably use C++20 concepts, a C++ wrapping type and some specialization for the 1bit case to handle this negative cases or just have a SYCL-specific fork...
Probably simpler to have a fork than trying to convince the C committee or FPGA people not understanding C++.

Why do you not allow a type able to represent {-1, 0}?

Our FPGA team didn't see any value in it, so we didn't propose it to WG14, so we didn't implement it here. I'm sure a follow-up paper to WG14 would be considered, and if it shows promise/positive reception, we'd welcome a patch here.

It comes from an FPGA tool programmer who knows about HLS *and* modern C++, so probably a very rare specie. :-)
The typical FPGA programmer and tool writer is often more focused on C, so knows barely nothing about generic programming.
In the meantime we can probably use C++20 concepts, a C++ wrapping type and some specialization for the 1bit case to handle this negative cases or just have a SYCL-specific fork...
Probably simpler to have a fork than trying to convince the C committee or FPGA people not understanding C++.

Our FPGA team DOES know HLS and (most of) modern C++, they are the implementers of https://www.intel.com/content/www/us/en/software/programmable/quartus-prime/hls-compiler.html.

That said, I can see potential value (though the SFINAE to get the unsigned vs signed seems easy enough), but would suggest either the WG14 paper, or to suggest it to the authors of the WG14 paper and see what they say. Feel free to include me in the chain.

An FPGA programmer is hitting this issue from your unit test:

c++
  signed _ExtInt(1) m; // expected-error{{signed _ExtInt must have a bit size of at least 2}}

Why do you not allow a type able to represent {-1, 0}?

At the time this feature was first being designed for WG14, I believe ones complement and sign magnitude integer representations were still allowed, so a signed 1-bit _ExtInt made no sense. That's no longer the case and so WG14 may be more interested in hearing about this if you have a compelling use case.

It comes from an FPGA tool programmer who knows about HLS *and* modern C++, so probably a very rare specie. :-)
The typical FPGA programmer and tool writer is often more focused on C, so knows barely nothing about generic programming.

Our FPGA team DOES know HLS and (most of) modern C++, they are the implementers of https://www.intel.com/content/www/us/en/software/programmable/quartus-prime/hls-compiler.html.

"Uses untimed ANSI C++ as the golden design source": is this pre ISO C++98? :-)
Perhaps you should put ISO C++ with some version number there.

Kidding aside, the HLS world comes historically from the C world and thus solves problem in the #pragma and __attribute__. And there is some value in this for customers not wanting to dive into C++ and just having minimal change to the code besides these #pragma. But a the same time it is done with a lot of atrocities on the type system when extending this to C++...

That said, I can see potential value (though the SFINAE to get the unsigned vs signed seems easy enough), but would suggest either the WG14 paper, or to suggest it to the authors of the WG14 paper and see what they say. Feel free to include me in the chain.

I am not following at all the WG14 papers. I have enough pain just running after the C++ papers... :-( But I might have a look on this one...