This is an archive of the discontinued LLVM Phabricator instance.

[Clang][RISCV] Add custom TableGen backend for riscv-vector intrinsics.
ClosedPublic

Authored by khchen on Jan 19 2021, 6:55 PM.

Details

Summary

Demonstrate how to generate vadd/vfadd intrinsic functions
(explicitly api).

  1. add -gen-riscv-vector-builtins for clang builtins.
  2. add -gen-riscv-vector-builtin-codegen for clang codegen.
  3. add -gen-riscv-vector-header for riscv_vector.h. It also generates

ifdef directives with extension checking, base on D94403.

  1. add -gen-riscv-vector-generic-header for riscv_vector_generic.h.

Generate overloading version Header for generic api.
https://github.com/riscv/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#c11-generic-interface

  1. update tblgen doc for riscv related options.

riscv_vector.td also defines some unused type transformers for vadd,
because I think it could demonstrate how tranfer type work and we need them
for whole intrinsic functions implementation in the future.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
khchen updated this revision to Diff 318447.Jan 22 2021, 12:56 AM
khchen marked 2 inline comments as done.
  1. do not need to manually define new op in gen-rvv-tests.py.
  2. do not need to manually add new op define in ALL marco.
HsiangKai added inline comments.Jan 24 2021, 9:45 PM
clang/include/clang/Basic/riscv_vector.td
129

mangled_suffix

148

"has a maskedoff operand"

150

Propose to use HasMaskedOffOperand.

162

EMUL according to specification.

192

mangled_suffix

khchen updated this revision to Diff 319000.Jan 25 2021, 7:14 AM
khchen marked 7 inline comments as done.
  1. address @HsiangKai's comments
  2. remove test generator to make td simpler.
  3. remove MangledSuffix, it should be MangledName
clang/include/clang/Basic/riscv_vector.td
162

Here ELMUL means exponental LMUL

khchen edited the summary of this revision. (Show Details)Jan 25 2021, 7:16 AM
jrtc27 added inline comments.Jan 25 2021, 7:30 AM
clang/include/clang/Basic/riscv_vector.td
162

Log2LMUL would be a clearer name and more consistent with how things like this are named elsewhere in LLVM

jrtc27 added inline comments.Jan 25 2021, 7:57 AM
clang/test/CodeGen/RISCV/riscv-rvv-intrinsics-generic/vadd.c
7–8 ↗(On Diff #319000)

This is poor style, but should be unnecessary per my comment below

10 ↗(On Diff #319000)

Asm checks are discouraged in Clang. If you want to check for Clang warnings, use -verify, and in this case you want // expected-no-diagnostics.

38 ↗(On Diff #319000)

Delete

clang/utils/TableGen/RISCVVEmitter.cpp
37

Surely being uninitialised is an error and we expect to always have it specified in the source?

49–56

These are poor names; many of them don't sound like bools, and are some of them not mutually exclusive? If so, an enum would be better.

56

Capitalisation.

57

Capital W.

58

VScale. Or just drop the V and use Scale?

88–96

though at that point the wrappers are a bit unnecessary as they only seem to be used once.

110

Camel-case?

122

Using an enum class for a bitmask doesn't make much sense.

295

Please make sure there's at least 1 space around things

309

or 1? Clearer to move this into the switch below IMO.

316

isPowerOf2_32 from llvm/Support/MathExtras.h

381

Move into the !Float above

387

Move into the !Float above

513

Capital T

694

Decl?

715

This is odd, why not OS << " return " + getName() + "("; or OS << " return " << getName() << "(";

778

Personally not a fan of omitting the curly braces in cases like this, it's non-obvious and more error-prone than when you don't have nesting.

khchen updated this revision to Diff 319488.Jan 27 2021, 12:56 AM
khchen marked 21 inline comments as done.
  1. address @jrtc27's comments. I really appreciate your help very much.
  2. use downstream test generator and move all tests to rvv-intrinsics-generic and rvv-intrinsics.
khchen added inline comments.Jan 27 2021, 1:37 AM
clang/test/CodeGen/RISCV/riscv-rvv-intrinsics-generic/vadd.c
10 ↗(On Diff #319000)

RVV is the scalable vector type similar to SVE, so I added this check.
please see https://reviews.llvm.org/D82943.

clang/utils/TableGen/RISCVVEmitter.cpp
49–56

Those variables are used to descript the property of RVVType, I think maybe rename as IsXXX could become more clear.
ps. I implement RVVType similar to SveType did.
Do you mean only mutually exclusive property should be represented in an enum?

309

This checks illegal type float8_t .

Jim added inline comments.Feb 8 2021, 10:14 PM
clang/include/clang/Basic/BuiltinsRISCV.def
2

BUILTIN is undefined here. I am not sure how to add new builtins for other extension.

clang/include/clang/Basic/riscv_vector.td
189

Add space before "_mask"

191

Add space before "_"

Jim added inline comments.Feb 8 2021, 10:59 PM
llvm/docs/CommandGuide/tblgen.rst
141

It is typo fix. Could you fix it in a separate patch?

khchen updated this revision to Diff 322297.Feb 8 2021, 11:18 PM
khchen marked 3 inline comments as done.
  1. address Jim's comment.
  2. remove suffix _vl according by https://github.com/riscv/rvv-intrinsic-doc/pull/64
khchen added inline comments.Feb 8 2021, 11:19 PM
clang/include/clang/Basic/BuiltinsRISCV.def
2

You could reference https://reviews.llvm.org/D93446 to see who define the BUILTIN .
I think how to support other extension depends on its implementation, it could be including other .inc or adding new builtins with BUILTIN macro (https://github.com/llvm-project/clang/blob/master/include/clang/Basic/Builtins.def#L110)

khchen marked an inline comment as done.Feb 8 2021, 11:48 PM
Jim added inline comments.Feb 9 2021, 12:39 AM
clang/include/clang/Basic/BuiltinsRISCV.def
2

If a new BUILTIN is added here, I met BUILTIN macro undefined error.

khchen updated this revision to Diff 324197.Feb 16 2021, 11:18 PM

remove float32_t and float64_t to avoid collisions with other project.

Jim added a reviewer: Jim.Feb 17 2021, 12:06 AM
craig.topper added inline comments.Feb 23 2021, 5:08 PM
clang/include/clang/Basic/riscv_vector.td
149

Isn't mask the first operand maskedoff the second operand?

clang/utils/TableGen/RISCVVEmitter.cpp
158

Return by const reference or use StringRef.

176

Why not just use std::string in the one place this is used?

863

Might be better to just sort the vector by the IR name and then walk the vector looking for the boundaries where the name changes from the previous iteration.

900

I think this is something like

while (!Prototypes.empty()) {
 auto Idx = Prototypes.find_first_of(Primaries);
 assert(Idx != StringRef::npos);
 ProtoSeq.push_back(Prototypes.slice(0, Idx+1).str());
 Prototypes = Prototypes.drop_front(Idx+1);
}

Which might be easier to understand.

971

Why does this need to explicitly create an Optional? Shouldn't we just be able to return &(It->second)?

991

Could we maybe just sort the original vector by extension and just loop over it. Keep track of the extensions from previous iteration and emit the guard when this iteration doesn't match the previous iteration.

1013

This is only called in one place which immediate loops over and prints the contents. Could we just pass the raw_ostream in here and do the printing directly?

1016

Even if we don't print directly. This can be a vector of StringRef. These are all string literals.

1017

I don't understand this. It says D implies F but we're checking for F. So that seems like F implies D.

craig.topper added inline comments.Feb 23 2021, 8:10 PM
clang/utils/TableGen/RISCVVEmitter.cpp
568

Can we do Transformer = Transformer.drop_back() right before this loop. That take_front code is harder to think about.

khchen updated this revision to Diff 326104.Feb 24 2021, 8:22 AM
khchen marked 10 inline comments as done.
  1. Rebase
  2. Address Craig's comments.
  3. Change the operand orders of builtin to the same order of IR intrinsics.
clang/utils/TableGen/RISCVVEmitter.cpp
900

Thanks, it's clear.

1017

Remove extension implying rule.
I thought we don't need to consider implying rule because it's clang's responsibility.
The original thinking was when predecessor "only" defines __riscv_d, and floating instruction also need to supported. But in fact, when enabling __riscv_d predecessor will also define __riscv_f.

craig.topper added inline comments.Feb 24 2021, 3:22 PM
clang/utils/TableGen/RISCVVEmitter.cpp
711

Can't we just print OutputType->type_str() and " " to OS separately? We shouldn't need to concat them into a Twine first.

715

Why is this > 1 and not >= 1 or !CTypeOrder.empty()?

721

Replace the + operators with <<

723

Same here, why is this >1 and not >=1 or !CTypeOrder.empty()?

750

Looks like other headers use 2 underscores at the beginning of their include guard.

758

Can we just #endif here instead of the #else? If the error is emitted the preprocessor should stop and not process the rest of the file. Then we don't need to close it at the bottom of the file.

768

I'd recommend calling this printType. dump made me think it was printing for debug like the dump() functions found in many LLVM classes.

800

I think we only need to check __riscv_f here?

837

I don't think this StringRef constructor call is needed.

850

We probably want 2 newlines on the end of this so there will be a blank line before the first RISCVV_BUILTIN.

874

Can we remember the PrevIRName StringRef instead?

1028

I think you can use ListSeparator for this. It keeps track of the separator string and whether the first item has been printed. It's most often used with loops, but it should work here. I think there are many examples uses in llvm/utils/TableGen

craig.topper added inline comments.Feb 24 2021, 3:22 PM
clang/utils/TableGen/RISCVVEmitter.cpp
71

These method names should use CamelCase and start with "get"

103

These should use CamelCase per llvm coding style.

Might be better named init*Str instead of compute. compute makes me think they are going to return something, but that might just be me.

268

You can initialize the bools to false with " = false" where they are declared in the class body then you don't need to mention them all here. Similar for Scale and ElementBitWidth.

271

Why is ElementBitwidth default ~0. Wouldn't 0 also be an invalid value?

334

You drop the else since the if above returned.

619

I don't think we need to go through Twine here. We should be able to call str() directly on first.

622

!Suffix.empty()

641

CTypeOrder.resize(InputTypes.size());
std::iota(CTypeOrder.begin(), CTypeOrder.end(), 0);

884

Defs.back()->emitCodeGenSwitchBody(OS);

khchen updated this revision to Diff 326300.Feb 24 2021, 11:41 PM
khchen marked 21 inline comments as done.
  1. Rename Dump to Print.
  2. Address Craig's comments, thanks for your patient.
clang/utils/TableGen/RISCVVEmitter.cpp
271

I followed SVE did, but you are right, 0 is better.

715

Thanks, it's bug when I refactor code from output input vector to input only vector...

758

Good point. Thanks. I should think more when I copied code from SVE.

874

I don't think so. We need PrevDef to emitCodeGenSwitchBody.

1028

thanks, it's more elegant.

craig.topper added inline comments.Feb 24 2021, 11:57 PM
clang/utils/TableGen/RISCVVEmitter.cpp
874

Of course we do. Sorry about that.

khchen updated this revision to Diff 326309.Feb 25 2021, 12:14 AM

refine comment

craig.topper added inline comments.Feb 25 2021, 8:51 AM
clang/utils/TableGen/RISCVVEmitter.cpp
679

We don't need Twine here

685

Replace + with << and drop utostr

692

Drop Twine

694

Drop Twine

701

Drop Twine

711

Replace + with <<

714

Drop Twine

790

This comment looks incomplete

873

Should we sink the printing of break; into emitCodeGenSwitchBody so its not repeated after the other call to emitCodeGenSwitchBody?

khchen updated this revision to Diff 326585.Feb 25 2021, 7:52 PM
khchen marked 10 inline comments as done.
  1. address Craig's comments.
  2. use ListSeparator in some code snippet.
craig.topper accepted this revision.Feb 26 2021, 11:04 AM

LGTM

clang/utils/TableGen/RISCVVEmitter.cpp
909

Buitlin->Builtin

This revision is now accepted and ready to land.Feb 26 2021, 11:04 AM
jrtc27 added inline comments.Feb 26 2021, 11:07 AM
clang/include/clang/Basic/riscv_vector.td
57

I just find it really obfuscates things when we have all these magic character sequences.

67

Then fix the way you define these? This is just bad design IMO.

craig.topper added inline comments.Feb 26 2021, 11:23 AM
clang/include/clang/Basic/riscv_vector.td
67

For each signature there is effectively a single key type that is a vector. The type transformer is a list of rules for how to derive all of the other operands from that one key type. Conceptually similar to LLVMScalarOrSameVectorWidth or LLVMHalfElementsVectorType in Intrinsics.td. Some types are fixed and don't vary by the key type. Like the size_t vl operand or a store intrinsic returning void. There is no separate place to put a base type.

jrtc27 added inline comments.Feb 26 2021, 11:24 AM
clang/utils/TableGen/RISCVVEmitter.cpp
117

Zfh

242

Please be consistent and use Log2 rather than Exp

260–261

That's not how multiplication works. This is exponentiation. Multiplication would be Log2LMul + log2(RHS). Please don't abuse operators like this.

326

This really needs to be an enum not a bunch of mutually-exclusive booleans, which I though I suggested in the past?

451–452

Please try and avoid wrapping code across lines

568

Or would it be better as a pop_back in the switch above?

625

Blank line

686

This is missing indentation?

871

Needs indentation?

craig.topper requested changes to this revision.Feb 26 2021, 11:27 AM

Dropping my approval pending @jrtc27 comments.

This revision now requires changes to proceed.Feb 26 2021, 11:27 AM
craig.topper added inline comments.Feb 26 2021, 11:29 AM
clang/utils/TableGen/RISCVVEmitter.cpp
568

I don't think StringRef has a pop_back.

jrtc27 added inline comments.Feb 26 2021, 11:31 AM
clang/utils/TableGen/RISCVVEmitter.cpp
568

Right, and it's an immutable data structure anyway so you'd need to have to deal with two return values. Never mind then.

craig.topper added inline comments.Feb 26 2021, 11:34 AM
clang/utils/TableGen/RISCVVEmitter.cpp
260–261

This seems like it must be broken, but since we don't do widening or narrowing in this patch we didn't notice?

jrtc27 added inline comments.Feb 26 2021, 11:46 AM
clang/include/clang/Basic/riscv_vector.td
67

Oh I see, I hadn't appreciated that TypeRange got split up and each character was a whole separate intrinsic, I had misinterpreted it as it being a list of the types of arguments, i.e. an N-character string for an (N-1)-argument (plus return type) intrinsic that you could then use as a base and apply transformations too (e.g. "if" for a float-to-int intrinsic, with "vv" etc giving you a vectorised versions) and so was confused as to why the void/size_t/ptrdiff_t/uint8_t/bool-ness couldn't just be pushed into the TypeRange and the corresponding transforms left as "e". But, how _would_ you define vector float-to-int (and back) conversions with this scheme then?

On a related note, I feel one way to make this less obfuscated is change v/w/q/o to be v1/v2/v4/v8 (maybe with the 1 being optional, don't really care), and is also more extensible in future rather than ending up with yet more alphabet soup, though it does change the parsing from being a list of characters to a list of strings.

craig.topper added inline comments.Feb 26 2021, 12:07 PM
clang/include/clang/Basic/riscv_vector.td
67

I think these transforms are used for the float-to-int and int-to-float.

I: given a vector type, compute the vector type with integer type
elements of the same width
F: given a vector type, compute the vector type with floating-point type
elements of the same width

The float and integer types for conversion are always the same size or one lmul up or one lmul down. So combining I and F with v or w should cover it.

khchen updated this revision to Diff 326980.Feb 28 2021, 9:50 AM
khchen marked 17 inline comments as done.
  1. address @jrtc27's suggestions, thanks.
  2. fix several bugs.
clang/include/clang/Basic/riscv_vector.td
57

Sorry, what do you mean?

67

Yes, thanks for Craig's comments, I and U are used to implement conversion.

void/size_t/ptrdiff_t/uint8_t are not related to basic type so it's why they are no transformed from transforms left as e.

On a related note, I feel one way to make this less obfuscated is change v/w/q/o to be v1/v2/v4/v8 (maybe with the 1 being optional, don't really care), and is also more extensible in future rather than ending up with yet more alphabet soup, though it does change the parsing from being a list of characters to a list of strings.

In the downstream we define a "complex" transformer which is not included in this patch. It uses a string and encode additional information for some special type, like indexed operand of indexed load/store (its type is using EEW encoded in the instruction with EMUL=(EEW/SEW)*LMUL).
I have considered to use list of string, but personally I still prefer to use a list of characters because it's not often to use "complex" transformer and overall looking at the riscv_vector.td I feel current prototype is more readable and similar to intrinsic Builtins.
Could we postpone the 'list of char' or 'list of string' decision in the future indexed load/store patch?

clang/utils/TableGen/RISCVVEmitter.cpp
260–261

Yes, thanks for point out. In my original plan is fixing that in followup patches.
I also add more bug fixes into this patch.

871

I think switch case statement in *_cg.inc does not need indentation like arm_cde_builtin_cg.inc did.

jrtc27 added inline comments.Feb 28 2021, 10:46 AM
clang/utils/TableGen/RISCVVEmitter.cpp
51

Signed

59–63

This isn't expressive enough for the grammar you defined. PCPCec is supposed to give const i8 * const i8 *, whereas this will interpret it as const i8 *. Given such types are presumably not needed you need to tighten the rules of your grammar.

128

No underscores in names

174

No underscores in names

260–261

Probably worth adding an an assert(isPowerOf2_32(RHS)); too

355–357

Combine the two switch statements

420

Combine this with the switch

442

Combine this with the switch

453–468

This should be able to be tidied up so there's only one switch

628

This looks wrong

khchen updated this revision to Diff 327363.Mar 1 2021, 11:37 PM
khchen marked 11 inline comments as done.

address @jrtc27's comments, thanks!

craig.topper added a comment.EditedMar 2 2021, 8:40 AM
This comment has been deleted.
clang/utils/TableGen/RISCVVEmitter.cpp
59–63

@jrtc, are you asking for RVVType::applyModifier to verify that that C doesn't appear twice for example?

craig.topper added inline comments.Mar 3 2021, 4:31 PM
clang/utils/TableGen/RISCVVEmitter.cpp
59–63

Oops I meant to write @jrtc27 above. Are you asking for RVVType::applyModifier to verify that that C doesn't appear twice for example?

khchen updated this revision to Diff 328533.Mar 5 2021, 7:50 AM

rebase.

LGTM. @jrtc27 are you ok with this?

khchen added a comment.Mar 9 2021, 8:40 AM

@jrtc27, please advise if there is anything more should to be changed, thanks.

craig.topper accepted this revision.Mar 10 2021, 1:35 PM

LGTM. I think we can address any remaining issues post-commit. I'd like to see us start adding the intrinsics that use this.

This revision is now accepted and ready to land.Mar 10 2021, 1:35 PM
jrtc27 added inline comments.Mar 10 2021, 1:38 PM
clang/utils/TableGen/RISCVVEmitter.cpp
59–63

That P and C don't appear twice, and that C appears in the "right" order wrt P (i.e. it's always PC never CP).

This revision was landed with ongoing or failed builds.Mar 10 2021, 6:44 PM
This revision was automatically updated to reflect the committed changes.
clang/test/CodeGen/RISCV/rvv-intrinsics/vfadd.c