Page MenuHomePhabricator

[Clang][RISCV] Add custom TableGen backend for riscv-vector intrinsics.
Needs RevisionPublic

Authored by khchen on Jan 19 2021, 6:55 PM.

Details

Summary

Demonstrate how to generate vadd/vfadd intrinsic functions
(explicitly api).

  1. add -gen-riscv-vector-builtins for clang builtins.
  2. add -gen-riscv-vector-builtin-codegen for clang codegen.
  3. add -gen-riscv-vector-header for riscv_vector.h. It also generates

ifdef directives with extension checking, base on D94403.

  1. add -gen-riscv-vector-generic-header for riscv_vector_generic.h.

Generate overloading version Header for generic api.
https://github.com/riscv/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#c11-generic-interface

  1. update tblgen doc for riscv related options.

riscv_vector.td also defines some unused type transformers for vadd,
because I think it could demonstrate how tranfer type work and we need them
for whole intrinsic functions implementation in the future.

Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com>
Co-Authored-by: Zakk Chen <zakk.chen@sifive.com>

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Jim added inline comments.Tue, Feb 9, 12:39 AM
clang/include/clang/Basic/BuiltinsRISCV.def
15

If a new BUILTIN is added here, I met BUILTIN macro undefined error.

khchen updated this revision to Diff 324197.Tue, Feb 16, 11:18 PM

remove float32_t and float64_t to avoid collisions with other project.

Jim added a reviewer: Jim.Wed, Feb 17, 12:06 AM
craig.topper added inline comments.Tue, Feb 23, 5:08 PM
clang/include/clang/Basic/riscv_vector.td
149

Isn't mask the first operand maskedoff the second operand?

clang/utils/TableGen/RISCVVEmitter.cpp
158

Return by const reference or use StringRef.

176

Why not just use std::string in the one place this is used?

863

Might be better to just sort the vector by the IR name and then walk the vector looking for the boundaries where the name changes from the previous iteration.

900

I think this is something like

while (!Prototypes.empty()) {
 auto Idx = Prototypes.find_first_of(Primaries);
 assert(Idx != StringRef::npos);
 ProtoSeq.push_back(Prototypes.slice(0, Idx+1).str());
 Prototypes = Prototypes.drop_front(Idx+1);
}

Which might be easier to understand.

971

Why does this need to explicitly create an Optional? Shouldn't we just be able to return &(It->second)?

991

Could we maybe just sort the original vector by extension and just loop over it. Keep track of the extensions from previous iteration and emit the guard when this iteration doesn't match the previous iteration.

1013

This is only called in one place which immediate loops over and prints the contents. Could we just pass the raw_ostream in here and do the printing directly?

1016

Even if we don't print directly. This can be a vector of StringRef. These are all string literals.

1017

I don't understand this. It says D implies F but we're checking for F. So that seems like F implies D.

craig.topper added inline comments.Tue, Feb 23, 8:10 PM
clang/utils/TableGen/RISCVVEmitter.cpp
568

Can we do Transformer = Transformer.drop_back() right before this loop. That take_front code is harder to think about.

khchen updated this revision to Diff 326104.Wed, Feb 24, 8:22 AM
khchen marked 10 inline comments as done.
  1. Rebase
  2. Address Craig's comments.
  3. Change the operand orders of builtin to the same order of IR intrinsics.
clang/utils/TableGen/RISCVVEmitter.cpp
900

Thanks, it's clear.

1017

Remove extension implying rule.
I thought we don't need to consider implying rule because it's clang's responsibility.
The original thinking was when predecessor "only" defines __riscv_d, and floating instruction also need to supported. But in fact, when enabling __riscv_d predecessor will also define __riscv_f.

craig.topper added inline comments.Wed, Feb 24, 3:22 PM
clang/utils/TableGen/RISCVVEmitter.cpp
71

These method names should use CamelCase and start with "get"

103

These should use CamelCase per llvm coding style.

Might be better named init*Str instead of compute. compute makes me think they are going to return something, but that might just be me.

268

You can initialize the bools to false with " = false" where they are declared in the class body then you don't need to mention them all here. Similar for Scale and ElementBitWidth.

271

Why is ElementBitwidth default ~0. Wouldn't 0 also be an invalid value?

334

You drop the else since the if above returned.

619

I don't think we need to go through Twine here. We should be able to call str() directly on first.

622

!Suffix.empty()

641

CTypeOrder.resize(InputTypes.size());
std::iota(CTypeOrder.begin(), CTypeOrder.end(), 0);

711

Can't we just print OutputType->type_str() and " " to OS separately? We shouldn't need to concat them into a Twine first.

715

Why is this > 1 and not >= 1 or !CTypeOrder.empty()?

721

Replace the + operators with <<

723

Same here, why is this >1 and not >=1 or !CTypeOrder.empty()?

750

Looks like other headers use 2 underscores at the beginning of their include guard.

758

Can we just #endif here instead of the #else? If the error is emitted the preprocessor should stop and not process the rest of the file. Then we don't need to close it at the bottom of the file.

768

I'd recommend calling this printType. dump made me think it was printing for debug like the dump() functions found in many LLVM classes.

800

I think we only need to check __riscv_f here?

837

I don't think this StringRef constructor call is needed.

850

We probably want 2 newlines on the end of this so there will be a blank line before the first RISCVV_BUILTIN.

874

Can we remember the PrevIRName StringRef instead?

884

Defs.back()->emitCodeGenSwitchBody(OS);

1028

I think you can use ListSeparator for this. It keeps track of the separator string and whether the first item has been printed. It's most often used with loops, but it should work here. I think there are many examples uses in llvm/utils/TableGen

khchen updated this revision to Diff 326300.Wed, Feb 24, 11:41 PM
khchen marked 21 inline comments as done.
  1. Rename Dump to Print.
  2. Address Craig's comments, thanks for your patient.
clang/utils/TableGen/RISCVVEmitter.cpp
271

I followed SVE did, but you are right, 0 is better.

715

Thanks, it's bug when I refactor code from output input vector to input only vector...

758

Good point. Thanks. I should think more when I copied code from SVE.

874

I don't think so. We need PrevDef to emitCodeGenSwitchBody.

1028

thanks, it's more elegant.

craig.topper added inline comments.Wed, Feb 24, 11:57 PM
clang/utils/TableGen/RISCVVEmitter.cpp
874

Of course we do. Sorry about that.

khchen updated this revision to Diff 326309.Thu, Feb 25, 12:14 AM

refine comment

craig.topper added inline comments.Thu, Feb 25, 8:51 AM
clang/utils/TableGen/RISCVVEmitter.cpp
679

We don't need Twine here

685

Replace + with << and drop utostr

692

Drop Twine

694

Drop Twine

701

Drop Twine

711

Replace + with <<

714

Drop Twine

790

This comment looks incomplete

873

Should we sink the printing of break; into emitCodeGenSwitchBody so its not repeated after the other call to emitCodeGenSwitchBody?

khchen updated this revision to Diff 326585.Thu, Feb 25, 7:52 PM
khchen marked 10 inline comments as done.
  1. address Craig's comments.
  2. use ListSeparator in some code snippet.
craig.topper accepted this revision.Fri, Feb 26, 11:04 AM

LGTM

clang/utils/TableGen/RISCVVEmitter.cpp
908

Buitlin->Builtin

This revision is now accepted and ready to land.Fri, Feb 26, 11:04 AM
jrtc27 added inline comments.Fri, Feb 26, 11:07 AM
clang/include/clang/Basic/riscv_vector.td
57

I just find it really obfuscates things when we have all these magic character sequences.

67

Then fix the way you define these? This is just bad design IMO.

craig.topper added inline comments.Fri, Feb 26, 11:23 AM
clang/include/clang/Basic/riscv_vector.td
67

For each signature there is effectively a single key type that is a vector. The type transformer is a list of rules for how to derive all of the other operands from that one key type. Conceptually similar to LLVMScalarOrSameVectorWidth or LLVMHalfElementsVectorType in Intrinsics.td. Some types are fixed and don't vary by the key type. Like the size_t vl operand or a store intrinsic returning void. There is no separate place to put a base type.

jrtc27 added inline comments.Fri, Feb 26, 11:24 AM
clang/utils/TableGen/RISCVVEmitter.cpp
116

Zfh

241

Please be consistent and use Log2 rather than Exp

259–260

That's not how multiplication works. This is exponentiation. Multiplication would be Log2LMul + log2(RHS). Please don't abuse operators like this.

325

This really needs to be an enum not a bunch of mutually-exclusive booleans, which I though I suggested in the past?

450–451

Please try and avoid wrapping code across lines

568

Or would it be better as a pop_back in the switch above?

624

Blank line

685

This is missing indentation?

870

Needs indentation?

craig.topper requested changes to this revision.Fri, Feb 26, 11:27 AM

Dropping my approval pending @jrtc27 comments.

This revision now requires changes to proceed.Fri, Feb 26, 11:27 AM
craig.topper added inline comments.Fri, Feb 26, 11:29 AM
clang/utils/TableGen/RISCVVEmitter.cpp
568

I don't think StringRef has a pop_back.

jrtc27 added inline comments.Fri, Feb 26, 11:31 AM
clang/utils/TableGen/RISCVVEmitter.cpp
568

Right, and it's an immutable data structure anyway so you'd need to have to deal with two return values. Never mind then.

craig.topper added inline comments.Fri, Feb 26, 11:34 AM
clang/utils/TableGen/RISCVVEmitter.cpp
259–260

This seems like it must be broken, but since we don't do widening or narrowing in this patch we didn't notice?

jrtc27 added inline comments.Fri, Feb 26, 11:46 AM
clang/include/clang/Basic/riscv_vector.td
67

Oh I see, I hadn't appreciated that TypeRange got split up and each character was a whole separate intrinsic, I had misinterpreted it as it being a list of the types of arguments, i.e. an N-character string for an (N-1)-argument (plus return type) intrinsic that you could then use as a base and apply transformations too (e.g. "if" for a float-to-int intrinsic, with "vv" etc giving you a vectorised versions) and so was confused as to why the void/size_t/ptrdiff_t/uint8_t/bool-ness couldn't just be pushed into the TypeRange and the corresponding transforms left as "e". But, how _would_ you define vector float-to-int (and back) conversions with this scheme then?

On a related note, I feel one way to make this less obfuscated is change v/w/q/o to be v1/v2/v4/v8 (maybe with the 1 being optional, don't really care), and is also more extensible in future rather than ending up with yet more alphabet soup, though it does change the parsing from being a list of characters to a list of strings.

craig.topper added inline comments.Fri, Feb 26, 12:07 PM
clang/include/clang/Basic/riscv_vector.td
67

I think these transforms are used for the float-to-int and int-to-float.

I: given a vector type, compute the vector type with integer type
elements of the same width
F: given a vector type, compute the vector type with floating-point type
elements of the same width

The float and integer types for conversion are always the same size or one lmul up or one lmul down. So combining I and F with v or w should cover it.