This is an archive of the discontinued LLVM Phabricator instance.

[SVFS] Vector Function ABI name demangler.
ClosedPublic

Authored by fpetrogalli on Aug 9 2019, 12:41 PM.

Details

Summary

This patch implements the demangling functionality as described in the
Vector Function ABI. This patch will be used to implement the
SearchVectorFunctionSystem(SVFS) as described in the RFC:

http://lists.llvm.org/pipermail/llvm-dev/2019-June/133484.html

A fuzzer is added to test the demangling utility.

Diff Detail

Event Timeline

aranisumedh created this revision.Aug 9 2019, 12:41 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 9 2019, 12:41 PM

2 comments:

  • It will be a good idea to add a fuzzer target for VFABI::getVectorName() specifically from the get go.
  • This is likely a dubious question, but is this mangling compatible with other implementations, e.g. GCC?

@lebedev.ri

  1. Could you elaborate about the fuzzer target? The VFABI::getVectorName() extracts the custom redirection to a user-specified function that can be described using #pragma declare variant
  2. If I understood you correctly, yes, this name mangling format is as specified in the ABI. It is also supported by other implementations such as GCC. The following link is a demonstration of the same which uses GCC compiler. [https://godbolt.org/z/PrGLn0]

@lebedev.ri

  1. Could you elaborate about the fuzzer target?

All the parsing code is always extremely prone to subtle bugs.
At least it should not crash on them.
That would be easy to ensure via a fuzz target.

The VFABI::getVectorName() extracts the custom redirection to a user-specified function that can be described using #pragma declare variant

  1. If I understood you correctly, yes, this name mangling format is as specified in the ABI. It is also supported by other implementations such as GCC. The following link is a demonstration of the same which uses GCC compiler. [https://godbolt.org/z/PrGLn0]

Great, just checking.

All the parsing code is always extremely prone to subtle bugs.
At least it should not crash on them.
That would be easy to ensure via a fuzz target.

Sorry for being naive here, but what is a fuzz target? I have also added several test cases to test for known corner cases.

All the parsing code is always extremely prone to subtle bugs.
At least it should not crash on them.
That would be easy to ensure via a fuzz target.

Sorry for being naive here, but what is a fuzz target?

Oh, i should have clarified.
See e.g. llvm/tools/llvm-dwarfdump/fuzzer/llvm-dwarfdump-fuzzer.cpp, llvm/tools/*-fuzzer.

I have also added several test cases to test for known corner cases.

llvm/include/llvm/Analysis/SearchVectorFunctionSystem.h
30

All these are currently in llvm namespace, would it be better to narrow it and put them into VFABI?

aranisumedh added inline comments.Aug 12 2019, 9:54 AM
llvm/include/llvm/Analysis/SearchVectorFunctionSystem.h
30

Someone correct me if I am wrong.

This module is intended to be extensible to other vectorization as well (as stated in the RFC). I am assuming that it can be extended to other ParameterKind as well.

Whereas the name demangling for now is as stated in the vector function ABI and hence a part of the VFABI namespace.

Thanks for splitting this functionality off into a separate patch!
I added some comments inline, but my main points are:

  • that it would be better to parse the mangled string incrementally, rather than extracting each feature from the string individually.
  • if you change this patch to parse incrementally, interfaces such as getIsMasked or getISA should probably be private methods to VectorFunctionShape, and be renamed to parseIsMasked and parseISA.
llvm/include/llvm/Analysis/SearchVectorFunctionSystem.h
150

I don't think we want to expose any of these interfaces directly, as they are only relevant to the parsing of the mangled name vis-a-vis the construction of the VectorFunctionShape object. After it has been parsed, the VectorFunctionShape object can be queried directly for these properties.

If you make them private methods to VectorFunctionShape, then a constructor of VectorFunctionShape can take StringRef MangledName and parse the string to fill out the object, calling all these interfaces under the hood. For this reason, I think you'll want to move demangleName and getParameters from D66025 to this patch as well.

I'd also recommend renaming these functions to use 'parse' instead of 'get', and instead of returning the value that they have parsed directly, you can wrap it in a Optional<>, so you can check whether an issue occurred while parsing the mangled name.

llvm/lib/Analysis/SearchVectorFunctionSystem.cpp
53

If you implement these methods as 'parse' methods, you can simplify the code by just updating the StringRef for MangledName. e.g.

// note that MangledName is now a reference, so when the value
// is parsed it is sliced off from the name.
Optional<bool> parseMask(StringRef &MangledName) { 
  bool Mask;
  switch (MangledName[0]) {
  case 'N':
    Mask = false;
    break;
  case 'M' :
    Mask = true;
    break;
  default:
    // parse failure, return nothing (caller can choose to
    // error or recover). MangledName is not updated.
    return {};
  }
  MangledName = MangledName.drop_front(1);
  return {Mask};
}

for parseVF, you can do something along the lines of:

Optional<unsigned> parseVF(StringRef &MangledName) {
  // Get the string containing the VF
  StringRef VFString = MangledName.take_until([](char c) { return ::isdigit(c); });

  unsigned VF;
  if (VFString.consumeInteger(10, VF))
    return {}; // Could not parse as integer

  // Successfully parsed, so update MangledName.
  MangledName.drop_front(VFString.size());
  return {VF};
}
llvm/unittests/Analysis/SearchVectorFunctionSystemTest.cpp
1 ↗(On Diff #214419)

Do you want to rename this test to something like VectorFunctionABITest.cpp ?

jdoerfert added inline comments.Aug 12 2019, 1:04 PM
llvm/include/llvm/Analysis/SearchVectorFunctionSystem.h
30

I'm with @lebedev.ri, ParameterKind and ParamType are such generic names that one can easily see errors in the future of someone "seeing" that definition but not the intended one. What is the problem with wrapping this in the VFABI namespace even if we use it in other contexts?

aranisumedh marked an inline comment as done.Aug 13 2019, 8:58 AM
aranisumedh added inline comments.
llvm/include/llvm/Analysis/SearchVectorFunctionSystem.h
30

I see the point you're trying to make. I'll wrap the ParameterKind and ParamType in the same namespace as the rest.

150

I'd also recommend renaming these functions to use 'parse' instead of 'get',

Will do.

and instead of returning the value that they have parsed directly, you can wrap it in a Optional<>, so you can check whether an issue occurred while parsing the mangled name.

I have used asserts for error handling. Also, vector function ABI doesn't allow any fields to be optional. Any mismatch is a failure.

aranisumedh updated this revision to Diff 216773.EditedAug 22 2019, 11:31 PM

Thank you all for valuable comments. I have made the changes as requested so far.

Still trying to figure out the *-fuzzer part. Will add support for same soon.

aranisumedh marked 6 inline comments as done.Aug 22 2019, 11:33 PM

Had a minor bug in the previous patch. Updated it.

Minor Update. Cosmetic changes.

fpetrogalli commandeered this revision.Aug 28 2019, 1:54 PM
fpetrogalli added a reviewer: aranisumedh.

Hi all,

thank you for the reviews so far.

I am picking up this change-set to continue working on it. Thank you @aranisumedh for your work!

Question: what is the fuzzer supposed to do? Generate random strings to be parsed by VABI::getVectorName?

Francesco

fpetrogalli marked an inline comment as done.Aug 28 2019, 2:02 PM
fpetrogalli added inline comments.
llvm/include/llvm/Analysis/SearchVectorFunctionSystem.h
30

@lebedev.ri, @jdoerfert I agree that the names ParameterKind and ParamType are too generic to be exposed in the llvm namespace, but I think that moving them in the VABI namespace might not be the right thing to do. These enum and structure are supposed to support other types of parameters of vector function that might not be OpenMP specific or directly related to a vector function ABI specification. This is one of the goals of the RFC submitted for the SVFS.

Therefore I suggest to move them back to the llvm namespace, with names that will make it clar that we are talking about function vectorization.

How about the following?

ParameterKind -> VectorFunctionParameterKind or VFParameterKind
ParamType -> VectorFunctionParameter or VFParameter (we might want get rid of the Type token too)

Does that make sense to you?

fpetrogalli retitled this revision from Name Demangling as specified in the Vector Function ABI to [SVFS] Search vector function system..
fpetrogalli edited the summary of this revision. (Show Details)

I have uploaded a new version with the fuzzer requested by @lebedev.ri .

fpetrogalli retitled this revision from [SVFS] Search vector function system. to [SVFS] Vector Function ABI demangler..Aug 30 2019, 11:33 AM
fpetrogalli retitled this revision from [SVFS] Vector Function ABI demangler. to [SVFS] Vector Function ABI name demangler..

Cosmetic changes to a comment.

jdoerfert added inline comments.Sep 3 2019, 10:05 AM
llvm/include/llvm/Analysis/SearchVectorFunctionSystem.h
30

How about the following?

ParameterKind -> VectorFunctionParameterKind or VFParameterKind
ParamType -> VectorFunctionParameter or VFParameter (we might want get rid of the Type > token too)

Does that make sense to you?

That is fine with me (both the VectorFunction or the VF version). It would be nice and consistent to put VF (or VectorFunction) in front of all these names (ParameterKind, ISAKind, ParamType) as done for VectorFunctionShape.

Sorry, meant to reply, but lost track.

llvm/include/llvm/Analysis/SearchVectorFunctionSystem.h
30

I think that would be good-enough.

141

files should end with newlines

llvm/tools/vfabi-demangle-fuzzer/vfabi-demangler-fuzzer.cpp
20

Question: does getVectorName() run *all* of the parsing code in SearchVectorFunctionSystem.cpp?

24–27

Yes, this should be ok i suppose.

I have updated the code according to the last round of reviews. I have removed pieces of code that were not needed anymore for testing.

The parser now is split from the VFParameter class (the old ParamType class).

fpetrogalli marked 6 inline comments as done.Sep 5 2019, 3:24 PM

Hi @sdesmalen:

it would be better to parse the mangled string incrementally, rather than extracting each feature from the string individually.

I think the parser now does what you ask for the "variable length" part of the mangled string, which is the one that holds the <vlen> and the <parameters>.

Francesco

llvm/tools/vfabi-demangle-fuzzer/vfabi-demangler-fuzzer.cpp
20

Good catch. Now it does.

fpetrogalli marked an inline comment as done.

I have done the following updates:

  1. Added demangling and tests for uniform parameters ("u").
  2. Added more docs, with references to the Vector Function ABI specs for x86 and AArch64.
  3. Use default constructors for the structs introduced in the patch.
  4. Added a test that checks that the demangling functions for the parameters is the same for the vector extensions of both AArch64 and x86.

Not a review, but some thoughts nonetheless.

llvm/include/llvm/Analysis/VectorUtils.h
64–66 ↗(On Diff #219420)

have you considered std::tie() ? Not sure if it would work/help here.

71 ↗(On Diff #219420)

Documentation commends have 3x /

75–82 ↗(On Diff #219420)

VectorVariant

But, i think this function should return llvm::Optional<Some Struct>, or error_or maybe

104 ↗(On Diff #219420)

This is unreachable, right? Else i think this might need better error propagation strategy.

164 ↗(On Diff #219420)

same, and elsewhere.

fpetrogalli marked 5 inline comments as done.
fpetrogalli added inline comments.
llvm/include/llvm/Analysis/VectorUtils.h
75–82 ↗(On Diff #219420)

I have changed the signature to use a single input parameter. The method is still returning a bool.

I have used optional in the static method that creates the VFShape instance.

104 ↗(On Diff #219420)

I have replace the output type with Optional<VFParamKind> and moved the llvm_unreachable at the end of the function.

164 ↗(On Diff #219420)

Done. The static method now returns an optional value.

In this new patch I have:

  1. Fixed a build failure of the fuzzer (I forgot to apply the interface changes of the last diff).
  2. Add ASSERT_DEATH testing for invalid names.
  3. Add testing to make sure <scalarname> is always present in the string.
  4. Add testing to make sure that <vectorname> is present when the opening ( is parsed.
  5. Add testing to make sure that the string terminates with ) when <vectorname> is present.

I have deferred the handling of the mask token "M" to the VFParameter struct, introducing a new VFPAramKind that handles global function predication. This should be more in line with the aim of this implementation, which is to make VFShape independent from the Vector Function ABI. In particular, there are use cases that @simoll cares about where predication can be attached to single parameters, and not to the whole function.

Tests to make sure the Global Predicate is added correct;ly has been added for all the targets supported in this patch (x86 and aarch64).

I'm fine with this (one small nit below).

@simoll, @lebedev.ri, any more comments or can this go in?

llvm/lib/Analysis/VFABIDemangling.cpp
413 ↗(On Diff #219773)

Why is this an Optional? It seems to always return a proper VFParamKind if it returns at all.

Looks ok.
(Note that i'm only looking at code, not correctness of demangling here.)

llvm/include/llvm/Analysis/VectorUtils.h
63 ↗(On Diff #219773)

There's ongoing effort to migrate to llvm::Alignment (i don't recall the name)

94 ↗(On Diff #219773)

I'm not sure why this should not also return Optional<ParsedData>.

simoll requested changes to this revision.Sep 12 2019, 5:58 AM

One minor thing: VFABI parsing should succeed for any well-formed AVFBI vector function name (beyond what's listed in the VFISAKind enum). This would open up the functionality for external users.

llvm/include/llvm/Analysis/VectorUtils.h
50 ↗(On Diff #219773)

How about renaming to ISA_Unknown ?

llvm/lib/Analysis/VFABIDemangling.cpp
302 ↗(On Diff #219773)

Why not carry on with parsing a well-formed AFABI String if the ABI is unknown?
This would enable support for VFABIs not listed here.

This revision now requires changes to proceed.Sep 12 2019, 5:58 AM
fpetrogalli marked 10 inline comments as done.

I have address all comments. Thank you for your feedback @jdoerfert @simoll @lebedev.ri !

Additionally, I have renames the enums values of VFISAKind from VFISAKind::ISA_* to VFISAKind::*.

llvm/lib/Analysis/VFABIDemangling.cpp
413 ↗(On Diff #219773)

yeah, sorry,Optional is an unnecessary complication here.

simoll accepted this revision.Sep 12 2019, 11:50 AM

LGTM. Thanks!

This revision is now accepted and ready to land.Sep 12 2019, 11:50 AM

Thanks for all the changes @fpetrogalli, this patch is really taking shape! Just added a few more drive-by comments.

llvm/include/llvm/Analysis/VectorUtils.h
94 ↗(On Diff #219773)

nit: Can we rename this main function to something like tryDemangleForVFABI ?

116 ↗(On Diff #219940)

Given that ScalarName and VectorName are not part of the equality-comparison, does that imply they should not be part of VFShape ?

llvm/lib/Analysis/VFABIDemangling.cpp
30 ↗(On Diff #219940)

Why is it named getISA which starts parsing at position4 rather than tryParseISA which starts parsing at position 0?

314 ↗(On Diff #219940)

Can we make the dropping of these substrings part of getMask and getISA (on success) ?

356 ↗(On Diff #219940)

nit: this comment seems unnecessary (the code says it all)

fpetrogalli marked 4 inline comments as done.Sep 12 2019, 2:22 PM
fpetrogalli added inline comments.
llvm/include/llvm/Analysis/VectorUtils.h
116 ↗(On Diff #219940)

I see. How about I refactor this as follow?

struct VFShape {
  unsigned VF;     // Vectorization factor.
  bool IsScalable; // True if the function is a scalable function.
  VFISAKind ISA;   // Instruction Set Architecture.
  SmallVector<VFParameter, 8> Parameters; // List of parameter informations.
};

struct VFInfo {
  VFShape Shape;
  StringRef ScalarName;                   // Scalar Function Name.
  StringRef VectorName; // Vector Function Name associated to this VFShape.
;

I think this makes more sense, I just want to double check with you before applying the changes.

HI @sdesmalen ,

I have updated the patch according to your feedback. In particular, I have modified all the parsing method to use the ParseRet enum instead of booleans. It looks better now, thanks.

Once outstanding work is the comparison operator of VFShape. I have proposed a solution in a comment, let me know what you think, I will apply it if you agree on the approach. BTW: good catch. The VFShape should carry information only about the shape, not about the names of the functions.

One more nit: while working on the SVFS, I have noticed that the llvm_unreachable I have added at the end of the static method to create FShape out of the parsed data was quite annoying: it got fired every time the CI I was trying to vectorize didn't have the name with the mangled attribute. hence, I have removed the assertion, and converted the ASSERT_DEATH tests into EXPECT_FALSE tests in TEST(VectorFunctionABITests, OnlyValidNames). This should not be controversial (I am happy to be told I am wrong if that's the case).

Thank you,

Francesco

I have updated the patch according to your feedback. In particular, I have modified all the parsing method to use the ParseRet enum instead of booleans. It looks better now, thanks.

Agreed, this looks good now, thanks!

Once outstanding work is the comparison operator of VFShape. I have proposed a solution in a comment, let me know what you think, I will apply it if you agree on the approach. BTW: good catch. The VFShape should carry information only about the shape, not about the names of the functions.

Yes, that seems like a good suggestion.

One more nit: while working on the SVFS, I have noticed that the llvm_unreachable I have added at the end of the static method to create FShape out of the parsed data was quite annoying: it got fired every time the CI I was trying to vectorize didn't have the name with the mangled attribute. hence, I have removed the assertion, and converted the ASSERT_DEATH tests into EXPECT_FALSE tests in TEST(VectorFunctionABITests, OnlyValidNames). This should not be controversial (I am happy to be told I am wrong if that's the case).

That seems like the better choice given how you've shaped the interface that returns a Optional<VFShape>, but I don't really understand why this would break CI.

fpetrogalli marked an inline comment as done.Sep 13 2019, 6:51 AM

One more nit: while working on the SVFS, I have noticed that the llvm_unreachable I have added at the end of the static method to create FShape out of the parsed data was quite annoying: it got fired every time the CI I was trying to vectorize didn't have the name with the mangled attribute. hence, I have removed the assertion, and converted the ASSERT_DEATH tests into EXPECT_FALSE tests in TEST(VectorFunctionABITests, OnlyValidNames). This should not be controversial (I am happy to be told I am wrong if that's the case).

That seems like the better choice given how you've shaped the interface that returns a Optional<VFShape>, but I don't really understand why this would break CI.

The way I am building the VFShape is as follows:

const StringRef AttrString =
        CI->getCalledFunction()->getFnAttribute("vector-function-abi-variant").getValueAsString();
      SmallVector<StringRef, 8> ListOfStrings;
      AttrString.split(ListOfStrings, ",");
SmallVector<VFShape, 8> Ret;
      for (auto MangledName : ListOfStrings) {
	auto Shape = VFShape::getFromVFABI(MangledName);
        if (Shape.hasValue() && Shape.getValue().ScalarName == ScalarName)
          Ret.push_back(Shape.getValue());
      }

Most of the functions don't have the attribute, which result in list of string consisting of a single element, which is an empty string, therefore the parser fails and the exception is raised.

Maybe there is a better way to do it, but given the Optional<VFShape> return, it doesn't make sense to raise the exception on failure.

I have split the implementation of VFShape into VFShape and VFInfo, as agreed.

I have also found a couple of bugs thanks to the fuzzer (that was a great suggestion @lebedev.ri !). I have added the correspondent unit tests to prevent regressions. The bugs were around invalid mangled name strings and the parsing of the align token.

Regards,

Francesco

fpetrogalli marked an inline comment as done.Sep 13 2019, 9:24 AM

I added the Unknown ISA handling in the test ISAIndependentMangling, and updated the comments in the fuzzer.

sdesmalen added inline comments.Sep 16 2019, 8:15 AM
llvm/include/llvm/Analysis/VectorUtils.h
153 ↗(On Diff #220128)

Can't tryDemangleForVFABI return a Optional<VFInfo> directly? (rather than construct it here from the ParsedData)
This also makes me question the need for ParsedData.

fpetrogalli marked an inline comment as done.Sep 16 2019, 8:36 PM
fpetrogalli added inline comments.
llvm/include/llvm/Analysis/VectorUtils.h
153 ↗(On Diff #220128)

I could, but I think it is not the right thing to do. I expect the the`VFInfo` struct to grow beyond handling Vector Function ABI and OpenMP information (it is one of the requirements). When this happens, all we need to do is just to to change this line adding a proper ParsedData -> VFInfo constructor instead of having to adapt the parser to the new VFInfo class.

sdesmalen added inline comments.Sep 17 2019, 12:45 AM
llvm/include/llvm/Analysis/VectorUtils.h
153 ↗(On Diff #220128)

If VFInfo turns into a class that supports a superset of the VFABI, the parser could construct a VFInfo object that populates less fields through having multiple constructors for VFInfo, or through a constructor with default arguments for the fields that are not applicable.
In the end there needs to be code somewhere that fills in the VFInfo object. Having a layer in the middle that just passes the information around offers little value, unless that layer adds some new functionality. Without such functionality, I think it makes more sense to return VFInfo directly.

I have removed the ParsedData struct as requested by @sdesmalen , my reasoning for keeping it was wrong.

I have updated the tests and the fuzzer accordingly.

fpetrogalli marked 3 inline comments as done.Sep 17 2019, 8:34 AM
fpetrogalli added inline comments.
llvm/include/llvm/Analysis/VectorUtils.h
153 ↗(On Diff #220128)

I agree with you, thank you.

sdesmalen added inline comments.Sep 17 2019, 8:53 AM
llvm/include/llvm/Analysis/VectorUtils.h
92 ↗(On Diff #220511)

Sorry, just spotting this now. Instead of having struct VFInfo; here, can you just move the definition of VFInfo here?

136 ↗(On Diff #220511)

nit: this code is still commented out.

fpetrogalli marked an inline comment as done.
fpetrogalli marked 2 inline comments as done.

Restore the function name to tryParseForVFABI.

sdesmalen accepted this revision.Sep 18 2019, 12:26 AM

Thanks @fpetrogalli, LGTM!

This revision was automatically updated to reflect the committed changes.