Page MenuHomePhabricator

[RFC] Re-implementing -fveclib with OpenMP
Needs ReviewPublic

Authored by fpetrogalli on Nov 11 2018, 7:31 PM.

Details

Summary

This RFC encompass the proposal of replacing the current TargetLibraryInfo (TLI) based implementation of the command line -fveclib with an OpenMP based one.

Diff Detail

Event Timeline

fpetrogalli created this revision.Nov 11 2018, 7:31 PM
hfinkel added inline comments.
RFC-fveclib.md
12

RC -> RFC

24

Why is this a question? Isn't this exactly the same infastructure we're using for OpenMP delcare simd, in which case it does exactly this.

It also enables other frontends and languages to add things these things as relevant for their own runtime libraries, etc.

30

meta-data -> metadata

(here and below)

225

TODO?

236

analisys -> analysis

270

This needs to be in the reserved namespace. Maybe something like:

_CLANG_USE_SVML
fpetrogalli marked 6 inline comments as done.
fpetrogalli added inline comments.
RFC-fveclib.md
225

Done, I have also changed the name to the correct one of the TLI (getVectorizedFunction).

hsaito added a subscriber: hsaito.Nov 16 2018, 12:02 PM

Also, in order for us to be able to adopt this mechanism, we need to work on LV's code gen. Currently, parts of it rely on having one consistent VF across the entire loop (i.e., loop-level VF or "serialize"). That's the reason why https://reviews.llvm.org/D53035 is generating illegal call first and then legalize. We need to add this work item as a dependency (to at least some veclib, e.g., SVML).

RFC-fveclib.md
41–42

I think we need to come up with a mechanism to enable the veclib mapping without fully enabling -fopenmp-simd. Decorating the veclib stuff with an extra "veclib" clause added to omp declare simd
and create a mode to process just those would do, and I'm open to other suggestions.

Rationale: auto-vectorization of loops and slp vectorization, using veclib, is really a separate issue from OpenMP SIMD.

hfinkel added inline comments.Nov 16 2018, 12:05 PM
RFC-fveclib.md
41–42

I agree with this. We should add a clang pragma that we always honor independent of the OpenMP settings.

hsaito added inline comments.Nov 16 2018, 12:17 PM
RFC-fveclib.md
39–40

I think we would want to have a mechanism to include the custom header file from a non-default location, say, for some non-compiler vendor, supplying a new veclib implementation. Something like -fveclib-include=<full-path>/filename.h. By default, it'll be <clang's include dir>/{SVML, Accelerate, ...}.h for those that are shipped with the compiler.

fpetrogalli added inline comments.Nov 16 2018, 1:14 PM
RFC-fveclib.md
39–40

Let me see if I got your proposal right.

  1. We add in <clang>/lib/Headers a math.h plus one header file for each of the libraries we need to expose to the vectorizer, as follows:
#include_once <math.h>

#ifdef _CLANG_USE_ACCELERATE
#include "Accelerate.h"
#endif

#ifdef _CLANG_USE_SVML
#include "SVML.h"
#endif
  1. The header Accelerate.h and SVML.h would add the declare simd directive to the declarations.
  1. The option -fveclib-include=<full-path>/filename.h trigger the generation of the metadata, by triggering the mechanism described in this RFC to generate the vector-variant metadata. This is useful as it allows auto-vectorization of external function calls for libraries that do not relate to math.h.
  1. The option -fveclib=X is a wrapper that (a) selects the correct section in <clang>/lib/Headersmath.h with a macro (e.g. _CLANG_USE_X), and moreover triggers the generation of the metadata described in this RFC.

Under the assumption I interpreted your suggestion right, I think this is a very useful change, as it fully realizes the improvement described in the introduction:

Enable support for a developer's own vector libraries without
requiring changes to the compiler.

41–42

I agree too, thank you for raising this. Could we do the same by restricting -fopenmp-simd to something like -fopenmpo-declare-simd? This would process only the declare simd directive and its associated declare variant for simd.

I think that this would:

  1. Make out internal header files and the one the libraries would have to provide look more standard (they would be able to use them with any other compiler, in principle).
  2. Avoid adding parser code to duplicate the new directive.

Does it make sense?

hsaito added inline comments.Nov 16 2018, 2:47 PM
RFC-fveclib.md
39–40

What I wrote was below, but I think what you wrote should also work. Either way is fine with me. If I understand your interpretation correctly, -fveclib and -fveclib-include should be mutually exclusive.

-fveclib=X w/o explicit -fveclib-include defaults to the behavior you wrote. Works only for the ones shipped with the compiler.

-fveclib=X w/ explicit -fveclib-include should pick up the specified header file (from anywhere in the accessible file system).

41–42

Not quite the same thing, because auto-vectorization can still pick up vector functions outside of veclib.

I think the include file can be written with a macro like below and still compatible with other OpenMP compilers. I don't know how the clang FE folks likes/hates parsing this when -fopenmp/-fopenmp-simd isn't thrown, however.

#ifdef clang

#define CLANG_VECLIB clang_veclib

#else

#define CLANG_VECLIB

#endif

snip snip snip

#pragma omp declare simd CLANG_VECLIB

hfinkel added inline comments.Nov 16 2018, 7:56 PM
RFC-fveclib.md
39–40

Why do we need a separate flag for this? Isn't it the same as using '-include <full-path>/filename.h'?

41–42

Avoid adding parser code to duplicate the new directive.

I think that this is exactly what we should do, however. We can refactor things so that we have a '#pragma clang declare simd', or similar, that we always accept (even if we ignore the '#pragma omp declare simd' spelling when OpenMP is not enabled).

An alternative might be to enable OpenMP simd parsing by default?

270

To update my own suggestion, we might use: _CLANG_USE_VECLIB_X (but we should be consistent throughout the document - below you have LIBRARY_X).

@hsaito , @hfinkel

I would avoid having a clang-specific pragma. By just using the OpenMP one, we provide the library vendors with a mechanism that is based on a standard and therefore can be used by any compiler.

Specifically:

An alternative might be to enable OpenMP simd parsing by default?

Yes, I thisnk this is doable, to have #pragma omp declare simd and #pragma omp declare variant (only the simd case) enabled by default.

Are you happy for me to modify the RFC with this?

Francesco

@hsaito , @hfinkel

I would avoid having a clang-specific pragma. By just using the OpenMP one, we provide the library vendors with a mechanism that is based on a standard and therefore can be used by any compiler.

Specifically:

An alternative might be to enable OpenMP simd parsing by default?

Yes, I thisnk this is doable, to have #pragma omp declare simd and #pragma omp declare variant (only the simd case) enabled by default.

Are you happy for me to modify the RFC with this?

I'm fine with that.

Francesco

An alternative might be to enable OpenMP simd parsing by default?

Yes, I thisnk this is doable, to have #pragma omp declare simd and #pragma omp declare variant (only the simd case) enabled by default.

Are you happy for me to modify the RFC with this?

I'm fine with that.

I'm not totally aligned here, but I'm fine proceeding with the RFC revision along this line. I'll wait for more voices to be raised (or not) on the issue. Please keep a note saying that there is a concern raised and those who care should voice their concerns.

An alternative might be to enable OpenMP simd parsing by default?

Yes, I thisnk this is doable, to have #pragma omp declare simd and #pragma omp declare variant (only the simd case) enabled by default.

Are you happy for me to modify the RFC with this?

I'm fine with that.

I'm not totally aligned here, but I'm fine proceeding with the RFC revision along this line. I'll wait for more voices to be raised (or not) on the issue. Please keep a note saying that there is a concern raised and those who care should voice their concerns.

So that I understand, what's your preference?

An alternative might be to enable OpenMP simd parsing by default?

Yes, I thisnk this is doable, to have #pragma omp declare simd and #pragma omp declare variant (only the simd case) enabled by default.

Are you happy for me to modify the RFC with this?

I'm fine with that.

I'm not totally aligned here, but I'm fine proceeding with the RFC revision along this line. I'll wait for more voices to be raised (or not) on the issue. Please keep a note saying that there is a concern raised and those who care should voice their concerns.

So that I understand, what's your preference?

I still would like to see the ability to differentiate quasi-intrinsic veclib functions versus user-written vector functions and enable veclib w/o enabling user-written vector functions, even if -fopenmp-simd becomes default. Some of our ICC customers take perf data with and w/o openmp simd (and proprietary simd pragma) and they expect SVML to be used for both cases. I don't have much preferences in how to accomplish it. For example, if we'd like to keep the header file contents totally standard based, we could use something like a command line flag -fclang-vector-intrinsic=file1,file2,..,fileN to indicate that "declare simd" in those header file is "quasi-intrinsic". We could silently do this for the veclibs shipped with clang (e.g., check if "declare simd" is coming from clang include directory) ---- I'm fine with this ----, but that's a bit unfair to the 3rd party veclibs.

I still would like to see the ability to differentiate quasi-intrinsic veclib functions versus user-written vector functions and enable veclib w/o enabling user-written vector functions, even if -fopenmp-simd becomes default. Some of our ICC customers take perf data with and w/o openmp simd (and proprietary simd pragma) and they expect SVML to be used for both cases. I don't have much preferences in how to accomplish it. For example, if we'd like to keep the header file contents totally standard based, we could use something like a command line flag -fclang-vector-intrinsic=file1,file2,..,fileN to indicate that "declare simd" in those header file is "quasi-intrinsic". We could silently do this for the veclibs shipped with clang (e.g., check if "declare simd" is coming from clang include directory) ---- I'm fine with this ----, but that's a bit unfair to the 3rd party veclibs.

I wonder if there is a way to do this programmatically. Maybe we could use clang/ModuleMap to tell clang to parse (and load) math.h and the compiler provided headers with the extra feature -fopenmp-simd?

If not, we can go with the #pragma clang declare simd directive, with clang being ifdef guarded to become omp when _OPENMP is detected.

https://reviews.llvm.org/D54412

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

Sigh, ignore this, it was an error.

If not, we can go with the #pragma clang declare simd directive, with clang being ifdef guarded to become omp when _OPENMP is detected.

Under the circumstances, this seems like a reasonable strategy to me.

If not, we can go with the #pragma clang declare simd directive, with clang being ifdef guarded to become omp when _OPENMP is detected.

Under the circumstances, this seems like a reasonable strategy to me.

Works for me.

@hfinkel , @hsaito ,

I think i have addressed all your comments. Please let me know if I have missed anything.

I have adapted the RFC to your suggestions, in particular the need to keep auto-vectorization and SLP vectorization separated from OpenMP SIMD. As suggested, I have introduced a new #pragma, that is equivalent to the OpenMP ones. I decide to call it #pragma veclib instead of #pragma clang to avoid making it clang-specific. The reason for doing this is that gcc has a -mveclibabi flag that is similar to the -fveclib in clang, so I thought it would add value for those library vendors that wanted to use the header file mechanism also in gcc, if gcc will ever decide to go via the same route we might take in llvm.

fpetrogalli marked 8 inline comments as done.Nov 25 2018, 8:26 PM
fpetrogalli marked an inline comment as done.Nov 25 2018, 8:29 PM
fpetrogalli added inline comments.
RFC-fveclib.md
39–40
Why do we need a separate flag for this? Isn't it the same as using '-include <full-path>/filename.h'?

Am I correct thinking that we want to have new separate flag because the #pragma directives are not parsed by default?

hsaito added inline comments.Nov 26 2018, 4:40 PM
RFC-fveclib.md
39–40

Am I correct thinking that we want to have new separate flag because the #pragma directives are not >parsed by default?

If we'd like to enable/disable #pragma veclib parsing, yes. Else no (i.e., enable parsing by default. and no way to turn it off, other than #pragma veclib becoming #pragma omp if _OPENMP is seen before parsing the header file). If a programmer doesn't want veclib, don't bother including the header file. That usage model works for me, I think.

I think the summary of discussion here would be:

  1. No veclib support ==> do not use -include.
  2. veclib (shipped with compiler) -include filename.h (look under system include dir by default)
  3. veclib (3rd party) -include <path>/filename.h

veclib w/o _OPENMP seen: #pragma veclib declare simd clang should parse this w/o a flag.
veclib w/ _OPENMP seen: #pragma omp declare simd
clang should parse this as part of OpenMP support.
using something like
#ifndef _OPENMP
#define VECLIB_PRAGMA veclib
#else
#define VECLIB_PRAGMA omp
#endif

hsaito added inline comments.Nov 26 2018, 4:50 PM
RFC-fveclib.md
43–44

-include

46–49

Retire -fveclib flag.

51

I think this subsection, explaining current situation, doesn't need any changes.

128

standard

139

I have a reservation about this simple substitution, but if other people think this will work, then I'm fine.

hsaito added inline comments.Nov 26 2018, 5:17 PM
RFC-fveclib.md
208

(SVFS)

210

SVFS

237

This isn't specific to veclib, but we need to think about what to do when there are more than one candidate that "fuzzy matches" the calling context. We should provide a standard "scoring system" that returns the "best match", but that scoring system may not be "one size fits all".

For veclib, we probably could take some short cut --- but if we want to standardize with OpenMP declare simd, we should think about multiple fuzzy matches.

fpetrogalli marked an inline comment as done.Nov 26 2018, 7:35 PM
fpetrogalli added inline comments.
RFC-fveclib.md
39–40

Hum - I was thinking something slightly different. I am not really keen in removing the -fveclib option, as it makes very clear to the user what to expect.

Let me iterate once again on a high level view of what I am trying to do here.

The LLVM view.

  1. The backend is informed about the availability of vector functions via the vector-varian attribute, with non standard name redirection.
  2. The standard names of the vector variant is generated according to the mangling rule of a vector function ABI based on OpenMP.

The vector-variant attribute is all the vectorizer needs to know to be able to vectorize the function.

The clang view.

  1. clang generates the vector-variant attribute when a function is decorated with #pragma omp declare simd, and with #pragma omp declare variant simd

The -fveclib view.

  1. We want to retain auto-vectorization of functions even when OpenMP is not enabled in the compiler. 2. This is done by adding the functions of a library in the system header files shipped with clang, and by cloning the #pragma omp declare simd and #pragma omp declare variant simd directive into the "veclib" ones.
  2. Parsing of the veclib directive is enabled only with the -fveclib command line option, that select the proper header file section shipped by the compiler (if we are not happy with math.h, we could ship the compiler with a veclib.h that is automatically included when -fveclib is used)

The user libraries

  1. The option -fveclib-include= activate the parsing of the veclib pragma and loads the list from a user provided header file.

Summary:
0. No option -> nothing happens

  1. -fveclib=X becomes -fparse-veclib -D__CLANG_ENABLE_LIBRARY_X -lX -include=<path>/to/lib/Headers/veclib.h. The header veclib.h is shipped with the compiler.
  2. -fveclib-include=/path/to/my/header.h becomes -fparse-veclib -include=/path/to/my/header.h. The user has to provide the correct linker flangs.
  3. -fopenmp[-simd] no vectorization happens other then for those functions that are marked with OpenMP declare simd. The header veclib.h is not loaded so nothing gets vectorized under the hood
  4. -fopenmp[-simd] -fveclib=X or -fopenmp[-simd] -fveclib-include=/path/to/my/header.h behave like in 1. and 2.

Item 1. 2. 3. and 4. use the same mechanism of vector-variant. The only difference is that for the -fveclib based option we replace the openMP directive with the veclib one to make sure we can keep the two functionality separated (vectorize OpenMP delcare simd _without_ also enabling -fveclib).

@hsaito , there are another couple of reasons for which I'd prefer to keep the -fveclib option.

First of all, by using -fveclib=X we inform the compiler that we are also requesting to link against library X. Given that clang can be used to link object files that have been compiled with -fveclib, I think that we would prevent linking errors by retaining -fveclib.

Second, the way I see -fveclib=X is for vector functions of something that is already available in math.h. Libraries like SVML and Accelerate contain only the vector version of the functions that are provided separately by libm or other scalar math library that implement math.h. For use provided library, this is not granted, as the compiler will expect to find both scalar and vector version "somewhere". Therefore I'd rather have a separate -fveclib-include option for external library, which would make explicit the fact that the user needs to provide at link time whatever library is needed to resolve the symbols generated by the compiler (scalar and vector).

Finally, -fveclib=X make it more easy for a Fortran front-end to pick up the list of vector functions associated to the library X, as I believe it doesn't make sense to use an -include option in a Fortran context.

hsaito added inline comments.Nov 28 2018, 2:00 PM
RFC-fveclib.md
39–40

This is fine with me. Aligns enough with my earlier comments. @hfinkel was asking why we need two separate flags and that's why my summary was purely based on -include.

43–44

Taking my earlier comment back.

46–49

Taking my earlier comment back.

fpetrogalli marked 12 inline comments as done.Nov 29 2018, 12:46 PM
fpetrogalli added inline comments.
RFC-fveclib.md
237

I have added the following paragraph (at the end of this section) to address this:

Notice that to fully support OpenMP vectorization we need to think
about a fuzzy matching mechanism that is able to select a candidate in
the calling context. However, this is not needed for -fveclib
because the scalar-to-vector mappings of `-fveclib` are such that
for every scalar function there is only one possible vector function
associated. Therefore, extending this behavior to a generic one is an
aspect of the implementation that will be treated in a separate RFC
about the vectorization pass.

fpetrogalli marked an inline comment as done.
fpetrogalli marked an inline comment as done.
hsaito added inline comments.Nov 29 2018, 1:16 PM
RFC-fveclib.md
237

Thanks!

fpetrogalli marked 4 inline comments as done.Nov 29 2018, 9:40 PM

I feel like pointing out that this RFC did *NOT* go to the lists, which makes zero sense, since it's RFC.
Usually, it is recommended to close the old review and submit a new one, making sure that it *does* go to the lists..

RKSimon added a subscriber: RKSimon.Dec 4 2018, 1:58 AM
smaslov added a subscriber: smaslov.Dec 6 2018, 2:48 PM

I feel like pointing out that this RFC did *NOT* go to the lists, which makes zero sense, since it's RFC.

Hi - sorry about that - I thought that sending it out on llvm-dev and cfe-dev was enough.

Usually, it is recommended to close the old review and submit a new one, making sure that it *does* go to the lists..

I see. I am happy to create a new one if that is a requirement for the RFC submission. One question: how do I make sure it goes to the lists? I thought I did it and I want to avoid making the same error.

Kind regards,

Francesco

I feel like pointing out that this RFC did *NOT* go to the lists, which makes zero sense, since it's RFC.

Hi - sorry about that - I thought that sending it out on llvm-dev and cfe-dev was enough.

No.

Usually, it is recommended to close the old review and submit a new one, making sure that it *does* go to the lists..

I see. I am happy to create a new one if that is a requirement for the RFC submission. One question:

how do I make sure it goes to the lists? I thought I did it and I want to avoid making the same error.

By setting correct "repo" field (best) and/or adding the appropriate -commits list as the subscriber.

Kind regards,

Francesco

I see. I am happy to create a new one if that is a requirement for the RFC submission. One question:

how do I make sure it goes to the lists? I thought I did it and I want to avoid making the same error.

By setting correct "repo" field (best) and/or adding the appropriate -commits list as the subscriber.

Sure. Can you specify exactly which are the repo fields I have to set? Sorry, it is not cleat to me from the changes you have done here.

Moreover, I can see though that you have subscribed llvm-commits. Is that the only one I need to add to the reviewers field? No cfe-commits list?

I see. I am happy to create a new one if that is a requirement for the RFC submission. One question:

how do I make sure it goes to the lists? I thought I did it and I want to avoid making the same error.

By setting correct "repo" field (best) and/or adding the appropriate -commits list as the subscriber.

Sure. Can you specify exactly which are the repo fields I have to set? Sorry, it is not cleat to me from the changes you have done here.

Moreover, I can see though that you have subscribed llvm-commits. Is that the only one I need to add to the reviewers field? No cfe-commits list?

RFCs should be sent to llvm-dev (or cfe-dev, etc.) and that can't be done from Phabricator.

rengolin added a subscriber: rengolin.

Just to clarify - to me it is not clear whether I am required to create a new review on phabricator to be able to refer to this proposal as an RFC, and if so, which fields have to be set, and with which values, to make sure that the proposal is recognized as an official RFC by the LLVM community.

Is there a document that explains how to submit an RFC?

Is there a document that explains how to submit an RFC?

Hi Francesco,

It's not that complex for a document. The text you have here would be more than fine, and the discussion that happened is exactly what we look for, but Phab reviews only copy llvm-commits, not llvm-dev, and that is the main problem.

Since there's a lot of high quality discussions here, I suggest you post the same content, but with a summary of the feedback you had, and the changes in design you had to do because of that.

This would avoid Hal and Hideki from having to say it all over again, and will get more people into the right point in the conversation.

cheers,
--renato

Is there a document that explains how to submit an RFC?

Hi Francesco,

It's not that complex for a document. The text you have here would be more than fine, and the discussion that happened is exactly what we look for, but Phab reviews only copy llvm-commits, not llvm-dev, and that is the main problem.

I see! Now it is clear to me - I thought people were asking me to create a new review on phabricator of the same document, and move the discussion there. What was really being asked was to submit the actual text of the proposal in the body of the email announcing the RFC... I will do it.

Since there's a lot of high quality discussions here, I suggest you post the same content, but with a summary of the feedback you had, and the changes in design you had to do because of that.

This would avoid Hal and Hideki from having to say it all over again, and will get more people into the right point in the conversation.

I would like to avoid writing a summary, and take care of clarifying what is being asked by pointing them at specific comments in the review if the question has already been answered in phabricator.

cheers,
--renato

Hello again - I have posted the RFC on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2018-December/128426.html

Regards,

Francesco

simoll added a subscriber: simoll.Jun 3 2019, 2:21 AM