Page MenuHomePhabricator

[OpenMP][OMPContext] Move SIMD alignment calculation to LLVM Frontend
ClosedPublic

Authored by domada on Nov 22 2022, 6:26 AM.

Details

Summary

Currently default simd alignment is specified by Clang specific TargetInfo class. This class cannot be reused for LLVM Flang. If we move the default alignment field into TargetMachine class then we can create TargetMachine objects and query them to find SIMD alignment.

Scope of changes:

  1. Added information about maximal allowed SIMD alignment to TargetMachine classes.
  2. Added TargetMachineCache to store created TargetMachine items and query them to find the alignment
  3. Removed getSimdDefaultAlign function from TargetInfo class.

Diff Detail

Event Timeline

domada created this revision.Nov 22 2022, 6:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 22 2022, 6:26 AM
domada requested review of this revision.Nov 22 2022, 6:26 AM
Herald added a project: Restricted Project. · View Herald Transcript

Looking at the existing code before the patch:
  / Return default simd alignment for the given target. Generally, this
  
/ value is type-specific, but this alignment can be used for most of the
  /// types for the given target.
  unsigned getSimdDefaultAlign() const { return SimdDefaultAlign; }
It is very imprecise. Which types? The smallest alignment for all vector and element types?.

If we are moving this to LLVM, can we use the data layout to infer what the default alignment is? Hopefully we can figure out what the criteria are and make the function work for any target, or maybe error out if it doesn't apply.

The idea is good, we should properly ask the target. That said, don't we need to also provide the type?

clang/lib/AST/ASTContext.cpp
2476

Why .str() ?

llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
504

Documentation please.

llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
3092
3100

No llvm:: this is llvm. Do we really want to create a target machine every time? Are the features different? Can we cache the machine or is it cheap to create?

Yes, we can extend calculations of the default alignment. My primary goal was to move code from clang into llvm without modification of the code logic.

AFAIK the default simd alignment corresponds to the maximal width of the simd instruction for given target.

Yes, we can extend calculations of the default alignment. My primary goal was to move code from clang into llvm without modification of the code logic.

AFAIK the default simd alignment corresponds to the maximal width of the simd instruction for given target.

So maybe the function should be called getMaximumVectorTypeAlignment or something like that, and query the target with some relevant types to find the maximum, alternatively implement this at a lower level inside DataLayout, and not have to provide a type.

jmmartinez added inline comments.
clang/lib/AST/ASTContext.cpp
2468–2477

Here you're copying the entire TargetFeatures vector. Could you use a const & ?

domada updated this revision to Diff 478500.Nov 29 2022, 3:42 AM
domada retitled this revision from [WIP][OMPIRBuilder] Add OpenMPDefaultSimdAlignment field to TargetMachine class to [OMPIRBuilder] Add OpenMPDefaultSimdAlignment field to TargetMachine class.
domada edited the summary of this revision. (Show Details)
  1. Refactored OMPIRBuilder method which calculates SIMD alignment. Added optional parameter Type *AlignedValueTy which can be used to define optimal alignment. If this parameter is not specified then, we return the maximal SIMD alignment for given target.
  2. Clang still uses suboptimal value because there is no link between clang TargetInfo and LLVM-Type. More information about this decision can be found in previous review which introduced original getSimdDefaultAlign function: https://reviews.llvm.org/D10597?id=28205#inline-85925
  3. TargetInfo class contains method: getMaxVectorAlign . This method returns maximal vector alignment for given target. The only difference between original: getMaxVectorAlign and getSimdDefaultAlign is that getSimdDefaultAlign returns 0 alignment for ARM target. I don't know if it is ok, that we assume 0 alignment for ARM targets.
  4. Removed getSimdDefaultAlign function.
domada marked 4 inline comments as done.Nov 29 2022, 3:46 AM
domada added inline comments.
clang/lib/AST/ASTContext.cpp
2476

Because createTargetMachine function requires target triple as the string.

If we implemented DataLayout::getMaxPreferredVectorTypeAlign something like this:

Align DataLayout::getMaxPreferredVectorTypeAlign() {
  Align max(1);
  for (auto &E : Alignments)
    if (E.AlignType == VECTOR_ALIGN && E.PrefAlign > max)
      max = E.PrefAlign;
  return max;
}

Then we could find the maximum vector type alignment, This would return Align(1) by default.
In the current implementation we return 0 by default, minimum alignment should reasonably be 1 by default, unless 0 means "unknown"?

FWIW, align 1 and 0 should be the same and both are trivial.

domada added a comment.Dec 5 2022, 3:25 AM

@jsjodin thanks for your remark. Let me add some more information to this patch.

Currently clang emits information about preferred simd alignment only for X86, WebAssembly and PPC targets.

If we generate code for one of these targets then we expand #pragma omp simd aligned(aligned_var) into call void @llvm.assume(i1 true) [ "align"(ptr aligned_var, i64 alignment_size) ] LLVM IR instruction . For other architectures (for example ARM) simd alignment is always set to 0 and the alignment assumption is not generated. I am not sure if it is ok, that we do not generate alignment assumptions for other architectures. That's why I raised this question during the last Flang-OpenMP meetings and I did not receive a clear answer.

Your code looks elegant but in my opinion it will generate different code for ARM architecture where the vector alignment is set to 64 ( https://github.com/llvm-mirror/clang/blob/master/lib/Basic/Targets/ARM.cpp#L311 ), but simd default alignment is set to 0.

@jsjodin thanks for your remark. Let me add some more information to this patch.

Currently clang emits information about preferred simd alignment only for X86, WebAssembly and PPC targets.

[snip]

Your code looks elegant but in my opinion it will generate different code for ARM architecture where the vector alignment is set to 64 ( https://github.com/llvm-mirror/clang/blob/master/lib/Basic/Targets/ARM.cpp#L311 ), but simd default alignment is set to 0.

I wasn't exactly proposing to eliminate the TargetMachine function. We would need a default implementation in that returns 0, and override it in X86, WebAssembly and PPC to call the DataLayout function. The only difference is that we don't re-encode the sizes. It will only make a difference in the future e.g. if newer x86 targets have larger max alignment.

domada added a comment.Dec 9 2022, 9:09 AM

I think we cannot not rely on DataLayout to find the maximal SIMD alignment.

Developer can specify that given functions are targeted to different subtargets (source: https://clang.llvm.org/docs/AttributeReference.html#target ). That's why information about target subtype is stored in function attribute not in the data layout string which is common for the whole Module.

I think we cannot not rely on DataLayout to find the maximal SIMD alignment.

Developer can specify that given functions are targeted to different subtargets (source: https://clang.llvm.org/docs/AttributeReference.html#target ). That's why information about target subtype is stored in function attribute not in the data layout string which is common for the whole Module.

The subtarget can only "improve" our information, right? We need to involve the target, provide it with the type, the datalayout, and the function attributes to make a decision. No?

Hi Johannes,
I was referring to Jan's idea. He proposes to move part of the functionality to the DataLayout class. IMO this option is not the best approach, because DataLayout object does not contain information about subtarget and we cannot determine if SIMD extensions (avx options) are enabled.

My current patch uses information about subtarget and type of the aligned value to determine the default alignment. If the type of the aligned value is provided I ask DataLayout about preferred alignment (OMPIRBuilder.cpp line 3113).. If this information is not provided I base only on subtarget information.

jdoerfert added inline comments.Dec 12 2022, 8:19 AM
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
3096

No static variables here, please.
These should be members of the class or a helper class.
Should we have a map of these to the respective TargetMachine?

domada updated this revision to Diff 483157.Dec 15 2022, 6:35 AM
domada retitled this revision from [OMPIRBuilder] Add OpenMPDefaultSimdAlignment field to TargetMachine class to [OpenMP][OMPContext] Move SIMD alignment calculation to LLVM Frontend.
domada edited the summary of this revision. (Show Details)

Applied review remarks.

Scope of changes:

  1. Removed static function for finding the default alignment from OMPIRBuilder
  2. Added new helper class for storing cached TargetMachine objects
domada added inline comments.Dec 15 2022, 6:54 AM
llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
3096

I added a new helper class TargetMachineCache for storing TargetMachine items.

Please let me know if it is ok, that this class is located in OMPContext.h.

Storing TargetMachine objects required modification of ASTContext class.

I think this is fine, with two nits. @jsjodin Please finish the review as you see fit.

clang/include/clang/AST/ASTContext.h
460 ↗(On Diff #483157)

Move it into the OpenMPIRBuilder please.

llvm/lib/Frontend/OpenMP/OMPContext.cpp
587 ↗(On Diff #483157)

You lookup Key in Targets 3 times. Maybe we can reduce that a little. Can the pointer ever be nullptr (the one returned by createTargetMachine?

domada added inline comments.Dec 16 2022, 3:56 AM
clang/include/clang/AST/ASTContext.h
460 ↗(On Diff #483157)

We cannot move it into the OpenMPIRBuilder because OpenMPIRBuilder is not created inside ASTContext. OpenMPIRBuilder is created inside clang LLVM IR codegen part (clang/lib/CodeGen/CGOpenMPRuntime) and I prefer not to modify this part.

That's why I initially proposed static function inside OpenMPIRBuilder. I think that we need to choose one option:

  • Less efficient static function inside OpenMPIRBuilder which creates TargetMachine items every time (advantage: no binding to LLVM objects inside clang ASTContext)
  • TargetMachineCache object inside ASTContext and more efficient function for default alignment calculation

Personally I opt for the first option. IMO OpenMP pragma: omp simd aligned(A) is not very common and we can spend more time to determine the default alignment.

domada added inline comments.Dec 16 2022, 4:33 AM
llvm/lib/Frontend/OpenMP/OMPContext.cpp
587 ↗(On Diff #483157)

createTargetMachine can return nullptr. Test scenario:

  1. build clang without support of one target arch (for example build only x86 target: `cmake "-DLLVM_TARGETS_TO_BUILD="X86;")
  2. try to generate LLVM IR and specify target which hasn't been mentioned in cmake configuration: (for example: clang --target=arm -S -emit-llvm -fopenmp test.c)

We can reduce number of lookups in the best scenario (given Key has been inserted earlier):

// find the corresponding target
 auto  TargetsIter = Targets.find(Key);
//if not found - insert it and update TargetsIter 
if (TargetsIter==Targets.end()) {
    // some initialization code
   Targets[Key] = TheTarget->createTargetMachine(..);
   TargetsIter = Targets.find(Key);
}
//check if TargetMachine is not nullptr:
if (!TargetsIter->getValue())
   return 0;
jsjodin added inline comments.Dec 23 2022, 7:13 AM
llvm/include/llvm/Frontend/OpenMP/OMPContext.h
233 ↗(On Diff #483157)

We can remove the AlignedValueTy parameter since it is not used and we determined that the DataLayout in the module is incomplete anyway.

jsjodin added inline comments.Dec 23 2022, 7:17 AM
llvm/include/llvm/Target/TargetMachine.h
212

This will be clearer if we put OpenMP in the function name as well getOpenMPMaximalSimdAlignment or something like that.

domada updated this revision to Diff 486533.Thu, Jan 5, 5:51 AM
domada edited the summary of this revision. (Show Details)

Scope of changes:

  1. Refactored createTargetMachine function.

Reuse existing code.

  1. Removed caching.

The existing comment about caching in createTargetMachine clearly indicates implementation obstacles. In my opinion we are allowed to create TargetMachine objects on demand because simd aligned pragma with default alignment is used rarely (very few usage in OpenMP benchmarks).

domada updated this revision to Diff 486565.Thu, Jan 5, 7:24 AM

Rename OMPIRBuilder function to query SIMD alignment

This revision is now accepted and ready to land.Fri, Jan 13, 6:29 AM
This revision was landed with ongoing or failed builds.Fri, Jan 13, 12:24 PM
This revision was automatically updated to reflect the committed changes.
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptFri, Jan 13, 12:24 PM