This is an archive of the discontinued LLVM Phabricator instance.

Make optimize llvm common to both gpu-to-hsaco/cubin
ClosedPublic

Authored by vinayaka-bandishti on May 26 2023, 4:33 AM.

Details

Summary

Before serializing, optimizations on llvm were only called on path to
hsaco, and not cubin. Define opt-level for gpu-to-cubin pass as well,
and move call to optimize llvm to a common place.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald Transcript
vinayaka-bandishti requested review of this revision.May 26 2023, 4:33 AM
mlir/lib/Dialect/GPU/CMakeLists.txt
127

I am a little skeptical about adding this. Please let me know if there is a better way to do this.

The update looks good but the layering needs to be fixed.

mlir/lib/Dialect/GPU/CMakeLists.txt
127

This can't be right. GPUTransforms can't depend on the ExecutionEngine. You'll need to move the OptUtils to its own library.

bondhugula added inline comments.May 26 2023, 5:32 AM
mlir/lib/Dialect/GPU/CMakeLists.txt
127

You can create an MLIRLLVMOptUtils with the needed functions shared between SerializeToBlob and ExecutionEngine.

bondhugula added inline comments.May 26 2023, 7:20 PM
mlir/lib/Dialect/GPU/Transforms/SerializeToHsaco.cpp
323

Since this was removed, a header include would need to be updated/trimmed?

mlir/lib/Dialect/GPU/Transforms/SerializeToHsaco.cpp
323

The declaration was in the same file. I have removed it.

vinayaka-bandishti marked 2 inline comments as done.May 29 2023, 10:09 AM
bondhugula added inline comments.May 29 2023, 8:21 PM
mlir/include/mlir/Dialect/GPU/Transforms/Passes.h
136

... with optLevel (default level 2).

mlir/lib/Dialect/GPU/CMakeLists.txt
127

Missing dep on MLIRExecutionEngineUtils - please see below.

mlir/lib/Dialect/GPU/Transforms/SerializeToBlob.cpp
16

This is again an issue I believe. OptUtils.h/cpp is part of MLIRExecutionEngineUtils. If you include this file, you'll need to depend on MLIRExecutionEngineUtils.

Without it and in this form, the build/link will simply fail with shared libs turned on.

Address comments, layering fix

vinayaka-bandishti marked 3 inline comments as done.May 29 2023, 9:37 PM
vinayaka-bandishti added inline comments.
mlir/lib/Dialect/GPU/CMakeLists.txt
12

I think, we can remove this IPO dep. Perhaps in a different change.

127

This dep is already present.

ftynse added inline comments.May 31 2023, 2:12 AM
mlir/include/mlir/Dialect/GPU/Transforms/Passes.h
115

I wouldn't have the default level be 2. This is surprising for both "classical compiler" users that would expect it to be 0 (as in clang/gcc) and "ml performance-oriented compiler" that would expect it ot be 3 or the maximum level. I see that this seems to have been the case for HSACO before, can we track down the commit that introduced that and ask the author for their rationale?

140

Ditto. Especially dangerous because CUBIN defaults to 2 and HSACO doesn't have a default.

mlir/lib/Dialect/GPU/Transforms/SerializeToBlob.cpp
109

Ultra-nit: don't start diagnostic messages from a letter.

vinayaka-bandishti marked an inline comment as done.Jun 1 2023, 10:46 PM
vinayaka-bandishti added a subscriber: krzysz00.
vinayaka-bandishti added inline comments.
mlir/include/mlir/Dialect/GPU/Transforms/Passes.h
115

Sure.
Commit introduced default opt level = 2 for hsaco is https://github.com/llvm/llvm-project/commit/bd22554af06e1f16dc9ff12eac8987f0ceebe8c1,
reviewed at https://reviews.llvm.org/D114113

@krzysz00 Can you please check this comment on rationale behind opt level?

Address comment on diagnostics style

mlir/lib/Dialect/GPU/Transforms/SerializeToBlob.cpp
109

You meant capital letters?

bondhugula added inline comments.Jun 2 2023, 12:16 AM
mlir/lib/Dialect/GPU/Transforms/SerializeToBlob.cpp
114

This method is in the MLIRExecutionEngineUtils library. The dependency on LLVMTarget itself won't help.

bondhugula accepted this revision.Jun 2 2023, 12:22 AM
bondhugula added inline comments.
mlir/include/mlir/Dialect/GPU/Transforms/Passes.h
115

As for HSACO, if we want to change the default opt level, it should be done in another revision. This revision is on adding the optimizer on the serialize-to-cubin path. The commit shouldn't surprise users by changing the existing default.

This revision is now accepted and ready to land.Jun 2 2023, 12:22 AM
bondhugula added inline comments.Jun 2 2023, 12:39 AM
mlir/lib/Dialect/GPU/Transforms/SerializeToBlob.cpp
109

Diagnostic messages shouldn't end with a new line. It is appropriately suffixed with more things. You are also missing a space after "level". The message should start in lowercase, I believe.

Resolve comments on diagnostics

vinayaka-bandishti marked 2 inline comments as done.Jun 4 2023, 9:40 PM

A note about the default opt level.

Before:
HSACO path had default opt level 2. CUBIN path had no(dummy) call to optimizeLlvm - effectively opt level was 0.

Since we make the default opt level common to both paths in this patch,
if we keep it at 2, cubin users will see the change/difference.
if we make it 0, hsaco users will see the change/difference.

bondhugula accepted this revision.EditedJun 4 2023, 9:58 PM

A note about the default opt level.

Before:
HSACO path had default opt level 2. CUBIN path had no(dummy) call to optimizeLlvm - effectively opt level was 0.

Since we make the default opt level common to both paths in this patch,
if we keep it at 2, cubin users will see the change/difference.
if we make it 0, hsaco users will see the change/difference.

The commit title and purpose already captures the fact that the cubin path will now have optimization. So it's fine this way since HSACO path's behavior is unchanged. We can change both to O3 in a subsequent commit if that's natural for the amd/hsaco path as well.

This revision was automatically updated to reflect the committed changes.