Page MenuHomePhabricator

[libomptarget][nvptx] Undef, internal shared variables
AbandonedPublic

Authored by JonChesterfield on Oct 22 2020, 4:22 PM.

Details

Summary

[libomptarget][nvptx] Undef, internal shared variables

Shared variables on nvptx, and LDS on amdgcn, are uninitialized at
the start of kernel execution. Therefore create the variables with
undef instead of zeros, motivated in part by the amdgcn back end
rejecting LDS+initializer.

Common is zero initialized, which seems incompatible with shared. Thus
change them to internal, following the direction of
https://reviews.llvm.org/rG7b3eabdcd215

WIP, other tests need to be updated if direction is good

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptOct 22 2020, 4:22 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
JonChesterfield requested review of this revision.Oct 22 2020, 4:22 PM

The nvptx back end accepts common + zero + shared, but not common + undef + shared. I think weak_odr is conceptually right here, but given the warning that nvlink doesn't support weak symbols, internal also seems fine. Can someone see an advantage to weak over internal? It could be arch specific at the risk of a lot of test duplication.

The nvptx back end accepts common + zero + shared, but not common + undef + shared. I think weak_odr is conceptually right here, but given the warning that nvlink doesn't support weak symbols, internal also seems fine. Can someone see an advantage to weak over internal? It could be arch specific at the risk of a lot of test duplication.

IIRC, it supports weak symbols, but does not support weak symbols of different sizes.

The nvptx back end accepts common + zero + shared, but not common + undef + shared. I think weak_odr is conceptually right here, but given the warning that nvlink doesn't support weak symbols, internal also seems fine. Can someone see an advantage to weak over internal? It could be arch specific at the risk of a lot of test duplication.

IIRC, it supports weak symbols, but does not support weak symbols of different sizes.

That seems a reasonable restriction. Linkers sometimes pick the first weak symbol they see. Comdat might mean pick the biggest one, but that's probably not a good thing to rely on.

clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
4794–4795

Perhaps weak_any + undef?

Could use internal for symbols that may vary in size and weak_any for those that don't.

ABataev added inline comments.Oct 23 2020, 6:23 AM
clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
2858–2859

"Internalization" is not the best option, it increases mem pressure. Common linkage is a better choice, allows to "squash" the same objects, defined in different units. Make it arch dependable, maybe?
For NVPTX zero initialization is not a problem, it is resolved when PTX is generated.

4794–4795

Yeah, it is a good idea, I think.

jdoerfert added inline comments.Oct 23 2020, 8:34 AM
clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
2858–2859

FWIW, if we do not depend on the zero initialization, we should go with undef.

ABataev added inline comments.Oct 23 2020, 8:43 AM
clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
2858–2859

Sure.

  • prefer weak, update tests
JonChesterfield abandoned this revision.EditedOct 27 2020, 9:48 AM

The diff doesn't look right here. I can't tell if that's a quirk of the phab gui or indicates a bad merge, recreated as D90248