scott.linder (Scott Linder)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 15 2018, 8:31 AM (39 w, 2 d)

Recent Activity

Fri, Oct 12

scott.linder accepted D53221: AMDGPU: Generate .amdgcn_target for object code v3.

LGTM, thanks for fixing this

Fri, Oct 12, 2:56 PM
scott.linder added a comment to D53153: [OpenCL] Mark namespace scope variables and kernel functions with default visibility.

Beyond constructors/destructors I believe an API which we implement through access to dynamic symbols for global variable is clGetProgramBuildInfo(..., CL_PROGRAM_BUILD_GLOBAL_VARIABLE_TOTAL_SIZE, ...) which is defined as "The total amount of storage, in bytes, used by program variables in the global address space."

Fri, Oct 12, 1:08 PM
scott.linder added a comment to D53153: [OpenCL] Mark namespace scope variables and kernel functions with default visibility.

The rationale is that -fvisibility only affects the default, and already does not apply in many cases. For example, see the rest of the conditions above the fvisibility check in getLVForNamespaceScopeDecl: when Var->getStorageClass() == SC_Static the linkage is (usually) internal and the visibility is default.

Visibility is meaningless for an internal-linkage declaration.

I think one question is whether OpenCL language semantics allow us to make these visibility determinations; I am going off of the APIs available to access symbols.

-fvisibility is not governed by the OpenCL specification; it's a Clang / GCC extension, and we get to determine its semantics, which generally override the rules from the language model we're implementing.

I'm a little surprised that symbol visibility is even meaningful when compiling GPU code. Can you give some more background on that?

To the extent that visibility is meaningful, it does seem to make sense for kernel to imply default visibility in the same way an attribute would, since it's an unambiguous marker that the function is intended to be called from outside the DSO. It's less obvious to me that the same is true for global variables; is there no distinction in OpenCL between global variables that can be accessed by the host and global variables that are internal to the device-side computation?

This makes more sense, thank you for the explanation. Ignore my last post, it makes sense that visibility is defaulted and ignored when a symbol is assigned internal linkage.

I am still not confident about globals; maybe @b-sumner has more insight? We have APIs for accessing global variable symbols form the host, but this may be AMD-specific, not general to OpenCL.

One thing I can think of the global variables is their initialization. If their initialization must be done by the runtime, then they have to be visible to the runtime.

Fri, Oct 12, 12:54 PM
scott.linder updated subscribers of D53153: [OpenCL] Mark namespace scope variables and kernel functions with default visibility.

The rationale is that -fvisibility only affects the default, and already does not apply in many cases. For example, see the rest of the conditions above the fvisibility check in getLVForNamespaceScopeDecl: when Var->getStorageClass() == SC_Static the linkage is (usually) internal and the visibility is default.

Visibility is meaningless for an internal-linkage declaration.

I think one question is whether OpenCL language semantics allow us to make these visibility determinations; I am going off of the APIs available to access symbols.

-fvisibility is not governed by the OpenCL specification; it's a Clang / GCC extension, and we get to determine its semantics, which generally override the rules from the language model we're implementing.

I'm a little surprised that symbol visibility is even meaningful when compiling GPU code. Can you give some more background on that?

To the extent that visibility is meaningful, it does seem to make sense for kernel to imply default visibility in the same way an attribute would, since it's an unambiguous marker that the function is intended to be called from outside the DSO. It's less obvious to me that the same is true for global variables; is there no distinction in OpenCL between global variables that can be accessed by the host and global variables that are internal to the device-side computation?

Fri, Oct 12, 12:31 PM
scott.linder added a comment to D53153: [OpenCL] Mark namespace scope variables and kernel functions with default visibility.

That's my understanding, at least.

Fri, Oct 12, 12:28 PM
scott.linder added a comment to D53153: [OpenCL] Mark namespace scope variables and kernel functions with default visibility.

The rationale is that -fvisibility only affects the default, and already does not apply in many cases. For example, see the rest of the conditions above the fvisibility check in getLVForNamespaceScopeDecl: when Var->getStorageClass() == SC_Static the linkage is (usually) internal and the visibility is default. In cases where individual symbols need unique visibility an __attribute__ can be used.

Fri, Oct 12, 11:33 AM
scott.linder updated the diff for D53153: [OpenCL] Mark namespace scope variables and kernel functions with default visibility.

Address feedback

Fri, Oct 12, 9:02 AM

Thu, Oct 11

scott.linder created D53153: [OpenCL] Mark namespace scope variables and kernel functions with default visibility.
Thu, Oct 11, 11:12 AM

Wed, Oct 10

scott.linder updated the diff for D48144: [Support] Teach YAMLIO about polymorphic types.

Rebase and remove redundant qualifiers

Wed, Oct 10, 11:28 AM
scott.linder closed D53088: [Support] Remove redundant qualifiers in YAMLTraits (NFC).

rL344166

Wed, Oct 10, 11:23 AM
scott.linder committed rL344166: [Support] Remove redundant qualifiers in YAMLTraits (NFC).
[Support] Remove redundant qualifiers in YAMLTraits (NFC)
Wed, Oct 10, 11:16 AM
scott.linder added a comment to D48144: [Support] Teach YAMLIO about polymorphic types.

Cleanup patch is at https://reviews.llvm.org/D53088

Wed, Oct 10, 9:46 AM
scott.linder created D53088: [Support] Remove redundant qualifiers in YAMLTraits (NFC).
Wed, Oct 10, 9:43 AM
scott.linder added a comment to D52741: [AMDGPU] Add an AMDGPU pass to promote bitcast calls.

Ping

Wed, Oct 10, 9:39 AM
scott.linder committed rL344153: Relax trivial cast requirements in CallPromotionUtils.
Relax trivial cast requirements in CallPromotionUtils
Wed, Oct 10, 9:37 AM
scott.linder closed D52792: Relax trivial cast requirements in CallPromotionUtils.
Wed, Oct 10, 9:37 AM

Tue, Oct 9

scott.linder updated the diff for D48144: [Support] Teach YAMLIO about polymorphic types.

Address feedback

Tue, Oct 9, 12:39 PM
scott.linder added a comment to D48144: [Support] Teach YAMLIO about polymorphic types.

I can submit a separate patch to clean up things like making the empty definitions into declarations, removing redundant public: qualifiers, and removing redundant references to this, but I don't want to mix those changes into this patch.

Tue, Oct 9, 12:34 PM

Mon, Oct 8

scott.linder updated the diff for D52894: Update CallSite docs and add a new function (NFC).

Remove update of dyn_cast in WebAssemblyFixFunctionBitcasts which does not use CallSite

Mon, Oct 8, 1:01 PM
scott.linder added a comment to D52891: [AMDGPU] Add -fvisibility-amdgpu-non-kernel-functions.

I will update the patch to modify the HIP toolchain and to add tests for global variables.

Mon, Oct 8, 12:06 PM
scott.linder committed rL343992: [AMDGPU] Legalize VGPR Rsrc operands for MUBUF instructions.
[AMDGPU] Legalize VGPR Rsrc operands for MUBUF instructions
Mon, Oct 8, 11:50 AM
scott.linder closed D51742: [AMDGPU] Fix regression with not maintaining MachineDominatorTree.
Mon, Oct 8, 11:50 AM
scott.linder updated the diff for D52894: Update CallSite docs and add a new function (NFC).

Address feedback

Mon, Oct 8, 10:13 AM

Thu, Oct 4

scott.linder accepted D52897: AMDGPU: Rename isAmdCodeObjectV2 -> isAmdHsaOrMesa.

LGTM

Thu, Oct 4, 11:47 AM
scott.linder added a dependency for D52824: [AMDGPU] Implemented MCELFNoteDisassembler for PAL metadata note: D52823: [Disassembler] MCELFNoteDisassembler abstraction.
Thu, Oct 4, 10:12 AM
scott.linder added a dependent revision for D52823: [Disassembler] MCELFNoteDisassembler abstraction: D52824: [AMDGPU] Implemented MCELFNoteDisassembler for PAL metadata note.
Thu, Oct 4, 10:12 AM
scott.linder added a dependent revision for D52821: [Disassembler][llvm-readobj] ELF note dumper abstraction: D52822: [llvm-readobj][AMDGPU] Moved AMDGPU-specific note record dumping into target.
Thu, Oct 4, 10:11 AM
scott.linder added a dependency for D52822: [llvm-readobj][AMDGPU] Moved AMDGPU-specific note record dumping into target: D52821: [Disassembler][llvm-readobj] ELF note dumper abstraction.
Thu, Oct 4, 10:11 AM
scott.linder created D52894: Update CallSite docs and add a new function (NFC).
Thu, Oct 4, 10:06 AM
scott.linder added a comment to D51742: [AMDGPU] Fix regression with not maintaining MachineDominatorTree.

Ping

Thu, Oct 4, 9:50 AM
scott.linder updated the diff for D52891: [AMDGPU] Add -fvisibility-amdgpu-non-kernel-functions.

Update docs

Thu, Oct 4, 9:30 AM
scott.linder added a comment to D52891: [AMDGPU] Add -fvisibility-amdgpu-non-kernel-functions.

I don't know who else to add as a reviewer; Sam, is there someone else outside of AMD that would be interested in reviewing this?

Thu, Oct 4, 9:24 AM
scott.linder created D52891: [AMDGPU] Add -fvisibility-amdgpu-non-kernel-functions.
Thu, Oct 4, 9:20 AM

Wed, Oct 3

scott.linder updated the diff for D52741: [AMDGPU] Add an AMDGPU pass to promote bitcast calls.

Address feedback

Wed, Oct 3, 9:02 AM
scott.linder added inline comments to D52741: [AMDGPU] Add an AMDGPU pass to promote bitcast calls.
Wed, Oct 3, 8:58 AM
scott.linder updated the diff for D52792: Relax trivial cast requirements in CallPromotionUtils.

Remove -debug flag from test

Wed, Oct 3, 8:57 AM
scott.linder updated the diff for D52792: Relax trivial cast requirements in CallPromotionUtils.

Add test using IndirectCallPromotion

Wed, Oct 3, 8:55 AM

Tue, Oct 2

scott.linder added a dependent revision for D52792: Relax trivial cast requirements in CallPromotionUtils: D52741: [AMDGPU] Add an AMDGPU pass to promote bitcast calls.
Tue, Oct 2, 11:57 AM
scott.linder added a comment to D52741: [AMDGPU] Add an AMDGPU pass to promote bitcast calls.

Thanks, Sam. I think the requirements of the two passes are different enough that combining them is not particularly useful.

Tue, Oct 2, 11:57 AM
scott.linder added a dependency for D52741: [AMDGPU] Add an AMDGPU pass to promote bitcast calls: D52792: Relax trivial cast requirements in CallPromotionUtils.
Tue, Oct 2, 11:57 AM
scott.linder updated the diff for D52741: [AMDGPU] Add an AMDGPU pass to promote bitcast calls.

Move CallPromotionUtils changes to new review (https://reviews.llvm.org/D52792)

Tue, Oct 2, 11:57 AM
scott.linder updated the diff for D52792: Relax trivial cast requirements in CallPromotionUtils.

Sorry, moved a little fast and grabbed the wrong patch at first.

Tue, Oct 2, 11:25 AM
scott.linder created D52792: Relax trivial cast requirements in CallPromotionUtils.
Tue, Oct 2, 11:22 AM
scott.linder updated subscribers of D52741: [AMDGPU] Add an AMDGPU pass to promote bitcast calls.

After reading the WebAsm code more closely it seems like our goals are different. They seem to support indirect calls, but only if certain properties of the functions match (namely number of arguments and return type). For example, the existing pass does not transform anything (e.g. leaves a bitcast function call present) if the bitcast only modifies argument types in a trivial way. For AMDGPU we need to always eliminate the bitcast, and we don't necessarily need to be generating wrapper functions.

Tue, Oct 2, 9:45 AM
scott.linder added a comment to D52741: [AMDGPU] Add an AMDGPU pass to promote bitcast calls.

The WebAsm code essentially does the same thing, but also creates wrapper functions when the number of arguments differ, or when the return value differs (it seems like this could just be a cast as well). I will move the WebAsm pass to lib/Transform and use it in both backends. I am not sure who from WebAsm to ask to take a look once I'm done?

Tue, Oct 2, 7:59 AM

Mon, Oct 1

scott.linder added a comment to D52741: [AMDGPU] Add an AMDGPU pass to promote bitcast calls.

@mssimpso Are you OK with reviewing the changes to CallPromotionUtils.cpp in this patch? I wasn't sure who the correct code owner was.

I'm not sure there is a code owner, but the CallPromotionUtils part looks good to me.

Mon, Oct 1, 1:17 PM
scott.linder added a comment to D52741: [AMDGPU] Add an AMDGPU pass to promote bitcast calls.

@mssimpso Are you OK with reviewing the changes to CallPromotionUtils.cpp in this patch? I wasn't sure who the correct code owner was.

Mon, Oct 1, 12:43 PM
scott.linder requested changes to D51795: AMDGPU: Don't error on calls to constexpr casts of functions.

This patch requires changes to CallGraph, and doesn't address inlining through bitcasts. I think https://reviews.llvm.org/D52741 can supercede it, and if we still want to make those changes to CallGraph they can be done independently.

Mon, Oct 1, 12:06 PM
scott.linder created D52741: [AMDGPU] Add an AMDGPU pass to promote bitcast calls.
Mon, Oct 1, 12:05 PM

Mon, Sep 24

scott.linder added a comment to D51742: [AMDGPU] Fix regression with not maintaining MachineDominatorTree.

Ping

Mon, Sep 24, 9:23 AM
scott.linder added a comment to D48144: [Support] Teach YAMLIO about polymorphic types.

Ping

Mon, Sep 24, 9:18 AM

Sep 14 2018

scott.linder updated the diff for D51742: [AMDGPU] Fix regression with not maintaining MachineDominatorTree.

Address feedback: make MDT optional, add braces, run clang-format.

Sep 14 2018, 11:50 AM

Sep 12 2018

scott.linder added a comment to D51795: AMDGPU: Don't error on calls to constexpr casts of functions.

It looks like to make this change we need to update CallGraph as well; it currently just uses .getCalledFunction(), so the graph doesn't have the edge created by the bitcasted call.

Sep 12 2018, 9:33 AM

Sep 11 2018

scott.linder added a reviewer for D51954: AMDGPU: Print all kernel descriptor directives (including the ones with default values): arsenm.

LGTM, but adding Matt because I think he was the one to originally request eliding defaulted fields. From what Tony described I agree we can't do this generally because of options which depend on attributes, e.g. -mattr=+/-xnack changes the "default".

Sep 11 2018, 3:48 PM
scott.linder added a comment to D51795: AMDGPU: Don't error on calls to constexpr casts of functions.

Seems to be related to the order; this succeeds:

Sep 11 2018, 3:34 PM
scott.linder added a comment to D51795: AMDGPU: Don't error on calls to constexpr casts of functions.

I still see asserts related to bitcasted function calls, but I haven't gotten to the bottom of why; an example:

Sep 11 2018, 3:09 PM

Sep 10 2018

scott.linder added a reviewer for D48144: [Support] Teach YAMLIO about polymorphic types: zturner.
Sep 10 2018, 10:41 AM
scott.linder updated the diff for D51742: [AMDGPU] Fix regression with not maintaining MachineDominatorTree.

Update patch to include all changes since the original patch was reverted.

Sep 10 2018, 10:34 AM
scott.linder added a comment to D51795: AMDGPU: Don't error on calls to constexpr casts of functions.

Sorry; I just checked again and llc had not been rebulit for some reason. I deleted it and ran the build again and that example does compile fine.

Sep 10 2018, 8:09 AM

Sep 7 2018

scott.linder added a comment to D51795: AMDGPU: Don't error on calls to constexpr casts of functions.

I still don't understand exactly when a call is indirect, but when an argument is cast this still seems to fail:

Sep 7 2018, 1:25 PM

Sep 6 2018

scott.linder committed rL341589: Revert r341413.
Revert r341413
Sep 6 2018, 2:40 PM
scott.linder updated the diff for D51742: [AMDGPU] Fix regression with not maintaining MachineDominatorTree.

Update existing tests RUN lines with -verify-machine-dom-info. This replicates what the expensive tests enabled.

Sep 6 2018, 12:37 PM
scott.linder added a comment to D51742: [AMDGPU] Fix regression with not maintaining MachineDominatorTree.

The regression was caught by the expensive_checks build, if that is sufficient for testing? I am actually not sure how to write a lit test for this, because opt -analyze will just re-calculate the DT; I don't know how to just print the DT after the legalize pass runs.

Sep 6 2018, 12:34 PM
scott.linder committed rL341574: Move init code in AArch64SelectionDAGTest to SetUpTestCase (NFC).
Move init code in AArch64SelectionDAGTest to SetUpTestCase (NFC)
Sep 6 2018, 11:44 AM
scott.linder added inline comments to D50659: [CodeGen] Fix remaining zext() assertions in SelectionDAG.
Sep 6 2018, 11:44 AM
scott.linder added reviewers for D51742: [AMDGPU] Fix regression with not maintaining MachineDominatorTree: rampitec, b-sumner.
Sep 6 2018, 11:28 AM
scott.linder added inline comments to D50659: [CodeGen] Fix remaining zext() assertions in SelectionDAG.
Sep 6 2018, 11:19 AM
scott.linder added a comment to rL341413: [AMDGPU] Legalize VGPR Rsrc operands for MUBUF instructions.

Review for fix of that regression is at https://reviews.llvm.org/D51742

Sep 6 2018, 11:13 AM
scott.linder created D51742: [AMDGPU] Fix regression with not maintaining MachineDominatorTree.
Sep 6 2018, 11:13 AM

Sep 4 2018

scott.linder added a comment to D48144: [Support] Teach YAMLIO about polymorphic types.

Ping

Sep 4 2018, 2:58 PM
scott.linder committed rL341413: [AMDGPU] Legalize VGPR Rsrc operands for MUBUF instructions.
[AMDGPU] Legalize VGPR Rsrc operands for MUBUF instructions
Sep 4 2018, 2:52 PM
scott.linder closed D50982: [AMDGPU] Legalize VGPR Rsrc operands for MUBUF instructions.
Sep 4 2018, 2:52 PM
scott.linder accepted D51592: AMDGPU: Fix DAG divergence not reporting flat loads.

LGTM, thanks!

Sep 4 2018, 11:21 AM
scott.linder added a comment to D50982: [AMDGPU] Legalize VGPR Rsrc operands for MUBUF instructions.

Ping

Sep 4 2018, 11:01 AM
scott.linder committed rL341380: [CodeGen] Fix remaining zext() assertions in SelectionDAG.
[CodeGen] Fix remaining zext() assertions in SelectionDAG
Sep 4 2018, 9:34 AM
scott.linder closed D50659: [CodeGen] Fix remaining zext() assertions in SelectionDAG.
Sep 4 2018, 9:34 AM
scott.linder added inline comments to D51592: AMDGPU: Fix DAG divergence not reporting flat loads.
Sep 4 2018, 8:25 AM

Aug 31 2018

scott.linder updated the diff for D50659: [CodeGen] Fix remaining zext() assertions in SelectionDAG.

Update test name to be clear it only runs when aarch64 is available.

Aug 31 2018, 12:08 PM
scott.linder added inline comments to D50659: [CodeGen] Fix remaining zext() assertions in SelectionDAG.
Aug 31 2018, 9:38 AM
scott.linder updated the diff for D48175: [BinaryFormat] Add MsgPackTypes.

Added test for shared ownership. I've run it locally with ASan and UBSan enabled.

Aug 31 2018, 9:10 AM

Aug 30 2018

scott.linder added a comment to D48175: [BinaryFormat] Add MsgPackTypes.

The use I have in mind is actually an external component which depends on LLVM. We currently implement reading MessagePack metadata there, but when we land the changes to generate it in LLVM it would be nice to avoid duplicating the code there.

Aug 30 2018, 12:28 PM

Aug 29 2018

scott.linder added a reviewer for D51434: [HIP] Add -fvisibility hidden option to clang: arsenm.

+Matt to confirm, but our executable format is a shared object and we eventually want to support preemptible symbols through the PLT. We already generate GOT entries for globals.

Aug 29 2018, 11:11 AM

Aug 28 2018

scott.linder updated the diff for D48175: [BinaryFormat] Add MsgPackTypes.

I've updated the patch to just support shared_ptr; I think the more generic option is right for now.

Aug 28 2018, 2:43 PM
scott.linder updated the diff for D48179: [AMDGPU] Emit MessagePack HSA Metadata for v3 code object.

Address feedback. There was no particular reason for not writing these inline, other than to prevent typos. With the verifier and tests we have I don't think that is an issue, and I agree these being grepable is worthwhile.

Aug 28 2018, 12:45 PM
scott.linder updated the diff for D50982: [AMDGPU] Legalize VGPR Rsrc operands for MUBUF instructions.

Use MRI.clearKillFlags

Aug 28 2018, 8:29 AM
scott.linder updated the diff for D48175: [BinaryFormat] Add MsgPackTypes.

Add virtual destructor to base class and clean up tests.

Aug 28 2018, 8:10 AM
scott.linder added inline comments to D48175: [BinaryFormat] Add MsgPackTypes.
Aug 28 2018, 8:10 AM
scott.linder updated the diff for D48144: [Support] Teach YAMLIO about polymorphic types.

Default the test case destructor rather than give it an empty body.

Aug 28 2018, 7:49 AM

Aug 27 2018

scott.linder updated the diff for D48144: [Support] Teach YAMLIO about polymorphic types.

In cleaning up this patch I noticed our YAML parser does not handle an explicit document containing only a block-level plain scalar, e.g.:

Aug 27 2018, 1:40 PM

Aug 24 2018

scott.linder updated the diff for D50982: [AMDGPU] Legalize VGPR Rsrc operands for MUBUF instructions.

Clear killed flag on uses, add test at -O0

Aug 24 2018, 1:55 PM
scott.linder added a dependent revision for D48144: [Support] Teach YAMLIO about polymorphic types: D51223: Update tests for new YAMLIO polymorphic traits.
Aug 24 2018, 10:14 AM
scott.linder added a dependency for D51223: Update tests for new YAMLIO polymorphic traits: D48144: [Support] Teach YAMLIO about polymorphic types.
Aug 24 2018, 10:14 AM
scott.linder created D51223: Update tests for new YAMLIO polymorphic traits.
Aug 24 2018, 10:14 AM
scott.linder updated the diff for D48144: [Support] Teach YAMLIO about polymorphic types.

Rebase and ping

Aug 24 2018, 10:11 AM
scott.linder closed D44429: [BinaryFormat] MessagePack reader/writer.

Modular build and UB fixes (r340507) are committed.

Aug 24 2018, 9:24 AM

Aug 23 2018

scott.linder added a comment to D50982: [AMDGPU] Legalize VGPR Rsrc operands for MUBUF instructions.

I think even in the simplest case my code is wrong. The $soffset operand to the BUFFER_LOAD remains killed but the def is not moved into the LoopBB. I am essentially transforming:

Aug 23 2018, 1:59 PM

Aug 22 2018

scott.linder committed rL340507: Fix undefined behavior in r340457.
Fix undefined behavior in r340457
Aug 22 2018, 7:52 PM
scott.linder updated subscribers of D44429: [BinaryFormat] MessagePack reader/writer.

Thank you both! I will try to keep modules in mind in the future.

Aug 22 2018, 7:05 PM
scott.linder committed rL340457: [BinaryFormat] Add MessagePack reader/writer.
[BinaryFormat] Add MessagePack reader/writer
Aug 22 2018, 2:43 PM
scott.linder closed D44429: [BinaryFormat] MessagePack reader/writer.
Aug 22 2018, 2:43 PM