Page MenuHomePhabricator

yaxunl (Yaxun Liu)
User

Projects

User does not belong to any projects.

User Details

User Since
May 13 2015, 10:16 AM (280 w, 5 d)

Recent Activity

Yesterday

yaxunl committed rG5a3023a91c0e: [HIP] Return non-zero value for invalid target ID (authored by yaxunl).
[HIP] Return non-zero value for invalid target ID
Mon, Sep 28, 8:17 PM
yaxunl committed rG187658b8a611: Recommit "[HIP] Change default --gpu-max-threads-per-block value to 1024" (authored by yaxunl).
Recommit "[HIP] Change default --gpu-max-threads-per-block value to 1024"
Mon, Sep 28, 8:01 PM
yaxunl committed rG10eb3bf2d430: Skip -fPIE for AMDGPU and HIP toolchain (authored by yaxunl).
Skip -fPIE for AMDGPU and HIP toolchain
Mon, Sep 28, 7:25 PM
yaxunl closed D88425: Skip -fPIE for AMDGPU and HIP toolchain.
Mon, Sep 28, 7:25 PM · Restricted Project
yaxunl updated the diff for D88370: Emit predefined macro for wavefront size for amdgcn.

revised by Matt's comments.

Mon, Sep 28, 1:05 PM
yaxunl added inline comments to D88425: Skip -fPIE for AMDGPU and HIP toolchain.
Mon, Sep 28, 10:36 AM · Restricted Project
yaxunl updated the diff for D88370: Emit predefined macro for wavefront size for amdgcn.

capitalize macro

Mon, Sep 28, 9:47 AM
yaxunl added inline comments to D88370: Emit predefined macro for wavefront size for amdgcn.
Mon, Sep 28, 9:42 AM
yaxunl added inline comments to D88377: Diagnose invalid target ID for AMDGPU toolchain for assembler.
Mon, Sep 28, 9:23 AM
yaxunl updated the diff for D88377: Diagnose invalid target ID for AMDGPU toolchain for assembler.

update patch with full context

Mon, Sep 28, 9:21 AM
yaxunl requested review of D88425: Skip -fPIE for AMDGPU and HIP toolchain.
Mon, Sep 28, 9:11 AM · Restricted Project

Sun, Sep 27

yaxunl requested review of D88377: Diagnose invalid target ID for AMDGPU toolchain for assembler.
Sun, Sep 27, 6:53 AM

Sat, Sep 26

yaxunl updated the diff for D88370: Emit predefined macro for wavefront size for amdgcn.

fix typo

Sat, Sep 26, 8:34 PM
yaxunl requested review of D88370: Emit predefined macro for wavefront size for amdgcn.
Sat, Sep 26, 7:41 PM
yaxunl added inline comments to D88345: [CUDA] Allow local `static const {__constant__, __device__}` variables..
Sat, Sep 26, 4:38 AM · Restricted Project

Fri, Sep 25

yaxunl added reviewers for D88303: [clang][codegen] Remove the insertion of `correctly-rounded-divide-sqrt-fp-math` fn-attr.: Anastasia, bader.
Fri, Sep 25, 7:40 AM · Restricted Project

Thu, Sep 24

yaxunl committed rGe39da8ab6a28: Recommit "[CUDA][HIP] Defer overloading resolution diagnostics for host device… (authored by yaxunl).
Recommit "[CUDA][HIP] Defer overloading resolution diagnostics for host device…
Thu, Sep 24, 5:45 AM

Wed, Sep 23

yaxunl committed rG8e780a1653e6: Recommit [NFC] Refactor DiagnosticBuilder and PartialDiagnostic (authored by yaxunl).
Recommit [NFC] Refactor DiagnosticBuilder and PartialDiagnostic
Wed, Sep 23, 1:56 PM
yaxunl accepted D87947: [AMDGPU] Make ds fp atomics overloadable.
Wed, Sep 23, 11:37 AM · Restricted Project, Restricted Project
yaxunl committed rGe90343ada3bd: Fix regressioin in test dwp-separate-debug-file.cpp (authored by yaxunl).
Fix regressioin in test dwp-separate-debug-file.cpp
Wed, Sep 23, 8:50 AM
yaxunl committed rGe6d50b4f22dc: recommit [HIP] Fix -gsplit-dwarf option (authored by yaxunl).
recommit [HIP] Fix -gsplit-dwarf option
Wed, Sep 23, 8:22 AM
yaxunl committed rG301e23305d03: [CUDA][HIP] Fix static device var used by host code only (authored by yaxunl).
[CUDA][HIP] Fix static device var used by host code only
Wed, Sep 23, 5:20 AM
yaxunl closed D88115: [CUDA][HIP] Fix static device var used by host code only.
Wed, Sep 23, 5:20 AM · Restricted Project

Tue, Sep 22

yaxunl requested review of D88115: [CUDA][HIP] Fix static device var used by host code only.
Tue, Sep 22, 1:57 PM · Restricted Project

Sat, Sep 19

yaxunl added a reverting change for rGe50465ecefc9: [HIP] Fix -gsplit-dwarf option: rG2819cea2ef8a: Revert "[HIP] Fix -gsplit-dwarf option".
Sat, Sep 19, 7:18 AM
yaxunl committed rG2819cea2ef8a: Revert "[HIP] Fix -gsplit-dwarf option" (authored by yaxunl).
Revert "[HIP] Fix -gsplit-dwarf option"
Sat, Sep 19, 7:18 AM
yaxunl added a reverting change for D87791: [CUDA][HIP] Fix -gsplit-dwarf option: rG2819cea2ef8a: Revert "[HIP] Fix -gsplit-dwarf option".
Sat, Sep 19, 7:18 AM · Restricted Project
yaxunl committed rGe50465ecefc9: [HIP] Fix -gsplit-dwarf option (authored by yaxunl).
[HIP] Fix -gsplit-dwarf option
Sat, Sep 19, 7:08 AM
yaxunl closed D87791: [CUDA][HIP] Fix -gsplit-dwarf option.
Sat, Sep 19, 7:07 AM · Restricted Project

Fri, Sep 18

yaxunl added a comment to D84362: [NFC] Refactor DiagnosticBuilder and PartialDiagnostic.
In D84362#2283045, @tra wrote:

The fix is for the change in D84364. It has no effect on the change in this review. Are you sure the issue you saw is due to change in this review instead of change in D84364?

Pretty sure. We've bisected the failure specifically to commit ee5519d323571c4a9a7d92cb817023c9b95334cd, before D84364 landed.

Fri, Sep 18, 2:55 PM · Restricted Project
yaxunl added a comment to D84362: [NFC] Refactor DiagnosticBuilder and PartialDiagnostic.
In D84362#2282965, @tra wrote:

I have a fix for the issue reported in D84364. Would you like to try? Thanks.

I can try it on the internal test that crashed with the patch. I've reopened this review and will pick up the diff once you update it.

Fri, Sep 18, 2:32 PM · Restricted Project
yaxunl added a comment to D84362: [NFC] Refactor DiagnosticBuilder and PartialDiagnostic.
In D84362#2279845, @tra wrote:
In D84362#2279688, @tra wrote:

Apparently this patch triggers compiler crashes on some of our code. I'll try to create a reproducer, but it would be great to revert the patch for now.

It's likely the same issue as the one reported in D84364, but we probably triggered it independently.

Fri, Sep 18, 2:02 PM · Restricted Project
yaxunl added a comment to D84364: [CUDA][HIP] Defer overloading resolution diagnostics for host device functions.

Not a known issue - no, but MSan doesn't play nice with uninistrumented libraries (including things like libcxx) - and so it can be tricky to ensure your build is properly sanitized, which is why I'd recommend the build script :).

Fri, Sep 18, 1:58 PM · Restricted Project
yaxunl added a comment to D84364: [CUDA][HIP] Defer overloading resolution diagnostics for host device functions.

Looks like this patch broke the MSan buildbots, PTAL (repro instructions https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild):

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/46239/steps/check-clang%20msan/logs/stdio

FAIL: Clang :: SemaCUDA/deferred-oeverload.cu (11308 of 26387)
******************** TEST 'Clang :: SemaCUDA/deferred-oeverload.cu' FAILED ********************
Script:
--
: 'RUN: at line 1';   /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/clang -cc1 -internal-isystem /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/lib/clang/12.0.0/include -nostdsysteminc -fcuda-is-device -fsyntax-only -verify=dev,com /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/test/SemaCUDA/deferred-oeverload.cu    -std=c++11 -fgpu-defer-diag
: 'RUN: at line 3';   /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/clang -cc1 -internal-isystem /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/lib/clang/12.0.0/include -nostdsysteminc -fsyntax-only -verify=host,com /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/test/SemaCUDA/deferred-oeverload.cu    -std=c++11 -fgpu-defer-diag
--
Exit Code: 77

Command Output (stderr):
--
==41680==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0xddfbf73 in clang::OverloadCandidateSet::CompleteCandidates(clang::Sema&, clang::OverloadCandidateDisplayKind, llvm::ArrayRef<clang::Expr*>, clang::SourceLocation, llvm::function_ref<bool (clang::OverloadCandidate&)>) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Sema/SemaOverload.cpp:11484:9
    #1 0xde4e7ca in clang::OverloadCandidateSet::NoteCandidates(std::__1::pair<clang::SourceLocation, clang::PartialDiagnostic>, clang::Sema&, clang::OverloadCandidateDisplayKind, llvm::ArrayRef<clang::Expr*>, llvm::StringRef, clang::SourceLocation, llvm::function_ref<bool (clang::OverloadCandidate&)>) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Sema/SemaOverload.cpp:11529:9
    #2 0xde6316f in FinishOverloadedCallExpr(clang::Sema&, clang::Scope*, clang::Expr*, clang::UnresolvedLookupExpr*, clang::SourceLocation, llvm::MutableArrayRef<clang::Expr*>, clang::SourceLocation, clang::Expr*, clang::OverloadCandidateSet*, clang::OverloadCandidate**, clang::OverloadingResult, bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Sema/SemaOverload.cpp:12959:19
    #3 0xde6296b in clang::Sema::BuildOverloadedCallExpr(clang::Scope*, clang::Expr*, clang::UnresolvedLookupExpr*, clang::SourceLocation, llvm::MutableArrayRef<clang::Expr*>, clang::SourceLocation, clang::Expr*, bool, bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Sema/SemaOverload.cpp:13032:10
    #4 0xd4e126b in clang::Sema::BuildCallExpr(clang::Scope*, clang::Expr*, clang::SourceLocation, llvm::MutableArrayRef<clang::Expr*>, clang::SourceLocation, clang::Expr*, bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Sema/SemaExpr.cpp:6378:16
    #5 0xd53e073 in clang::Sema::ActOnCallExpr(clang::Scope*, clang::Expr*, clang::SourceLocation, llvm::MutableArrayRef<clang::Expr*>, clang::SourceLocation, clang::Expr*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Sema/SemaExpr.cpp:6275:7
    #6 0xcaa957f in clang::Parser::ParsePostfixExpressionSuffix(clang::ActionResult<clang::Expr*, true>) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseExpr.cpp:2066:23
    #7 0xcaaf03e in clang::Parser::ParseCastExpression(clang::Parser::CastParseKind, bool, bool&, clang::Parser::TypeCastState, bool, bool*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseExpr.cpp:1811:9
    #8 0xcaa5f3e in clang::Parser::ParseCastExpression(clang::Parser::CastParseKind, bool, clang::Parser::TypeCastState, bool, bool*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseExpr.cpp:681:20
    #9 0xcaa15f3 in clang::Parser::ParseAssignmentExpression(clang::Parser::TypeCastState) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseExpr.cpp:173:20
    #10 0xcaa13ad in clang::Parser::ParseExpression(clang::Parser::TypeCastState) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseExpr.cpp:124:18
    #11 0xcbc8288 in clang::Parser::ParseExprStatement(clang::Parser::ParsedStmtContext) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseStmt.cpp:446:19
    #12 0xcbc2183 in clang::Parser::ParseStatementOrDeclarationAfterAttributes(llvm::SmallVector<clang::Stmt*, 32u>&, clang::Parser::ParsedStmtContext, clang::SourceLocation*, clang::Parser::ParsedAttributesWithRange&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseStmt.cpp:234:12
    #13 0xcbc119d in clang::Parser::ParseStatementOrDeclaration(llvm::SmallVector<clang::Stmt*, 32u>&, clang::Parser::ParsedStmtContext, clang::SourceLocation*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseStmt.cpp:106:20
    #14 0xcbdd830 in clang::Parser::ParseCompoundStatementBody(bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseStmt.cpp:1098:11
    #15 0xcbe0b67 in clang::Parser::ParseFunctionStatementBody(clang::Decl*, clang::Parser::ParseScope&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseStmt.cpp:2259:21
    #16 0xc9c26d4 in clang::Parser::ParseFunctionDefinition(clang::ParsingDeclarator&, clang::Parser::ParsedTemplateInfo const&, clang::Parser::LateParsedAttrList*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/Parser.cpp:1375:10
    #17 0xca17b5b in clang::Parser::ParseDeclGroup(clang::ParsingDeclSpec&, clang::DeclaratorContext, clang::SourceLocation*, clang::Parser::ForRangeInit*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:1924:27
    #18 0xc9bf6ee in clang::Parser::ParseDeclOrFunctionDefInternal(clang::Parser::ParsedAttributesWithRange&, clang::ParsingDeclSpec&, clang::AccessSpecifier) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/Parser.cpp:1135:10
    #19 0xc9bde93 in clang::Parser::ParseDeclarationOrFunctionDefinition(clang::Parser::ParsedAttributesWithRange&, clang::ParsingDeclSpec*, clang::AccessSpecifier) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/Parser.cpp:1151:12
    #20 0xc9bb56b in clang::Parser::ParseExternalDeclaration(clang::Parser::ParsedAttributesWithRange&, clang::ParsingDeclSpec*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/Parser.cpp:971:12
    #21 0xc9b5d18 in clang::Parser::ParseTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef>&, bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/Parser.cpp:716:12
    #22 0xc9a49cf in clang::ParseAST(clang::Sema&, bool, bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseAST.cpp:158:20
    #23 0x8de7de0 in clang::FrontendAction::Execute() /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Frontend/FrontendAction.cpp:950:8
    #24 0x8ce176d in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:984:33
    #25 0x905ccb9 in clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:278:25
    #26 0xb9bc9d in cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/tools/driver/cc1_main.cpp:240:15
    #27 0xb93aa6 in ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/tools/driver/driver.cpp:330:12
    #28 0xb928ad in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/tools/driver/driver.cpp:407:12
    #29 0x7f3d1d0dc09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
    #30 0xb15209 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/clang-12+0xb15209)

SUMMARY: MemorySanitizer: use-of-uninitialized-value /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Sema/SemaOverload.cpp:11484:9 in clang::OverloadCandidateSet::CompleteCandidates(clang::Sema&, clang::OverloadCandidateDisplayKind, llvm::ArrayRef<clang::Expr*>, clang::SourceLocation, llvm::function_ref<bool (clang::OverloadCandidate&)>)
Exiting

--

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. 
********************
Failed Tests (1):
  Clang :: SemaCUDA/deferred-oeverload.cu
Fri, Sep 18, 11:06 AM · Restricted Project
yaxunl added a comment to D87858: [hip] Add HIP scope atomic ops..
In D87858#2280429, @jfb wrote:

Please provide documentation in this patch.

Fri, Sep 18, 9:32 AM · Restricted Project
yaxunl added reviewers for D87858: [hip] Add HIP scope atomic ops.: tra, rjmccall.
Fri, Sep 18, 9:27 AM · Restricted Project

Thu, Sep 17

yaxunl added a comment to D87791: [CUDA][HIP] Fix -gsplit-dwarf option.
In D87791#2279885, @tra wrote:

It is requested by our debugger team, so it should work with amdgpu.

Is the naming scheme for GPU-side DWO files dictated by debugger? If that's the case, it may be worth adding a comment about that.

LGTM.

Thu, Sep 17, 11:40 AM · Restricted Project
yaxunl added a comment to D87791: [CUDA][HIP] Fix -gsplit-dwarf option.
In D87791#2279821, @tra wrote:

Therefore in either case there is no need to rename the intermediate .o files since they are temporary files which have unique names.

The .dwo files are not temporary files. They are supposed to be shipped with .o files for debugging info.

Ack.
BTW, is split-dwarf useful for AMD GPUs on device side? I don't think we can currently utilize DWO files on device side with CUDA at all. To think of it, it's probably going to break GPU-side debugging as CUDA can only deal with dwarf info embedded in the GPU binary.
If it does not work for AMD GPUs, perhaps we should just disable it for GPUs.

Thu, Sep 17, 11:12 AM · Restricted Project
yaxunl added a comment to D84364: [CUDA][HIP] Defer overloading resolution diagnostics for host device functions.

Looks like this patch broke the MSan buildbots, PTAL (repro instructions https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild):

http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/46239/steps/check-clang%20msan/logs/stdio

FAIL: Clang :: SemaCUDA/deferred-oeverload.cu (11308 of 26387)
******************** TEST 'Clang :: SemaCUDA/deferred-oeverload.cu' FAILED ********************
Script:
--
: 'RUN: at line 1';   /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/clang -cc1 -internal-isystem /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/lib/clang/12.0.0/include -nostdsysteminc -fcuda-is-device -fsyntax-only -verify=dev,com /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/test/SemaCUDA/deferred-oeverload.cu    -std=c++11 -fgpu-defer-diag
: 'RUN: at line 3';   /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/clang -cc1 -internal-isystem /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/lib/clang/12.0.0/include -nostdsysteminc -fsyntax-only -verify=host,com /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/test/SemaCUDA/deferred-oeverload.cu    -std=c++11 -fgpu-defer-diag
--
Exit Code: 77

Command Output (stderr):
--
==41680==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0xddfbf73 in clang::OverloadCandidateSet::CompleteCandidates(clang::Sema&, clang::OverloadCandidateDisplayKind, llvm::ArrayRef<clang::Expr*>, clang::SourceLocation, llvm::function_ref<bool (clang::OverloadCandidate&)>) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Sema/SemaOverload.cpp:11484:9
    #1 0xde4e7ca in clang::OverloadCandidateSet::NoteCandidates(std::__1::pair<clang::SourceLocation, clang::PartialDiagnostic>, clang::Sema&, clang::OverloadCandidateDisplayKind, llvm::ArrayRef<clang::Expr*>, llvm::StringRef, clang::SourceLocation, llvm::function_ref<bool (clang::OverloadCandidate&)>) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Sema/SemaOverload.cpp:11529:9
    #2 0xde6316f in FinishOverloadedCallExpr(clang::Sema&, clang::Scope*, clang::Expr*, clang::UnresolvedLookupExpr*, clang::SourceLocation, llvm::MutableArrayRef<clang::Expr*>, clang::SourceLocation, clang::Expr*, clang::OverloadCandidateSet*, clang::OverloadCandidate**, clang::OverloadingResult, bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Sema/SemaOverload.cpp:12959:19
    #3 0xde6296b in clang::Sema::BuildOverloadedCallExpr(clang::Scope*, clang::Expr*, clang::UnresolvedLookupExpr*, clang::SourceLocation, llvm::MutableArrayRef<clang::Expr*>, clang::SourceLocation, clang::Expr*, bool, bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Sema/SemaOverload.cpp:13032:10
    #4 0xd4e126b in clang::Sema::BuildCallExpr(clang::Scope*, clang::Expr*, clang::SourceLocation, llvm::MutableArrayRef<clang::Expr*>, clang::SourceLocation, clang::Expr*, bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Sema/SemaExpr.cpp:6378:16
    #5 0xd53e073 in clang::Sema::ActOnCallExpr(clang::Scope*, clang::Expr*, clang::SourceLocation, llvm::MutableArrayRef<clang::Expr*>, clang::SourceLocation, clang::Expr*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Sema/SemaExpr.cpp:6275:7
    #6 0xcaa957f in clang::Parser::ParsePostfixExpressionSuffix(clang::ActionResult<clang::Expr*, true>) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseExpr.cpp:2066:23
    #7 0xcaaf03e in clang::Parser::ParseCastExpression(clang::Parser::CastParseKind, bool, bool&, clang::Parser::TypeCastState, bool, bool*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseExpr.cpp:1811:9
    #8 0xcaa5f3e in clang::Parser::ParseCastExpression(clang::Parser::CastParseKind, bool, clang::Parser::TypeCastState, bool, bool*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseExpr.cpp:681:20
    #9 0xcaa15f3 in clang::Parser::ParseAssignmentExpression(clang::Parser::TypeCastState) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseExpr.cpp:173:20
    #10 0xcaa13ad in clang::Parser::ParseExpression(clang::Parser::TypeCastState) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseExpr.cpp:124:18
    #11 0xcbc8288 in clang::Parser::ParseExprStatement(clang::Parser::ParsedStmtContext) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseStmt.cpp:446:19
    #12 0xcbc2183 in clang::Parser::ParseStatementOrDeclarationAfterAttributes(llvm::SmallVector<clang::Stmt*, 32u>&, clang::Parser::ParsedStmtContext, clang::SourceLocation*, clang::Parser::ParsedAttributesWithRange&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseStmt.cpp:234:12
    #13 0xcbc119d in clang::Parser::ParseStatementOrDeclaration(llvm::SmallVector<clang::Stmt*, 32u>&, clang::Parser::ParsedStmtContext, clang::SourceLocation*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseStmt.cpp:106:20
    #14 0xcbdd830 in clang::Parser::ParseCompoundStatementBody(bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseStmt.cpp:1098:11
    #15 0xcbe0b67 in clang::Parser::ParseFunctionStatementBody(clang::Decl*, clang::Parser::ParseScope&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseStmt.cpp:2259:21
    #16 0xc9c26d4 in clang::Parser::ParseFunctionDefinition(clang::ParsingDeclarator&, clang::Parser::ParsedTemplateInfo const&, clang::Parser::LateParsedAttrList*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/Parser.cpp:1375:10
    #17 0xca17b5b in clang::Parser::ParseDeclGroup(clang::ParsingDeclSpec&, clang::DeclaratorContext, clang::SourceLocation*, clang::Parser::ForRangeInit*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseDecl.cpp:1924:27
    #18 0xc9bf6ee in clang::Parser::ParseDeclOrFunctionDefInternal(clang::Parser::ParsedAttributesWithRange&, clang::ParsingDeclSpec&, clang::AccessSpecifier) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/Parser.cpp:1135:10
    #19 0xc9bde93 in clang::Parser::ParseDeclarationOrFunctionDefinition(clang::Parser::ParsedAttributesWithRange&, clang::ParsingDeclSpec*, clang::AccessSpecifier) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/Parser.cpp:1151:12
    #20 0xc9bb56b in clang::Parser::ParseExternalDeclaration(clang::Parser::ParsedAttributesWithRange&, clang::ParsingDeclSpec*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/Parser.cpp:971:12
    #21 0xc9b5d18 in clang::Parser::ParseTopLevelDecl(clang::OpaquePtr<clang::DeclGroupRef>&, bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/Parser.cpp:716:12
    #22 0xc9a49cf in clang::ParseAST(clang::Sema&, bool, bool) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Parse/ParseAST.cpp:158:20
    #23 0x8de7de0 in clang::FrontendAction::Execute() /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Frontend/FrontendAction.cpp:950:8
    #24 0x8ce176d in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:984:33
    #25 0x905ccb9 in clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:278:25
    #26 0xb9bc9d in cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/tools/driver/cc1_main.cpp:240:15
    #27 0xb93aa6 in ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/tools/driver/driver.cpp:330:12
    #28 0xb928ad in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/tools/driver/driver.cpp:407:12
    #29 0x7f3d1d0dc09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
    #30 0xb15209 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/clang-12+0xb15209)

SUMMARY: MemorySanitizer: use-of-uninitialized-value /b/sanitizer-x86_64-linux-fast/build/llvm-project/clang/lib/Sema/SemaOverload.cpp:11484:9 in clang::OverloadCandidateSet::CompleteCandidates(clang::Sema&, clang::OverloadCandidateDisplayKind, llvm::ArrayRef<clang::Expr*>, clang::SourceLocation, llvm::function_ref<bool (clang::OverloadCandidate&)>)
Exiting

--

********************
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. 
********************
Failed Tests (1):
  Clang :: SemaCUDA/deferred-oeverload.cu
Thu, Sep 17, 11:02 AM · Restricted Project
yaxunl added a reverting change for rG40df06cdafc0: [CUDA][HIP] Defer overloading resolution diagnostics for host device functions: rG772bd8a7d99b: Revert "[CUDA][HIP] Defer overloading resolution diagnostics for host device….
Thu, Sep 17, 10:57 AM
yaxunl added a reverting change for rG7f1f89ec8d99: Fix build failure in clangd: rG772bd8a7d99b: Revert "[CUDA][HIP] Defer overloading resolution diagnostics for host device….
Thu, Sep 17, 10:57 AM
yaxunl committed rG772bd8a7d99b: Revert "[CUDA][HIP] Defer overloading resolution diagnostics for host device… (authored by yaxunl).
Revert "[CUDA][HIP] Defer overloading resolution diagnostics for host device…
Thu, Sep 17, 10:57 AM
yaxunl committed rG829d14ee0a6a: Revert "[NFC] Refactor DiagnosticBuilder and PartialDiagnostic" (authored by yaxunl).
Revert "[NFC] Refactor DiagnosticBuilder and PartialDiagnostic"
Thu, Sep 17, 10:57 AM
yaxunl added a reverting change for rGee5519d32357: [NFC] Refactor DiagnosticBuilder and PartialDiagnostic: rG829d14ee0a6a: Revert "[NFC] Refactor DiagnosticBuilder and PartialDiagnostic".
Thu, Sep 17, 10:57 AM
yaxunl added a reverting change for D84364: [CUDA][HIP] Defer overloading resolution diagnostics for host device functions: rG772bd8a7d99b: Revert "[CUDA][HIP] Defer overloading resolution diagnostics for host device….
Thu, Sep 17, 10:57 AM · Restricted Project
yaxunl added a reverting change for D84362: [NFC] Refactor DiagnosticBuilder and PartialDiagnostic: rG829d14ee0a6a: Revert "[NFC] Refactor DiagnosticBuilder and PartialDiagnostic".
Thu, Sep 17, 10:57 AM · Restricted Project
yaxunl committed rG7f1f89ec8d99: Fix build failure in clangd (authored by yaxunl).
Fix build failure in clangd
Thu, Sep 17, 8:51 AM
yaxunl committed rG40df06cdafc0: [CUDA][HIP] Defer overloading resolution diagnostics for host device functions (authored by yaxunl).
[CUDA][HIP] Defer overloading resolution diagnostics for host device functions
Thu, Sep 17, 8:32 AM
yaxunl closed D84364: [CUDA][HIP] Defer overloading resolution diagnostics for host device functions.
Thu, Sep 17, 8:32 AM · Restricted Project

Wed, Sep 16

yaxunl added a comment to D87791: [CUDA][HIP] Fix -gsplit-dwarf option.
In D87791#2277887, @tra wrote:

Does this naming scheme the same as used for .o files? We may want to keep them in sync.

Other than that, LGTM.

Wed, Sep 16, 7:54 PM · Restricted Project
yaxunl committed rGee5519d32357: [NFC] Refactor DiagnosticBuilder and PartialDiagnostic (authored by yaxunl).
[NFC] Refactor DiagnosticBuilder and PartialDiagnostic
Wed, Sep 16, 2:36 PM
yaxunl closed D84362: [NFC] Refactor DiagnosticBuilder and PartialDiagnostic.
Wed, Sep 16, 2:36 PM · Restricted Project
yaxunl requested review of D87791: [CUDA][HIP] Fix -gsplit-dwarf option.
Wed, Sep 16, 1:27 PM · Restricted Project
yaxunl updated the diff for D84362: [NFC] Refactor DiagnosticBuilder and PartialDiagnostic.

Revised by Artem's comments. Extracted the common code of DiagnosticBuilder and PartialDiagnostic
and removed redundant code.

Wed, Sep 16, 1:10 PM · Restricted Project

Tue, Sep 15

yaxunl added a comment to D84362: [NFC] Refactor DiagnosticBuilder and PartialDiagnostic.
In D84362#2274884, @tra wrote:

There are use patterns expecting PartialDiagnosticInst << X << Y to continue to be a PartialDiagnostic&, e.g.

PartialDiagnosticAt PDAt(SourceLoc, PartialDiagnosticInst << X << Y);

However if we derive PartialDiagnostic and DiagnosticBuilder from a base class DiagnosticBuilderBase which implements the << operators, PartialDiagnosticInst << X << Y will become a DiagnosticBuilderBase&, then we can no longer write the above code.

That's one reason I use templates to implement << operators.

Do we want to sacrifice this convenience?

I don't think we have to.
AFAICT, virtually all such patterns (and there are only 20-ish of them in total) are used with EmitFormatDiagnostic(S.PDiag()) which could be adapted to accept DiagnosticBuilderBase and Sema::PDiag() changed to return PartialDiagnosticBuilder with no loss of convenience.

Tue, Sep 15, 12:28 PM · Restricted Project
yaxunl added a comment to D87321: Fix -gz=zlib options for linker.

This broke the PPC LLD bot and the failure has been ignored for 4 days. I believe it should be fixed with 3bc3983f229.

Tue, Sep 15, 11:40 AM · Restricted Project
yaxunl added a comment to D84362: [NFC] Refactor DiagnosticBuilder and PartialDiagnostic.
In D84362#2271585, @tra wrote:

So, the idea here is to do some sort of duck-typing and allow DiagBuilder to work with both DiagnosticBuilder and PartialDiagnostic.

What bothers me is that unlike Diagnostic PartialDiagnostic seems to be commingling functionality of the builder with that of the diagnostic itself. I'm not sure if that's by design or just happened to be that way.

I think a better approach would be to refactor PartialDiagnostic and separate the builder functionality. That should make it possible to create a common diagnostic builder base class with Partial/Full diagnostic deriving their own builder, if needed.

That said, I'm not that familiar with the diags. Perhaps @rtrieu @aaron.ballman would have better ideas.

I'm similarly a bit uncomfortable with adding the SFINAE magic to make this work instead of making a base class that will work for either kind of diagnostic builder. I'm adding @rsmith to see if he has opinions as well.

Tue, Sep 15, 11:07 AM · Restricted Project

Fri, Sep 11

yaxunl committed rGee13ae030e21: Fix test hip-gz-options.hip (authored by yaxunl).
Fix test hip-gz-options.hip
Fri, Sep 11, 2:58 PM
yaxunl committed rGccb4124a4172: Fix -gz=zlib options for linker (authored by yaxunl).
Fix -gz=zlib options for linker
Fri, Sep 11, 2:13 PM
yaxunl closed D87321: Fix -gz=zlib options for linker.
Fri, Sep 11, 2:13 PM · Restricted Project
yaxunl added inline comments to D87321: Fix -gz=zlib options for linker.
Fri, Sep 11, 12:56 PM · Restricted Project
yaxunl added inline comments to D87321: Fix -gz=zlib options for linker.
Fri, Sep 11, 12:45 PM · Restricted Project
yaxunl updated the diff for D87321: Fix -gz=zlib options for linker.

fix tests

Fri, Sep 11, 12:45 PM · Restricted Project
yaxunl added inline comments to D87321: Fix -gz=zlib options for linker.
Fri, Sep 11, 11:17 AM · Restricted Project
yaxunl updated the diff for D87321: Fix -gz=zlib options for linker.

fix tests

Fri, Sep 11, 11:17 AM · Restricted Project
yaxunl added inline comments to D87321: Fix -gz=zlib options for linker.
Fri, Sep 11, 8:32 AM · Restricted Project
yaxunl updated the diff for D87321: Fix -gz=zlib options for linker.

revised by Fangrui's comments.

Fri, Sep 11, 8:27 AM · Restricted Project

Thu, Sep 10

yaxunl updated the diff for D84822: Add documentation for target ID and ClangOffloadBundlerFormat.

revised by Tony's and Konstantin's comments

Thu, Sep 10, 1:37 PM · Restricted Project
yaxunl committed rG4934127e627d: Diable sanitizer options for amdgpu (authored by yaxunl).
Diable sanitizer options for amdgpu
Thu, Sep 10, 12:42 PM
yaxunl closed D87461: Disable sanitizer options for AMDGPU.
Thu, Sep 10, 12:41 PM · Restricted Project
yaxunl added a comment to D87325: [HIP] Add -emit-pch option to clang driver.

need tests for --cuda-host-only and default host/device compilation, and tests for C/C++

Thu, Sep 10, 11:50 AM
yaxunl added reviewers for D87325: [HIP] Add -emit-pch option to clang driver: rsmith, tra.
Thu, Sep 10, 11:50 AM
yaxunl requested review of D87461: Disable sanitizer options for AMDGPU.
Thu, Sep 10, 8:54 AM · Restricted Project

Tue, Sep 8

yaxunl committed rG041da0d828e3: [HIP] Add gfx1031 and gfx1030 (authored by yaxunl).
[HIP] Add gfx1031 and gfx1030
Tue, Sep 8, 1:39 PM
yaxunl closed D87324: [HIP] Add gfx1030 and gfx1031.
Tue, Sep 8, 1:39 PM · Restricted Project
yaxunl requested review of D87324: [HIP] Add gfx1030 and gfx1031.
Tue, Sep 8, 1:07 PM · Restricted Project
yaxunl requested review of D87321: Fix -gz=zlib options for linker.
Tue, Sep 8, 12:05 PM · Restricted Project
yaxunl added a comment to D84362: [NFC] Refactor DiagnosticBuilder and PartialDiagnostic.

ping. this is needed by https://reviews.llvm.org/D84364

Tue, Sep 8, 8:42 AM · Restricted Project

Thu, Sep 3

yaxunl added a comment to D84364: [CUDA][HIP] Defer overloading resolution diagnostics for host device functions.
In D84364#2255572, @tra wrote:

LGTM.

Nice!

To sum it up -- the patch introduces -fgpu-defer-diag flag which allows deferring overload resolution diagnostics, if overload set included candidates from both sides.

We may be deferring cases when we don't have to (e.g. df()->()callee2() should've errored out right away, even during host compilation, as there's no way it could ever be valid), but this approach is a good starting point. It affects only the interesting subset of diags, is not enabled by default, and can be refined further, if necessary.

Thu, Sep 3, 7:32 PM · Restricted Project
yaxunl added a comment to D84364: [CUDA][HIP] Defer overloading resolution diagnostics for host device functions.
In D84364#2201336, @tra wrote:

I added a Deferrable bit to the diagnostics which can be specified in td files. This can be added to individual diagnostic defs or added to a bunch of diagnostic defs all together.

This field is used to control whether a diagnostic message can be deferred.

This may be a case of "too much, but not enough". It will be unnecessary for most of the diagnostics we have. Overload resolution is likely to be the primary beneficiary, inline asm and exceptions may be two other classes, but I can't think of anything else ATM.
At the same time it may not be enough, because we also need to take into account where and when particular diagnostic is emitted. I.e. the same diagnostic may need to be postponed when we emit it from CUDA code, yet we may want to *not* postpone it if it's in the code which has nothing to do with CUDA. E.g. C++ code which has oveloading-related error in an inline function which would not be codegen'ed. I would expect such error to be reported as it would be if the same function was compiled in plain C++ mode.

Thu, Sep 3, 2:53 PM · Restricted Project
yaxunl updated the diff for D84364: [CUDA][HIP] Defer overloading resolution diagnostics for host device functions.

Defer overload resolution diags only if there are wrong-sided candidates.

Thu, Sep 3, 2:44 PM · Restricted Project

Wed, Sep 2

yaxunl added a reverting change for rG04abbb3a7818: [HIP] Change default --gpu-max-threads-per-block value to 1024: rG62dbb7e54c65: Revert "[HIP] Change default --gpu-max-threads-per-block value to 1024".
Wed, Sep 2, 1:13 PM
yaxunl committed rG62dbb7e54c65: Revert "[HIP] Change default --gpu-max-threads-per-block value to 1024" (authored by yaxunl).
Revert "[HIP] Change default --gpu-max-threads-per-block value to 1024"
Wed, Sep 2, 1:13 PM
yaxunl added a reverting change for D76795: [HIP] Change default --gpu-max-threads-per-block value to 1024: rG62dbb7e54c65: Revert "[HIP] Change default --gpu-max-threads-per-block value to 1024".
Wed, Sep 2, 1:13 PM · Restricted Project
yaxunl added a comment to D84822: Add documentation for target ID and ClangOffloadBundlerFormat.

revised

Wed, Sep 2, 5:16 AM · Restricted Project
yaxunl updated the diff for D84822: Add documentation for target ID and ClangOffloadBundlerFormat.

revised by Tony's comments.

Wed, Sep 2, 5:16 AM · Restricted Project

Aug 26 2020

yaxunl retitled D86376: [HIP] Simplify kernel launching from [HIP] Improve kernel launching latency to [HIP] Simplify kernel launching.
Aug 26 2020, 9:29 AM
yaxunl added a comment to D86376: [HIP] Simplify kernel launching.
In D86376#2236704, @tra wrote:
Aug 26 2020, 9:29 AM

Aug 25 2020

yaxunl added a comment to D86376: [HIP] Simplify kernel launching.
In D86376#2234824, @tra wrote:

This patch appears to be somewhere in the gray area to me. My prior experience with CUDA suggests that it will make little to no difference. On the other hand, AMD GPUs may be different enough to prove me wrong. Without specific evidence, I still can't tell what's the case here.

Sorry, the overhead due to __hipPushConfigure/__hipPopConfigure is about 60 us. The typical kernel launching latency is about 500us, therefore the improvement is around 10%.

60 *micro seconds* to store/load something from memory? It does not sound right. 0.5 millisecond per kernel launch is also suspiciously high.
For CUDA it's ~5us (https://www.hpcs.cs.tsukuba.ac.jp/icpp2019/data/posters/Poster17-abst.pdf). If it does indeed take 60 microseconds to push/pop a O(cacheline) worth of launch config data, the implementation may be doing something wrong. We're talking about O(100) syscalls and that's way too much work for something that simple. What do those calls do?

Can you confirm that the units are indeed microseconds and not nanoseconds?

My previous measurements did not warming up, which caused some one time overhead due to device initialization and loading of device binary. With warm up, the call of __hipPushCallConfigure/__hipPopCallConfigure takes about 19 us. Based on the trace from rocprofile, the time spent inside these functions can be ignored. Most of the time is spent making the calls. These functions stay in a shared library, which may be the reason why they take such long time. Making them always_inline may get rid of the overhead, however, that would require exposing internal data structures.

Aug 25 2020, 9:16 AM

Aug 24 2020

yaxunl added a comment to D86376: [HIP] Simplify kernel launching.
In D86376#2234547, @tra wrote:

I'm OK with how the patch is implemented.
I'm still on the fence regarding whether it should be implemented.

`hipPushConfiguration/hipPopConfiguration' and kernel stub can cause 40 ns overhead, whereas we have requests to squeeze any overhead in kernel launching latency.

That's about the same as 1 cache miss. I'm willing to bet that it will be lost in the noise. Are there any real world benchmarks where it makes a difference?
Are those requests driven by a specific use case? Not all requests (even well intentioned ones) are worth implementing.
This patch appears to be somewhere in the gray area to me. My prior experience with CUDA suggests that it will make little to no difference. On the other hand, AMD GPUs may be different enough to prove me wrong. Without specific evidence, I still can't tell what's the case here.

Aug 24 2020, 2:51 PM
yaxunl added a comment to D86376: [HIP] Simplify kernel launching.
In D86376#2234259, @tra wrote:

How much does this inlining buy you in practice? I.e. what's a typical launch latency before/after the patch? For CUDA, config push/pop is negligible compared to the cost of actually launching the kernel on the GPU. It is measurable if the launch is asynchronous, but queueing kernels fast, does not help all that much in the long run -- you eventually have to run those kernels on the GPU, so in most cases you're just spend a bit more time idling while waiting for the queued kernels to finish. To be beneficial, you'll need a finely balanced CPU/GPU workload and that's rather hard to achieve. Not to the point where the minor savings here would be meaningful. I would assume the situation on AMD GPUs is not that different.

Aug 24 2020, 12:34 PM

Aug 21 2020

yaxunl updated the summary of D86376: [HIP] Simplify kernel launching.
Aug 21 2020, 3:10 PM
yaxunl requested review of D86376: [HIP] Simplify kernel launching.
Aug 21 2020, 3:03 PM

Aug 19 2020

yaxunl requested review of D86217: rename sram-ecc as sramecc in clang.
Aug 19 2020, 8:06 AM · Restricted Project
yaxunl added inline comments to D84822: Add documentation for target ID and ClangOffloadBundlerFormat.
Aug 19 2020, 6:56 AM · Restricted Project
yaxunl updated the diff for D84822: Add documentation for target ID and ClangOffloadBundlerFormat.

revised by Tony's comments.

Aug 19 2020, 6:55 AM · Restricted Project
yaxunl added a comment to D84822: Add documentation for target ID and ClangOffloadBundlerFormat.

How does one opt out of this scheme?

OpenMP already has a convention for finding code objects in the host binary, used by amdgpu and other targets, which I don't think matches the above.

Where's the corresponding implementation? I think it's the llc part I need to read to understand whether this proposal can be used outside of HIP and amd's OpenCL.

Aug 19 2020, 6:29 AM · Restricted Project

Aug 18 2020

yaxunl committed rGa11ab6e04c19: Fix test hip-target-id.hip (authored by yaxunl).
Fix test hip-target-id.hip
Aug 18 2020, 9:42 PM
yaxunl committed rG7546b29e7616: [HIP] Support target id by --offload-arch (authored by yaxunl).
[HIP] Support target id by --offload-arch
Aug 18 2020, 8:44 PM
yaxunl closed D60620: [HIP] Support target id by --offload-arch.
Aug 18 2020, 8:44 PM · Restricted Project, Restricted Project