Page MenuHomePhabricator

[RISCV] Unify the arch string parsing logic to RISCVISAInfo.
ClosedPublic

Authored by kito-cheng on Jun 29 2021, 8:26 PM.

Details

Summary

How many place you need to modify when implementing a new extension for RISC-V?

At least 7 places as I know:

  • Add new SubtargetFeature at RISCV.td
  • -march parser in RISCV.cpp
  • RISCVTargetInfo::initFeatureMap@RISCV.cpp for handling feature vector.
  • RISCVTargetInfo::getTargetDefines@RISCV.cpp for pre-define marco.
  • Arch string parser for ELF attribute in RISCVAsmParser.cpp
  • ELF attribute emittion in RISCVAsmParser.cpp, and make sure it's in canonical order...
  • ELF attribute emittion in RISCVTargetStreamer.cpp, and again, must in canonical order...

And now, this patch provide an unified infrastructure for handling (almost)
everything of RISC-V arch string.

After this patch, you only need to update 2 places for implement an extension
for RISC-V:

  • Add new SubtargetFeature at RISCV.td, hmmm, it's hard to avoid.
  • Add new entry to SupportedExtensionInfos@RISCVISAInfo.cpp.

Most codes are come from existing -march parser, but with few new feature/bug
fixes:

  • Accept version for -march, e.g. -march=rv32i2p0.
  • Reject version info with p but without minor version number like rv32i2p.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
kito-cheng edited the summary of this revision. (Show Details)Aug 18 2021, 8:06 PM
asb added a comment.Aug 19 2021, 5:42 AM

I'm getting a build error (building with clang 12.0.1):

FAILED: lib/Support/CMakeFiles/LLVMSupport.dir/RISCVISAInfo.cpp.o 
/usr/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/asb/llvm-project/build/default/lib/Support -I/home/asb/llvm-project/llvm/lib/Support -I/home/asb/llvm-project/build/default/include -I/home/asb/llvm-project/llvm/include -fPIC -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wstring-conversion -Wmisleading-indentation -fdiagnostics-color -Werror=global-constructors -g -fPIC -gsplit-dwarf -std=c++14  -fno-exceptions -fno-rtti -MD -MT lib/Support/CMakeFiles/LLVMSupport.dir/RISCVISAInfo.cpp.o -MF lib/Support/CMakeFiles/LLVMSupport.dir/RISCVISAInfo.cpp.o.d -o lib/Support/CMakeFiles/LLVMSupport.dir/RISCVISAInfo.cpp.o -c /home/asb/llvm-project/llvm/lib/Support/RISCVISAInfo.cpp
/home/asb/llvm-project/llvm/lib/Support/RISCVISAInfo.cpp:399:10: warning: moving a local object in a return statement prevents copy elision [-Wpessimizing-move]
  return std::move(ISAInfo);
         ^
/home/asb/llvm-project/llvm/lib/Support/RISCVISAInfo.cpp:399:10: note: remove std::move call here
  return std::move(ISAInfo);
         ^~~~~~~~~~       ~
/home/asb/llvm-project/llvm/lib/Support/RISCVISAInfo.cpp:478:12: error: call to deleted constructor of 'llvm::Error'
    return E;
           ^
/home/asb/llvm-project/llvm/include/llvm/Support/Error.h:186:3: note: 'Error' has been explicitly marked deleted here
  Error(const Error &Other) = delete;
  ^
/home/asb/llvm-project/llvm/include/llvm/Support/Error.h:493:18: note: passing argument to parameter 'Err' here
  Expected(Error Err)
                 ^
/home/asb/llvm-project/llvm/lib/Support/RISCVISAInfo.cpp:520:14: error: call to deleted constructor of 'llvm::Error'
      return E;
             ^
/home/asb/llvm-project/llvm/include/llvm/Support/Error.h:186:3: note: 'Error' has been explicitly marked deleted here
  Error(const Error &Other) = delete;
  ^
/home/asb/llvm-project/llvm/include/llvm/Support/Error.h:493:18: note: passing argument to parameter 'Err' here
  Expected(Error Err)
                 ^
/home/asb/llvm-project/llvm/lib/Support/RISCVISAInfo.cpp:642:14: error: call to deleted constructor of 'llvm::Error'
      return E;
             ^
/home/asb/llvm-project/llvm/include/llvm/Support/Error.h:186:3: note: 'Error' has been explicitly marked deleted here
  Error(const Error &Other) = delete;
  ^
/home/asb/llvm-project/llvm/include/llvm/Support/Error.h:493:18: note: passing argument to parameter 'Err' here
  Expected(Error Err)
                 ^
1 warning and 3 errors generated.
[3/3711] Building CXX object tools/llvm-config/CMakeFiles/llvm-config.dir/llvm-config.cpp.o
ninja: build stopped: subcommand failed.

Changes:

  • Fix build warning and build error with clang

@asb Thanks for report that, there is no warning and build error when I build with GCC 7 (which is default compiler in Ubuntu 18.04), but I can reproduce that with clang 13, seems like I should switch my default compiler :P

eopXD added a subscriber: eopXD.Aug 25 2021, 1:57 AM
eopXD added inline comments.
llvm/lib/Support/RISCVISAInfo.cpp
269

For naming consistency with parseArchString, maybe rename MArch to Arch?

Changes:

  • Remove unused argument MArch for getExtensionVersion.
kito-cheng marked an inline comment as done.Aug 29 2021, 8:30 PM
kito-cheng added inline comments.
llvm/lib/Support/RISCVISAInfo.cpp
269

It turns out the argument isn't used anywhere...so I remove that.

Jim added inline comments.Aug 30 2021, 8:49 PM
llvm/lib/Support/RISCVISAInfo.cpp
388

Does this need to check it is invalid if XLen is 64?

395

Why not get the version of i from SupportedExtensions?

kito-cheng marked an inline comment as done.

Changes:

  • Check feature combination is valid.
  • Change return type of parseFeatures to llvm::Expected<std::unique_ptr<RISCVISAInfo>>.
  • Set default extension version in getExtensionVersion.
kito-cheng marked 2 inline comments as done.Sep 2 2021, 12:32 AM
kito-cheng added inline comments.
llvm/lib/Support/RISCVISAInfo.cpp
388

Add check, and parseFeatures might return Error now.

395

Updated :)

Jim added inline comments.Sep 2 2021, 10:11 PM
llvm/lib/Support/RISCVISAInfo.cpp
417

Could this checking put before ISAInfo->addExtension...

kito-cheng marked 2 inline comments as done.

Changes:

  • Address Jim's comment.
kito-cheng marked an inline comment as done.Sep 8 2021, 10:59 PM

I was trying to put this patch through its paces but it no longer applies. Can you please rebase it? It seems this is nearly there.

Seems like conflict with D108187, will update after testing :)

Changes:

  • Rebase to main.

Changes:

  • Rebase to main
  • Remove b and zbproposedc.
craig.topper added inline comments.Oct 11 2021, 11:39 AM
clang/lib/Driver/ToolChains/Arch/RISCV.cpp
213–215

Please add an inline comment for what "true" represents.

216

Should we have an llvm_unreachable here for unknown values of XLen?

225

Drop else after return

227

here too

232

Here too

llvm/include/llvm/Support/RISCVISAInfo.h
44

I'm not sure the comparison to StringMap is helpful here. StringMap is an unordered map that stores strings in the same allocation as the value. This std::map is ordered and doesn't store the strings in the same allocation.

llvm/lib/Support/RISCVISAInfo.cpp
287

Do MajorStr and MinorStr need to be std::strings or can they be StringRefs? Looks like they are either empty or a subset of In. So there shouldn't be a lifetime issue?

353

Should this message say which extension the error applies to?

389

Use

auto ISAInfo = std::make_unique<RISCVISAInfo>()
391

Can we make XLen part of the RISCVISAInfo constructor? Seems like we know that early enough.

441

Can we delay constructing this until after the first 2 error checks? Then we could pass XLen to the constructor.

710

Would this better as

Arch << "rv" << XLen;

That makes it future proof to a hypothetical rv128 someday.

kito-cheng marked 11 inline comments as done.

Changes:

  • Address @craig.topper's comment
  • Add XLen to constructor of RISCVISAInfo.
llvm/lib/Support/RISCVISAInfo.cpp
389

std::make_unique<RISCVISAInfo>() require a public constructor, but I would prefer keep that in private.

craig.topper accepted this revision.Oct 13 2021, 9:52 AM

LGTM other than that comment about removing 'b' from AllStdExts.

llvm/lib/Support/RISCVISAInfo.cpp
40

'b' shouldn't be in this list anymore?

389

I agree with keeping the constructor private. I didn't realize make_unique had that restriction.

This revision is now accepted and ready to land.Oct 13 2021, 9:52 AM
khchen added inline comments.Oct 14 2021, 9:22 AM
llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
2095–2096

maybe we could have a NFC patch later to add

// Emit the arch string if needed.
if (!IsIntegerValue)
  getTargetStreamer().emitTextAttribute(Tag, ISAInfo->toString());

here to reuse ISAInfo, and remove 2098~2117 code.

kito-cheng added inline comments.Oct 17 2021, 1:05 AM
llvm/lib/Support/RISCVISAInfo.cpp
40

I would prefer to keep that for sync with ISA manual, like other unsupported extension like J, T and `N, we remove B from SupportedExtensions, so keep this should be harmless.

llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
2095–2096

Good point, thanks :)
Let improve that by following patch.

This revision was landed with ongoing or failed builds.Oct 17 2021, 1:26 AM
This revision was automatically updated to reflect the committed changes.

Committed with one minor update for version of zba/zbb/zbc/zbs.

kito-cheng added inline comments.Oct 17 2021, 7:21 AM
llvm/lib/Support/RISCVISAInfo.cpp
40

Oh, and it also used for compute the canonical order for z* extensions.

akuegel added inline comments.
llvm/include/llvm/Support/RISCVISAInfo.h
15

Would it be possible to avoid the usage of ArgList in a target that belongs to Support?
This introduces a dependency cycle, since Option already depends on Support.

This broke the modules build:

While building module 'LLVM_Utils' imported from lvm/lib/Support/TargetParser.cpp:14:
In file included from <module-includes>:209:
llvm/include/llvm/Support/RISCVISAInfo.h:15:10: fatal error: could not build module 'LLVM_Option'
#include "llvm/Option/ArgList.h"
 ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~
llvm/lib/Support/TargetParser.cpp:14:10: fatal error: could not build module 'LLVM_Utils'
#include "llvm/Support/TargetParser.h"
 ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Support headers can't include Option headers (as Option depends on Support already, so that would be a cyclic dependency). I believe folks here are in the US time zone, so I went ahead and replaced the header include with a forward decl to unbreak the bots (see de4d2f80b75e2a1e4b0ac5c25e20f20839633688 )

FWIW, I think there is a better place than Support for this new class. Support is a dependency of nearly every single LLVM component, but this class seems to be used by a handful of files. Could we maybe move this to some other/new part of LLVM (and make the few components that need it depend on that)?

This broke the modules build:

While building module 'LLVM_Utils' imported from lvm/lib/Support/TargetParser.cpp:14:
In file included from <module-includes>:209:
llvm/include/llvm/Support/RISCVISAInfo.h:15:10: fatal error: could not build module 'LLVM_Option'
#include "llvm/Option/ArgList.h"
 ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~
llvm/lib/Support/TargetParser.cpp:14:10: fatal error: could not build module 'LLVM_Utils'
#include "llvm/Support/TargetParser.h"
 ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Support headers can't include Option headers (as Option depends on Support already, so that would be a cyclic dependency). I believe folks here are in the US time zone, so I went ahead and replaced the header include with a forward decl to unbreak the bots (see de4d2f80b75e2a1e4b0ac5c25e20f20839633688 )

FWIW, I think there is a better place than Support for this new class. Support is a dependency of nearly every single LLVM component, but this class seems to be used by a handful of files. Could we maybe move this to some other/new part of LLVM (and make the few components that need it depend on that)?

Thanks for the header fix. I think we also need to fix the library circular dependency that still exists. The Support library should not use anything from the Option library. Maybe we can partition the function some way that moves the MakeArgStr back into the clang driver code.

I don't know if we have a better library for this new class. I think Support is the only common library between the clang driver and LLVM's MC layer. I supposed we could create a new library, but that might be a hard sell for one file.

This broke the modules build:

While building module 'LLVM_Utils' imported from lvm/lib/Support/TargetParser.cpp:14:
In file included from <module-includes>:209:
llvm/include/llvm/Support/RISCVISAInfo.h:15:10: fatal error: could not build module 'LLVM_Option'
#include "llvm/Option/ArgList.h"
 ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~
llvm/lib/Support/TargetParser.cpp:14:10: fatal error: could not build module 'LLVM_Utils'
#include "llvm/Support/TargetParser.h"
 ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Support headers can't include Option headers (as Option depends on Support already, so that would be a cyclic dependency). I believe folks here are in the US time zone, so I went ahead and replaced the header include with a forward decl to unbreak the bots (see de4d2f80b75e2a1e4b0ac5c25e20f20839633688 )

FWIW, I think there is a better place than Support for this new class. Support is a dependency of nearly every single LLVM component, but this class seems to be used by a handful of files. Could we maybe move this to some other/new part of LLVM (and make the few components that need it depend on that)?

Thanks for the header fix. I think we also need to fix the library circular dependency that still exists. The Support library should not use anything from the Option library. Maybe we can partition the function some way that moves the MakeArgStr back into the clang driver code.

I don't know if we have a better library for this new class. I think Support is the only common library between the clang driver and LLVM's MC layer. I supposed we could create a new library, but that might be a hard sell for one file.

IIRC there are some other target-specific classes in Support for the same reason. Maybe we can make something such as TargetSupport or SharedTarget?

jrtc27 added a comment.EditedOct 18 2021, 12:49 PM

Two options come to mind if we really need to be outputting a StringRef list:

  1. (the simpler option) Pass in a Twine -> const char * lambda that the caller hooks up to Args.MakeArgString
  2. (probably the nicer option) Invent our own MakeArgString that allocates from storage owned by RISCVISAInfo itself; lots of ways you can do that

Adding a new library just so we can use MakeArgString seems overkill, even if there are other things that perhaps belong in something a bit less shared than Support.

Two options come to mind if we really need to be outputting a StringRef list:

  1. (the simpler option) Pass in a Twine -> const char * lambda that the caller hooks up to Args.MakeArgString

Good idea. I'll post a patch for that shortly.

  1. (probably the nicer option) Invent our own MakeArgString that allocates from storage owned by RISCVISAInfo itself; lots of ways you can do that

I don't think the RISCVISAInfo objects lives long enough for that. It only lives for the duration of the function that calls toFeatures, but the Features vector is returned from that function.

Adding a new library just so we can use MakeArgString seems overkill, even if there are other things that perhaps belong in something a bit less shared than Support.

jrtc27 added a comment.EditedOct 18 2021, 1:18 PM

Two options come to mind if we really need to be outputting a StringRef list:

  1. (the simpler option) Pass in a Twine -> const char * lambda that the caller hooks up to Args.MakeArgString

Good idea. I'll post a patch for that shortly.

  1. (probably the nicer option) Invent our own MakeArgString that allocates from storage owned by RISCVISAInfo itself; lots of ways you can do that

I don't think the RISCVISAInfo objects lives long enough for that. It only lives for the duration of the function that calls toFeatures, but the Features vector is returned from that function.

Hm, I see. We could also just have a global cache of heap-allocated copies of these strings, I guess; after all it's not all that different from the tables of strings we already have, just lazily built up at run time (can even store them in a lazily-initialised field of each RISCVExtensionInfo). Or we just stick to the slightly-ugly lambda approach, unless anyone else has bright ideas for a cleaner interface and implementation.

Two options come to mind if we really need to be outputting a StringRef list:

  1. (the simpler option) Pass in a Twine -> const char * lambda that the caller hooks up to Args.MakeArgString

Good idea. I'll post a patch for that shortly.

  1. (probably the nicer option) Invent our own MakeArgString that allocates from storage owned by RISCVISAInfo itself; lots of ways you can do that

I don't think the RISCVISAInfo objects lives long enough for that. It only lives for the duration of the function that calls toFeatures, but the Features vector is returned from that function.

Hm, I see. We could also just have a global cache of heap-allocated copies of these strings, I guess; after all it's not all that different from the tables of strings we already have, just lazily built up at run time (can even store them in a lazily-initialised field of each RISCVExtensionInfo). Or we just stick to the slightly-ugly lambda approach, unless anyone else has bright ideas for a cleaner interface and implementation.

Lambda approach here https://reviews.llvm.org/D112032

zixuan-wu added inline comments.
llvm/lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.cpp
107

If SubtargetFeature is enum value like ARM.td does, Feature.Value would be a uninitialization value, so I think there should be if condition like Feature.Value < RISCV::NumSubtargetFeatures, or there is a out-of-range visit.

def ProcXXX :
  SubtargetFeature<"", "RISCVProcFamily",
                   "XXX", "XXX processors">;

Hi, @kito-cheng as your this patch unify the extension handing in one same place by new infra, are you going to support the multiple extension version in next step?