This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Refactor the RISCV ISA extension info and target features to support multiple extension version
Needs ReviewPublic

Authored by zixuan-wu on Dec 17 2021, 1:02 AM.

Details

Summary

As RISC-V spec supports multiple extension version, refactor the SubtargetFeatures and RISCVISAInfo to support it.

RISCVISAInfo can parse arch string like -march with extension version, but related SubtargetFeatures don't contain version info in llvm side. For example, when enable 'm' extension with passing -mattr=+m, only a boolean flag to indicate that 'm' extension is enable. After this patch, '+m' would be default -mattr to represent default extension version. You can add new extension version by adding new SubtargetFeature like +m2p1.

It also handles arch string with version number of attribute section in obj file, including assemble and dis-assemble process.

Diff Detail

Event Timeline

zixuan-wu created this revision.Dec 17 2021, 1:02 AM
zixuan-wu requested review of this revision.Dec 17 2021, 1:02 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptDec 17 2021, 1:02 AM
zixuan-wu edited the summary of this revision. (Show Details)Dec 17 2021, 1:38 AM

enable 'm' extension with passing mattr=+m After this patch, it would be -mattr=+m2p0.

It's not obvious to me that support for extension versions should mean or has to mean that we always explicitly specify the version. Why can't we keep supporting the option mattr=+m, which would be mapped to mattr=+m,+m2p0, or whatever the current default m version happens to be?

zixuan-wu added a comment.EditedDec 19 2021, 6:32 PM

enable 'm' extension with passing mattr=+m After this patch, it would be -mattr=+m2p0.

It's not obvious to me that support for extension versions should mean or has to mean that we always explicitly specify the version. Why can't we keep supporting the option mattr=+m, which would be mapped to mattr=+m,+m2p0, or whatever the current default m version happens to be?

OK. I think default version mattr is acceptable, and it also supports extension version functionality that we can use mattr=+m2p0. Then, testcases change would be clean and simple.

zixuan-wu edited the summary of this revision. (Show Details)

Address comments.

Do not bring back V draft 0.7. It is gone, it will never be supported again by LLVM under that name. The standard extension namespace is reserved for ratified extensions and development snapshots only, not old drafts vendors decided to ship. For those, non-standard extension names are needed.

I think this would benefit from increased test coverage, namely to show that the mattr command-line options are properly handled. Some possible ideas:

  • Tests with the correct extension versions (maybe add a test file that exercises the version for all extensions).
  • Tests that show an error message with unsupported versions.
  • A test that shows that something like mattr=+m,+m2p1 is allowed (or not).

Nit: fix the lint / no new line warnings.

zixuan-wu added a comment.EditedDec 22 2021, 6:30 PM

Do not bring back V draft 0.7. It is gone, it will never be supported again by LLVM under that name. The standard extension namespace is reserved for ratified extensions and development snapshots only, not old drafts vendors decided to ship. For those, non-standard extension names are needed.

I understood only ratified extension is accepted. Even for ratified extension, the version is going to evolve in the future. Anyway, no matter whether standard or non-standard extension, version support is needed. So this patch is for those. v0p7 is just used to go through the codepath to support multiple version and test as now there is no extension that is not the default version. I will change the test way.

craig.topper added inline comments.Dec 23 2021, 8:20 AM
llvm/include/llvm/Support/RISCVISAInfo.h
33

Use !(*this == Version)

llvm/lib/Support/RISCVISAInfo.cpp
154–156

This name was already bad and getting worse. It doesn't return a bool so shouldn't start with is. It should be called something like getExperimentalExtensionVersions.

341

Use !isExperimentalExtension(ExtName).empty()

404

!empty

481

Please fix this clang-format issue

llvm/lib/Target/RISCV/RISCVSubtarget.cpp
56 ↗(On Diff #395803)

Add a reset method?

91 ↗(On Diff #395803)

Why do we need to initialize things now but didn't before?

llvm/lib/Target/RISCV/RISCVSubtarget.h
38–58

Can we have a default constructor for RISCVExtensionVersion?

105–123

Add a helper method to RISCVExtensionVersion to check for a null version?

zixuan-wu added inline comments.Dec 24 2021, 1:48 AM
llvm/include/llvm/Support/RISCVISAInfo.h
33

Good taste.

llvm/lib/Target/RISCV/RISCVSubtarget.cpp
91 ↗(On Diff #395803)

After the constructor is workable to initialize, no need for initializeEnvironment. So remove it.

zixuan-wu updated this revision to Diff 396147.Dec 24 2021, 1:56 AM
zixuan-wu edited the summary of this revision. (Show Details)

Address all comments.

zixuan-wu updated this revision to Diff 396148.Dec 24 2021, 1:59 AM
jrtc27 added a comment.Jan 5 2022, 8:29 PM

I'm unconvinced about landing something like this until there's an actual use case in the tree. How do we know this will actually work the way we want it to if there's nothing proving it? It's still unclear to me how exactly this is going to be represented in the target features, but also with RISC-V extensions not being changed once ratified any more (changes mean new extensions entirely, not new versions) I don't know whether this is actually needed or we can just deal with the couple of existing cases more simply.

llvm/lib/Support/RISCVISAInfo.cpp
48

Don't do this

but also with RISC-V extensions not being changed once ratified any more (changes mean new extensions entirely, not new versions)

I don't think so. Or why is there version in RISC-V spec?
And not only for standard extension, but also it's needed in custom extension.

BTW, it's been supported to parse version of -march in clang side.

zixuan-wu added inline comments.Jan 5 2022, 9:45 PM
llvm/lib/Support/RISCVISAInfo.cpp
48

This nit will be removed before commit

jrtc27 added a comment.EditedJan 5 2022, 9:54 PM

but also with RISC-V extensions not being changed once ratified any more (changes mean new extensions entirely, not new versions)

I don't think so. Or why is there version in RISC-V spec?

That was added years ago before there was any plan/policy for ratifying new extensions beyond the initial GC set. https://docs.google.com/presentation/d/1nQ5uFb39KA6gvUi5SReWfIQSiRN7hp6z7ZPfctE4mKk/edit#slide=id.p1 is the current lifecycle; note that only errata can be fixed after ratification, everything else requires a new extension (see the blue arrow and brown box below it). As far as I can tell, for new extensions it serves no purpose other than to distinguish draft specs from ratified ones.

And not only for standard extension, but also it's needed in custom extension.

Vendor extensions are going to be enough of a support pain in Clang. I sincerely hope they don't make life worse by defining multiple versions, rather than doing it properly and defining new extensions every time they add things. It's not just for the toolchain's benefit; it also improves forwards compatibility for kernels/loaders, as they won't know about new versions, but may know about existing versions, so if they support "Xvendorbase" then software that wants to take advantage of "Xvendorbase" and "Xvendornew" can still use "Xvendorbase", but if the kernel/loader only supports "Xvendor1p0" and software wants to use "Xvendor2p0" then it can't do anything, it'd need to be more careful and also have an "Xvendor1p0" implementation. Extending rather than redefining comes with real benefits.

BTW, it's been supported to parse version of -march in clang side.

It parses and checks the version, but it only allows the version of the extension Clang currently implements. Parsing the version is a hard requirement since it can be part of a valid arch string. Supporting multiple versions is not.

jrtc27 added inline comments.Jan 5 2022, 9:55 PM
llvm/lib/Support/RISCVISAInfo.cpp
61

This one too?..

zixuan-wu added a comment.EditedJan 5 2022, 10:33 PM

but also with RISC-V extensions not being changed once ratified any more (changes mean new extensions entirely, not new versions)

I don't think so. Or why is there version in RISC-V spec?

That was added years ago before there was any plan/policy for ratifying new extensions beyond the initial GC set. https://docs.google.com/presentation/d/1nQ5uFb39KA6gvUi5SReWfIQSiRN7hp6z7ZPfctE4mKk/edit#slide=id.p1 is the current lifecycle; note that only errata can be fixed after ratification, everything else requires a new extension (see the blue arrow and brown box below it). As far as I can tell, for new extensions it serves no purpose other than to distinguish draft specs from ratified ones.

And not only for standard extension, but also it's needed in custom extension.

Vendor extensions are going to be enough of a support pain in Clang. I sincerely hope they don't make life worse by defining multiple versions, rather than doing it properly and defining new extensions every time they add things. It's not just for the toolchain's benefit; it also improves forwards compatibility for kernels/loaders, as they won't know about new versions, but may know about existing versions, so if they support "Xvendorbase" then software that wants to take advantage of "Xvendorbase" and "Xvendornew" can still use "Xvendorbase", but if the kernel/loader only supports "Xvendor1p0" and software wants to use "Xvendor2p0" then it can't do anything, it'd need to be more careful and also have an "Xvendor1p0" implementation. Extending rather than redefining comes with real benefits.

BTW, it's been supported to parse version of -march in clang side.

It parses and checks the version, but it only allows the version of the extension Clang currently implements. Parsing the version is a hard requirement since it can be part of a valid arch string. Supporting multiple versions is not.

If I don't understand wrong, all you want to say is that the extension version is just for indication, and not for functionality? So the RV spec does not require compiler to support multi-version.

Anybody else has more comments about support multi-version extension? Or it has been decided already by RISC-V foundation?

khchen added a subscriber: khchen.Jan 24 2022, 8:11 AM

I think this would benefit from increased test coverage, namely to show that the mattr command-line options are properly handled. Some possible ideas:

  • Tests with the correct extension versions (maybe add a test file that exercises the version for all extensions).
  • Tests that show an error message with unsupported versions.
  • A test that shows that something like mattr=+m,+m2p1 is allowed (or not).

Nit: fix the lint / no new line warnings.

I agree with @luismarques, we need increased test coverage.

BTW, I think this patch need to work with D113237 to show the ability on supporting multiple extension version.

Anybody else has more comments about support multi-version extension? Or it has been decided already by RISC-V foundation?

IMPO, supporting multi version for ratified extensions in compiler does make sense to me.
GCC already did it, but I'm not sure is there any related discussion before when gcc decided to support multi version extension (or misa-spec).

clang/test/Driver/riscv-arch-version.c
7

I'm thinking do we also need to have -target-feature +m2p0 here since we are going to support multiple extension version?

llvm/lib/Target/RISCV/RISCV.td
14

nit: It will be good to have a comment to talk about what's default version come from?
I guess the default is -misa-spec=2.2, right?

llvm/test/CodeGen/RISCV/attributes-version.ll
3

For example, we need to have test for llc -mattr=+m,+m2p0, and invalid test for mattr=+m,+m1p9