This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Object/
-
llvm/
-
Object/
3
ELFObjectFile.h
3/3
ObjectFile.h
-
lib/Object/
-
Object/
3/6
ELFObjectFile.cpp
-
test/tools/llvm-objdump/ELF/AMDGPU/
-
tools/
-
llvm-objdump/
-
ELF/
-
AMDGPU/
3/8
subtarget.ll
-
tools/llvm-objdump/
-
llvm-objdump/
6/9
llvm-objdump.cpp

Differential D84519

[llvm-objdump][AMDGPU] Detect CPU string
ClosedPublic

Authored by rochauha on Jul 24 2020, 5:06 AM.

Download Raw Diff

Details

Reviewers

scott.linder
t-tye
• espindola
jhenderson
MaskRay
echristo

Commits

rGe760e85680d6: [llvm-objdump][AMDGPU] Detect CPU string

Summary

AMDGPU ISA isn't backwards compatible and hence -mcpu must always be specified during disassembly.
However, the AMDGPU target CPU is stored in e_flags in the ELF object.

This patch allows targets to implement CPU string detection, and also implements it for AMDGPU by looking at e_flags.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

rochauha created this revision.Jul 24 2020, 5:06 AM

Herald added a reviewer: • espindola. · View Herald TranscriptJul 24 2020, 5:06 AM

Herald added a reviewer: jhenderson. · View Herald Transcript

Herald added a reviewer: MaskRay. · View Herald Transcript

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: llvm-commits, kerbowa, rupprecht and 9 others. · View Herald Transcript

Harbormaster failed remote builds in B65541: Diff 280413!Jul 24 2020, 7:00 AM

scott.linder added inline comments.Jul 24 2020, 8:07 AM

llvm/test/tools/llvm-objdump/ELF/AMDGPU/subtarget.ll
10	Would it make sense to have tests at various optimization levels? Generally we seem to prefer writing tests at the default opt level unless the test is explicitly checking something at a certain opt level.
llvm/tools/llvm-objdump/llvm-objdump.cpp
2156	Between the lint and not knowing off-hand what `getHeader` returns, I'd probably just not use auto here. I think the full type can be as simple as `ELF64LE::Ehdr`? I would also prefer dropping `The`, unless it is there to avoid colliding with another name?
2160–2194	Can this just be `StringRef ArchName = AMDGPUTargetStreamer::getArchNameFromElfMach(e_flags && EF_AMDGPU_MACH); return ArchName.empty() ? None : ArchName`? If there is an issue with making that function visible here I would propose just moving the implementation, it seems like a pretty general thing.
2174	I would sink this under the condition for `MCPU.empty()` if you only need it in that case.

Some changes based on review by @scott.linder.

llvm/test/tools/llvm-objdump/ELF/AMDGPU/subtarget.ll
10	I don't have any strong opinion regarding this. The kernel is empty anyways. I chose `-O0` with the hope that the binary size would be slightly bigger than otherwise.
llvm/tools/llvm-objdump/llvm-objdump.cpp
2160–2194	From what I understand, TargetStreamer is used to emit binaries. But here we are going the other way round, we need to infer the details from the binary header. Is it possible to move `getArchNameFromElfMach()` to lib/Support/TargetParser.*? It would certainly make this patch cleaner. And the code would also be used across AMDGPUTargetStreamer and llvm-objdump.

Harbormaster failed remote builds in B65687: Diff 280667!Jul 25 2020, 5:04 AM

I am a bit worrisome that this patch is adding more target specific stuff to llvm-objdump. Can we step back and think whether precise CPU selection is really necessary? For example, PowerPC has the needs for different ISA levels but it simply uses -mcpu=future.

If different AMDGPU CPUs are incompatible, should such detection be moved to somewhere in lib/Target/AMDGPU/?

I'd really like all of the cpu specific bits here to not reside in the objdump binary. Ideally it should be in include/ and lib/ somewhere.

This revision now requires changes to proceed.Jul 26 2020, 10:42 PM

In D84519#2174866, @MaskRay wrote:

I am a bit worrisome that this patch is adding more target specific stuff to llvm-objdump.

Agreed, I have looked at making the majority of llvm-objdump (and essentially everything under tools/) into a library at one point, but the biggest initial hurdle is the CommandLine parsing library. I think ideally all of the target-relevant parts would be in target-overridden implementations in some library, all of the llvm-objdump behavior parts (like the core symbol-to-symbol loop) would be in an llvm-objdump library which depends on the libraries with target-specific behaviors, and the tools/ directory would contain ~10 line main functions.

Can we step back and think whether precise CPU selection is really necessary? For example, PowerPC has the needs for different ISA levels but it simply uses -mcpu=future.

I am a bit lost on what you mean; is there a significant cost to providing precise CPU selection? It doesn't make sense to require the user to e.g. run the dumper to inspect the binary in order to come up with the command-line to run the dumper in order to inspect the binary.

If different AMDGPU CPUs are incompatible, should such detection be moved to somewhere in lib/Target/AMDGPU/?

Yes, I think it should be done in a generic way. All other targets which don't need/want it just getting the default (existing) behavior.

In D84519#2176909, @scott.linder wrote:

In D84519#2174866, @MaskRay wrote:

Can we step back and think whether precise CPU selection is really necessary? For example, PowerPC has the needs for different ISA levels but it simply uses -mcpu=future.

I am a bit lost on what you mean; is there a significant cost to providing precise CPU selection? It doesn't make sense to require the user to e.g. run the dumper to inspect the binary in order to come up with the command-line to run the dumper in order to inspect the binary.

-mcpu=pwr10 decodes more instructions than -mcpu=pwr9.
-mcpu=pwr9 decodes more instructions than -mcpu=pwr8.

I think GNU objdump just decodes all instructions it recognizes, not expecting a -mcpu option. It behaves as if the default recognized ISA level includes everything.

In D84519#2177164, @MaskRay wrote:

In D84519#2176909, @scott.linder wrote:

In D84519#2174866, @MaskRay wrote:

Can we step back and think whether precise CPU selection is really necessary? For example, PowerPC has the needs for different ISA levels but it simply uses -mcpu=future.

I am a bit lost on what you mean; is there a significant cost to providing precise CPU selection? It doesn't make sense to require the user to e.g. run the dumper to inspect the binary in order to come up with the command-line to run the dumper in order to inspect the binary.

-mcpu=pwr10 decodes more instructions than -mcpu=pwr9.
-mcpu=pwr9 decodes more instructions than -mcpu=pwr8.

I think GNU objdump just decodes all instructions it recognizes, not expecting a -mcpu option. It behaves as if the default recognized ISA level includes everything.

Ah, that makes sense. For AMDGCN your earlier point (and Eric's point in email, which doesn't seem to have made it here) concerning incompatibility applies: we don't have forward or backward compatibility between ISA generations. Essentially any change to mcpu (and some features, IIUC) can make two ISAs fundamentally incompatible. I think we have attempted to approximate a "default" ISA before for things like disassembly and failed.

In D84519#2178859, @scott.linder wrote:

In D84519#2177164, @MaskRay wrote:

In D84519#2176909, @scott.linder wrote:

In D84519#2174866, @MaskRay wrote:

Can we step back and think whether precise CPU selection is really necessary? For example, PowerPC has the needs for different ISA levels but it simply uses -mcpu=future.

I am a bit lost on what you mean; is there a significant cost to providing precise CPU selection? It doesn't make sense to require the user to e.g. run the dumper to inspect the binary in order to come up with the command-line to run the dumper in order to inspect the binary.

-mcpu=pwr10 decodes more instructions than -mcpu=pwr9.
-mcpu=pwr9 decodes more instructions than -mcpu=pwr8.

I think GNU objdump just decodes all instructions it recognizes, not expecting a -mcpu option. It behaves as if the default recognized ISA level includes everything.

Ah, that makes sense. For AMDGCN your earlier point (and Eric's point in email, which doesn't seem to have made it here) concerning incompatibility applies: we don't have forward or backward compatibility between ISA generations. Essentially any change to mcpu (and some features, IIUC) can make two ISAs fundamentally incompatible. I think we have attempted to approximate a "default" ISA before for things like disassembly and failed.

For example in one minor revision, the interpretation of an opcode was changed so we have to change the name in the disassembly based on the individual feature

In D84519#2176909, @scott.linder wrote:

In D84519#2174866, @MaskRay wrote:

I am a bit worrisome that this patch is adding more target specific stuff to llvm-objdump.

Agreed, I have looked at making the majority of llvm-objdump (and essentially everything under tools/) into a library at one point, but the biggest initial hurdle is the CommandLine parsing library. I think ideally all of the target-relevant parts would be in target-overridden implementations in some library, all of the llvm-objdump behavior parts (like the core symbol-to-symbol loop) would be in an llvm-objdump library which depends on the libraries with target-specific behaviors, and the tools/ directory would contain ~10 line main functions.

Can we step back and think whether precise CPU selection is really necessary? For example, PowerPC has the needs for different ISA levels but it simply uses -mcpu=future.

I am a bit lost on what you mean; is there a significant cost to providing precise CPU selection? It doesn't make sense to require the user to e.g. run the dumper to inspect the binary in order to come up with the command-line to run the dumper in order to inspect the binary.

If different AMDGPU CPUs are incompatible, should such detection be moved to somewhere in lib/Target/AMDGPU/?

Yes, I think it should be done in a generic way. All other targets which don't need/want it just getting the default (existing) behavior.

I'd like to point out that this patch correctly detects the CPU string by looking at the binary. The CPU string needs to be determined beforehand and passed on to the SubtargetInfo constructor.

To move the CPU string detection to lib/Target/AMDGPU, the target would also need to look at the binary. I think, one would have to pass ObjectFile * to the MCSubtargetInfo constructor. This allows targets to do the needful in their respective constructors. I'd like to know thoughts/opinions on this.

rochauha retitled this revision from [llvm-objdump][AMDGPU] Detect subtarget to [llvm-objdump][AMDGPU] Detect CPU string.Aug 5 2020, 2:23 AM

In D84519#2195705, @rochauha wrote:

In D84519#2176909, @scott.linder wrote:

In D84519#2174866, @MaskRay wrote:

I am a bit worrisome that this patch is adding more target specific stuff to llvm-objdump.

Agreed, I have looked at making the majority of llvm-objdump (and essentially everything under tools/) into a library at one point, but the biggest initial hurdle is the CommandLine parsing library. I think ideally all of the target-relevant parts would be in target-overridden implementations in some library, all of the llvm-objdump behavior parts (like the core symbol-to-symbol loop) would be in an llvm-objdump library which depends on the libraries with target-specific behaviors, and the tools/ directory would contain ~10 line main functions.

Can we step back and think whether precise CPU selection is really necessary? For example, PowerPC has the needs for different ISA levels but it simply uses -mcpu=future.

I am a bit lost on what you mean; is there a significant cost to providing precise CPU selection? It doesn't make sense to require the user to e.g. run the dumper to inspect the binary in order to come up with the command-line to run the dumper in order to inspect the binary.

If different AMDGPU CPUs are incompatible, should such detection be moved to somewhere in lib/Target/AMDGPU/?

Yes, I think it should be done in a generic way. All other targets which don't need/want it just getting the default (existing) behavior.

I'd like to point out that this patch correctly detects the CPU string by looking at the binary. The CPU string needs to be determined beforehand and passed on to the SubtargetInfo constructor.

To move the CPU string detection to lib/Target/AMDGPU, the target would also need to look at the binary. I think, one would have to pass ObjectFile * to the MCSubtargetInfo constructor. This allows targets to do the needful in their respective constructors. I'd like to know thoughts/opinions on this.

Maybe lib/Target/* is not the right place for the code then? If the dependency is on Object then where would the logical place for the code to live be?

I think "anywhere but llvm-objdump.cpp" is the concern, but I don't know where the right library is for target-specific logic related to Objects.

To me, it doesn't look like this is 'real' target-specific work. For example the target triple is detected within llvm-objdump itself using the getTarget() function. I'd like to understand - what is different in case CPU string detection that it needs to reside somewhere else?

However, one place that I can think of is the Object/ELFObjectFile.h. There are functions like getFileFormatName() and getArch(). Perhaps might be reasonable to have a getCPUNameIfPossible() too?

This seems to be related to the push of the cpu-id (aka. target-id) into other places as well: e.g., D84822, D80750, probably more.
I mentioned in those reviews already, this doesn't strike me as an "AMDGPU" feature at all.
We should start the discussions around missing functionality instead of adding AMDGPU workarounds in all these places...
One example is an IR module level target cpu and/or target feature, similar to target triple.

Also, the commit doesn't have a message that explains what is going on.

In D84519#2200683, @jdoerfert wrote:

This seems to be related to the push of the cpu-id (aka. target-id) into other places as well: e.g., D84822, D80750, probably more.
I mentioned in those reviews already, this doesn't strike me as an "AMDGPU" feature at all.
We should start the discussions around missing functionality instead of adding AMDGPU workarounds in all these places...
One example is an IR module level target cpu and/or target feature, similar to target triple.

Also, the commit doesn't have a message that explains what is going on.

I agree, but can we at least fix the ICE in llvm-objdump in the short-term in terms of the reality we are currently living in? I didn't look into this case closely before commenting last, but immediately above the diff in disassembleObject there is a call to ObjectFile::getFeatures, which is a pure virtual function implemented in the various concrete object file format implementations with reference to specific targets. For example, ELFObjectFileBase::getFeatures is implemented as:

SubtargetFeatures ELFObjectFileBase::getFeatures() const {
  switch (getEMachine()) {
  case ELF::EM_MIPS:
    return getMIPSFeatures();
  case ELF::EM_ARM:
    return getARMFeatures();
  case ELF::EM_RISCV:
    return getRISCVFeatures();
  default:
    return SubtargetFeatures();
  }
}

Which is just the same code Ronak is adding here, hoisted into the Object library. Again, this is not AMDGPU specific, but somewhere it has to be implemented in terms of AMDGPU (because we define the ABI of our code objects).

I worry that we will just have to live with broken tools for months while we try to fix up core parts of LLVM, and all the while we could make a small, well-defined change consistent with the existing logic in the Object library to get to something useful. The "precedent" for this sort of code in Object is then just https://reviews.llvm.org/D21125. If we come along soon after and implement a vastly superior, internally-consistent model of all of these things and update Object, we will have to have updated e.g. getFeatures anyway, so there is no loss.

Pulled CPU string detection logic into ELFObjectFileBase.
Added details in commit message.

Herald added a subscriber: hiraditya. · View Herald TranscriptAug 6 2020, 11:29 PM

Harbormaster completed remote builds in B67418: Diff 283818.Aug 7 2020, 12:57 AM

scott.linder added inline comments.Aug 7 2020, 2:04 PM

llvm/include/llvm/Object/ObjectFile.h
330	Can you document this, noting that it returns an empty `StringRef` when no CPU can be detected? I am also curious why you dropped the `Optional`, it seemed like a nicer interface that made the documentation of the "empty" case part of the code. I suppose in C++ there is no way to elide the extra storage for the `Optional`, but I would still think it is worth using. I would also prefer to drop `Target` from the name, as `getArch`/`getFeatures` do not mention `{Sub}Target`.
llvm/lib/Object/ELFObjectFile.cpp
369	I would prefer to just name this following the style guide, I don't think this is a case where breaking it helps readability.
llvm/tools/llvm-objdump/llvm-objdump.cpp
2174	I don't think we need to even mention "AMDGPU" here, and the bit about an empty string being returned in the "failure" case should just be in the doc-comment on `getTargetCPUName`; I would instead just indicate the call is fallible by adding "try" somewhere. All in all, I'd just replace this whole comment with `// If MCPU Isn't specified, try to detect it.`
2176	I think using the ternary here actually makes this longer/harder to read compared to an `if` (for example, you need the explicit call to `.str()` rather than getting the `std::string(StringRef)` constructor implicitly).

[Don't forget the commit message, also here in the phab review please]

rochauha added inline comments.Aug 8 2020, 12:50 AM

llvm/include/llvm/Object/ObjectFile.h
330	Since `getArch` / `getFeatures` were not using `Optional`, I thought maybe the convention should be maintained. But I do agree that `Optional` is the more self documenting interface in a way.

Changes based on review by @scott.linder

rochauha edited the summary of this revision. (Show Details)Aug 8 2020, 12:53 AM

rochauha edited the summary of this revision. (Show Details)

rochauha marked 3 inline comments as done.Aug 8 2020, 12:56 AM

rochauha added inline comments.

llvm/include/llvm/Object/ObjectFile.h
330	Changed to using `Optional`.
llvm/tools/llvm-objdump/llvm-objdump.cpp
2174	Done.

Harbormaster completed remote builds in B67585: Diff 284125.Aug 8 2020, 1:49 AM

Looking good to me! Some small nits, and it would be good to hear back on what the general consensus on adding this to Object is.

llvm/lib/Object/ELFObjectFile.cpp
368	Can you add `assert(getEMachine() == ELF::EM_AMDGPU)`; ?
372	I think there should be an implicit `StringRef(const char *)` constructor, so you can drop the explicit call here, same for the rest of the returns here.
llvm/test/tools/llvm-objdump/ELF/AMDGPU/subtarget.ll
11	shouldn't these `specify.txt` and `detect.txt` include `%t` somewhere in their path? I.e. `%t-specify.txt` and `%t-detect.txt`?
llvm/tools/llvm-objdump/llvm-objdump.cpp
2176–2177	Small nit, but I think this reads better as: if (MCPU.empty()) MCPU = Obj->tryGetCPUName().getValueOr(""); IMO the ternary operator just makes things more complicated, and you aren't exposing the intermediate type of `tryGetCPUName` anyway if you are using `auto`, so we don't lose anything.

jhenderson added inline comments.Aug 13 2020, 12:22 AM

llvm/include/llvm/Object/ELFObjectFile.h
91	I know it's not the way the `get*Features` methods are written, but perhaps this should be a private method?

scott.linder added inline comments.Aug 13 2020, 9:03 AM

llvm/include/llvm/Object/ELFObjectFile.h
91	I was thinking the same, but as you mention the convention here seems to be that these are public. If we think that is just an oversight maybe we can just follow up with a patch to hide all of them?

jhenderson added inline comments.Aug 14 2020, 12:09 AM

llvm/include/llvm/Object/ELFObjectFile.h
91	That would also be fine by me.

Changes based on comments by @scott.linder.

rochauha marked 4 inline comments as done.Aug 14 2020, 11:47 PM

Harbormaster completed remote builds in B68507: Diff 285817.Aug 15 2020, 12:25 AM

MaskRay added inline comments.Aug 15 2020, 5:40 PM

llvm/tools/llvm-objdump/llvm-objdump.cpp
2176	https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements I think the code is self-explaining. There is no need for the comment.

Removed comment and braces.

rochauha marked an inline comment as done.Aug 15 2020, 10:05 PM

Harbormaster completed remote builds in B68545: Diff 285879.Aug 15 2020, 10:36 PM

LGTM, and can you please open a follow-up review once this is committed to hide all of the Object::get<Target><Stuff> methods?

MaskRay added inline comments.Aug 17 2020, 10:40 AM

llvm/lib/Object/ELFObjectFile.cpp
382	Why is the comment not aligned with `case` ?
llvm/test/tools/llvm-objdump/ELF/AMDGPU/subtarget.ll
2	In these binary utility directories, we usually place RUN/CHECK lines before the text.
6	In these binary utility directories, we use `;;` for non-RUN-non-CHECK comments. The different comment marker makes comments stand out.

rochauha added inline comments.Aug 17 2020, 10:40 PM

llvm/lib/Object/ELFObjectFile.cpp
382	This is the result after running clang-format.
llvm/test/tools/llvm-objdump/ELF/AMDGPU/subtarget.ll
2	Since the function is pretty small, I thought it would be helpful to have it first. Do you think the layout should be changed?
6	I looked at a few tests in the binary utility tests but didn't find this pattern. Do you suggest that this should be changed in this test?

In D84519#2221836, @scott.linder wrote:

LGTM, and can you please open a follow-up review once this is committed to hide all of the Object::get<Target><Stuff> methods?

Yes, I intend to do this.

jhenderson added inline comments.Aug 18 2020, 2:47 AM

llvm/lib/Object/ELFObjectFile.cpp
382	Sounds like a clang-format bug that should be reported? I guess the problem is distinguishing comments for the case lines from those at the end of the previous case. Still, with the blank line before I think it's pretty clear that in this case the comment applies to the case.
llvm/test/tools/llvm-objdump/ELF/AMDGPU/subtarget.ll
6	A lot of the older tests were written before we adopted the double-comment marker style, and haven't been updated. Newer tests should be written with the new style.

This revision was not accepted when it landed; it landed in state Needs Review.Aug 18 2020, 5:14 AM

Closed by commit rGe760e85680d6: [llvm-objdump][AMDGPU] Detect CPU string (authored by rochauha). · Explain Why

This revision was automatically updated to reflect the committed changes.

rochauha added a commit: rGe760e85680d6: [llvm-objdump][AMDGPU] Detect CPU string.

I have manually aligned the comment to the case statement before landing the patch. I think some investigation for clang-format may be necessary.

rochauha added a child revision: D86136: [ELF] Hide target specific methods as private.Aug 18 2020, 5:47 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

Object/

ELFObjectFile.h

4 lines

ObjectFile.h

1 line

lib/

Object/

ELFObjectFile.cpp

111 lines

test/

tools/

llvm-objdump/

ELF/

AMDGPU/

subtarget.ll

83 lines

tools/

llvm-objdump/

llvm-objdump.cpp

4 lines

Diff 286256

llvm/include/llvm/Object/ELFObjectFile.h

Show First 20 Lines • Show All 80 Lines • ▼ Show 20 Lines	public:
SubtargetFeatures getFeatures() const override;		SubtargetFeatures getFeatures() const override;

SubtargetFeatures getMIPSFeatures() const;		SubtargetFeatures getMIPSFeatures() const;

SubtargetFeatures getARMFeatures() const;		SubtargetFeatures getARMFeatures() const;

SubtargetFeatures getRISCVFeatures() const;		SubtargetFeatures getRISCVFeatures() const;

		Optional<StringRef> tryGetCPUName() const override;

		StringRef getAMDGPUCPUName() const;
		jhendersonUnsubmitted Not Done Reply Inline Actions I know it's not the way the `getFeatures` methods are written, but perhaps this should be a private method? jhenderson:* I know it's not the way the `get*Features` methods are written, but perhaps this should be a…
		scott.linderUnsubmitted Not Done Reply Inline Actions I was thinking the same, but as you mention the convention here seems to be that these are public. If we think that is just an oversight maybe we can just follow up with a patch to hide all of them? scott.linder: I was thinking the same, but as you mention the convention here seems to be that these are…
		jhendersonUnsubmitted Not Done Reply Inline Actions That would also be fine by me. jhenderson: That would also be fine by me.

void setARMSubArch(Triple &TheTriple) const override;		void setARMSubArch(Triple &TheTriple) const override;

virtual uint16_t getEType() const = 0;		virtual uint16_t getEType() const = 0;

virtual uint16_t getEMachine() const = 0;		virtual uint16_t getEMachine() const = 0;

std::vector<std::pair<Optional<DataRefImpl>, uint64_t>>		std::vector<std::pair<Optional<DataRefImpl>, uint64_t>>
getPltAddresses() const;		getPltAddresses() const;
▲ Show 20 Lines • Show All 1,152 Lines • Show Last 20 Lines

llvm/include/llvm/Object/ObjectFile.h

Show First 20 Lines • Show All 321 Lines • ▼ Show 20 Lines	public:

/// The number of bytes used to represent an address in this object		/// The number of bytes used to represent an address in this object
/// file format.		/// file format.
virtual uint8_t getBytesInAddress() const = 0;		virtual uint8_t getBytesInAddress() const = 0;

virtual StringRef getFileFormatName() const = 0;		virtual StringRef getFileFormatName() const = 0;
virtual Triple::ArchType getArch() const = 0;		virtual Triple::ArchType getArch() const = 0;
virtual SubtargetFeatures getFeatures() const = 0;		virtual SubtargetFeatures getFeatures() const = 0;
		virtual Optional<StringRef> tryGetCPUName() const { return None; };
		scott.linderUnsubmitted Done Reply Inline Actions Can you document this, noting that it returns an empty `StringRef` when no CPU can be detected? I am also curious why you dropped the `Optional`, it seemed like a nicer interface that made the documentation of the "empty" case part of the code. I suppose in C++ there is no way to elide the extra storage for the `Optional`, but I would still think it is worth using. I would also prefer to drop `Target` from the name, as `getArch`/`getFeatures` do not mention `{Sub}Target`. scott.linder: Can you document this, noting that it returns an empty `StringRef` when no CPU can be detected?
		rochauhaAuthorUnsubmitted Done Reply Inline Actions Since `getArch` / `getFeatures` were not using `Optional`, I thought maybe the convention should be maintained. But I do agree that `Optional` is the more self documenting interface in a way. rochauha: Since `getArch` / `getFeatures` were not using `Optional`, I thought maybe the convention…
		rochauhaAuthorUnsubmitted Done Reply Inline Actions Changed to using `Optional`. rochauha: Changed to using `Optional`.
virtual void setARMSubArch(Triple &TheTriple) const { }		virtual void setARMSubArch(Triple &TheTriple) const { }
virtual Expected<uint64_t> getStartAddress() const {		virtual Expected<uint64_t> getStartAddress() const {
return errorCodeToError(object_error::parse_failed);		return errorCodeToError(object_error::parse_failed);
};		};

/// Create a triple from the data in this object file.		/// Create a triple from the data in this object file.
Triple makeTriple() const;		Triple makeTriple() const;

▲ Show 20 Lines • Show All 253 Lines • Show Last 20 Lines

llvm/lib/Object/ELFObjectFile.cpp

Show First 20 Lines • Show All 349 Lines • ▼ Show 20 Lines	case ELF::EM_ARM:
return getARMFeatures();		return getARMFeatures();
case ELF::EM_RISCV:		case ELF::EM_RISCV:
return getRISCVFeatures();		return getRISCVFeatures();
default:		default:
return SubtargetFeatures();		return SubtargetFeatures();
}		}
}		}

		Optional<StringRef> ELFObjectFileBase::tryGetCPUName() const {
		switch (getEMachine()) {
		case ELF::EM_AMDGPU:
		return getAMDGPUCPUName();
		default:
		return None;
		}
		}

		StringRef ELFObjectFileBase::getAMDGPUCPUName() const {
		assert(getEMachine() == ELF::EM_AMDGPU);
		scott.linderUnsubmitted Done Reply Inline Actions Can you add `assert(getEMachine() == ELF::EM_AMDGPU)`; ? scott.linder: Can you add `assert(getEMachine() == ELF::EM_AMDGPU)`; ?
		unsigned CPU = getPlatformFlags() & ELF::EF_AMDGPU_MACH;
		scott.linderUnsubmitted Done Reply Inline Actions I would prefer to just name this following the style guide, I don't think this is a case where breaking it helps readability. scott.linder: I would prefer to just name this following the style guide, I don't think this is a case where…

		switch (CPU) {
		// Radeon HD 2000/3000 Series (R600).
		scott.linderUnsubmitted Done Reply Inline Actions I think there should be an implicit `StringRef(const char )` constructor, so you can drop the explicit call here, same for the rest of the returns here. scott.linder:* I think there should be an implicit `StringRef(const char *)` constructor, so you can drop the…
		case ELF::EF_AMDGPU_MACH_R600_R600:
		return "r600";
		case ELF::EF_AMDGPU_MACH_R600_R630:
		return "r630";
		case ELF::EF_AMDGPU_MACH_R600_RS880:
		return "rs880";
		case ELF::EF_AMDGPU_MACH_R600_RV670:
		return "rv670";

		// Radeon HD 4000 Series (R700).
		MaskRayUnsubmitted Not Done Reply Inline Actions Why is the comment not aligned with `case` ? MaskRay: Why is the comment not aligned with `case `?
		rochauhaAuthorUnsubmitted Not Done Reply Inline Actions This is the result after running clang-format. rochauha: This is the result after running clang-format.
		jhendersonUnsubmitted Not Done Reply Inline Actions Sounds like a clang-format bug that should be reported? I guess the problem is distinguishing comments for the case lines from those at the end of the previous case. Still, with the blank line before I think it's pretty clear that in this case the comment applies to the case. jhenderson: Sounds like a clang-format bug that should be reported? I guess the problem is distinguishing…
		case ELF::EF_AMDGPU_MACH_R600_RV710:
		return "rv710";
		case ELF::EF_AMDGPU_MACH_R600_RV730:
		return "rv730";
		case ELF::EF_AMDGPU_MACH_R600_RV770:
		return "rv770";

		// Radeon HD 5000 Series (Evergreen).
		case ELF::EF_AMDGPU_MACH_R600_CEDAR:
		return "cedar";
		case ELF::EF_AMDGPU_MACH_R600_CYPRESS:
		return "cypress";
		case ELF::EF_AMDGPU_MACH_R600_JUNIPER:
		return "juniper";
		case ELF::EF_AMDGPU_MACH_R600_REDWOOD:
		return "redwood";
		case ELF::EF_AMDGPU_MACH_R600_SUMO:
		return "sumo";

		// Radeon HD 6000 Series (Northern Islands).
		case ELF::EF_AMDGPU_MACH_R600_BARTS:
		return "barts";
		case ELF::EF_AMDGPU_MACH_R600_CAICOS:
		return "caicos";
		case ELF::EF_AMDGPU_MACH_R600_CAYMAN:
		return "cayman";
		case ELF::EF_AMDGPU_MACH_R600_TURKS:
		return "turks";

		// AMDGCN GFX6.
		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX600:
		return "gfx600";
		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX601:
		return "gfx601";

		// AMDGCN GFX7.
		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX700:
		return "gfx700";
		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX701:
		return "gfx701";
		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX702:
		return "gfx702";
		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX703:
		return "gfx703";
		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX704:
		return "gfx704";

		// AMDGCN GFX8.
		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX801:
		return "gfx801";
		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX802:
		return "gfx802";
		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX803:
		return "gfx803";
		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX810:
		return "gfx810";

		// AMDGCN GFX9.
		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX900:
		return "gfx900";
		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX902:
		return "gfx902";
		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX904:
		return "gfx904";
		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX906:
		return "gfx906";
		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX908:
		return "gfx908";
		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX909:
		return "gfx909";

		// AMDGCN GFX10.
		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1010:
		return "gfx1010";
		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1011:
		return "gfx1011";
		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1012:
		return "gfx1012";
		case ELF::EF_AMDGPU_MACH_AMDGCN_GFX1030:
		return "gfx1030";

		default:
		llvm_unreachable("Unknown EF_AMDGPU_MACH value");
		}
		}

// FIXME Encode from a tablegen description or target parser.		// FIXME Encode from a tablegen description or target parser.
void ELFObjectFileBase::setARMSubArch(Triple &TheTriple) const {		void ELFObjectFileBase::setARMSubArch(Triple &TheTriple) const {
if (TheTriple.getSubArch() != Triple::NoSubArch)		if (TheTriple.getSubArch() != Triple::NoSubArch)
return;		return;

ARMAttributeParser Attributes;		ARMAttributeParser Attributes;
if (Error E = getBuildAttributes(Attributes)) {		if (Error E = getBuildAttributes(Attributes)) {
// TODO Propagate Error.		// TODO Propagate Error.
▲ Show 20 Lines • Show All 150 Lines • Show Last 20 Lines

llvm/test/tools/llvm-objdump/ELF/AMDGPU/subtarget.ll

This file was added.

				define amdgpu_kernel void @test_kernel() {
				ret void
				MaskRayUnsubmitted Not Done Reply Inline Actions In these binary utility directories, we usually place RUN/CHECK lines before the text. MaskRay: In these binary utility directories, we usually place RUN/CHECK lines before the text.
				rochauhaAuthorUnsubmitted Done Reply Inline Actions Since the function is pretty small, I thought it would be helpful to have it first. Do you think the layout should be changed? rochauha: Since the function is pretty small, I thought it would be helpful to have it first. Do you…
				}

				; Test subtarget detection. Disassembly is only supported for GFX8 and beyond.
				;
				MaskRayUnsubmitted Not Done Reply Inline Actions In these binary utility directories, we use `;;` for non-RUN-non-CHECK comments. The different comment marker makes comments stand out. MaskRay: In these binary utility directories, we use `;; ` for non-RUN-non-CHECK comments. The different…
				rochauhaAuthorUnsubmitted Not Done Reply Inline Actions I looked at a few tests in the binary utility tests but didn't find this pattern. Do you suggest that this should be changed in this test? rochauha: I looked at a few tests in the binary utility tests but didn't find this pattern. Do you…
				jhendersonUnsubmitted Not Done Reply Inline Actions A lot of the older tests were written before we adopted the double-comment marker style, and haven't been updated. Newer tests should be written with the new style. jhenderson: A lot of the older tests were written before we adopted the double-comment marker style, and…
				; ----------------------------------GFX10--------------------------------------
				;
				; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1030 -filetype=obj -O0 -o %t.o %s
				; RUN: llvm-objdump -D --arch-name=amdgcn --mcpu=gfx1030 %t.o > %t-specify.txt
				scott.linderUnsubmitted Not Done Reply Inline Actions Would it make sense to have tests at various optimization levels? Generally we seem to prefer writing tests at the default opt level unless the test is explicitly checking something at a certain opt level. scott.linder: Would it make sense to have tests at various optimization levels? Generally we seem to prefer…
				rochauhaAuthorUnsubmitted Done Reply Inline Actions I don't have any strong opinion regarding this. The kernel is empty anyways. I chose `-O0` with the hope that the binary size would be slightly bigger than otherwise. rochauha: I don't have any strong opinion regarding this. The kernel is empty anyways. I chose `-O0`…
				; RUN: llvm-objdump -D %t.o > %t-detect.txt
				scott.linderUnsubmitted Done Reply Inline Actions shouldn't these `specify.txt` and `detect.txt` include `%t` somewhere in their path? I.e. `%t-specify.txt` and `%t-detect.txt`? scott.linder: shouldn't these `specify.txt` and `detect.txt` include `%t` somewhere in their path? I.e. `%t…
				; RUN: diff %t-specify.txt %t-detect.txt

				; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1012 -filetype=obj -O0 -o %t.o %s
				; RUN: llvm-objdump -D --arch-name=amdgcn --mcpu=gfx1012 %t.o > %t-specify.txt
				; RUN: llvm-objdump -D %t.o > %t-detect.txt
				; RUN: diff %t-specify.txt %t-detect.txt

				; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1011 -filetype=obj -O0 -o %t.o %s
				; RUN: llvm-objdump -D --arch-name=amdgcn --mcpu=gfx1011 %t.o > %t-specify.txt
				; RUN: llvm-objdump -D %t.o > %t-detect.txt
				; RUN: diff %t-specify.txt %t-detect.txt

				; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1010 -filetype=obj -O0 -o %t.o %s
				; RUN: llvm-objdump -D --arch-name=amdgcn --mcpu=gfx1010 %t.o > %t-specify.txt
				; RUN: llvm-objdump -D %t.o > %t-detect.txt
				; RUN: diff %t-specify.txt %t-detect.txt


				; ----------------------------------GFX9---------------------------------------
				;
				; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx909 -filetype=obj -O0 -o %t.o %s
				; RUN: llvm-objdump -D --arch-name=amdgcn --mcpu=gfx909 %t.o > %t-specify.txt
				; RUN: llvm-objdump -D %t.o > %t-detect.txt
				; RUN: diff %t-specify.txt %t-detect.txt

				; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx908 -filetype=obj -O0 -o %t.o %s
				; RUN: llvm-objdump -D --arch-name=amdgcn --mcpu=gfx908 %t.o > %t-specify.txt
				; RUN: llvm-objdump -D %t.o > %t-detect.txt
				; RUN: diff %t-specify.txt %t-detect.txt

				; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx906 -filetype=obj -O0 -o %t.o %s
				; RUN: llvm-objdump -D --arch-name=amdgcn --mcpu=gfx906 %t.o > %t-specify.txt
				; RUN: llvm-objdump -D %t.o > %t-detect.txt
				; RUN: diff %t-specify.txt %t-detect.txt

				; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx904 -filetype=obj -O0 -o %t.o %s
				; RUN: llvm-objdump -D --arch-name=amdgcn --mcpu=gfx904 %t.o > %t-specify.txt
				; RUN: llvm-objdump -D %t.o > %t-detect.txt
				; RUN: diff %t-specify.txt %t-detect.txt

				; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx902 -filetype=obj -O0 -o %t.o %s
				; RUN: llvm-objdump -D --arch-name=amdgcn --mcpu=gfx902 %t.o > %t-specify.txt
				; RUN: llvm-objdump -D %t.o > %t-detect.txt
				; RUN: diff %t-specify.txt %t-detect.txt

				; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -filetype=obj -O0 -o %t.o %s
				; RUN: llvm-objdump -D --arch-name=amdgcn --mcpu=gfx900 %t.o > %t-specify.txt
				; RUN: llvm-objdump -D %t.o > %t-detect.txt
				; RUN: diff %t-specify.txt %t-detect.txt


				; ----------------------------------GFX8---------------------------------------
				;
				; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx810 -filetype=obj -O0 -o %t.o %s
				; RUN: llvm-objdump -D --arch-name=amdgcn --mcpu=gfx810 %t.o > %t-specify.txt
				; RUN: llvm-objdump -D %t.o > %t-detect.txt
				; RUN: diff %t-specify.txt %t-detect.txt

				; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx803 -filetype=obj -O0 -o %t.o %s
				; RUN: llvm-objdump -D --arch-name=amdgcn --mcpu=gfx803 %t.o > %t-specify.txt
				; RUN: llvm-objdump -D %t.o > %t-detect.txt
				; RUN: diff %t-specify.txt %t-detect.txt

				; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx802 -filetype=obj -O0 -o %t.o %s
				; RUN: llvm-objdump -D --arch-name=amdgcn --mcpu=gfx802 %t.o > %t-specify.txt
				; RUN: llvm-objdump -D %t.o > %t-detect.txt
				; RUN: diff %t-specify.txt %t-detect.txt

				; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx801 -filetype=obj -O0 -o %t.o %s
				; RUN: llvm-objdump -D --arch-name=amdgcn --mcpu=gfx801 %t.o > %t-specify.txt
				; RUN: llvm-objdump -D %t.o > %t-detect.txt
				; RUN: diff %t-specify.txt %t-detect.txt

llvm/tools/llvm-objdump/llvm-objdump.cpp

Show First 20 Lines • Show All 2,147 Lines • ▼ Show 20 Lines	for (StringRef Sym : MissingDisasmSymbolSet.keys())
reportWarning("failed to disassemble missing symbol " + Sym, FileName);		reportWarning("failed to disassemble missing symbol " + Sym, FileName);
}		}

static void disassembleObject(const ObjectFile *Obj, bool InlineRelocs) {		static void disassembleObject(const ObjectFile *Obj, bool InlineRelocs) {
const Target *TheTarget = getTarget(Obj);		const Target *TheTarget = getTarget(Obj);

// Package up features to be passed to target/subtarget		// Package up features to be passed to target/subtarget
SubtargetFeatures Features = Obj->getFeatures();		SubtargetFeatures Features = Obj->getFeatures();
if (!MAttrs.empty())		if (!MAttrs.empty())
		scott.linderUnsubmitted Done Reply Inline Actions Between the lint and not knowing off-hand what `getHeader` returns, I'd probably just not use auto here. I think the full type can be as simple as `ELF64LE::Ehdr`? I would also prefer dropping `The`, unless it is there to avoid colliding with another name? scott.linder: Between the lint and not knowing off-hand what `getHeader` returns, I'd probably just not use…
for (unsigned I = 0; I != MAttrs.size(); ++I)		for (unsigned I = 0; I != MAttrs.size(); ++I)
Features.AddFeature(MAttrs[I]);		Features.AddFeature(MAttrs[I]);

std::unique_ptr<const MCRegisterInfo> MRI(		std::unique_ptr<const MCRegisterInfo> MRI(
TheTarget->createMCRegInfo(TripleName));		TheTarget->createMCRegInfo(TripleName));
if (!MRI)		if (!MRI)
reportError(Obj->getFileName(),		reportError(Obj->getFileName(),
"no register info for target " + TripleName);		"no register info for target " + TripleName);

// Set up disassembler.		// Set up disassembler.
MCTargetOptions MCOptions;		MCTargetOptions MCOptions;
std::unique_ptr<const MCAsmInfo> AsmInfo(		std::unique_ptr<const MCAsmInfo> AsmInfo(
TheTarget->createMCAsmInfo(*MRI, TripleName, MCOptions));		TheTarget->createMCAsmInfo(*MRI, TripleName, MCOptions));
if (!AsmInfo)		if (!AsmInfo)
reportError(Obj->getFileName(),		reportError(Obj->getFileName(),
"no assembly info for target " + TripleName);		"no assembly info for target " + TripleName);

		if (MCPU.empty())
		scott.linderUnsubmitted Done Reply Inline Actions I would sink this under the condition for `MCPU.empty()` if you only need it in that case. scott.linder: I would sink this under the condition for `MCPU.empty()` if you only need it in that case.
		scott.linderUnsubmitted Done Reply Inline Actions I don't think we need to even mention "AMDGPU" here, and the bit about an empty string being returned in the "failure" case should just be in the doc-comment on `getTargetCPUName`; I would instead just indicate the call is fallible by adding "try" somewhere. All in all, I'd just replace this whole comment with `// If MCPU Isn't specified, try to detect it.` scott.linder: I don't think we need to even mention "AMDGPU" here, and the bit about an empty string being…
		rochauhaAuthorUnsubmitted Done Reply Inline Actions Done. rochauha: Done.
		MCPU = Obj->tryGetCPUName().getValueOr("").str();

		scott.linderUnsubmitted Not Done Reply Inline Actions I think using the ternary here actually makes this longer/harder to read compared to an `if` (for example, you need the explicit call to `.str()` rather than getting the `std::string(StringRef)` constructor implicitly). scott.linder: I think using the ternary here actually makes this longer/harder to read compared to an `if`…
		MaskRayUnsubmitted Done Reply Inline Actions https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies-of-if-else-loop-statements I think the code is self-explaining. There is no need for the comment. MaskRay: https://llvm.org/docs/CodingStandards.html#don-t-use-braces-on-simple-single-statement-bodies…
std::unique_ptr<const MCSubtargetInfo> STI(		std::unique_ptr<const MCSubtargetInfo> STI(
		scott.linderUnsubmitted Done Reply Inline Actions Small nit, but I think this reads better as: if (MCPU.empty()) MCPU = Obj->tryGetCPUName().getValueOr(""); IMO the ternary operator just makes things more complicated, and you aren't exposing the intermediate type of `tryGetCPUName` anyway if you are using `auto`, so we don't lose anything. scott.linder: Small nit, but I think this reads better as: ``` if (MCPU.empty()) MCPU = Obj->tryGetCPUName…
TheTarget->createMCSubtargetInfo(TripleName, MCPU, Features.getString()));		TheTarget->createMCSubtargetInfo(TripleName, MCPU, Features.getString()));
if (!STI)		if (!STI)
reportError(Obj->getFileName(),		reportError(Obj->getFileName(),
"no subtarget info for target " + TripleName);		"no subtarget info for target " + TripleName);
std::unique_ptr<const MCInstrInfo> MII(TheTarget->createMCInstrInfo());		std::unique_ptr<const MCInstrInfo> MII(TheTarget->createMCInstrInfo());
if (!MII)		if (!MII)
reportError(Obj->getFileName(),		reportError(Obj->getFileName(),
"no instruction info for target " + TripleName);		"no instruction info for target " + TripleName);
MCObjectFileInfo MOFI;		MCObjectFileInfo MOFI;
MCContext Ctx(AsmInfo.get(), MRI.get(), &MOFI);		MCContext Ctx(AsmInfo.get(), MRI.get(), &MOFI);
// FIXME: for now initialize MCObjectFileInfo with default values		// FIXME: for now initialize MCObjectFileInfo with default values
MOFI.InitMCObjectFileInfo(Triple(TripleName), false, Ctx);		MOFI.InitMCObjectFileInfo(Triple(TripleName), false, Ctx);

std::unique_ptr<MCDisassembler> DisAsm(		std::unique_ptr<MCDisassembler> DisAsm(
TheTarget->createMCDisassembler(*STI, Ctx));		TheTarget->createMCDisassembler(*STI, Ctx));
if (!DisAsm)		if (!DisAsm)
reportError(Obj->getFileName(), "no disassembler for target " + TripleName);		reportError(Obj->getFileName(), "no disassembler for target " + TripleName);
		scott.linderUnsubmitted Not Done Reply Inline Actions Can this just be `StringRef ArchName = AMDGPUTargetStreamer::getArchNameFromElfMach(e_flags && EF_AMDGPU_MACH); return ArchName.empty() ? None : ArchName`? If there is an issue with making that function visible here I would propose just moving the implementation, it seems like a pretty general thing. scott.linder: Can this just be `StringRef ArchName = AMDGPUTargetStreamer::getArchNameFromElfMach(e_flags &&…
		rochauhaAuthorUnsubmitted Not Done Reply Inline Actions From what I understand, TargetStreamer is used to emit binaries. But here we are going the other way round, we need to infer the details from the binary header. Is it possible to move `getArchNameFromElfMach()` to lib/Support/TargetParser.? It would certainly make this patch cleaner. And the code would also be used across AMDGPUTargetStreamer and llvm-objdump. rochauha:* From what I understand, TargetStreamer is used to emit binaries. But here we are going the…

// If we have an ARM object file, we need a second disassembler, because		// If we have an ARM object file, we need a second disassembler, because
// ARM CPUs have two different instruction sets: ARM mode, and Thumb mode.		// ARM CPUs have two different instruction sets: ARM mode, and Thumb mode.
// We use mapping symbols to switch between the two assemblers, where		// We use mapping symbols to switch between the two assemblers, where
// appropriate.		// appropriate.
std::unique_ptr<MCDisassembler> SecondaryDisAsm;		std::unique_ptr<MCDisassembler> SecondaryDisAsm;
std::unique_ptr<const MCSubtargetInfo> SecondarySTI;		std::unique_ptr<const MCSubtargetInfo> SecondarySTI;
if (isArmElf(Obj) && !STI->checkFeatures("+mclass")) {		if (isArmElf(Obj) && !STI->checkFeatures("+mclass")) {
▲ Show 20 Lines • Show All 798 Lines • Show Last 20 Lines