This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/
-
llvm/
-
BinaryFormat/
12/14
GOFF.h
1/1
GOFFAda.def
-
Magic.h
-
Object/
-
Binary.h
21/22
GOFF.h
13/13
GOFFObjectFile.h
-
ObjectFile.h
-
lib/
-
BinaryFormat/
-
Magic.cpp
-
Object/
-
Binary.cpp
-
CMakeLists.txt
71/72
GOFFObjectFile.cpp
-
ObjectFile.cpp
-
SymbolicFile.cpp
-
unittests/
-
BinaryFormat/
-
TestFileMagic.cpp
-
Object/
-
CMakeLists.txt
1/9
GOFFObjectFileTest.cpp

Differential D89071

[SystemZ/z/OS] Add GOFFObjectFile class and details of GOFF file format
AbandonedPublic

Authored by yusra.syeda on Oct 8 2020, 1:56 PM.

Download Raw Diff

Details

Reviewers

Kai
uweigand
hubert.reinterpretcast
jhenderson
MaskRay
kpn

Summary

This patch details the GOFF file format and implements the GOFFObjectfile class.
This patch uses https://reviews.llvm.org/D88741

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

This updates the function createGOFFObjectFile to use Error instead of std::error_code as described in https://llvm.org/docs/ProgrammersManual.html#fallible-constructors

yusra.syeda marked an inline comment as done.Oct 28 2020, 12:45 PM

Harbormaster completed remote builds in B76797: Diff 301393.Oct 28 2020, 3:27 PM

This update cleans up GOFF.h by replacing various set/get functions with templated functions.

yusra.syeda marked 2 inline comments as done.Oct 30 2020, 2:26 PM

yusra.syeda added inline comments.

llvm/lib/Object/GOFFObjectFile.cpp
50–51	The DataExtractor class doesn't seem to be helpful. It's best use is if the data is read sequential, which is not the case with GOFF.

yusra.syeda marked an inline comment as done.Oct 30 2020, 2:26 PM

Harbormaster completed remote builds in B77096: Diff 302016.Oct 30 2020, 3:04 PM

jhenderson added inline comments.Nov 2 2020, 12:48 AM

llvm/lib/Object/GOFFObjectFile.cpp
50–51	You can use `DataExtractor` with offsets, rather than a `Cursor`, if the read is jumping around.

kpn added a subscriber: kpn.Nov 2 2020, 6:35 AM

kpn added inline comments.

llvm/lib/Object/GOFFObjectFile.cpp
50–51	Does this mean the ability to read RECFM=VB GOFF datasets is explicitly being designed to be impossible?

jhenderson added inline comments.Nov 3 2020, 12:42 AM

llvm/lib/Object/GOFFObjectFile.cpp
50–51	I'm not sure if this comment is being directed at me or @yusra.syeda. If at me, I don't really understand the question as I don't know the file format.

This update cleans up reinterpret_cast<> statements in GOFFObjectFile related to EsdPtrs, TextPtrs, and RldPtrs

Harbormaster completed remote builds in B77428: Diff 302624.Nov 3 2020, 11:04 AM

yusra.syeda added inline comments.Nov 3 2020, 1:12 PM

llvm/lib/Object/GOFFObjectFile.cpp
50–51	The first goal is to get the compiler running in USS, which only supports fixed block length of 80.

@jhenderson the comments you left have been addressed. Are there any other suggestions you have?

Sorry for the delay. This isn't high on my priorities, and I don't have the knowledge to properly review the file format details.

You will need some form of testing, possibly unit testing before this is ready to be put in. Also, please make sure to run clang-format on all new code, so that it conforms to LLVM style guidelines.

llvm/include/llvm/BinaryFormat/GOFF.h
6–7	This doesn't look to me to be the current license header. Please update to match the current version.
41	If this is supposed to be a byte, shouldn't it just be `constexpr uint8_t PTVPrefix = 0x03`?
llvm/include/llvm/Object/GOFF.h
4	This file is also using an outdated license.
106
164
llvm/include/llvm/Object/GOFFObjectFile.h
4	Update license.
35	Delete redundant blank line.
52	Isn't all of this "GOFF specific"? The class is `GOFFObjectFile`...
54	Why is this returning a `std::error_code` not an `Error`?
llvm/lib/Object/GOFFObjectFile.cpp
48	As stated before, avoid using `error_code` if possible. That includes functions like `errorCodeToError` which just convert from it. Instead, prefer functions like `createStringError`, which allow you to give more contextual error information to the user.
83
87–88
117–118
132
137	This sounds like it should be an error?
153	These sorts of comments don't add anything, in my opinion. Just delete them (the function names describe things sufficiently).
158	Add a blank line between functions.
162	Why convert between the two when you could just use `uint8_t` throughout?
165
166	What limits this to being specifically `uint16_t` in size?
168
171
181–182	What happens if the data is truncated?
263	Do you mean specifically " " means local? What about "", " " (i.e. 0, 2 spaces) etc?
283–288	Could this assertion fire if somebody wrote garbage in their object file's symbol type field? If so, it should be an error, not an assertion. (Use Error/Expected for malformed input and assertions for coder errors within LLVM).
297–300	Same question as above - should this be an error in case of malformed input?
339	Is this really unreachable? What happens if there is no symbol section in the file?
357	Blank line between functions. Same goes below.
377
380–381	Use range-based for loop here and below.
388
397	Same comment as earlier. Why not stick to `uint8_t` everywhere?
398
405
417	Add blank line between functions here and below.
453

OK, I'll bite. I do know GOFF, having implemented it in a shipping, commercial compiler before. Give me some time to take a closer look.

llvm/lib/Object/GOFFObjectFile.cpp
166	I believe the maximum record length even with continuations is 32KB. I don't know if saving two bytes of stack is worth making people reading the code doubletake, though. A check to make sure this 32KB limit is not exceeded is needed.

kpn added inline comments.Nov 11 2020, 9:38 AM

llvm/lib/Object/GOFFObjectFile.cpp
50–51	Hmm, @yusra.syeda, your response is mostly correct, but not entirely. It's true that the Unix-style filesystem (IBM calls it the "Hierarchical File System", with the first implementation of it being called the "Hierarchical File System" and the second "z/FS") has no record support because Unix doesn't support records in files. Thus the 80-byte requirement on GOFF record sizes. This is true. But there's no requirement that a program started under USS only access the Unix-style filesystem. There's no requirement that a program started under TSO or in batch only access traditional MVS datasets. Indeed, JCL even has support for Unix paths in DD statements, and TSO probably does as well (but I don't have my book handy). So the compiler "running in USS" does _not_ mean that we are restricted to 80-byte GOFF records. Granted, disambiguation of Unix paths and MVS dataset names is a problem, but still. I understand why you would want to leave variable sized records for implementation later. Not implementing support for MVS datasets up front is one thing. Designing your code to make it difficult if not impossible to add later is quite different. For example, random access to variable size record datasets is painful. Are you at least looking ahead to adding RECFM=V support later?

Address some formatting comments

Harbormaster completed remote builds in B78679: Diff 304950.Nov 12 2020, 1:23 PM

yusra.syeda marked 25 inline comments as done.Nov 12 2020, 1:23 PM

jhenderson added inline comments.Nov 12 2020, 11:59 PM

llvm/lib/Object/GOFFObjectFile.cpp
87	(reminder - comments must end in a full stop)
342	When I said the following to `SymbolRef` above: These sorts of comments don't add anything, in my opinion. Just delete them (the function names describe things sufficiently). That wasn't referring to just the one comment. Please delete all of these sort of comments.
398	Please address both edits suggested in the previous comment, not just the second one.
405	Ditto.

yusra.syeda added inline comments.Nov 13 2020, 9:38 AM

llvm/lib/Object/GOFFObjectFile.cpp
50–51	@kpn we don't have plans on adding support for variable length GOFF records. The XL compiler supports only 80 byte records and we don't plan to add support further than what exists in the XL compiler.

Address formatting comments

yusra.syeda marked 7 inline comments as done.Nov 13 2020, 1:29 PM

Harbormaster completed remote builds in B78812: Diff 305249.Nov 13 2020, 1:30 PM

kpn added inline comments.Nov 13 2020, 2:35 PM

llvm/include/llvm/BinaryFormat/GOFF.h
17–18	I can't find any document with Google that describes a GOFF-specific TIS, and Google also has trouble with "GOFF64" and "GOFF-64". Those version numbers I guess refer to that missing document. Can you instead reference the exact book from IBM's "z/OS Internet Library" that describes GOFF? Include the full name, the edition number, and the year please. Then it's easy to look up the GOFF spec.
llvm/include/llvm/BinaryFormat/GOFFAda.def
10	Please expand "ADA" here at the top. This appears to be for "Extended Attributes"?
llvm/include/llvm/Object/GOFF.h
138	Missing module properties field.
157	I don't see any support for "Text Encoding". These fields are at bytes 16-21. Maybe a comment if you don't plan on ever implementing it?
224	Is this right? Bit 41.3 in "Program Management" is the "Removable Class Flag", which matches the code above this. Bit 41.4-6 are marked reserved, and bit 41.7 is unnamed but if set means "Reserve 16 bytes at beginning of class. MRG class ED records only." So it looks like the code is writing to the wrong part of byte 41. The code that reads from that byte also appears incorrect. Shouldn't it be (41, 4, 3)?
228	This is the "Associated data ID". It's confusing having "Ada" and both "ADA" but I don't think they're related?
267	No COMMON flag?
488	I assume RLD continuation records and relocation compression are coming later.
531	For completeness this is good. But please don't ever use it. When the Binder abends I can't tell you how useful it is to use the Unix "dd" command to slice up GOFF until the abend goes away. That's how I've had to shoot down a number of bugs in emitting GOFF. But that technique doesn't work if the END card's record count field is used.
llvm/lib/Object/GOFFObjectFile.cpp
263	It's " " that's special. My employeer's compiler uses the same symbol name because the Binder translates it into a private symbol name that uses characters. In this way multiple private symbols can be disambiguated in a listing after a link.

kpn added a reviewer: kpn.Nov 13 2020, 2:36 PM

Change errorCodeToError to createStringError in GOFFObjectFile constructor

yusra.syeda marked an inline comment as not done.Nov 17 2020, 8:36 AM

yusra.syeda added inline comments.

llvm/lib/Object/GOFFObjectFile.cpp
166	Thanks, I will add that check.
339	Yes, this should be unreachable and is added as an extra safety check.

Harbormaster completed remote builds in B79133: Diff 305807.Nov 17 2020, 9:28 AM

Clean up some cast statements from const uint8_t * to const char *

yusra.syeda marked 2 inline comments as done.Nov 17 2020, 1:56 PM

Harbormaster completed remote builds in B79185: Diff 305901.Nov 17 2020, 2:09 PM

jhenderson added inline comments.Nov 18 2020, 1:12 AM

llvm/lib/Object/GOFFObjectFile.cpp
47	Be more verbose with your error messages, so that they provide more useful context. See https://llvm.org/docs/CodingStandards.html#error-and-warning-messages for full details.

MaskRay added inline comments.Nov 18 2020, 5:45 PM

llvm/lib/Object/GOFFObjectFile.cpp
323	https://llvm.org/docs/CodingStandards.html#don-t-evaluate-end-every-time-through-a-loop several instances in the file do not obey the rule.
338	llvm_unreachable does not need a return

Add check for ESD name length field size
Update loops to comply with LLVM coding standard:
https://llvm.org/docs/CodingStandards.html#don-t-evaluate-end-every-time-through-a-loop

yusra.syeda marked 3 inline comments as done.Nov 23 2020, 11:40 AM

Harbormaster completed remote builds in B79835: Diff 307140.Nov 23 2020, 11:51 AM

Remove return after llvm_unreachable statement

yusra.syeda marked 2 inline comments as done.Nov 23 2020, 11:58 AM

Harbormaster completed remote builds in B79841: Diff 307153.Nov 23 2020, 12:07 PM

Remove setERSymbolType and getERSymbolType functions

yusra.syeda marked an inline comment as done.Nov 23 2020, 3:13 PM

yusra.syeda added inline comments.

llvm/include/llvm/Object/GOFF.h
224	The setERSymbolType and getERSymbolType functions have been removed from this patch as they are not required.

yusra.syeda marked an inline comment as done.Nov 23 2020, 3:14 PM

Harbormaster completed remote builds in B79875: Diff 307207.Nov 23 2020, 3:15 PM

yusra.syeda marked 4 inline comments as done.Nov 24 2020, 12:50 PM

yusra.syeda added inline comments.

llvm/include/llvm/Object/GOFF.h
138	This field will be added in a future patch.
157	This will also be added in a future patch.
267	Same with this field.
488	Yes, these will be coming in a future patch.

yusra.syeda marked 4 inline comments as done.Nov 24 2020, 12:50 PM

Added unit test GOFFObjectFileTest.cpp, and added GOFF case in TestFileMagic.cpp
Addressed more review comments

yusra.syeda marked 9 inline comments as done.Dec 9 2020, 8:29 AM

Harbormaster completed remote builds in B81646: Diff 310546.Dec 9 2020, 9:07 AM

jhenderson added inline comments.Dec 10 2020, 1:02 AM

llvm/include/llvm/Object/GOFFObjectFile.h
51	Why is `getSymbolName` still returning a `std::error_code` and not an `Error`?
llvm/lib/Object/GOFFObjectFile.cpp
48	What is the right size? What is the size that the object file actually is? Please include context in the message. Also, no trailing full stop in error messages. See https://llvm.org/docs/CodingStandards.html#error-and-warning-messages.
180	(or "beginning") begin = verb, beginning/start = nouns
291	What has actually been specified though? Which symbol (if known)?
304	What type is it actually? Also, clang-format.
339	You didn't answer my question: "What happens if there is no symbol section in the file?"
llvm/unittests/Object/GOFFObjectFileTest.cpp
76	clang-format this file

Return Error instead of error_code for function getSymbolName
Address other review comments

Harbormaster completed remote builds in B82317: Diff 311659.Dec 14 2020, 11:49 AM

yusra.syeda marked 5 inline comments as done.Dec 14 2020, 12:49 PM

yusra.syeda added inline comments.

llvm/include/llvm/Object/GOFFObjectFile.h
51	Seems like I missed this. It's been updated.
llvm/lib/Object/GOFFObjectFile.cpp
291	These are the only options for the symbolType in the ESDSymbolType enum. If it's not one of these then the type is invalid.
304	These are the only options in the ESDExecutable enum, so if it's not one of these then the type is invalid.
339	This should be unreachable and it should be an error if there is no symbol section. I updated the llvm_unreachable statement to return an Error instead.

yusra.syeda marked 2 inline comments as done.Dec 14 2020, 12:49 PM

jhenderson added inline comments.Dec 15 2020, 12:10 AM

llvm/include/llvm/Object/GOFFObjectFile.h
51	Sorry, I should have spotted this earlier. It would be a more common style for this function's signature to return an `Expected` rather than take the `Res` as an argument: `Expected<StringRef> getSymbolName(SymbolRef Symbol) const;` This would match, e.g. `getSectionName` or `getSectionContents` as good examples.
llvm/lib/Object/GOFFObjectFile.cpp
47	Please remember to clang-format your changes.
49	`getBufferSize()` returns `size_t` not `int`. The correct format specifier is `%z` for `size_t`.
202	Here, you'd want `Error::success()`, but actually, if you switch to using `Expected`, you'd return `*NameOrErr` directly.
291	I don't think you understood what I meant - when I asked this and the similar question below regarding executable type, I meant you should include the additional context, i.e. which symbol had an invalid type (i.e. the index or possibly name) and what that invalid type was, e.g. "symbol 42 has unknown type 0x12". It's important to do this properly because the user's input object might be corrupted in some way, and the code needs to make it easier for that user to find the problem.
304	Same as above - give more useful context in this message, e.g. "executable has unknown type 0x1111". Also, I think it's more common to omit the `std::` prefix from `std::errc` values (LLVM has its own version of this set, which partly parallels the std one). Please take a look at removing that prefix from all these `std::errc` instances.
344	I think it would be slightly clearly to say `no symbol section found`. "unable to get" sounds like there was an actual problem retrieving the section (e.g. some part of the section data was invalid), whereas "no ... found" is clear that it's simply not there.

Change return type of getSymbolType function from Error to Expected<StringRef>
Also remove StringRef parameter passed by reference to the function
Update error messages to be more descriptive

Harbormaster completed remote builds in B82511: Diff 311981.Dec 15 2020, 11:57 AM

Fix typo

Harbormaster completed remote builds in B82512: Diff 311982.Dec 15 2020, 12:00 PM

yusra.syeda marked 5 inline comments as done.Dec 15 2020, 12:01 PM

yusra.syeda added inline comments.

llvm/lib/Object/GOFFObjectFile.cpp
291	Thanks, done.

yusra.syeda marked an inline comment as done.Dec 15 2020, 12:02 PM

jhenderson added inline comments.Dec 16 2020, 1:38 AM

llvm/include/llvm/BinaryFormat/GOFF.h
54	Does the GOFF spec specify the size of symbol types and other things like that? If so, I'd use the explicit size for these fields. For example, if symbol type is guaranteed to be a single byte, you might adopt my inline edit. The same goes for each other enum below.
llvm/lib/Object/GOFFObjectFile.cpp
137	I think this comment was referring to the "unhandled" bits below. It's been marked as done, but I don't see any response. Could you clarify more why this isn't a hard error and instead such things are being ignored?
200–204	Actually, better yet, I think you can simplify this down to a single line as suggested in the edit.
293–294	Rather than casting `EsdId`, use the correct print format specifier.
309–310	Same as above.

Update error statement, clean up getSymbolName function, add size to enum

Harbormaster completed remote builds in B83001: Diff 312867.Dec 18 2020, 11:50 AM

yusra.syeda marked 5 inline comments as done.Dec 18 2020, 11:53 AM

yusra.syeda added inline comments.

llvm/include/llvm/BinaryFormat/GOFF.h
54	I've added the sizes where applicable.
llvm/lib/Object/GOFFObjectFile.cpp
137	These should not be errors. The GOFF reader ignores the normally less important things.

yusra.syeda marked 2 inline comments as done.Dec 18 2020, 11:53 AM

Apply clang format suggestion

Harbormaster completed remote builds in B83003: Diff 312870.Dec 18 2020, 12:01 PM

I've not reviewed the testing yet, but my immediate thought is that there needs to be a lot more, handling all the different code paths.

llvm/include/llvm/BinaryFormat/GOFF.h
110	Is the field that holds this not a fixed size type? If it is, you could use `uint8_t`/`uin16_t` etc as appropriate here to match.
llvm/lib/Object/GOFFObjectFile.cpp
293–294	Now that `SymbolType` is defined to be a `uint8_t`, you should use that explicitly, i.e. something like `0x%02" PRIX8"` (though you could probably simplify and omit the "02" bit, since this is an error message, and getting a fixed width field isn't really necessary.

Fix error message formatting

yusra.syeda marked an inline comment as done.Jan 8 2021, 1:43 PM

yusra.syeda added inline comments.

llvm/include/llvm/BinaryFormat/GOFF.h
110	This field is 3 bits in size.

Harbormaster completed remote builds in B84529: Diff 315505.Jan 8 2021, 2:09 PM

jhenderson added inline comments.Jan 11 2021, 12:06 AM

llvm/include/llvm/BinaryFormat/GOFF.h
110	Right, okay. I'd still consider using `uint8_t`, as that is the smallest type that can be used here, I believe. Same probably applies elsewhere. This will allow easier print formatting.

Added size to ESDExecutable enum
Reformatted unit tests

Harbormaster completed remote builds in B91426: Diff 327254.Mar 1 2021, 1:02 PM

In D89071#2486161, @jhenderson wrote:

I've not reviewed the testing yet, but my immediate thought is that there needs to be a lot more, handling all the different code paths.

@jhenderson can you please review the testing? Currently the tests construct a GOFF object with a valid sized record (80 bytes), an invalid sized record (!80 bytes), and obtains the symbol name from the ESD record. Testing for relocations will be added in a future patch.

The majority of your code is still untested as far as I can see. There appear to be three test cases you have so far:

An invalid size for a GOFF object.
A valid size for a GOFF object.
That getSymbolName returns the name of a single symbol in the symbol table.

What about all the rest of the functionality that is included in this patch, including, but certainly not limited to, the following?

More than one symbol in the symbol table.
Other properties of symbols.
The various properties of records.
Relocations.
And so on...

For each bit of code you have written, consider whether a test would fail if that bit of code was broken in some way, or didn't exist. If no test fails, then that code needs a new test case of some form. There may also be other cases where testing is appropriate, e.g. where two separate aspects of the same system interact in some way, although those are harder to judge.

llvm/lib/Object/GOFFObjectFile.cpp
65–66	I don't see a test case involving a continuation record. You should have one followed by a non-continuation record, as otherwise this aspect is not tested.
llvm/unittests/Object/GOFFObjectFileTest.cpp
21	This and the other functions below are only used in one place, if I'm not mistaken. As such, just inline them - splitting them off makes it harder to follow what the individaul tests are doing, since you have to jump around the file.
44	The valid size can be any multiple of 80 bytes. I'd recommend a second test-case that uses a size of something other than 80 bytes, e.g. 160 bytes. What about 0 bytes? That probably needs a specific test case, as that is a multiple of 80...
53	According to the code, it needs to be a multiple of 80 bytes, so this comment isn't quite correct (it implies 160 is not a valid size).
54	Test the edge cases e.g. 79 and/or 81.
59	Rather than `Failed()`, use `FailedWithMessage()`, so that you can check the error message output.
74	This is only used once - just inline it.
75	I suspect, given the name, that the `SymbolRef` type is very small already (in the same manner as `StringRef`), and there's no real benefit in making this a `const &`.
80–81	You should just be able to do `EXPECT_EQ(SymbolName, "Hello");` here.

In D89071#2610685, @jhenderson wrote:

The majority of your code is still untested as far as I can see. There appear to be three test cases you have so far:

An invalid size for a GOFF object.

A valid size for a GOFF object.

That getSymbolName returns the name of a single symbol in the symbol table.

What about all the rest of the functionality that is included in this patch, including, but certainly not limited to, the following?

More than one symbol in the symbol table.

Other properties of symbols.

The various properties of records.

Relocations.

And so on...

For each bit of code you have written, consider whether a test would fail if that bit of code was broken in some way, or didn't exist. If no test fails, then that code needs a new test case of some form. There may also be other cases where testing is appropriate, e.g. where two separate aspects of the same system interact in some way, although those are harder to judge.

Here's an idea: I found, at least when running in batch, that the Binder (linker) will link an object consisting of nothing more than a HDR card, then ESD cards followed by an END card (meaning, just symbols). It will also link an object consisting of HDR, ESD, TXT, and END cards with zero relocations. Does it work that way when not running in batch? Because if it does then it might make sense to split this ticket up into a new ticket with just support for HDR+ESD+END cards.

That would make this patch smaller, and it would reduce the amount of tests that need to be written to get some initial GOFF support into the tree. The tests that @jhenderson requested would still be needed, but you'd only need the ones that were relevant to the smaller amount of code in the new ticket. A new ticket should refer back to this ticket because this ticket shows the direction you are going, and it has a bunch of comments that should probably be left for posterity. Later tickets can build on this foundation.

The LLVM community tends to prefer smaller patches over larger ones. Typically, anyway.

It's an idea. Thoughts?

In D89071#2614095, @kpn wrote:

In D89071#2610685, @jhenderson wrote:

The majority of your code is still untested as far as I can see. There appear to be three test cases you have so far:

An invalid size for a GOFF object.

A valid size for a GOFF object.

That getSymbolName returns the name of a single symbol in the symbol table.

What about all the rest of the functionality that is included in this patch, including, but certainly not limited to, the following?

More than one symbol in the symbol table.

Other properties of symbols.

The various properties of records.

Relocations.

And so on...

For each bit of code you have written, consider whether a test would fail if that bit of code was broken in some way, or didn't exist. If no test fails, then that code needs a new test case of some form. There may also be other cases where testing is appropriate, e.g. where two separate aspects of the same system interact in some way, although those are harder to judge.

Here's an idea: I found, at least when running in batch, that the Binder (linker) will link an object consisting of nothing more than a HDR card, then ESD cards followed by an END card (meaning, just symbols). It will also link an object consisting of HDR, ESD, TXT, and END cards with zero relocations. Does it work that way when not running in batch? Because if it does then it might make sense to split this ticket up into a new ticket with just support for HDR+ESD+END cards.

That would make this patch smaller, and it would reduce the amount of tests that need to be written to get some initial GOFF support into the tree. The tests that @jhenderson requested would still be needed, but you'd only need the ones that were relevant to the smaller amount of code in the new ticket. A new ticket should refer back to this ticket because this ticket shows the direction you are going, and it has a bunch of comments that should probably be left for posterity. Later tickets can build on this foundation.

The LLVM community tends to prefer smaller patches over larger ones. Typically, anyway.

It's an idea. Thoughts?

I agree this would be a quicker way to get some initial GOFF support in. I can break down this patch to support HDR + ESD + END records, and add the relevant tests mentioned by @jhenderson. I'll add support for the remaining records incrementally.

yusra.syeda mentioned this in D98437: [SystemZ][z/OS] Add GOFFObjectFile class support for HDR, ESD and END records.Mar 11 2021, 10:15 AM

I created a smaller patch supporting just the HDR, ESD and END records here: https://reviews.llvm.org/D98437. Please continue the review in the new patch.

yusra.syeda abandoned this revision.Jun 4 2021, 8:53 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

BinaryFormat/

GOFF.h

220 lines

GOFFAda.def

41 lines

Magic.h

1 line

Object/

5 lines

543 lines

167 lines

3 lines

lib/

BinaryFormat/

Magic.cpp

5 lines

Object/

1 line

1 line

490 lines

5 lines

1 line

unittests/

BinaryFormat/

TestFileMagic.cpp

2 lines

Object/

CMakeLists.txt

1 line

GOFFObjectFileTest.cpp

91 lines

Diff 327254

llvm/include/llvm/BinaryFormat/GOFF.h

This file was added.

//===-- llvm/BinaryFormat/GOFF.h - GOFF definitions --------------*- C++-*-===//

Lint: Lint

clang-format not found in user's PATH; not linting file.

Lint: Lint: clang-format not found in user's PATH; not linting file.

jhendersonUnsubmitted

Done

You need to shorten this comment line by deleting the extra dashes to bring it within the column width (compare other header files to contrast how they work).

jhenderson: You need to shorten this comment line by deleting the extra dashes to bring it within the…

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

//===----------------------------------------------------------------------===//

jhendersonUnsubmitted

Done

This doesn't look to me to be the current license header. Please update to match the current version.

jhenderson: This doesn't look to me to be the current license header. Please update to match the current…

// This header contains common, non-processor-specific data structures and

// constants for the GOFF file format.

// GOFF specifics can be found in MVS Program Management: Advanced Facilities

// The details of the GOFF32 bits in this file are largely based on the Tool

// Interface Standard (TIS) Executable and Linking Format (GOFF) Specification

// Version 1.2, May 1995. The GOFF64 specifics are based on GOFF-64 Object File

// Format Version 1.5, Draft 2, May 1998 as well as OpenBSD header files.

//===----------------------------------------------------------------------===//

jhendersonUnsubmitted

Done

Looks like the comment here needs reflowing?

jhenderson: Looks like the comment here needs reflowing?

kpnUnsubmitted

Done

I can't find any document with Google that describes a GOFF-specific TIS, and Google also has trouble with "GOFF64" and "GOFF-64". Those version numbers I guess refer to that missing document.

Can you instead reference the exact book from IBM's "z/OS Internet Library" that describes GOFF? Include the full name, the edition number, and the year please. Then it's easy to look up the GOFF spec.

kpn: I can't find any document with Google that describes a GOFF-specific TIS, and Google also has…

#ifndef LLVM_BINARYFORMAT_GOFF_H

#define LLVM_BINARYFORMAT_GOFF_H

#include "llvm/Support/DataTypes.h"

namespace llvm {

namespace GOFF {

constexpr uint8_t RecordLength = 80;

const uint8_t RecordPrefixLength = 3;

const uint8_t PayloadLength = 77;

/// \brief Maximum data length before starting a new card for RLD and TXT data.

///

/// The maximum number of bytes that can be included in an RLD or TXT record and

/// their continuations is a SIGNED 16 bit int despite what the spec says. The

/// number of bytes we allow ourselves to attach to a card is thus arbitrarily

/// limited to 16K bytes.

MaskRayUnsubmitted

Done

/// limited to 16K bytes.

- const uint16_t MaxDataLength = 16 * 1024;

+ constexpr uint16_t MaxDataLength = 16 * 1024;

// Prefix byte on every record. This indicates GOFF format.

MaskRay:

constexpr uint16_t MaxDataLength = 16 * 1024;

// Prefix byte on every record. This indicates GOFF format.

jhendersonUnsubmitted

Done

const uint16_t MaxDataLength = 16 * 1024;

- // Prefix byte on every record. This indicates GOFF format.

+ // Prefix byte on every record. This indicates GOFF format.

enum { PTVPrefix = 0x03 };

jhenderson:

jhendersonUnsubmitted

Done

If this is supposed to be a byte, shouldn't it just be constexpr uint8_t PTVPrefix = 0x03?

jhenderson: If this is supposed to be a byte, shouldn't it just be `constexpr uint8_t PTVPrefix = 0x03`?

constexpr uint8_t PTVPrefix = 0x03;

enum RecordType {

RT_ESD = 0,

RT_TXT = 1,

RT_RLD = 2,

RT_LEN = 3,

RT_END = 4,

RT_HDR = 15,

};

enum ESDSymbolType : uint8_t {

ESD_ST_SectionDefinition = 0,

jhendersonUnsubmitted

Done

RT_HDR = 15,

};

- enum ESDSymbolType {

+ enum ESDSymbolType : uint8_t {

ESD_ST_SectionDefinition = 0,

Does the GOFF spec specify the size of symbol types and other things like that? If so, I'd use the explicit size for these fields. For example, if symbol type is guaranteed to be a single byte, you might adopt my inline edit. The same goes for each other enum below.

jhenderson: Does the GOFF spec specify the size of symbol types and other things like that? If so, I'd use…

yusra.syedaAuthorUnsubmitted

Done

I've added the sizes where applicable.

yusra.syeda: I've added the sizes where applicable.

ESD_ST_ElementDefinition = 1,

ESD_ST_LabelDefinition = 2,

ESD_ST_PartReference = 3,

ESD_ST_ExternalReference = 4,

};

enum ESDNameSpaceId : uint8_t {

ESD_NS_ProgramManagementBinder = 0,

ESD_NS_NormalName = 1,

ESD_NS_PseudoRegister = 2,

ESD_NS_Parts = 3

};

enum ESDReserveQwords {

ESD_RQ_0 = 0,

ESD_RQ_1 = 1,

ESD_RQ_2 = 2,

ESD_RQ_3 = 3

};

enum ESDAmode : uint8_t {

ESD_AMODE_None = 0,

ESD_AMODE_24 = 1,

ESD_AMODE_31 = 2,

ESD_AMODE_ANY = 3,

ESD_AMODE_64 = 4,

ESD_AMODE_MIN = 16,

};

enum ESDRmode : uint8_t {

ESD_RMODE_None = 0,

ESD_RMODE_24 = 1,

ESD_RMODE_31 = 3,

ESD_RMODE_64 = 4,

};

enum ESDTextStyle {

ESD_TS_ByteOriented = 0,

ESD_TS_Structured = 1,

ESD_TS_Unstructured = 2,

};

enum ESDBindingAlgorithm {

ESD_BA_Concatenate = 0,

ESD_BA_Merge = 1,

};

enum ESDTaskingBehavior {

ESD_TA_Unspecified = 0,

ESD_TA_NonReus = 1,

ESD_TA_Reus = 2,

ESD_TA_Rent = 3,

};

enum ESDExecutable : uint8_t {

ESD_EXE_Unspecified = 0,

jhendersonUnsubmitted

Not Done

Is the field that holds this not a fixed size type? If it is, you could use uint8_t/uin16_t etc as appropriate here to match.

jhenderson: Is the field that holds this not a fixed size type? If it is, you could use `uint8_t`/`uin16_t`…

yusra.syedaAuthorUnsubmitted

Done

This field is 3 bits in size.

yusra.syeda: This field is 3 bits in size.

jhendersonUnsubmitted

Not Done

Right, okay. I'd still consider using uint8_t, as that is the smallest type that can be used here, I believe. Same probably applies elsewhere. This will allow easier print formatting.

jhenderson: Right, okay. I'd still consider using `uint8_t`, as that is the smallest type that can be used…

ESD_EXE_DATA = 1,

ESD_EXE_CODE = 2,

};

enum ESDDuplicateSymbolSeverity {

ESD_DSS_NoWarning = 0,

ESD_DSS_Warning = 1,

ESD_DSS_Error = 2,

ESD_DSS_Reserved = 3,

};

enum ESDBindingStrength {

ESD_BST_Strong = 0,

ESD_BST_Weak = 1,

};

enum ESDLoadingBehavior {

ESD_LB_Initial = 0,

ESD_LB_Deferred = 1,

ESD_LB_NoLoad = 2,

ESD_LB_Reserved = 3,

};

enum ESDBindingScope {

ESD_BSC_Unspecified = 0,

ESD_BSC_Section = 1,

ESD_BSC_Module = 2,

ESD_BSC_Library = 3,

ESD_BSC_ImportExport = 4,

};

enum ESDLinkageType { ESD_LT_OS = 0, ESD_LT_XPLink = 1 };

enum ESDAlignment {

ESD_ALIGN_Byte = 0,

ESD_ALIGN_Halfword = 1,

ESD_ALIGN_Fullword = 2,

ESD_ALIGN_Doubleword = 3,

ESD_ALIGN_Quadword = 4,

ESD_ALIGN_32byte = 5,

ESD_ALIGN_64byte = 6,

ESD_ALIGN_128byte = 7,

ESD_ALIGN_256byte = 8,

ESD_ALIGN_512byte = 9,

ESD_ALIGN_1024byte = 10,

ESD_ALIGN_2Kpage = 11,

ESD_ALIGN_4Kpage = 12,

};

enum TXTRecordStyle : uint8_t {

TXT_RS_Byte = 0,

TXT_RS_Structured = 1,

TXT_RS_Unstructured = 2,

};

enum RLDReferenceType {

RLD_RT_RAddress = 0,

RLD_RT_ROffset = 1,

RLD_RT_RLength = 2,

RLD_RT_RRelativeImmediate = 6,

RLD_RT_RTypeConstant = 7,

RLD_RT_RLongDisplacement = 9,

};

enum RLDReferentType {

jhendersonUnsubmitted

Done

TXT_RS_Unstructured = 2,

};

// RLDRelocationType is internal use only and these values not put

- // in GOFF format

+ // in GOFF format.

enum RLDRelocationType {

I'm not quite sure what this comment is saying. When it says "internal use only" do you mean not defined in any spec, and used only as a helper within the format processing? If so, the enum values should be styled after LLVM coding standard i.e. "UpperCamelCase". If not, you need to clarify what is meant by "internal use only".

jhenderson: I'm not quite sure what this comment is saying. When it says "internal use only" do you mean…

yusra.syedaAuthorUnsubmitted

Done

This is used only as a helper, but is not needed in this patch. It has been removed.

yusra.syeda: This is used only as a helper, but is not needed in this patch. It has been removed.

RLD_RO_Label = 0,

RLD_RO_Element = 1,

RLD_RO_Class = 2,

RLD_RO_Part = 3,

};

enum RLDAction {

RLD_ACT_Multiply = 4,

RLD_ACT_Div4Quotient = 6,

RLD_ACT_Div4Remainder = 7,

RLD_ACT_And = 8,

RLD_ACT_Or = 9,

RLD_ACT_Xor = 10,

RLD_ACT_Move = 16

};

enum RLDFetchStore { RLD_FS_Fetch = 0, RLD_FS_Store = 1 };

enum ENDEntryPointRequest {

END_EPR_None = 0,

END_EPR_EsdidOffset = 1,

END_EPR_ExternalName = 2,

END_EPR_Reserved = 3,

};

// \brief Represent the different types of objects that can be referenced from

// the associated data area (ADA) of a compilation unit.

enum ADASlotKind {

#define ADASLOT(SLOT, MO, V) SLOT = V,

#include "llvm/BinaryFormat/GOFFAda.def"

};

// \brief Subsections of the primary C_CODE section in the object file.

enum SubsectionKind {

SK_ReadOnly = 1,

SK_PPA1 = 2,

SK_JumpTable = 3,

SK_PPA2 = 4,

};

} // end namespace GOFF

} // end namespace llvm

#endif // LLVM_BINARYFORMAT_GOFF_H

llvm/include/llvm/BinaryFormat/GOFFAda.def

This file was added.

				//===-- GOFFAda.def ---------------------------------------------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file centralizes the ADA (Associated Data Area) definitions
				// used by GOFF and XPLINK.
				kpnUnsubmitted Done Reply Inline Actions Please expand "ADA" here at the top. This appears to be for "Extended Attributes"? kpn: Please expand "ADA" here at the top. This appears to be for "Extended Attributes"?
				//
				//===----------------------------------------------------------------------===//

				#ifndef ADASLOT
				#define ADASLOT(SLOT, MO, V)
				#endif

				/// ADASLOT(SLOT, MO, V)
				///
				/// \param SLOT - ADA slot kind in file
				/// \param MO - Corresponding machine operand kind
				/// \param V - Enumeration value

				// The address of a non-function (data) symbol.
				ADASLOT(ADA_DataSymbolAddr, MO_ADA_DATA_SYMBOL_ADDR, 1)
				// The address of a function descriptor.
				ADASLOT(ADA_IndirectFuncDesc, MO_ADA_INDIRECT_FUNC_DESC, 2)
				// A function descriptor. Contains the address of the ADA for a function and
				// the address of the function.
				ADASLOT(ADA_DirectFuncDesc, MO_ADA_DIRECT_FUNC_DESC, 3)
				// The address of the ADA for a function. Part of a function descriptor.
				ADASLOT(ADA_DirectFuncDescADA, MO_ADA_DIRECT_FUNC_DESC_ADA, 4)
				// The address of a function. Part of a function descriptor.
				ADASLOT(ADA_DirectFuncDescAddr, MO_ADA_DIRECT_FUNC_DESC_EPA, 5)
				// The address of a data symbol's handle, that is, the address of the symbol
				// referenced. This is how external data is referenced.
				ADASLOT(ADA_DataSymbolHandle, MO_ADA_DATA_SYMBOL_ADDR_VIA_HANDLE, 6)
				// Access ADA entry containing pointer to internal data symbol.
				ADASLOT(ADA_InternalDataSymbolAddr, MO_ADA_INTERNAL_DATA_SYMBOL_ADDR, 7)

				#undef ADASLOT

llvm/include/llvm/BinaryFormat/Magic.h

//===- llvm/BinaryFormat/Magic.h - File magic identification ----- C++ --===//		//===- llvm/BinaryFormat/Magic.h - File magic identification ----- C++ --===//
		Lint: Lint Inline Actions clang-format not found in user's PATH; not linting file. Lint: Lint: clang-format not found in user's PATH; not linting file.
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

Show All 35 Lines	enum Impl {
coff_object, ///< COFF object file		coff_object, ///< COFF object file
coff_import_library, ///< COFF import library		coff_import_library, ///< COFF import library
pecoff_executable, ///< PECOFF executable file		pecoff_executable, ///< PECOFF executable file
windows_resource, ///< Windows compiled resource file (.res)		windows_resource, ///< Windows compiled resource file (.res)
xcoff_object_32, ///< 32-bit XCOFF object file		xcoff_object_32, ///< 32-bit XCOFF object file
xcoff_object_64, ///< 64-bit XCOFF object file		xcoff_object_64, ///< 64-bit XCOFF object file
wasm_object, ///< WebAssembly Object file		wasm_object, ///< WebAssembly Object file
pdb, ///< Windows PDB debug info file		pdb, ///< Windows PDB debug info file
		goff_object, ///< GOFF object file
tapi_file, ///< Text-based Dynamic Library Stub file		tapi_file, ///< Text-based Dynamic Library Stub file
};		};

bool is_object() const { return V != unknown; }		bool is_object() const { return V != unknown; }

file_magic() = default;		file_magic() = default;
file_magic(Impl V) : V(V) {}		file_magic(Impl V) : V(V) {}
operator Impl() const { return V; }		operator Impl() const { return V; }
Show All 18 Lines

llvm/include/llvm/Object/Binary.h

//===- Binary.h - A generic binary file -------------------------- C++ --===//		//===- Binary.h - A generic binary file -------------------------- C++ --===//
		Lint: Lint Inline Actions clang-format not found in user's PATH; not linting file. Lint: Lint: clang-format not found in user's PATH; not linting file.
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
▲ Show 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	enum {
ID_ELF64L, // ELF 64-bit, little endian		ID_ELF64L, // ELF 64-bit, little endian
ID_ELF64B, // ELF 64-bit, big endian		ID_ELF64B, // ELF 64-bit, big endian

ID_MachO32L, // MachO 32-bit, little endian		ID_MachO32L, // MachO 32-bit, little endian
ID_MachO32B, // MachO 32-bit, big endian		ID_MachO32B, // MachO 32-bit, big endian
ID_MachO64L, // MachO 64-bit, little endian		ID_MachO64L, // MachO 64-bit, little endian
ID_MachO64B, // MachO 64-bit, big endian		ID_MachO64B, // MachO 64-bit, big endian

		ID_GOFF,
ID_Wasm,		ID_Wasm,

ID_EndObjects		ID_EndObjects
};		};

static inline unsigned int getELFType(bool isLE, bool is64Bits) {		static inline unsigned int getELFType(bool isLE, bool is64Bits) {
if (isLE)		if (isLE)
return is64Bits ? ID_ELF64L : ID_ELF32L;		return is64Bits ? ID_ELF64L : ID_ELF32L;
▲ Show 20 Lines • Show All 56 Lines • ▼ Show 20 Lines	public:
bool isCOFFImportFile() const {		bool isCOFFImportFile() const {
return TypeID == ID_COFFImportFile;		return TypeID == ID_COFFImportFile;
}		}

bool isIR() const {		bool isIR() const {
return TypeID == ID_IR;		return TypeID == ID_IR;
}		}

		bool isGOFF() const { return TypeID == ID_GOFF; }

bool isMinidump() const { return TypeID == ID_Minidump; }		bool isMinidump() const { return TypeID == ID_Minidump; }

bool isTapiFile() const { return TypeID == ID_TapiFile; }		bool isTapiFile() const { return TypeID == ID_TapiFile; }

bool isLittleEndian() const {		bool isLittleEndian() const {
return !(TypeID == ID_ELF32B \|\| TypeID == ID_ELF64B \|\|		return !(TypeID == ID_ELF32B \|\| TypeID == ID_ELF64B \|\|
TypeID == ID_MachO32B \|\| TypeID == ID_MachO64B);		TypeID == ID_MachO32B \|\| TypeID == ID_MachO64B);
}		}

bool isWinRes() const { return TypeID == ID_WinRes; }		bool isWinRes() const { return TypeID == ID_WinRes; }

Triple::ObjectFormatType getTripleObjectFormat() const {		Triple::ObjectFormatType getTripleObjectFormat() const {
if (isCOFF())		if (isCOFF())
return Triple::COFF;		return Triple::COFF;
if (isMachO())		if (isMachO())
return Triple::MachO;		return Triple::MachO;
if (isELF())		if (isELF())
return Triple::ELF;		return Triple::ELF;
		if (isGOFF())
		return Triple::GOFF;
return Triple::UnknownObjectFormat;		return Triple::UnknownObjectFormat;
}		}

static Error checkOffset(MemoryBufferRef M, uintptr_t Addr,		static Error checkOffset(MemoryBufferRef M, uintptr_t Addr,
const uint64_t Size) {		const uint64_t Size) {
if (Addr + Size < Addr \|\| Addr + Size < Size \|\|		if (Addr + Size < Addr \|\| Addr + Size < Size \|\|
Addr + Size > uintptr_t(M.getBufferEnd()) \|\|		Addr + Size > uintptr_t(M.getBufferEnd()) \|\|
Addr < uintptr_t(M.getBufferStart())) {		Addr < uintptr_t(M.getBufferStart())) {
▲ Show 20 Lines • Show All 71 Lines • Show Last 20 Lines

llvm/include/llvm/Object/GOFF.h

This file was added.

//===- GOFF.h - GOFF object file implementation -----------------*- C++ -*-===//

Lint: Lint

clang-format not found in user's PATH; not linting file.

Lint: Lint: clang-format not found in user's PATH; not linting file.

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information.

jhendersonUnsubmitted

Done

This file is also using an outdated license.

jhenderson: This file is also using an outdated license.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

//===----------------------------------------------------------------------===//

// This file declares the GOFFObjectFile class.

// Record classes and derivatives are also declared and implemented.

//===----------------------------------------------------------------------===//

#ifndef LLVM_OBJECT_GOFF_H

#define LLVM_OBJECT_GOFF_H

#include "llvm/ADT/SmallVector.h"

#include "llvm/BinaryFormat/GOFF.h"

#include "llvm/Support/Debug.h"

#include "llvm/Support/Endian.h"

#include "llvm/Support/raw_ostream.h"

namespace llvm {

struct GOFFRelocationEntry {

const uint32_t REsdId;

const uint32_t PEsdId;

const uint64_t POffset;

const GOFF::RLDReferenceType ReferenceType;

const GOFF::RLDReferentType ReferentType;

const GOFF::RLDAction Action;

const bool NoFetchTarget;

const uint32_t TargetLength;

GOFFRelocationEntry(uint32_t REsdId, uint32_t PEsdId, uint64_t POffset,

GOFF::RLDReferenceType ReferenceType,

jhendersonUnsubmitted

Done

Please follow LLVM coding standards and drop the trailing _. You don't need it.

jhenderson: Please follow LLVM coding standards and drop the trailing `_`. You don't need it.

GOFF::RLDReferentType ReferentType,

GOFF::RLDAction Action, bool NoFetchTarget,

uint32_t TargetLength)

: REsdId(REsdId), PEsdId(PEsdId), POffset(POffset),

ReferenceType(ReferenceType), ReferentType(ReferentType),

Action(Action), NoFetchTarget(NoFetchTarget),

TargetLength(TargetLength) {}

};

namespace object {

/// \brief Represents a GOFF physical record.

///

/// Specifies protected member functions to manipulate the record. These should

/// be called from deriving classes to change values as that record specifies.

class Record {

SmallVector<char, GOFF::RecordLength> Bytes;

public:

Record() : Bytes(GOFF::RecordLength, (char)0x00) {

set<uint8_t>(0, GOFF::PTVPrefix);

};

// Set PTV fields common to all records.

void setRecordType(GOFF::RecordType RecordType) {

jhendersonUnsubmitted

Done

set8(0, GOFF::PTVPrefix);

};

- // Set PTV fields common to all records

+ // Set PTV fields common to all records.

void setRecordType(GOFF::RecordType RecordType) {

Here and throughout, please end your comments with a trailing full stop to conform to the LLVM coding standards.

jhenderson: Here and throughout, please end your comments with a trailing full stop to conform to the LLVM…

setBits(1, 0, 4, RecordType);

}

void setContinuation(bool Continuation) {

uint8_t Value = Continuation ? 1 : 0;

setBits(1, 6, 1, Value);

}

void setContinued(bool Continued) {

uint8_t Value = Continued ? 1 : 0;

setBits(1, 7, 1, Value);

}

static bool isContinued(const uint8_t *Record) {

uint8_t IsContinued;

getBits(Record, 1, 7, 1, IsContinued);

return IsContinued;

}

static bool isContinuation(const uint8_t *Record) {

uint8_t IsContinuation;

getBits(Record, 1, 6, 1, IsContinuation);

return IsContinuation;

}

const SmallVectorImpl<char> &getBytes() const { return Bytes; }

protected:

/// \brief Set bit field of specified byte.

///

/// Used to pack bit fields into one byte. Fields are packed LEFT TO RIGHT.

/// Bit index zero is the MOST SIGNIFICANT BIT of the byte.

///

/// \param ByteIndex index of byte the field is in.

/// \param BitIndex index of first bit of field.

/// \param Length length of bit field.

/// \param Value new value of bit field.

void setBits(uint8_t ByteIndex, uint8_t BitIndex, uint8_t Length,

uint8_t Value) {

assert(ByteIndex < GOFF::RecordLength && "Byte index out of bounds!");

assert(BitIndex < 8 && "Bit index out of bounds!");

assert(Length + BitIndex <= 8 && "Bit length too long!");

uint8_t Mask = ((1 << Length) - 1) << (8 - BitIndex - Length);

jhendersonUnsubmitted

Done

set<uint8_t>(ByteIndex, (PrevValue & ~Mask) | Value);

}

- // Set byte of record

+ // Set byte of record.

template <class T> void set(uint8_t ByteIndex, T Value) {

jhenderson:

Value = Value << (8 - BitIndex - Length);

assert((Value & Mask) == Value && "Bits set outside of range!");

jhendersonUnsubmitted

Done

You state that this accounts for endianness. What happens if the host endianness is different to the target endianness? I suspect you're values will be in the wrong order then.

jhenderson: You state that this accounts for endianness. What happens if the host endianness is different…

uint8_t PrevValue;

get<uint8_t>(reinterpret_cast<const uint8_t *>(Bytes.data()), ByteIndex,

PrevValue);

set<uint8_t>(ByteIndex, (PrevValue & ~Mask) | Value);

jhendersonUnsubmitted

Done

As you're using this value for an operation in memory, you probably want it to be a size_t, rather than a signed integer. Same goes elsewhere below.

jhenderson: As you're using this value for an operation in memory, you probably want it to be a `size_t`…

}

// Set byte of record.

template <class T> void set(uint8_t ByteIndex, T Value) {

assert(ByteIndex + sizeof(T) - 1 < GOFF::RecordLength &&

"Byte index out of bounds!");

support::endian::write<T, support::big, support::unaligned>(

&Bytes[ByteIndex], Value);

}

static void getBits(const uint8_t *Bytes, uint8_t ByteIndex, uint8_t BitIndex,

uint8_t Length, uint8_t &Value) {

assert(ByteIndex < GOFF::RecordLength && "Byte index out of bounds!");

assert(BitIndex < 8 && "Bit index out of bounds!");

assert(Length + BitIndex <= 8 && "Bit length too long!");

jhendersonUnsubmitted

Done

This comment should be deleted - it provides no useful value.

jhenderson: This comment should be deleted - it provides no useful value.

get<uint8_t>(Bytes, ByteIndex, Value);

uint8_t Mask = ((1 << Length) - 1) << (8 - BitIndex - Length);

Value = (Value & Mask) >> (8 - BitIndex - Length);

}

template <class T>

static void get(const uint8_t *Bytes, uint8_t ByteIndex, T &Value) {

assert(ByteIndex + sizeof(T) - 1 < GOFF::RecordLength &&

"Byte index out of bounds!");

kpnUnsubmitted

Done

Missing module properties field.

kpn: Missing module properties field.

yusra.syedaAuthorUnsubmitted

Done

This field will be added in a future patch.

yusra.syeda: This field will be added in a future patch.

Value = support::endian::read<T, support::big, support::unaligned>(

&Bytes[ByteIndex]);

}

jhendersonUnsubmitted

Done

I've not looked too hard, but I wonder if some of this code could be removed by using the DataExtractor class? It provides operations for extracting values from a stream, given an index, and also handles endianness conversion.

jhenderson: I've not looked too hard, but I wonder if some of this code could be removed by using the…

};

class HDRRecord : public Record {

public:

HDRRecord() : Record() {

setRecordType(GOFF::RT_HDR);

setArchitectureLevel(1);

}

void setArchitectureLevel(uint32_t Level) { set<uint32_t>(48, Level); }

};

class TXTRecord : public Record {

public:

/// \brief Maximum length of data; any more must go in continuation.

static const uint8_t TXTMaxDataLength = 56;

kpnUnsubmitted

Done

I don't see any support for "Text Encoding". These fields are at bytes 16-21. Maybe a comment if you don't plan on ever implementing it?

kpn: I don't see any support for "Text Encoding". These fields are at bytes 16-21. Maybe a comment…

yusra.syedaAuthorUnsubmitted

Done

This will also be added in a future patch.

yusra.syeda: This will also be added in a future patch.

public:

TXTRecord() : Record() { setRecordType(GOFF::RT_TXT); }

void setRecordStyle(GOFF::TXTRecordStyle Style) { setBits(3, 4, 4, Style); }

void setElementEsdId(uint32_t EsdId) { set<uint32_t>(4, EsdId); }

jhendersonUnsubmitted

Done

assert(Length <= TXTMaxDataLength && "Data too long for TXT Record");

- for (int I = 0; I < Length; ++I)

+ for (size_t I = 0; I < Length; ++I)

set<uint8_t>(24 + I, Data[I]);

jhenderson:

void setOffset(uint32_t Offset) { set<uint32_t>(12, Offset); }

void setDataLength(uint16_t Length) { set<uint16_t>(22, Length); }

void setData(const SmallVectorImpl<char>::const_iterator &Data,

uint8_t Length) {

assert(Length <= TXTMaxDataLength && "Data too long for TXT Record");

for (size_t I = 0; I < Length; ++I)

set<uint8_t>(24 + I, Data[I]);

}

// Get routines.

static void getElementEsdId(const uint8_t *Record, uint32_t &EsdId) {

get<uint32_t>(Record, 4, EsdId);

}

static void getOffset(const uint8_t *Record, uint32_t &Offset) {

get<uint32_t>(Record, 12, Offset);

}

static void getDataLength(const uint8_t *Record, uint16_t &Length) {

get<uint16_t>(Record, 22, Length);

}

};

class ESDRecord : public Record {

public:

/// \brief Number of bytes for name; any more must go in continuation.

/// This is the number of bytes that can fit into the data field of an ESD

/// record.

static const uint8_t ESDMaxNameLength = 8;

/// \brief Maximum name length for ESD records and continuations.

/// This is the number of bytes that can fit into the data field of an ESD

/// record AND following continuations. This is limited fundamentally by the

/// 16 bit SIGNED length field.

static const uint16_t MaxNameLength = 32 * 1024;

public:

ESDRecord() : Record() { setRecordType(GOFF::RT_ESD); }

void setSymbolType(GOFF::ESDSymbolType SymbolType) {

set<uint8_t>(3, SymbolType);

}

void setEsdId(uint32_t EsdId) { set<uint32_t>(4, EsdId); }

void setParentEsdId(uint32_t EsdId) { set<uint32_t>(8, EsdId); }

void setOffset(uint32_t Offset) { set<uint32_t>(16, Offset); }

void setLength(uint32_t Length) { set<uint32_t>(24, Length); }

void setNameSpaceId(GOFF::ESDNameSpaceId Id) { set<uint8_t>(40, Id); }

void setFillBytePresent(bool Present) { setBits(41, 0, 1, Present); }

void setNameMangled(bool Mangled) { setBits(41, 1, 1, Mangled); }

kpnUnsubmitted

Done

Is this right? Bit 41.3 in "Program Management" is the "Removable Class Flag", which matches the code above this. Bit 41.4-6 are marked reserved, and bit 41.7 is unnamed but if set means "Reserve 16 bytes at beginning of class. MRG class ED records only."

So it looks like the code is writing to the wrong part of byte 41. The code that reads from that byte also appears incorrect. Shouldn't it be (41, 4, 3)?

kpn: Is this right? Bit 41.3 in "Program Management" is the "Removable Class Flag", which matches…

yusra.syedaAuthorUnsubmitted

Done

The setERSymbolType and getERSymbolType functions have been removed from this patch as they are not required.

yusra.syeda: The setERSymbolType and getERSymbolType functions have been removed from this patch as they are…

void setRenamable(bool Renamable) { setBits(41, 2, 1, Renamable); }

void setRemovable(bool Removable) { setBits(41, 3, 1, Removable); }

kpnUnsubmitted

Not Done

This is the "Associated data ID". It's confusing having "Ada" and both "ADA" but I don't think they're related?

kpn: This is the "Associated data ID". It's confusing having "Ada" and both "ADA" but I don't think…

void setReserveQwords(GOFF::ESDReserveQwords Reserve) {

setBits(41, 5, 3, Reserve);

}

void setFillByteValue(uint8_t Fill) { set<uint8_t>(42, Fill); }

void setAdaEsdId(uint32_t EsdId) { set<uint32_t>(44, EsdId); }

void setSortPriority(uint32_t Priority) { set<uint32_t>(48, Priority); }

void setAmode(GOFF::ESDAmode Amode) { set<uint8_t>(60, Amode); }

void setRmode(GOFF::ESDRmode Rmode) { set<uint8_t>(61, Rmode); }

void setTextStyle(GOFF::ESDTextStyle Style) { setBits(62, 0, 4, Style); }

void setBindingAlgorithm(GOFF::ESDBindingAlgorithm Algorithm) {

setBits(62, 4, 4, Algorithm);

}

void setTaskingBehavior(GOFF::ESDTaskingBehavior TaskingBehavior) {

setBits(63, 0, 3, TaskingBehavior);

}

void setReadOnly(bool ReadOnly) {

uint8_t Value = ReadOnly ? 1 : 0;

setBits(63, 4, 1, Value);

}

void setExecutable(GOFF::ESDExecutable Executable) {

setBits(63, 5, 3, Executable);

}

void setDuplicateSeverity(GOFF::ESDDuplicateSymbolSeverity DSS) {

setBits(64, 2, 2, DSS);

}

void setBindingStrength(GOFF::ESDBindingStrength Strength) {

kpnUnsubmitted

Done

No COMMON flag?

kpn: No COMMON flag?

yusra.syedaAuthorUnsubmitted

Done

Same with this field.

yusra.syeda: Same with this field.

setBits(64, 4, 4, Strength);

}

void setLoadingBehavior(GOFF::ESDLoadingBehavior Behavior) {

setBits(65, 0, 2, Behavior);

}

void setIndirectReference(bool Indirect) {

uint8_t Value = Indirect ? 1 : 0;

setBits(65, 3, 1, Value);

}

void setBindingScope(GOFF::ESDBindingScope Scope) {

setBits(65, 4, 4, Scope);

}

void setLinkageType(GOFF::ESDLinkageType Type) { setBits(66, 2, 1, Type); }

void setAlignment(GOFF::ESDAlignment Alignment) {

setBits(66, 3, 5, Alignment);

}

void setNameLength(uint16_t Length) { set<uint16_t>(70, Length); }

void setName(const StringRef::const_iterator &Data, uint8_t Length) {

assert(Length <= ESDMaxNameLength && "Data too long for ESD Record");

for (int I = 0; I < Length; ++I)

set<uint8_t>(72 + I, Data[I]);

}

// ESD Get routines.

static void getSymbolType(const uint8_t *Record,

GOFF::ESDSymbolType &SymbolType) {

uint8_t Value;

get<uint8_t>(Record, 3, Value);

SymbolType = (GOFF::ESDSymbolType)Value;

}

static void getEsdId(const uint8_t *Record, uint32_t &EsdId) {

get<uint32_t>(Record, 4, EsdId);

}

static void getParentEsdId(const uint8_t *Record, uint32_t &EsdId) {

get<uint32_t>(Record, 8, EsdId);

}

static void getOffset(const uint8_t *Record, uint32_t &Offset) {

get<uint32_t>(Record, 16, Offset);

}

static void getLength(const uint8_t *Record, uint32_t &Length) {

get<uint32_t>(Record, 24, Length);

}

static void getNameSpaceId(const uint8_t *Record, GOFF::ESDNameSpaceId &Id) {

uint8_t Value;

get<uint8_t>(Record, 40, Value);

Id = (GOFF::ESDNameSpaceId)Value;

}

static void getFillBytePresent(const uint8_t *Record, bool &Present) {

uint8_t Value;

getBits(Record, 41, 0, 1, Value);

Present = (bool)Value;

}

static void getNameMangled(const uint8_t *Record, bool &Mangled) {

uint8_t Value;

getBits(Record, 41, 1, 1, Value);

Mangled = (bool)Value;

}

static void getRenamable(const uint8_t *Record, bool &Renamable) {

uint8_t Value;

getBits(Record, 41, 2, 1, Value);

Renamable = (bool)Value;

}

static void getRemovable(const uint8_t *Record, bool &Removable) {

uint8_t Value;

getBits(Record, 41, 3, 1, Value);

Removable = (bool)Value;

}

static void getFillByteValue(const uint8_t *Record, uint8_t &Fill) {

get<uint8_t>(Record, 42, Fill);

}

static void getAdaEsdId(const uint8_t *Record, uint32_t &EsdId) {

get<uint32_t>(Record, 44, EsdId);

}

static void getSortPriority(const uint8_t *Record, uint32_t &Priority) {

get<uint32_t>(Record, 48, Priority);

}

static void getAmode(const uint8_t *Record, GOFF::ESDAmode &Amode) {

uint8_t Value;

get<uint8_t>(Record, 60, Value);

Amode = (GOFF::ESDAmode)Value;

}

static void getRmode(const uint8_t *Record, GOFF::ESDRmode &Rmode) {

uint8_t Value;

get<uint8_t>(Record, 61, Value);

Rmode = (GOFF::ESDRmode)Value;

}

static void getTextStyle(const uint8_t *Record, GOFF::ESDTextStyle &Style) {

uint8_t Value;

getBits(Record, 62, 0, 4, Value);

Style = (GOFF::ESDTextStyle)Value;

}

static void getBindingAlgorithm(const uint8_t *Record,

GOFF::ESDBindingAlgorithm &Algorithm) {

uint8_t Value;

getBits(Record, 62, 4, 4, Value);

Algorithm = (GOFF::ESDBindingAlgorithm)Value;

}

static void getTaskingBehavior(const uint8_t *Record,

GOFF::ESDTaskingBehavior &TaskingBehavior) {

uint8_t Value;

getBits(Record, 63, 0, 3, Value);

TaskingBehavior = (GOFF::ESDTaskingBehavior)Value;

}

static void getReadOnly(const uint8_t *Record, bool &ReadOnly) {

uint8_t Value;

getBits(Record, 63, 4, 1, Value);

ReadOnly = (bool)Value;

}

static void getExecutable(const uint8_t *Record,

GOFF::ESDExecutable &Executable) {

uint8_t Value;

getBits(Record, 63, 5, 3, Value);

Executable = (GOFF::ESDExecutable)Value;

}

static void getDuplicateSeverity(const uint8_t *Record,

GOFF::ESDDuplicateSymbolSeverity &DSS) {

uint8_t Value;

getBits(Record, 64, 2, 2, Value);

DSS = (GOFF::ESDDuplicateSymbolSeverity)Value;

}

static void getBindingStrength(const uint8_t *Record,

GOFF::ESDBindingStrength &Strength) {

uint8_t Value;

getBits(Record, 64, 4, 4, Value);

Strength = (GOFF::ESDBindingStrength)Value;

}

static void getLoadingBehavior(const uint8_t *Record,

GOFF::ESDLoadingBehavior &Behavior) {

uint8_t Value;

getBits(Record, 65, 0, 2, Value);

Behavior = (GOFF::ESDLoadingBehavior)Value;

}

static void getIndirectReference(const uint8_t *Record, bool &Indirect) {

uint8_t Value;

getBits(Record, 65, 3, 1, Value);

Indirect = (bool)Value;

}

static void getBindingScope(const uint8_t *Record,

GOFF::ESDBindingScope &Scope) {

uint8_t Value;

getBits(Record, 65, 4, 4, Value);

Scope = (GOFF::ESDBindingScope)Value;

}

static void getLinkageType(const uint8_t *Record,

GOFF::ESDLinkageType &Type) {

uint8_t Value;

getBits(Record, 66, 2, 1, Value);

Type = (GOFF::ESDLinkageType)Value;

}

static void getAlignment(const uint8_t *Record,

GOFF::ESDAlignment &Alignment) {

uint8_t Value;

getBits(Record, 66, 3, 5, Value);

Alignment = (GOFF::ESDAlignment)Value;

}

static uint16_t getNameLength(const uint8_t *Record) {

uint16_t Length;

get<uint16_t>(Record, 70, Length);

return Length;

}

};

class RLDRecord : public Record {

public:

/// \brief Maximum length of data; any more must go in another continuation.

static const uint8_t RLDMaxDataLength = 74;

/// \brief Lenght in bytes of one full relocation entry. We don't pack EsdIds.

static const uint8_t DataEntryLength = 20;

public:

RLDRecord() : Record() { setRecordType(GOFF::RT_RLD); }

void setLength(uint16_t Length) { set<uint16_t>(4, Length); }

void setData(const GOFFRelocationEntry &Ent, uint8_t Index) {

assert(Index < 2 && "Can only store two entries per RLD Record");

uint8_t Offset = 6 + DataEntryLength * Index;

setBits(Offset + 1, 0, 4, Ent.ReferenceType);

setBits(Offset + 1, 4, 4, Ent.ReferentType);

setBits(Offset + 2, 0, 7, Ent.Action);

if (Ent.NoFetchTarget)

setBits(Offset + 2, 7, 1, 1);

else

kpnUnsubmitted

Done

I assume RLD continuation records and relocation compression are coming later.

kpn: I assume RLD continuation records and relocation compression are coming later.

yusra.syedaAuthorUnsubmitted

Done

Yes, these will be coming in a future patch.

yusra.syeda: Yes, these will be coming in a future patch.

setBits(Offset + 2, 7, 1, 0);

set<uint8_t>(Offset + 4, Ent.TargetLength);

set<uint32_t>(Offset + 8, Ent.REsdId);

set<uint32_t>(Offset + 12, Ent.PEsdId);

set<uint32_t>(Offset + 16, Ent.POffset);

}

// RLD Get routines.

static void getLength(const uint8_t *Record, uint16_t &Length) {

get<uint16_t>(Record, 4, Length);

}

};

class ContinuationRecord : public Record {

public:

/// \brief Maximum length of data; any more must go in another continuation.

static const uint8_t ContinuationMaxDataLength = 77;

public:

ContinuationRecord(GOFF::RecordType Type) : Record() {

setRecordType(Type);

setContinuation(true);

}

void setData(const char *Data, uint8_t Length) {

assert(Length <= ContinuationMaxDataLength &&

"Data too long for Continuation Record");

for (int I = 0; I < Length; ++I)

set<uint8_t>(3 + I, Data[I]);

}

};

class ENDRecord : public Record {

public:

ENDRecord() : Record() {

setRecordType(GOFF::RT_END);

setEntryPointRequestType(GOFF::END_EPR_None); // TODO Always None for now.

}

void setEntryPointRequestType(GOFF::ENDEntryPointRequest Eprt) {

setBits(3, 6, 2, Eprt);

}

kpnUnsubmitted

Done

For completeness this is good. But please don't ever use it.

When the Binder abends I can't tell you how useful it is to use the Unix "dd" command to slice up GOFF until the abend goes away. That's how I've had to shoot down a number of bugs in emitting GOFF. But that technique doesn't work if the END card's record count field is used.

kpn: For completeness this is good. But please don't ever use it. When the Binder abends I can't…

void setEntryAmode(GOFF::ESDAmode Amode) { set<uint8_t>(4, Amode); }

void setEntryEsdId(uint32_t EsdId) { set<uint32_t>(12, EsdId); }

jhendersonUnsubmitted

Done

I'm not sure this is a useful comment. The implementation could change in all sorts of ways and break this function. I don't think it's useful to calll out two specific possibilities.

jhenderson: I'm not sure this is a useful comment. The implementation could change in all sorts of ways and…

void setRecordCount(uint32_t RecordCount) { set<uint32_t>(8, RecordCount); }

};

} // end namespace object

} // end namespace llvm

#endif

llvm/include/llvm/Object/GOFFObjectFile.h

This file was added.

				//===- GOFF.h - GOFF object file implementation ------------------ C++ --===//
				Lint: Lint Inline Actions clang-format not found in user's PATH; not linting file. Lint: Lint: clang-format not found in user's PATH; not linting file.
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				jhendersonUnsubmitted Done Reply Inline Actions Update license. jhenderson: Update license.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file declares the GOFFObjectFile class.
				// Record classes and derivatives are also declared and implemented.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_OBJECT_GOFFOBJECTFILE_H
				#define LLVM_OBJECT_GOFFOBJECTFILE_H

				#include "llvm/ADT/DenseMap.h"
				#include "llvm/ADT/IndexedMap.h"
				#include "llvm/ADT/Triple.h"
				#include "llvm/BinaryFormat/GOFF.h"
				#include "llvm/MC/SubtargetFeature.h"
				#include "llvm/Object/ObjectFile.h"
				#include "llvm/Support/CharSet.h"
				#include "llvm/Support/Debug.h"
				#include "llvm/Support/Endian.h"
				#include "llvm/Support/raw_ostream.h"

				namespace llvm {

				namespace object {

				class GOFFObjectFile : public ObjectFile {
				IndexedMap<const uint8_t *> EsdPtrs; // Indexed by EsdId.
				SmallVector<const uint8_t *, 256> TextPtrs;
				jhendersonUnsubmitted Done Reply Inline Actions This class is massive. Does it really need to be all implemented up front? For example, could you limit it to the bare minimum needed for the `ObjectFile` interface, and expand on it gradually in follow-up patches as you add things that need more information? jhenderson: This class is massive. Does it really need to be all implemented up front? For example, could…
				SmallVector<const uint8_t *, 256> RldPtrs;
				jhendersonUnsubmitted Done Reply Inline Actions Delete redundant blank line. jhenderson: Delete redundant blank line.

				mutable DenseMap<uint32_t, std::string> EsdNames;

				typedef DataRefImpl SectionEntryImpl;
				// (EDID, 0) code, r/o data section
				// (EDID,PRID) r/w data section
				SmallVector<SectionEntryImpl, 256> SectionList;
				mutable DenseMap<uint32_t, std::string> SectionData;

				SmallVector<uint8_t, GOFF::RecordLength> RelocationData;

				CharSetConverter Converter;

				public:
				Expected<StringRef> getSymbolName(SymbolRef Symbol) const;

				jhendersonUnsubmitted Done Reply Inline Actions Why is `getSymbolName` still returning a `std::error_code` and not an `Error`? jhenderson: Why is `getSymbolName` still returning a `std::error_code` and not an `Error`?
				jhendersonUnsubmitted Done Reply Inline Actions Sorry, I should have spotted this earlier. It would be a more common style for this function's signature to return an `Expected` rather than take the `Res` as an argument: `Expected<StringRef> getSymbolName(SymbolRef Symbol) const;` This would match, e.g. `getSectionName` or `getSectionContents` as good examples. jhenderson: Sorry, I should have spotted this earlier. It would be a more common style for this function's…
				yusra.syedaAuthorUnsubmitted Done Reply Inline Actions Seems like I missed this. It's been updated. yusra.syeda: Seems like I missed this. It's been updated.
				static GOFF::RLDAction getRLDAction(uint64_t RelocationType); // +,-,...
				jhendersonUnsubmitted Done Reply Inline Actions Why are you using `protected` at all in this class? As far as I can see, there are no sub-classes, so `protected` is no different to `private`, and you should probably prefer the latter. jhenderson: Why are you using `protected` at all in this class? As far as I can see, there are no sub…
				jhendersonUnsubmitted Done Reply Inline Actions Isn't all of this "GOFF specific"? The class is `GOFFObjectFile`... jhenderson: Isn't all of this "GOFF specific"? The class is `GOFFObjectFile`...
				static GOFF::RLDFetchStore getRLDFetchStore(uint64_t RelocationType);
				static uint16_t getRLDBitLength(uint64_t RelocationType);
				jhendersonUnsubmitted Done Reply Inline Actions Why is this returning a `std::error_code` not an `Error`? jhenderson: Why is this returning a `std::error_code` not an `Error`?
				static uint8_t getRLDBitOffset(uint64_t RelocationType);
				static bool
				getRLDSymbolIsIndirect(uint64_t RelocationType); // Function descriptor ref.
				static bool getRLDIsWeak(uint64_t RelocationType); // Weak ref.
				static bool
				getRLDIsCodeAddressReference(uint64_t RelocationType); // Function code ref.

				GOFFObjectFile(MemoryBufferRef Object, Error &Err);
				static inline bool classof(const Binary *V) { return V->isGOFF(); }
				section_iterator section_begin() const override;
				section_iterator section_end() const override;

				uint8_t getBytesInAddress() const override { return 8; }

				StringRef getFileFormatName() const override { return "GOFF-SystemZ"; }

				Triple::ArchType getArch() const override { return Triple::systemz; }

				SubtargetFeatures getFeatures() const override { return SubtargetFeatures(); }

				bool isRelocatableObject() const override { return true; }

				void moveSymbolNext(DataRefImpl &Symb) const override;
				basic_symbol_iterator symbol_begin() const override;
				basic_symbol_iterator symbol_end() const override;

				private:
				// SymbolRef.
				Expected<StringRef> getSymbolName(DataRefImpl Symb) const override;
				Expected<uint64_t> getSymbolAddress(DataRefImpl Symb) const override;
				uint64_t getSymbolValueImpl(DataRefImpl Symb) const override;
				uint64_t getCommonSymbolSizeImpl(DataRefImpl Symb) const override;
				Expected<uint32_t> getSymbolFlags(DataRefImpl Symb) const override;
				Expected<SymbolRef::Type> getSymbolType(DataRefImpl Symb) const override;
				Expected<section_iterator> getSymbolSection(DataRefImpl Symb) const override;

				const uint8_t *getSymbolEsdRecord(DataRefImpl Symb) const;
				bool isSymbolUnresolved(DataRefImpl Symb) const;
				bool isSymbolIndirect(DataRefImpl Symb) const;

				// SectionRef.
				void moveSectionNext(DataRefImpl &Sec) const override{};
				virtual Expected<StringRef> getSectionName(DataRefImpl Sec) const override {
				return StringRef();
				}
				uint64_t getSectionAddress(DataRefImpl Sec) const override { return 0; }
				uint64_t getSectionSize(DataRefImpl Sec) const override { return 0; }
				virtual Expected<ArrayRef<uint8_t>>
				getSectionContents(DataRefImpl Sec) const override {
				jhendersonUnsubmitted Done Reply Inline Actions Don't use `typedef struct`. This isn't C code - use `struct RelocationIteratorState { ... };` jhenderson: Don't use `typedef struct`. This isn't C code - use `struct RelocationIteratorState { ... };`
				return ArrayRef<uint8_t>();
				jhendersonUnsubmitted Done Reply Inline Actions I'm not sure what's with the weird comment style. What does it add above normal comments? jhenderson: I'm not sure what's with the weird comment style. What does it add above normal comments?
				}
				uint64_t getSectionIndex(DataRefImpl Sec) const override { return 0; }
				uint64_t getSectionAlignment(DataRefImpl Sec) const override { return 0; }
				bool isSectionCompressed(DataRefImpl Sec) const override { return false; }
				bool isSectionText(DataRefImpl Sec) const override { return false; }
				bool isSectionData(DataRefImpl Sec) const override { return false; }
				bool isSectionBSS(DataRefImpl Sec) const override { return false; }
				bool isSectionVirtual(DataRefImpl Sec) const override { return false; }
				relocation_iterator section_rel_begin(DataRefImpl Sec) const override {
				return relocation_iterator(RelocationRef(Sec, this));
				}
				relocation_iterator section_rel_end(DataRefImpl Sec) const override {
				return relocation_iterator(RelocationRef(Sec, this));
				}

				const uint8_t *getSectionEdEsdRecord(DataRefImpl &Sec) const;
				const uint8_t *getSectionPrEsdRecord(DataRefImpl &Sec) const;
				const uint8_t *getSectionEdEsdRecord(uint32_t SectionIndex) const;
				const uint8_t *getSectionPrEsdRecord(uint32_t SectionIndex) const;

				struct RelocationIteratorState {
				jhendersonUnsubmitted Done Reply Inline Actions Don't add `_t` to type names. jhenderson: Don't add `_t` to type names.
				// Common output.
				DataRefImpl Sec; // Section containing relocation.
				uint64_t PosOffset; // Offset of relocation in Sec.
				DataRefImpl RefSymb; // Symbol referred to.
				uint64_t RelocationType; // Type of relocation.

				// Fields used when processing relocations from object file.
				uint32_t RelocationDataOffset;
				uint16_t CurrentRldSize;

				uint32_t SectionDefId;
				jhendersonUnsubmitted Done Reply Inline Actions Try to avoid repeatedly switching between `private`/`public` etc - put all the private stuff together and all the `public` stuff together. `public` stuff tends to be first, since that's the public API, although it's not that important. jhenderson: Try to avoid repeatedly switching between `private`/`public` etc - put all the private stuff…

				// Optional RLD fields whose value is inherited from a previous RLD item.
				uint32_t OReferenceId;
				uint32_t OPositionId;
				uint64_t OOffset;

				// Fields when processing relocations for manufactured function descriptor
				// section.
				uint32_t FunctionDescriptorIndex;
				bool DoingFuncSymb;
				};

				// RelocationRef.
				void moveRelocationNext(DataRefImpl &Rel) const override{};
				uint64_t getRelocationOffset(DataRefImpl Rel) const override { return 0; }
				symbol_iterator getRelocationSymbol(DataRefImpl Rel) const override {
				RelocationIteratorState RIS = (RelocationIteratorState )Rel.p;
				return basic_symbol_iterator(SymbolRef(RIS->RefSymb, this));
				}
				uint64_t getRelocationType(DataRefImpl Rel) const override { return 0; }
				void getRelocationTypeName(DataRefImpl Rel,
				SmallVectorImpl<char> &Result) const override{};

				void getRelocationData();
				};

				} // namespace object

				} // namespace llvm

				#endif

llvm/include/llvm/Object/ObjectFile.h

//===- ObjectFile.h - File format independent object file -------- C++ --===//		//===- ObjectFile.h - File format independent object file -------- C++ --===//
		Lint: Lint Inline Actions clang-format not found in user's PATH; not linting file. Lint: Lint: clang-format not found in user's PATH; not linting file.
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
▲ Show 20 Lines • Show All 360 Lines • ▼ Show 20 Lines	public:
static Expected<std::unique_ptr<ObjectFile>>		static Expected<std::unique_ptr<ObjectFile>>
createELFObjectFile(MemoryBufferRef Object);		createELFObjectFile(MemoryBufferRef Object);

static Expected<std::unique_ptr<MachOObjectFile>>		static Expected<std::unique_ptr<MachOObjectFile>>
createMachOObjectFile(MemoryBufferRef Object,		createMachOObjectFile(MemoryBufferRef Object,
uint32_t UniversalCputype = 0,		uint32_t UniversalCputype = 0,
uint32_t UniversalIndex = 0);		uint32_t UniversalIndex = 0);

		static Expected<std::unique_ptr<ObjectFile>>
		createGOFFObjectFile(MemoryBufferRef Object);

static Expected<std::unique_ptr<WasmObjectFile>>		static Expected<std::unique_ptr<WasmObjectFile>>
createWasmObjectFile(MemoryBufferRef Object);		createWasmObjectFile(MemoryBufferRef Object);
};		};

// Inline function definitions.		// Inline function definitions.
inline SymbolRef::SymbolRef(DataRefImpl SymbolP, const ObjectFile *Owner)		inline SymbolRef::SymbolRef(DataRefImpl SymbolP, const ObjectFile *Owner)
: BasicSymbolRef(SymbolP, Owner) {}		: BasicSymbolRef(SymbolP, Owner) {}

▲ Show 20 Lines • Show All 207 Lines • Show Last 20 Lines

llvm/lib/BinaryFormat/Magic.cpp

//===- llvm/BinaryFormat/Magic.cpp - File magic identification --- C++ --===//		//===- llvm/BinaryFormat/Magic.cpp - File magic identification --- C++ --===//
		Lint: Lint Inline Actions clang-format not found in user's PATH; not linting file. Lint: Lint: clang-format not found in user's PATH; not linting file.
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

▲ Show 20 Lines • Show All 69 Lines • ▼ Show 20 Lines	file_magic llvm::identify_magic(StringRef Magic) {
case 'B':		case 'B':
if (startswith(Magic, "BC\xC0\xDE"))		if (startswith(Magic, "BC\xC0\xDE"))
return file_magic::bitcode;		return file_magic::bitcode;
break;		break;
case '!':		case '!':
if (startswith(Magic, "!<arch>\n") \|\| startswith(Magic, "!<thin>\n"))		if (startswith(Magic, "!<arch>\n") \|\| startswith(Magic, "!<thin>\n"))
return file_magic::archive;		return file_magic::archive;
break;		break;
		case 0x03:
		if (startswith(Magic, "\x03\xF0\x00"))
		return file_magic::goff_object;
		break;
case '\177':		case '\177':
if (startswith(Magic, "\177ELF") && Magic.size() >= 18) {		if (startswith(Magic, "\177ELF") && Magic.size() >= 18) {
bool Data2MSB = Magic[5] == 2;		bool Data2MSB = Magic[5] == 2;
unsigned high = Data2MSB ? 16 : 17;		unsigned high = Data2MSB ? 16 : 17;
unsigned low = Data2MSB ? 17 : 16;		unsigned low = Data2MSB ? 17 : 16;
if (Magic[high] == 0) {		if (Magic[high] == 0) {
switch (Magic[low]) {		switch (Magic[low]) {
default:		default:
▲ Show 20 Lines • Show All 140 Lines • Show Last 20 Lines

llvm/lib/Object/Binary.cpp

//===- Binary.cpp - A generic binary file ---------------------------------===//		//===- Binary.cpp - A generic binary file ---------------------------------===//
		Lint: Lint Inline Actions clang-format not found in user's PATH; not linting file. Lint: Lint: clang-format not found in user's PATH; not linting file.
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	Expected<std::unique_ptr<Binary>> object::createBinary(MemoryBufferRef Buffer,
case file_magic::macho_dynamically_linked_shared_lib:		case file_magic::macho_dynamically_linked_shared_lib:
case file_magic::macho_dynamic_linker:		case file_magic::macho_dynamic_linker:
case file_magic::macho_bundle:		case file_magic::macho_bundle:
case file_magic::macho_dynamically_linked_shared_lib_stub:		case file_magic::macho_dynamically_linked_shared_lib_stub:
case file_magic::macho_dsym_companion:		case file_magic::macho_dsym_companion:
case file_magic::macho_kext_bundle:		case file_magic::macho_kext_bundle:
case file_magic::coff_object:		case file_magic::coff_object:
case file_magic::coff_import_library:		case file_magic::coff_import_library:
		case file_magic::goff_object:
case file_magic::pecoff_executable:		case file_magic::pecoff_executable:
case file_magic::bitcode:		case file_magic::bitcode:
case file_magic::xcoff_object_32:		case file_magic::xcoff_object_32:
case file_magic::xcoff_object_64:		case file_magic::xcoff_object_64:
case file_magic::wasm_object:		case file_magic::wasm_object:
return ObjectFile::createSymbolicFile(Buffer, Type, Context);		return ObjectFile::createSymbolicFile(Buffer, Type, Context);
case file_magic::macho_universal_binary:		case file_magic::macho_universal_binary:
return MachOUniversalBinary::create(Buffer);		return MachOUniversalBinary::create(Buffer);
Show All 34 Lines

llvm/lib/Object/CMakeLists.txt

	add_llvm_component_library(LLVMObject			add_llvm_component_library(LLVMObject
	Archive.cpp			Archive.cpp
	ArchiveWriter.cpp			ArchiveWriter.cpp
	Binary.cpp			Binary.cpp
	COFFImportFile.cpp			COFFImportFile.cpp
	COFFModuleDefinition.cpp			COFFModuleDefinition.cpp
	COFFObjectFile.cpp			COFFObjectFile.cpp
	Decompressor.cpp			Decompressor.cpp
	ELF.cpp			ELF.cpp
	ELFObjectFile.cpp			ELFObjectFile.cpp
	Error.cpp			Error.cpp
				GOFFObjectFile.cpp
	IRObjectFile.cpp			IRObjectFile.cpp
	IRSymtab.cpp			IRSymtab.cpp
	MachOObjectFile.cpp			MachOObjectFile.cpp
	MachOUniversal.cpp			MachOUniversal.cpp
	Minidump.cpp			Minidump.cpp
	ModuleSymbolTable.cpp			ModuleSymbolTable.cpp
	Object.cpp			Object.cpp
	ObjectFile.cpp			ObjectFile.cpp
	Show All 19 Lines

llvm/lib/Object/GOFFObjectFile.cpp

This file was added.

//===- GOFFObjectFile.cpp - GOFF object file implementation -----*- C++ -*-===//

Lint: Lint

clang-format not found in user's PATH; not linting file.

Lint: Lint: clang-format not found in user's PATH; not linting file.

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

//===----------------------------------------------------------------------===//

// Implementation of the GOFFObjectFile class.

//===----------------------------------------------------------------------===//

#include "llvm/Object/GOFFObjectFile.h"

#include "llvm/BinaryFormat/GOFF.h"

#include "llvm/Object/GOFF.h"

#include "llvm/Support/Debug.h"

#include "llvm/Support/Errc.h"

#include "llvm/Support/raw_ostream.h"

#ifndef DEBUG_TYPE

#define DEBUG_TYPE "goff"

#endif

using namespace llvm;

using namespace object;

#define RLD_INDIRECT_FLAG 0x400000000

#define RLD_CODE_ADDR_FLAG 0x200000000

#define RLD_WEAK_FLAG 0x100000000

Expected<std::unique_ptr<ObjectFile>>

ObjectFile::createGOFFObjectFile(MemoryBufferRef Object) {

Error Err = Error::success();

std::unique_ptr<GOFFObjectFile> Ret(new GOFFObjectFile(Object, Err));

if (Err)

return std::move(Err);

return std::move(Ret);

jhendersonUnsubmitted

Done

Use the fallible constructor idiom here, as described in the LLVM Programmer's Manual. In particular, avoid using std::error_code in new code - prefer just using Expected and Error.

jhenderson: Use the fallible constructor idiom here, as described in the [[ https://llvm.

}

GOFFObjectFile::GOFFObjectFile(MemoryBufferRef Object, Error &Err)

: ObjectFile(Binary::ID_GOFF, Object),

Converter(cantFail(errorOrToExpected(CharSetConverter::create(

CharSetConverter::CP_IBM1047, CharSetConverter::CP_UTF8)))) {

ErrorAsOutParameter ErrAsOutParam(&Err);

// Object file isn't the right size, bail out early.

if ((Object.getBufferSize() % GOFF::RecordLength) != 0) {

Err = createStringError(

jhendersonUnsubmitted

Done

Be more verbose with your error messages, so that they provide more useful context. See https://llvm.org/docs/CodingStandards.html#error-and-warning-messages for full details.

jhenderson: Be more verbose with your error messages, so that they provide more useful context. See https…

jhendersonUnsubmitted

Done

Please remember to clang-format your changes.

jhenderson: Please remember to clang-format your changes.

object_error::unexpected_eof,

jhendersonUnsubmitted

Done

SectionEntryImpl DummySection;

- SectionList.emplace_back(DummySection); // dummy entry at index 0

+ SectionList.emplace_back(DummySection); // Dummy entry at index 0.

const uint8_t *End = reinterpret_cast<const uint8_t *>(Data.getBufferEnd());

Comments should start with an upper-case letter.

jhenderson: Comments should start with an upper-case letter.

jhendersonUnsubmitted

Done

As stated before, avoid using error_code if possible. That includes functions like errorCodeToError which just convert from it. Instead, prefer functions like createStringError, which allow you to give more contextual error information to the user.

jhenderson: As stated before, avoid using `error_code` if possible. That includes functions like…

jhendersonUnsubmitted

Done

Err = createStringError(object_error::unexpected_eof,

- "object file is not the right size.");

+ "object file is not the right size");

return;

What is the right size? What is the size that the object file actually is? Please include context in the message. Also, no trailing full stop in error messages. See https://llvm.org/docs/CodingStandards.html#error-and-warning-messages.

jhenderson: What is the right size? What is the size that the object file actually is? Please include…

"object file is not the right size. Must be a multiple "

jhendersonUnsubmitted

Done

"object file is not the right size. Must be a multiple "

- "of 80 bytes, but is %d bytes",

+ "of 80 bytes, but is %z bytes",

Object.getBufferSize());

getBufferSize() returns size_t not int. The correct format specifier is %z for size_t.

jhenderson: `getBufferSize()` returns `size_t` not `int`. The correct format specifier is `%z` for `size_t`.

"of 80 bytes, but is %z bytes",

Object.getBufferSize());

jhendersonUnsubmitted

Done

As noted earlier - it might be better to use the DataExtractor and Cursor class to make parsing easier.

jhenderson: As noted earlier - it might be better to use the `DataExtractor` and `Cursor` class to make…

yusra.syedaAuthorUnsubmitted

Done

The DataExtractor class doesn't seem to be helpful. It's best use is if the data is read sequential, which is not the case with GOFF.

yusra.syeda: The DataExtractor class doesn't seem to be helpful. It's best use is if the data is read…

jhendersonUnsubmitted

Done

You can use DataExtractor with offsets, rather than a Cursor, if the read is jumping around.

jhenderson: You can use `DataExtractor` with offsets, rather than a `Cursor`, if the read is jumping around.

kpnUnsubmitted

Done

Does this mean the ability to read RECFM=VB GOFF datasets is explicitly being designed to be impossible?

kpn: Does this mean the ability to read RECFM=VB GOFF datasets is explicitly being designed to be…

jhendersonUnsubmitted

Done

I'm not sure if this comment is being directed at me or @yusra.syeda. If at me, I don't really understand the question as I don't know the file format.

jhenderson: I'm not sure if this comment is being directed at me or @yusra.syeda. If at me, I don't really…

yusra.syedaAuthorUnsubmitted

Done

The first goal is to get the compiler running in USS, which only supports fixed block length of 80.

yusra.syeda: The first goal is to get the compiler running in USS, which only supports fixed block length of…

kpnUnsubmitted

Done

Hmm, @yusra.syeda, your response is mostly correct, but not entirely.

It's true that the Unix-style filesystem (IBM calls it the "Hierarchical File System", with the first implementation of it being called the "Hierarchical File System" and the second "z/FS") has no record support because Unix doesn't support records in files. Thus the 80-byte requirement on GOFF record sizes. This is true.

But there's no requirement that a program started under USS only access the Unix-style filesystem. There's no requirement that a program started under TSO or in batch only access traditional MVS datasets. Indeed, JCL even has support for Unix paths in DD statements, and TSO probably does as well (but I don't have my book handy).

So the compiler "running in USS" does _not_ mean that we are restricted to 80-byte GOFF records. Granted, disambiguation of Unix paths and MVS dataset names is a problem, but still.

I understand why you would want to leave variable sized records for implementation later. Not implementing support for MVS datasets up front is one thing. Designing your code to make it difficult if not impossible to add later is quite different. For example, random access to variable size record datasets is painful.

Are you at least looking ahead to adding RECFM=V support later?

kpn: Hmm, @yusra.syeda, your response is mostly correct, but not entirely. It's true that the Unix…

yusra.syedaAuthorUnsubmitted

Done

@kpn we don't have plans on adding support for variable length GOFF records. The XL compiler supports only 80 byte records and we don't plan to add support further than what exists in the XL compiler.

yusra.syeda: @kpn we don't have plans on adding support for variable length GOFF records. The XL compiler…

return;

}

SectionEntryImpl DummySection;

SectionList.emplace_back(DummySection); // Dummy entry at index 0.

const uint8_t *End = reinterpret_cast<const uint8_t *>(Data.getBufferEnd());

for (const uint8_t *I = base(); I < End; I += GOFF::RecordLength) {

uint8_t RecordType = (I[1] & 0xF0) >> 4;

bool IsContinuation = I[1] & 0x02;

// Don't parse continuations records, they've already been handled by

// a previous record parse call.

if (IsContinuation)

continue;

jhendersonUnsubmitted

Not Done

I don't see a test case involving a continuation record. You should have one followed by a non-continuation record, as otherwise this aspect is not tested.

jhenderson: I don't see a test case involving a continuation record. You should have one followed by a non…

for (size_t J = 0; J < GOFF::RecordLength; ++J) {

const uint8_t *P = I + J;

if (J % 8 == 0)

LLVM_DEBUG(dbgs() << " ");

LLVM_DEBUG(dbgs() << format("%02hhX", *P));

}

switch (RecordType) {

case GOFF::RT_ESD: {

// Save ESD record.

uint32_t EsdId;

ESDRecord::getEsdId(I, EsdId);

EsdPtrs.grow(EsdId);

EsdPtrs[EsdId] = I;

// Determine and save the "sections" in GOFF.

jhendersonUnsubmitted

Done

// case (1): (ED,child PR)

- // - where the PR must be have non-zero length.

+ // - where the PR must have non-zero length.

// case (2a) (ED,0)

jhenderson:

// A section is saved as a tuple of the form

// case (1): (ED,child PR)

// - where the PR must have non-zero length.

// case (2a) (ED,0)

jhendersonUnsubmitted

Done

// - where the ED is zero length but

- // contains a label (LD)

+ // contains a label (LD).

GOFF::ESDSymbolType SymbolType;

(reminder - comments must end in a full stop)

jhenderson: (reminder - comments must end in a full stop)

// - where the ED is of non-zero length.

jhendersonUnsubmitted

Done

// case (2b) (ED,0)

// - where the ED is zero length but

- // but contains a label (LD)

+ // contains a label (LD)

GOFF::ESDSymbolType SymbolType;

jhenderson:

// case (2b) (ED,0)

// - where the ED is zero length but

// contains a label (LD).

GOFF::ESDSymbolType SymbolType;

ESDRecord::getSymbolType(I, SymbolType);

SectionEntryImpl Section;

uint32_t Length;

ESDRecord::getLength(I, Length);

if (SymbolType == GOFF::ESD_ST_ElementDefinition) {

// case (2a)

if (Length != 0) {

Section.d.a = EsdId;

SectionList.emplace_back(Section);

}

} else if (SymbolType == GOFF::ESD_ST_PartReference) {

// case (1)

if (Length != 0) {

uint32_t SymEdId;

ESDRecord::getParentEsdId(I, SymEdId);

Section.d.a = SymEdId;

Section.d.b = EsdId;

SectionList.emplace_back(Section);

}

} else if (SymbolType == GOFF::ESD_ST_LabelDefinition) {

// case (2b)

uint32_t SymEdId;

ESDRecord::getParentEsdId(I, SymEdId);

const uint8_t *SymEdRecord = EsdPtrs[SymEdId];

uint32_t EdLength;

ESDRecord::getLength(SymEdRecord, EdLength);

jhendersonUnsubmitted

Done

if (!EdLength) { // [ EDID, PRID ]

- // LD child of a zero length parent ED

- // Add the section ED which was previously ignored

+ // LD child of a zero length parent ED.

+ // Add the section ED which was previously ignored.

Section.d.a = SymEdId;

jhenderson:

if (!EdLength) { // [ EDID, PRID ]

// LD child of a zero length parent ED.

// Add the section ED which was previously ignored.

Section.d.a = SymEdId;

SectionList.emplace_back(Section);

}

LLVM_DEBUG(dbgs() << " -- ESD " << EsdId << "\n");

break;

}

case GOFF::RT_TXT:

// Save TXT records.

TextPtrs.emplace_back(I);

LLVM_DEBUG(dbgs() << " -- TXT\n");

jhendersonUnsubmitted

Done

case GOFF::RT_RLD:

- // Save RLD records

+ // Save RLD records.

RldPtrs.emplace_back(I);

jhenderson:

break;

case GOFF::RT_RLD:

// Save RLD records.

RldPtrs.emplace_back(I);

LLVM_DEBUG(dbgs() << " -- RLD\n");

jhendersonUnsubmitted

Done

This sounds like it should be an error?

jhenderson: This sounds like it should be an error?

jhendersonUnsubmitted

Done

I think this comment was referring to the "unhandled" bits below. It's been marked as done, but I don't see any response. Could you clarify more why this isn't a hard error and instead such things are being ignored?

jhenderson: I think this comment was referring to the "unhandled" bits below. It's been marked as done, but…

yusra.syedaAuthorUnsubmitted

Done

These should not be errors. The GOFF reader ignores the normally less important things.

yusra.syeda: These should not be errors. The GOFF reader ignores the normally less important things.

break;

case GOFF::RT_LEN:

LLVM_DEBUG(dbgs() << " -- LEN (GOFF record type) unhandled\n");

break;

case GOFF::RT_END:

LLVM_DEBUG(dbgs() << " -- END (GOFF record type) unhandled\n");

break;

case GOFF::RT_HDR:

LLVM_DEBUG(dbgs() << " -- HDR (GOFF record type) unhandled\n");

break;

default:

llvm_unreachable("Unknown record type");

}

getRelocationData();

jhendersonUnsubmitted

Done

These sorts of comments don't add anything, in my opinion. Just delete them (the function names describe things sufficiently).

jhenderson: These sorts of comments don't add anything, in my opinion. Just delete them (the function names…

}

const uint8_t *GOFFObjectFile::getSymbolEsdRecord(DataRefImpl Symb) const {

const uint8_t *EsdRecord = EsdPtrs[Symb.d.a];

return EsdRecord;

jhendersonUnsubmitted

Done

Add a blank line between functions.

jhenderson: Add a blank line between functions.

}

Expected<StringRef> GOFFObjectFile::getSymbolName(DataRefImpl Symb) const {

if (EsdNames.count(Symb.d.a))

jhendersonUnsubmitted

Done

Why convert between the two when you could just use uint8_t throughout?

jhenderson: Why convert between the two when you could just use `uint8_t` throughout?

return EsdNames[Symb.d.a];

const uint8_t *Record = getSymbolEsdRecord(Symb);

jhendersonUnsubmitted

Done

// TODO: This could probably be changed to extract the length of the symbol

- // name and then grab only that many characters but for now this works fine

+ // name and then grab only that many characters but for now this works fine.

uint16_t Continuations = 0;

jhenderson:

uint16_t SymbolNameLength = ESDRecord::getNameLength(Record);

jhendersonUnsubmitted

Done

What limits this to being specifically uint16_t in size?

jhenderson: What limits this to being specifically `uint16_t` in size?

kpnUnsubmitted

Done

I believe the maximum record length even with continuations is 32KB. I don't know if saving two bytes of stack is worth making people reading the code doubletake, though.

A check to make sure this 32KB limit is not exceeded is needed.

kpn: I believe the maximum record length even with continuations is 32KB. I don't know if saving two…

yusra.syedaAuthorUnsubmitted

Done

Thanks, I will add that check.

yusra.syeda: Thanks, I will add that check.

assert(SymbolNameLength <= ESDRecord::MaxNameLength);

jhendersonUnsubmitted

Done

while (true) {

- // Is the record continued in the next record

+ // Is the record continued in the next record?

const char *ContinuationByte =

jhenderson:

SmallString<256> SymbolName;

// First record.

jhendersonUnsubmitted

Done

Record + (Continuations * GOFF::RecordLength) + 1;

- bool IsContinued = *ContinuationByte & 0x01;

+ const bool IsContinued = *ContinuationByte & 0x01;

if (IsContinued)

jhenderson:

const uint8_t *Slice =

Record + (GOFF::RecordLength - ESDRecord::ESDMaxNameLength);

size_t SliceLength =

std::min(SymbolNameLength, (uint16_t)ESDRecord::ESDMaxNameLength);

SymbolName.append(Slice, Slice + SliceLength);

SymbolNameLength -= SliceLength;

Slice += SliceLength;

// Continuation records.

jhendersonUnsubmitted

Done

SymbolNameLength -= SliceLength, Slice += GOFF::PayloadLength) {

- // Slice points to the begin of the new record.

+ // Slice points to the start of the new record.

// Check that this block is a Continuation.

(or "beginning")

begin = verb, beginning/start = nouns

jhenderson: (or "beginning") begin = verb, beginning/start = nouns

for (; SymbolNameLength > 0;

SymbolNameLength -= SliceLength, Slice += GOFF::PayloadLength) {

jhendersonUnsubmitted

Done

SmallString<256> SymbolName(Record + 72, Record + GOFF::RecordLength);

// This assumes that we always emit 80 byte records even if the rest of the

- // data in a record is nulls

+ // data in a record is nulls.

for (uint16_t I = 0; I < Continuations; ++I) {

What happens if the data is truncated?

jhenderson: What happens if the data is truncated?

// Slice points to the start of the new record.

// Check that this block is a Continuation.

assert(Record::isContinuation(Slice) && "Continuation bit must be set");

SliceLength = std::min(SymbolNameLength, (uint16_t)GOFF::PayloadLength);

Slice += GOFF::RecordPrefixLength;

SymbolName.append(Slice, Slice + SliceLength);

}

SmallString<256> SymbolNameConverted;

if (auto EC = Converter.convert(SymbolName, SymbolNameConverted))

return errorCodeToError(EC);

EsdNames[Symb.d.a].assign(SymbolNameConverted.c_str());

return EsdNames[Symb.d.a];

}

Expected<StringRef> GOFFObjectFile::getSymbolName(SymbolRef Symbol) const {

return getSymbolName(Symbol.getRawDataRefImpl());

}

jhendersonUnsubmitted

Done

Here, you'd want Error::success(), but actually, if you switch to using Expected, you'd return *NameOrErr directly.

jhenderson: Here, you'd want `Error::success()`, but actually, if you switch to using `Expected`, you'd…

Expected<uint64_t> GOFFObjectFile::getSymbolAddress(DataRefImpl Symb) const {

uint32_t Offset;

jhendersonUnsubmitted

Done

Expected<StringRef> GOFFObjectFile::getSymbolName(SymbolRef Symbol) const {

- Expected<StringRef> NameOrErr = getSymbolName(Symbol.getRawDataRefImpl());

- if (NameOrErr) {

- return *NameOrErr;

- }

- return NameOrErr.takeError();

+ return getSymbolName(Symbol.getRawDataRefImpl());

}

Expected<uint64_t> GOFFObjectFile::getSymbolAddress(DataRefImpl Symb) const {

Actually, better yet, I think you can simplify this down to a single line as suggested in the edit.

jhenderson: Actually, better yet, I think you can simplify this down to a single line as suggested in the…

const uint8_t *EsdRecord = getSymbolEsdRecord(Symb);

ESDRecord::getOffset(EsdRecord, Offset);

return static_cast<uint64_t>(Offset);

}

uint64_t GOFFObjectFile::getSymbolValueImpl(DataRefImpl Symb) const {

uint32_t Offset;

const uint8_t *EsdRecord = getSymbolEsdRecord(Symb);

ESDRecord::getOffset(EsdRecord, Offset);

return static_cast<uint64_t>(Offset);

}

uint64_t GOFFObjectFile::getCommonSymbolSizeImpl(DataRefImpl Symb) const {

return 0;

}

bool GOFFObjectFile::isSymbolUnresolved(DataRefImpl Symb) const {

const uint8_t *Record = getSymbolEsdRecord(Symb);

GOFF::ESDSymbolType SymbolType;

ESDRecord::getSymbolType(Record, SymbolType);

if (SymbolType == GOFF::ESD_ST_ExternalReference)

return true;

if (SymbolType == GOFF::ESD_ST_PartReference) {

uint32_t Length;

ESDRecord::getLength(Record, Length);

if (Length == 0)

return true;

}

return false;

}

bool GOFFObjectFile::isSymbolIndirect(DataRefImpl Symb) const {

const uint8_t *Record = getSymbolEsdRecord(Symb);

bool Indirect;

ESDRecord::getIndirectReference(Record, Indirect);

return Indirect;

}

Expected<uint32_t> GOFFObjectFile::getSymbolFlags(DataRefImpl Symb) const {

uint32_t Flags = 0;

if (isSymbolUnresolved(Symb))

Flags |= SymbolRef::SF_Undefined;

const uint8_t *Record = getSymbolEsdRecord(Symb);

GOFF::ESDBindingStrength BindingStrength;

ESDRecord::getBindingStrength(Record, BindingStrength);

if (BindingStrength == GOFF::ESD_BST_Weak)

Flags |= SymbolRef::SF_Weak;

GOFF::ESDBindingScope BindingScope;

ESDRecord::getBindingScope(Record, BindingScope);

if (BindingScope != GOFF::ESD_BSC_Section) {

Expected<StringRef> Name = getSymbolName(Symb);

if (Name && *Name != " ") { // Blank name is local.

Flags |= SymbolRef::SF_Global;

if (BindingScope == GOFF::ESD_BSC_ImportExport)

jhendersonUnsubmitted

Done

Expected<StringRef> Name = getSymbolName(Symb);

- if (Name && !Name->equals(" ")) { // blank name is local

+ if (Name && *Name != " ") { // Blank name is local.

Flags |= SymbolRef::SF_Global;

Do you mean specifically " " means local? What about "", " " (i.e. 0, 2 spaces) etc?

jhenderson: Do you mean specifically " " means local? What about "", " " (i.e. 0, 2 spaces) etc?

kpnUnsubmitted

Done

It's " " that's special. My employeer's compiler uses the same symbol name because the Binder translates it into a private symbol name that uses characters. In this way multiple private symbols can be disambiguated in a listing after a link.

kpn: It's " " that's special. My employeer's compiler uses the same symbol name because the Binder…

Flags |= SymbolRef::SF_Exported;

MaskRayUnsubmitted

Done

Prefer == != to equals

MaskRay: Prefer == != to equals

else if (!(Flags & SymbolRef::SF_Undefined))

Flags |= SymbolRef::SF_Hidden;

}

return Flags;

}

Expected<SymbolRef::Type>

GOFFObjectFile::getSymbolType(DataRefImpl Symb) const {

const uint8_t *Record = getSymbolEsdRecord(Symb);

GOFF::ESDSymbolType SymbolType;

ESDRecord::getSymbolType(Record, SymbolType);

GOFF::ESDExecutable Executable;

ESDRecord::getExecutable(Record, Executable);

if (SymbolType != GOFF::ESD_ST_SectionDefinition &&

SymbolType != GOFF::ESD_ST_ElementDefinition &&

SymbolType != GOFF::ESD_ST_LabelDefinition &&

SymbolType != GOFF::ESD_ST_PartReference &&

SymbolType != GOFF::ESD_ST_ExternalReference) {

uint32_t EsdId;

ESDRecord::getEsdId(Record, EsdId);

return createStringError(llvm::errc::invalid_argument,

jhendersonUnsubmitted

Done

Could this assertion fire if somebody wrote garbage in their object file's symbol type field? If so, it should be an error, not an assertion. (Use Error/Expected for malformed input and assertions for coder errors within LLVM).

jhenderson: Could this assertion fire if somebody wrote garbage in their object file's symbol type field?

"ESD record %" PRIu32

" has invalid symbol type 0x%" PRIX8,

EsdId, SymbolType);

jhendersonUnsubmitted

Done

std::errc::invalid_argument,

- "symbolType must be SectionDef/ElemDef/LabelDef/PartRef/ExtRef.");

+ "symbolType must be SectionDef/ElemDef/LabelDef/PartRef/ExtRef");

switch (SymbolType) {

What has actually been specified though? Which symbol (if known)?

jhenderson: What has actually been specified though? Which symbol (if known)?

yusra.syedaAuthorUnsubmitted

Done

These are the only options for the symbolType in the ESDSymbolType enum. If it's not one of these then the type is invalid.

yusra.syeda: These are the only options for the symbolType in the ESDSymbolType enum. If it's not one of…

jhendersonUnsubmitted

Done

I don't think you understood what I meant - when I asked this and the similar question below regarding executable type, I meant you should include the additional context, i.e. which symbol had an invalid type (i.e. the index or possibly name) and what that invalid type was, e.g. "symbol 42 has unknown type 0x12". It's important to do this properly because the user's input object might be corrupted in some way, and the code needs to make it easier for that user to find the problem.

jhenderson: I don't think you understood what I meant - when I asked this and the similar question below…

yusra.syedaAuthorUnsubmitted

Done

Thanks, done.

yusra.syeda: Thanks, done.

}

switch (SymbolType) {

case GOFF::ESD_ST_SectionDefinition:

jhendersonUnsubmitted

Done

return createStringError(llvm::errc::invalid_argument,

- "ESD record %lu has invalid symbol type %02X",

- (unsigned long)EsdId, SymbolType);

+ "ESD record %" PRIu32 " has invalid symbol type %02X",

+ EsdId, SymbolType);

}

switch (SymbolType) {

Rather than casting EsdId, use the correct print format specifier.

jhenderson: Rather than casting `EsdId`, use the correct print format specifier.

jhendersonUnsubmitted

Done

Now that SymbolType is defined to be a uint8_t, you should use that explicitly, i.e. something like 0x%02" PRIX8" (though you could probably simplify and omit the "02" bit, since this is an error message, and getting a fixed width field isn't really necessary.

jhenderson: Now that `SymbolType` is defined to be a `uint8_t`, you should use that explicitly, i.e.

case GOFF::ESD_ST_ElementDefinition:

return SymbolRef::ST_Other;

case GOFF::ESD_ST_LabelDefinition:

case GOFF::ESD_ST_PartReference:

case GOFF::ESD_ST_ExternalReference:

if (Executable != GOFF::ESD_EXE_CODE && Executable != GOFF::ESD_EXE_DATA &&

jhendersonUnsubmitted

Done

Same question as above - should this be an error in case of malformed input?

jhenderson: Same question as above - should this be an error in case of malformed input?

Executable != GOFF::ESD_EXE_Unspecified) {

uint32_t EsdId;

ESDRecord::getEsdId(Record, EsdId);

return createStringError(llvm::errc::invalid_argument,

jhendersonUnsubmitted

Done

return createStringError(std::errc::invalid_argument,

- "executable must be CODE/DATA/Unspecified.");

+ "executable must be CODE/DATA/Unspecified");

switch (Executable) {

What type is it actually?

Also, clang-format.

jhenderson: What type is it actually? Also, clang-format.

yusra.syedaAuthorUnsubmitted

Done

These are the only options in the ESDExecutable enum, so if it's not one of these then the type is invalid.

yusra.syeda: These are the only options in the ESDExecutable enum, so if it's not one of these then the type…

jhendersonUnsubmitted

Done

Same as above - give more useful context in this message, e.g. "executable has unknown type 0x1111".

Also, I think it's more common to omit the std:: prefix from std::errc values (LLVM has its own version of this set, which partly parallels the std one). Please take a look at removing that prefix from all these std::errc instances.

jhenderson: Same as above - give more useful context in this message, e.g. "executable has unknown type…

"ESD record %" PRIu32

" has unknown Executable type 0x%02X",

EsdId, Executable);

}

switch (Executable) {

case GOFF::ESD_EXE_CODE:

jhendersonUnsubmitted

Done

Same as above.

jhenderson: Same as above.

return SymbolRef::ST_Function;

case GOFF::ESD_EXE_DATA:

return SymbolRef::ST_Data;

case GOFF::ESD_EXE_Unspecified:

return SymbolRef::ST_Unknown;

}

Expected<section_iterator>

GOFFObjectFile::getSymbolSection(DataRefImpl Symb) const {

DataRefImpl Sec;

MaskRayUnsubmitted

Done

https://llvm.org/docs/CodingStandards.html#don-t-evaluate-end-every-time-through-a-loop

several instances in the file do not obey the rule.

MaskRay: https://llvm.org/docs/CodingStandards.html#don-t-evaluate-end-every-time-through-a-loop…

if (isSymbolUnresolved(Symb))

return section_iterator(SectionRef(Sec, this));

const uint8_t *SymEsdRecord = EsdPtrs[Symb.d.a];

uint32_t SymEdId;

ESDRecord::getParentEsdId(SymEsdRecord, SymEdId);

const uint8_t *SymEdRecord = EsdPtrs[SymEdId];

for (size_t I = 0, E = SectionList.size(); I < E; ++I) {

bool Found;

const uint8_t *SectionPrRecord = getSectionPrEsdRecord(I);

if (SectionPrRecord) {

Found = SymEsdRecord == SectionPrRecord;

} else {

const uint8_t *SectionEdRecord = getSectionEdEsdRecord(I);

MaskRayUnsubmitted

Done

llvm_unreachable does not need a return

MaskRay: llvm_unreachable does not need a return

Found = SymEdRecord == SectionEdRecord;

jhendersonUnsubmitted

Done

Is this really unreachable? What happens if there is no symbol section in the file?

jhenderson: Is this really unreachable? What happens if there is no symbol section in the file?

yusra.syedaAuthorUnsubmitted

Done

Yes, this should be unreachable and is added as an extra safety check.

yusra.syeda: Yes, this should be unreachable and is added as an extra safety check.

jhendersonUnsubmitted

Done

You didn't answer my question: "What happens if there is no symbol section in the file?"

jhenderson: You didn't answer my question: "What happens if there is no symbol section in the file?"

yusra.syedaAuthorUnsubmitted

Done

This should be unreachable and it should be an error if there is no symbol section. I updated the llvm_unreachable statement to return an Error instead.

yusra.syeda: This should be unreachable and it should be an error if there is no symbol section. I updated…

}

if (Found) {

jhendersonUnsubmitted

Done

When I said the following to SymbolRef above:

These sorts of comments don't add anything, in my opinion. Just delete them (the function names describe things sufficiently).

That wasn't referring to just the one comment. Please delete all of these sort of comments.

jhenderson: When I said the following to `SymbolRef` above: > These sorts of comments don't add anything…

Sec.d.a = I;

return section_iterator(SectionRef(Sec, this));

jhendersonUnsubmitted

Done

I think it would be slightly clearly to say no symbol section found. "unable to get" sounds like there was an actual problem retrieving the section (e.g. some part of the section data was invalid), whereas "no ... found" is clear that it's simply not there.

jhenderson: I think it would be slightly clearly to say `no symbol section found`. "unable to get" sounds…

}

return createStringError(llvm::errc::invalid_argument,

"no symbol section found");

}

const uint8_t *GOFFObjectFile::getSectionEdEsdRecord(DataRefImpl &Sec) const {

SectionEntryImpl EsdIds = SectionList[Sec.d.a];

const uint8_t *EsdRecord = EsdPtrs[EsdIds.d.a];

return EsdRecord;

}

const uint8_t *GOFFObjectFile::getSectionPrEsdRecord(DataRefImpl &Sec) const {

jhendersonUnsubmitted

Done

Blank line between functions. Same goes below.

jhenderson: Blank line between functions. Same goes below.

SectionEntryImpl EsdIds = SectionList[Sec.d.a];

const uint8_t *EsdRecord = nullptr;

if (EsdIds.d.b)

EsdRecord = EsdPtrs[EsdIds.d.b];

return EsdRecord;

}

const uint8_t *

GOFFObjectFile::getSectionEdEsdRecord(uint32_t SectionIndex) const {

DataRefImpl Sec;

Sec.d.a = SectionIndex;

const uint8_t *EsdRecord = getSectionEdEsdRecord(Sec);

return EsdRecord;

}

const uint8_t *

GOFFObjectFile::getSectionPrEsdRecord(uint32_t SectionIndex) const {

DataRefImpl Sec;

Sec.d.a = SectionIndex;

const uint8_t *EsdRecord = getSectionPrEsdRecord(Sec);

jhendersonUnsubmitted

Done

return;

- // Calculate total length of relocation items from all records

+ // Calculate total length of relocation items from all records.

uint32_t RelocationDataSize = 0;

jhenderson:

return EsdRecord;

}

void GOFFObjectFile::getRelocationData() {

jhendersonUnsubmitted

Done

uint16_t RldLengthField;

- for (uint32_t I = 0; I < RldPtrs.size(); I++) {

- const uint8_t *RldRecord = RldPtrs[I];

+ for (const uint8_t *RldRecord : RldPtrs) {

RLDRecord::getLength(RldRecord, RldLengthField);

Use range-based for loop here and below.

jhenderson: Use range-based for loop here and below.

if (RelocationData.size())

return;

// Calculate total length of relocation items from all records.

uint32_t RelocationDataSize = 0;

uint16_t RldLengthField;

for (const uint8_t *RldRecord : RldPtrs) {

jhendersonUnsubmitted

Done

RelocationData.reserve(RelocationDataSize);

- // Populate RelocationData with relocation items from all records

+ // Populate RelocationData with relocation items from all records.

for (uint32_t I = 0; I < RldPtrs.size(); I++) {

jhenderson:

RLDRecord::getLength(RldRecord, RldLengthField);

RelocationDataSize += RldLengthField;

}

RelocationData.reserve(RelocationDataSize);

// Populate RelocationData with relocation items from all records.

for (const uint8_t *RldRecord : RldPtrs) {

RLDRecord::getLength(RldRecord, RldLengthField);

jhendersonUnsubmitted

Done

Same comment as earlier. Why not stick to uint8_t everywhere?

jhenderson: Same comment as earlier. Why not stick to `uint8_t` everywhere?

uint16_t Remainder = RldLengthField;

jhendersonUnsubmitted

Done

const char *ChrPtr = reinterpret_cast<const char *>(RldRecord);

- RelocationData.append(ChrPtr + 6, AppendLength); // copy from initial record

+ RelocationData.append(ChrPtr + 6, AppendLength); // Copy from initial record.

Remainder -= AppendLength;

jhenderson:

jhendersonUnsubmitted

Done

Please address both edits suggested in the previous comment, not just the second one.

jhenderson: Please address both edits suggested in the previous comment, not just the second one.

uint32_t AppendLength = Remainder < (GOFF::RecordLength - 6)

? Remainder

: (GOFF::RecordLength - 6);

const uint8_t *DataPtr = RldRecord;

RelocationData.append(*(DataPtr + 6),

AppendLength); // Copy from initial record.

jhendersonUnsubmitted

Done

: (GOFF::RecordLength - 3);

- RelocationData.append(ChrPtr + 3, AppendLength); // copy from continuation

+ RelocationData.append(ChrPtr + 3, AppendLength); // Copy from continuation.

Remainder -= AppendLength;

jhenderson:

jhendersonUnsubmitted

Done

Ditto.

jhenderson: Ditto.

Remainder -= AppendLength;

DataPtr += GOFF::RecordLength;

while (Remainder > 0) {

AppendLength = Remainder < (GOFF::RecordLength - 3)

? Remainder

: (GOFF::RecordLength - 3);

RelocationData.append(*(DataPtr + 3),

AppendLength); // Copy from continuation.

Remainder -= AppendLength;

DataPtr += GOFF::RecordLength;

}

jhendersonUnsubmitted

Done

Add blank line between functions here and below.

jhenderson: Add blank line between functions here and below.

}

// Utility routines which extract details about the relocation from

// the relocation type.

GOFF::RLDFetchStore GOFFObjectFile::getRLDFetchStore(uint64_t RelocationType) {

return (GOFF::RLDFetchStore)(RelocationType & 0x00000001);

}

GOFF::RLDAction GOFFObjectFile::getRLDAction(uint64_t RelocationType) {

return (GOFF::RLDAction)((RelocationType & 0x000000FE) >> 1);

}

uint8_t GOFFObjectFile::getRLDBitOffset(uint64_t RelocationType) {

return ((RelocationType & 0x00070000) >> 16);

}

uint16_t GOFFObjectFile::getRLDBitLength(uint64_t RelocationType) {

return ((RelocationType & 0xFFE00000) >> 21);

MaskRayUnsubmitted

Done

Don't define a variable which is immediately used on the next line and not used in other places.

MaskRay: Don't define a variable which is immediately used on the next line and not used in other places.

}

bool GOFFObjectFile::getRLDIsCodeAddressReference(uint64_t RelocationType) {

return (RelocationType & RLD_CODE_ADDR_FLAG) != 0;

}

bool GOFFObjectFile::getRLDSymbolIsIndirect(uint64_t RelocationType) {

return (RelocationType & RLD_INDIRECT_FLAG) != 0;

}

bool GOFFObjectFile::getRLDIsWeak(uint64_t RelocationType) {

return (RelocationType & RLD_WEAK_FLAG) != 0;

}

section_iterator GOFFObjectFile::section_begin() const {

DataRefImpl Sec;

moveSectionNext(Sec);

return section_iterator(SectionRef(Sec, this));

jhendersonUnsubmitted

Done

ESDRecord::getSymbolType(EsdRecord, SymbolType);

- // Skip EDs - i.e. section symbols

+ // Skip EDs - i.e. section symbols.

bool IgnoreSpecialGOFFSymbols = true;

jhenderson:

}

section_iterator GOFFObjectFile::section_end() const {

DataRefImpl Sec;

return section_iterator(SectionRef(Sec, this));

}

void GOFFObjectFile::moveSymbolNext(DataRefImpl &Symb) const {

for (uint32_t I = Symb.d.a + 1, E = EsdPtrs.size(); I < E; ++I) {

if (EsdPtrs[I]) {

const uint8_t *EsdRecord = EsdPtrs[I];

GOFF::ESDSymbolType SymbolType;

ESDRecord::getSymbolType(EsdRecord, SymbolType);

// Skip EDs - i.e. section symbols.

bool IgnoreSpecialGOFFSymbols = true;

bool SkipSymbol = ((SymbolType == GOFF::ESD_ST_ElementDefinition) ||

(SymbolType == GOFF::ESD_ST_SectionDefinition)) &&

IgnoreSpecialGOFFSymbols;

if (!SkipSymbol) {

Symb.d.a = I;

return;

}

Symb.d.a = 0;

}

basic_symbol_iterator GOFFObjectFile::symbol_begin() const {

DataRefImpl Symb;

moveSymbolNext(Symb);

return basic_symbol_iterator(SymbolRef(Symb, this));

}

basic_symbol_iterator GOFFObjectFile::symbol_end() const {

DataRefImpl Symb;

return basic_symbol_iterator(SymbolRef(Symb, this));

}

llvm/lib/Object/ObjectFile.cpp

//===- ObjectFile.cpp - File format independent object file ---------------===//		//===- ObjectFile.cpp - File format independent object file ---------------===//
		Lint: Lint Inline Actions clang-format not found in user's PATH; not linting file. Lint: Lint: clang-format not found in user's PATH; not linting file.
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
▲ Show 20 Lines • Show All 112 Lines • ▼ Show 20 Lines	Triple ObjectFile::makeTriple() const {
} else if (isCOFF()) {		} else if (isCOFF()) {
const auto COFFObj = cast<COFFObjectFile>(this);		const auto COFFObj = cast<COFFObjectFile>(this);
if (COFFObj->getArch() == Triple::thumb)		if (COFFObj->getArch() == Triple::thumb)
TheTriple.setTriple("thumbv7-windows");		TheTriple.setTriple("thumbv7-windows");
} else if (isXCOFF()) {		} else if (isXCOFF()) {
// XCOFF implies AIX.		// XCOFF implies AIX.
TheTriple.setOS(Triple::AIX);		TheTriple.setOS(Triple::AIX);
TheTriple.setObjectFormat(Triple::XCOFF);		TheTriple.setObjectFormat(Triple::XCOFF);
		} else if (isGOFF()) {
		TheTriple.setOS(Triple::ZOS);
		TheTriple.setObjectFormat(Triple::GOFF);
}		}

return TheTriple;		return TheTriple;
}		}

Expected<std::unique_ptr<ObjectFile>>		Expected<std::unique_ptr<ObjectFile>>
ObjectFile::createObjectFile(MemoryBufferRef Object, file_magic Type) {		ObjectFile::createObjectFile(MemoryBufferRef Object, file_magic Type) {
StringRef Data = Object.getBuffer();		StringRef Data = Object.getBuffer();
Show All 25 Lines	ObjectFile::createObjectFile(MemoryBufferRef Object, file_magic Type) {
case file_magic::macho_preload_executable:		case file_magic::macho_preload_executable:
case file_magic::macho_dynamically_linked_shared_lib:		case file_magic::macho_dynamically_linked_shared_lib:
case file_magic::macho_dynamic_linker:		case file_magic::macho_dynamic_linker:
case file_magic::macho_bundle:		case file_magic::macho_bundle:
case file_magic::macho_dynamically_linked_shared_lib_stub:		case file_magic::macho_dynamically_linked_shared_lib_stub:
case file_magic::macho_dsym_companion:		case file_magic::macho_dsym_companion:
case file_magic::macho_kext_bundle:		case file_magic::macho_kext_bundle:
return createMachOObjectFile(Object);		return createMachOObjectFile(Object);
		case file_magic::goff_object:
		return createGOFFObjectFile(Object);
case file_magic::coff_object:		case file_magic::coff_object:
case file_magic::coff_import_library:		case file_magic::coff_import_library:
case file_magic::pecoff_executable:		case file_magic::pecoff_executable:
return createCOFFObjectFile(Object);		return createCOFFObjectFile(Object);
case file_magic::xcoff_object_32:		case file_magic::xcoff_object_32:
return createXCOFFObjectFile(Object, Binary::ID_XCOFF32);		return createXCOFFObjectFile(Object, Binary::ID_XCOFF32);
case file_magic::xcoff_object_64:		case file_magic::xcoff_object_64:
return createXCOFFObjectFile(Object, Binary::ID_XCOFF64);		return createXCOFFObjectFile(Object, Binary::ID_XCOFF64);
Show All 22 Lines

llvm/lib/Object/SymbolicFile.cpp

//===- SymbolicFile.cpp - Interface that only provides symbols ------------===//		//===- SymbolicFile.cpp - Interface that only provides symbols ------------===//
		Lint: Lint Inline Actions clang-format not found in user's PATH; not linting file. Lint: Lint: clang-format not found in user's PATH; not linting file.
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	SymbolicFile::createSymbolicFile(MemoryBufferRef Object, file_magic Type,
case file_magic::macho_preload_executable:		case file_magic::macho_preload_executable:
case file_magic::macho_dynamically_linked_shared_lib:		case file_magic::macho_dynamically_linked_shared_lib:
case file_magic::macho_dynamic_linker:		case file_magic::macho_dynamic_linker:
case file_magic::macho_bundle:		case file_magic::macho_bundle:
case file_magic::macho_dynamically_linked_shared_lib_stub:		case file_magic::macho_dynamically_linked_shared_lib_stub:
case file_magic::macho_dsym_companion:		case file_magic::macho_dsym_companion:
case file_magic::macho_kext_bundle:		case file_magic::macho_kext_bundle:
case file_magic::pecoff_executable:		case file_magic::pecoff_executable:
		case file_magic::goff_object:
case file_magic::xcoff_object_32:		case file_magic::xcoff_object_32:
case file_magic::xcoff_object_64:		case file_magic::xcoff_object_64:
case file_magic::wasm_object:		case file_magic::wasm_object:
return ObjectFile::createObjectFile(Object, Type);		return ObjectFile::createObjectFile(Object, Type);
case file_magic::coff_import_library:		case file_magic::coff_import_library:
return std::unique_ptr<SymbolicFile>(new COFFImportFile(Object));		return std::unique_ptr<SymbolicFile>(new COFFImportFile(Object));
case file_magic::elf_relocatable:		case file_magic::elf_relocatable:
case file_magic::macho_object:		case file_magic::macho_object:
▲ Show 20 Lines • Show All 53 Lines • Show Last 20 Lines

llvm/unittests/BinaryFormat/TestFileMagic.cpp

	//===- llvm/unittest/BinaryFormat/TestFileMagic.cpp - File magic tests ----===//			//===- llvm/unittest/BinaryFormat/TestFileMagic.cpp - File magic tests ----===//
				Lint: Lint Inline Actions clang-format not found in user's PATH; not linting file. Lint: Lint: clang-format not found in user's PATH; not linting file.
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	Show All 40 Lines
	const char bitcode[] = "\xde\xc0\x17\x0b";			const char bitcode[] = "\xde\xc0\x17\x0b";
	const char coff_object[] = "\x00\x00......";			const char coff_object[] = "\x00\x00......";
	const char coff_bigobj[] =			const char coff_bigobj[] =
	"\x00\x00\xff\xff\x00\x02......"			"\x00\x00\xff\xff\x00\x02......"
	"\xc7\xa1\xba\xd1\xee\xba\xa9\x4b\xaf\x20\xfa\xf6\x6a\xa4\xdc\xb8";			"\xc7\xa1\xba\xd1\xee\xba\xa9\x4b\xaf\x20\xfa\xf6\x6a\xa4\xdc\xb8";
	const char coff_import_library[] = "\x00\x00\xff\xff....";			const char coff_import_library[] = "\x00\x00\xff\xff....";
	const char elf_relocatable[] = {0x7f, 'E', 'L', 'F', 1, 2, 1, 0, 0,			const char elf_relocatable[] = {0x7f, 'E', 'L', 'F', 1, 2, 1, 0, 0,
	0, 0, 0, 0, 0, 0, 0, 0, 1};			0, 0, 0, 0, 0, 0, 0, 0, 1};
				const char goff_object[] = "\x03\xF0\x00";
	const char macho_universal_binary[] = "\xca\xfe\xba\xbe...\x00";			const char macho_universal_binary[] = "\xca\xfe\xba\xbe...\x00";
	const char macho_object[] =			const char macho_object[] =
	"\xfe\xed\xfa\xce........\x00\x00\x00\x01............";			"\xfe\xed\xfa\xce........\x00\x00\x00\x01............";
	const char macho_executable[] =			const char macho_executable[] =
	"\xfe\xed\xfa\xce........\x00\x00\x00\x02............";			"\xfe\xed\xfa\xce........\x00\x00\x00\x02............";
	const char macho_fixed_virtual_memory_shared_lib[] =			const char macho_fixed_virtual_memory_shared_lib[] =
	"\xfe\xed\xfa\xce........\x00\x00\x00\x03............";			"\xfe\xed\xfa\xce........\x00\x00\x00\x03............";
	const char macho_core[] =			const char macho_core[] =
	Show All 30 Lines
	#define DEFINE(magic) {#magic, magic, sizeof(magic), file_magic::magic}			#define DEFINE(magic) {#magic, magic, sizeof(magic), file_magic::magic}
	DEFINE(archive),			DEFINE(archive),
	DEFINE(bitcode),			DEFINE(bitcode),
	DEFINE(coff_object),			DEFINE(coff_object),
	{"coff_bigobj", coff_bigobj, sizeof(coff_bigobj),			{"coff_bigobj", coff_bigobj, sizeof(coff_bigobj),
	file_magic::coff_object},			file_magic::coff_object},
	DEFINE(coff_import_library),			DEFINE(coff_import_library),
	DEFINE(elf_relocatable),			DEFINE(elf_relocatable),
				DEFINE(goff_object),
	DEFINE(macho_universal_binary),			DEFINE(macho_universal_binary),
	DEFINE(macho_object),			DEFINE(macho_object),
	DEFINE(macho_executable),			DEFINE(macho_executable),
	DEFINE(macho_fixed_virtual_memory_shared_lib),			DEFINE(macho_fixed_virtual_memory_shared_lib),
	DEFINE(macho_core),			DEFINE(macho_core),
	DEFINE(macho_preload_executable),			DEFINE(macho_preload_executable),
	DEFINE(macho_dynamically_linked_shared_lib),			DEFINE(macho_dynamically_linked_shared_lib),
	DEFINE(macho_dynamic_linker),			DEFINE(macho_dynamic_linker),
	Show All 29 Lines

llvm/unittests/Object/CMakeLists.txt

	set(LLVM_LINK_COMPONENTS			set(LLVM_LINK_COMPONENTS
	BinaryFormat			BinaryFormat
	Object			Object
	)			)

	add_llvm_unittest(ObjectTests			add_llvm_unittest(ObjectTests
	ArchiveTest.cpp			ArchiveTest.cpp
	ELFObjectFileTest.cpp			ELFObjectFileTest.cpp
	ELFTypesTest.cpp			ELFTypesTest.cpp
	ELFTest.cpp			ELFTest.cpp
				GOFFObjectFileTest.cpp
	MinidumpTest.cpp			MinidumpTest.cpp
	ObjectFileTest.cpp			ObjectFileTest.cpp
	SymbolSizeTest.cpp			SymbolSizeTest.cpp
	SymbolicFileTest.cpp			SymbolicFileTest.cpp
	XCOFFObjectFileTest.cpp			XCOFFObjectFileTest.cpp
	)			)

	target_link_libraries(ObjectTests PRIVATE LLVMTestingSupport)			target_link_libraries(ObjectTests PRIVATE LLVMTestingSupport)

llvm/unittests/Object/GOFFObjectFileTest.cpp

This file was added.

				//===- GOFFObjectFileTest.cpp - Tests for GOFFObjectFile ------------------===//
				Lint: Lint Inline Actions clang-format not found in user's PATH; not linting file. Lint: Lint: clang-format not found in user's PATH; not linting file.
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/Object/GOFFObjectFile.h"
				#include "llvm/Support/MemoryBuffer.h"
				#include "llvm/Testing/Support/Error.h"
				#include "gtest/gtest.h"

				using namespace llvm;
				using namespace llvm::object;
				using namespace llvm::GOFF;

				namespace {
				char GOFFData[GOFF::RecordLength * 3] = {0x00};

				void initializeGOFFData() {
				jhendersonUnsubmitted Not Done Reply Inline Actions This and the other functions below are only used in one place, if I'm not mistaken. As such, just inline them - splitting them off makes it harder to follow what the individaul tests are doing, since you have to jump around the file. jhenderson: This and the other functions below are only used in one place, if I'm not mistaken. As such…
				// HDR record.
				GOFFData[0] = 0x03;
				GOFFData[1] = 0xF0;

				// ESD record.
				GOFFData[GOFF::RecordLength] = 0x03;
				GOFFData[GOFF::RecordLength + 3] = 0x02;
				GOFFData[GOFF::RecordLength + 7] = 0x01;
				GOFFData[GOFF::RecordLength + 11] = 0x01;
				GOFFData[GOFF::RecordLength + 71] = 0x05; // Size of symbol name.
				GOFFData[GOFF::RecordLength + 72] = 0xC8; // Symbol name is Hello.
				GOFFData[GOFF::RecordLength + 73] = 0x85;
				GOFFData[GOFF::RecordLength + 74] = 0x93;
				GOFFData[GOFF::RecordLength + 75] = 0x93;
				GOFFData[GOFF::RecordLength + 76] = 0x96;

				// END record.
				GOFFData[GOFF::RecordLength * 2] = 0x03;
				GOFFData[GOFF::RecordLength * 2 + 1] = 0x40;
				}

				void constructValidGOFF() {
				StringRef ValidSize(GOFFData, 80);
				jhendersonUnsubmitted Not Done Reply Inline Actions The valid size can be any multiple of 80 bytes. I'd recommend a second test-case that uses a size of something other than 80 bytes, e.g. 160 bytes. What about 0 bytes? That probably needs a specific test case, as that is a multiple of 80... jhenderson: The valid size can be any multiple of 80 bytes. I'd recommend a second test-case that uses a…
				Expected<std::unique_ptr<ObjectFile>> GOFFObjOrErr =
				object::ObjectFile::createGOFFObjectFile(
				MemoryBufferRef(ValidSize, "dummyGOFF"));

				ASSERT_THAT_EXPECTED(GOFFObjOrErr, Succeeded());
				}

				void constructInvalidGOFF() {
				// Construct GOFFObject with record of length != 80.
				jhendersonUnsubmitted Not Done Reply Inline Actions According to the code, it needs to be a multiple of 80 bytes, so this comment isn't quite correct (it implies 160 is not a valid size). jhenderson: According to the code, it needs to be a multiple of 80 bytes, so this comment isn't quite…
				StringRef InvalidData(GOFFData, 70);
				jhendersonUnsubmitted Not Done Reply Inline Actions Test the edge cases e.g. 79 and/or 81. jhenderson: Test the edge cases e.g. 79 and/or 81.
				Expected<std::unique_ptr<ObjectFile>> GOFFObjOrErr =
				object::ObjectFile::createGOFFObjectFile(
				MemoryBufferRef(InvalidData, "dummyGOFF"));

				ASSERT_THAT_EXPECTED(GOFFObjOrErr, Failed());
				jhendersonUnsubmitted Not Done Reply Inline Actions Rather than `Failed()`, use `FailedWithMessage()`, so that you can check the error message output. jhenderson: Rather than `Failed()`, use `FailedWithMessage()`, so that you can check the error message…
				}

				void getSymbolName() {
				initializeGOFFData();
				StringRef Data(GOFFData, GOFF::RecordLength * 3);

				Expected<std::unique_ptr<ObjectFile>> GOFFObjOrErr =
				object::ObjectFile::createGOFFObjectFile(
				MemoryBufferRef(Data, "dummyGOFF"));

				ASSERT_THAT_EXPECTED(GOFFObjOrErr, Succeeded());

				GOFFObjectFile GOFFObj = dyn_cast<GOFFObjectFile>((GOFFObjOrErr).get());

				auto Symbols = GOFFObj->symbols();
				jhendersonUnsubmitted Not Done Reply Inline Actions This is only used once - just inline it. jhenderson: This is only used once - just inline it.
				for (const SymbolRef &Symbol : Symbols) {
				jhendersonUnsubmitted Not Done Reply Inline Actions I suspect, given the name, that the `SymbolRef` type is very small already (in the same manner as `StringRef`), and there's no real benefit in making this a `const &`. jhenderson: I suspect, given the name, that the `SymbolRef` type is very small already (in the same manner…
				Expected<StringRef> SymbolNameOrErr = GOFFObj->getSymbolName(Symbol);
				jhendersonUnsubmitted Done Reply Inline Actions clang-format this file jhenderson: clang-format this file
				ASSERT_THAT_EXPECTED(SymbolNameOrErr, Succeeded());
				StringRef SymbolName = SymbolNameOrErr.get();

				ASSERT_EQ(SymbolName.empty(), false);
				EXPECT_STREQ(SymbolName.data(), "Hello");
				jhendersonUnsubmitted Not Done Reply Inline Actions You should just be able to do `EXPECT_EQ(SymbolName, "Hello");` here. jhenderson: You should just be able to do `EXPECT_EQ(SymbolName, "Hello");` here.
				}
				}

				} // namespace

				TEST(GOFFObjectFileTest, ConstructValidGOFFObject) { constructValidGOFF(); }

				TEST(GOFFObjectFileTest, ConstructGOFFObjectInvalid) { constructInvalidGOFF(); }

				TEST(GOFFObjectFileTest, GetSymbolName) { getSymbolName(); }

This is an archive of the discontinued LLVM Phabricator instance.

[SystemZ/z/OS] Add GOFFObjectFile class and details of GOFF file formatAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 327254

llvm/include/llvm/BinaryFormat/GOFF.h

llvm/include/llvm/BinaryFormat/GOFFAda.def

llvm/include/llvm/BinaryFormat/Magic.h

llvm/include/llvm/Object/Binary.h

llvm/include/llvm/Object/GOFF.h

llvm/include/llvm/Object/GOFFObjectFile.h

llvm/include/llvm/Object/ObjectFile.h

llvm/lib/BinaryFormat/Magic.cpp

llvm/lib/Object/Binary.cpp

llvm/lib/Object/CMakeLists.txt

llvm/lib/Object/GOFFObjectFile.cpp

llvm/lib/Object/ObjectFile.cpp

llvm/lib/Object/SymbolicFile.cpp

llvm/unittests/BinaryFormat/TestFileMagic.cpp

llvm/unittests/Object/CMakeLists.txt

llvm/unittests/Object/GOFFObjectFileTest.cpp

[SystemZ/z/OS] Add GOFFObjectFile class and details of GOFF file format
AbandonedPublic