Page MenuHomePhabricator

[lld] Enable a watermark of loadable sections to be generated and placed in a note section
AbandonedPublic

Authored by chrisjackson on Aug 19 2019, 9:56 AM.

Details

Summary

Add a '--watermark' flag to lld that enables an xxhash of loadable sections to be placed in a note section, 'note.llvm.watermark'. Then we can determine if any loadable sections have been modified since the ELF was linked.

We have selected xxhash to minimise the overhead of calculating the watermark. The functionality already provided for the GNU build Id has been reused where possible. The hash value provided by the watermark must be unaffected by stripping of debug data or symbols. As buildId hashes the entire ELF, it is not suitable.

By ensuring loadable sections have not changed since link-time, we can have confidence that they are compliant with the system ABI. This helps to ensure that changes to system software will not unexpectedly cause the ELF to execute incorrectly. If additional tooling is being used to modify the ELF this would indicate functionality that is lacking in our toolchain and is desired by users.

Diff Detail

Event Timeline

chrisjackson created this revision.Aug 19 2019, 9:56 AM
davidb added a subscriber: davidb.Aug 19 2019, 10:09 AM
chrisjackson edited the summary of this revision. (Show Details)Aug 20 2019, 2:37 AM
chrisjackson added a reviewer: ruiu.
ruiu added a comment.Aug 20 2019, 2:56 AM

IIRC lld's --build-id={md5,sha1} was slow at first but after we made them a tree-hash to utilize mutli-cores, its cost became negligible. Have you considered taking that approach? This watermark hash is probably not a thing that people attack by the collision attack, so it might not have to be a cryptographically-safe hash, but still cryptographically-safe hash function has nice properties compared to non-crypto ones.

I suggest adding a dumper to llvm-readobj and then adding a test to test/ELF/partition-notes.s

lld/ELF/Driver.cpp
969

See args.hasFlag above.

lld/ELF/SyntheticSections.cpp
339

Delete lld::elf::.

344

5 -> 8, otherwise it is incorrect to use watermarkBuf = buf + 20;

351

Delete lld::elf::

Delete llvm::

lld/ELF/Writer.cpp
624

Backfill .note.llvm.watermark section content. This is similar to .note.gnu.build-id.

2764

inline the only use of the variable

2767

Delete parts

2771

if (first >= last) might be better (WriteHash asserts first < last though I haven't found a case where first can be equal to last)

lld/test/ELF/watermark.s
2

generated placed?

6

-triple=x86_64 %s -o %t.o (this is generic, not Linux specific). Use .o for object files.

15

Consider llvm-readelf -S. Its output is concise.

16

llvm-readelf -x .note.llvm.watermark (Prefer llvm-readelf -x over llvm-objdump -s`)

24

[048C]

MaskRay added a comment.EditedAug 20 2019, 8:45 AM

Another problem is that .note.gnu.build-id is SHF_ALLOC and included in a PT_LOAD segment. When computing watermark, you probably don't want to include its contents. So you should compute build-id and watermark first before you write.

By ensuring loadable sections have not changed since link-time, we can have confidence that they are compliant with the system ABI. This helps to ensure that changes to system software will not unexpectedly cause the ELF to execute incorrectly.

I think I want to hear a bit more about the motivation and how you'd use this feature. If this is used to measure ABI compliance, isn't it too strict? Different optimizations, adding/deleting .note* sections, shuffling .dynsym contents, linker -O1/-O2 (affecting SHF_MERGE sections), etc will change the watermark. Basically you can only do non-PT_LOAD-influencing things like upgrading linkers (lld has an embedded string in .comment), --discard-locals, --strip-debug, etc. I don't see how those things can help you share resources between two links. Those things can also be easily done with a post-processing tool (llvm-objcopy?) Basically, linkers do a lossy transformation. If a feature can be implemented without being affected by the lossy transformation, we should think carefully if it is really the business of the linker.

grimar added a subscriber: grimar.Aug 20 2019, 11:57 PM

Another problem is that .note.gnu.build-id is SHF_ALLOC and included in a PT_LOAD segment. When computing watermark, you probably don't want to include its contents. So you should compute build-id and watermark first before you write.

What target are you looking at? I thought generally note sections aren't allocated and are usually safe to remove. If they were included in a segment, then that would be a PT_NOTE segment that isn't nested in a PT_LOAD segment.

Another problem is that .note.gnu.build-id is SHF_ALLOC and included in a PT_LOAD segment. When computing watermark, you probably don't want to include its contents. So you should compute build-id and watermark first before you write.

What target are you looking at? I thought generally note sections aren't allocated and are usually safe to remove. If they were included in a segment, then that would be a PT_NOTE segment that isn't nested in a PT_LOAD segment.

Actually nevermind. I've just found a few example of scripts placing the build-id into text so the value can be accessed at runtime.

MaskRay added a comment.EditedAug 21 2019, 4:47 AM

Another problem is that .note.gnu.build-id is SHF_ALLOC and included in a PT_LOAD segment. When computing watermark, you probably don't want to include its contents. So you should compute build-id and watermark first before you write.

What target are you looking at? I thought generally note sections aren't allocated and are usually safe to remove. If they were included in a segment, then that would be a PT_NOTE segment that isn't nested in a PT_LOAD segment.

Actually nevermind. I've just found a few example of scripts placing the build-id into text so the value can be accessed at runtime.

Unless you mark a SHF_ALLOC output section as NOLOAD or discard it, it should be included in at least one PT_LOAD segment. ELF spec says: "SHF_ALLOC - The section occupies memory during process execution."
SHT_NOTE sections are usually SHF_ALLOC. This is because 1) many are inspected at runtime 2) many are expected to be dumped to core (this applies to .note.ABI-tag .note.gnu.build-id .note.tag ...). Non-SHF_ALLOC SHT_NOTE section exist in the wild, but they are rare, e.g. GHC uses .debug-ghc-link-info (compiler/main/SysTools/ExtraObj.hs). (May be more common with https://fedoraproject.org/wiki/Toolchain/Watermark .gnu.build.attributes) The computation of .note.llvm.watermark cannot depend on the contents of .note.gnu.build-id so I suggested computing both before writing the contents.

I hope some of you can answer my motivation/justification question above. Whether or not this change is justified to be made into LLD, I think you'll need a llvm-readobj change to dump the note. Also, as I noted above, the llvm-readobj change will help testing the linker feature. You can add the include/llvm/BinaryFormat/ELF.h change to that llvm-readobj patch. It may be worth a llvm-objcopy change, too. One justification is that this watermark can be used to verify llvm-objcopy does not break things (does not alter SHF_ALLOC sections).

MaskRay added a subscriber: peter.smith.

My first reaction is that this seems to be quite a bit of a platform specific feature to build into the linker, it could also make the platform dependent on LLD if this didn't also get into binutils or other ELF linkers. An alternative approach which I believe has been used in other platforms before (for example https://people.freebsd.org/~tmm/elfcksum.c) is to reserve some empty space in the binary that an external tool can post-process to write any checksum/hash etc that you want into it. This is not as convenient but would be compatible with other linkers and not require a LLVM specific extension to ELF.

If this were to go in I think you'd also need to update:

Thanks for the detailed explanation, mostly tally's up with what I've just read. One thing

Non-SHF_ALLOC SHT_NOTE section exist in the wild, but they are rare

I don't think this case should be considered rare (I actually thought the opposite case was rare...). The ELF spec describes SHT_NOTE section to not have any defining attributes (SHF_ALLOC), and most embedded/memory critical systems won't want notes in memory, so I think this is very much target dependent.

Whether or not this change is justified to be made into LLD, I think you'll need a llvm-readobj change to dump the note.

Namely, please add another case here: https://github.com/llvm/llvm-project/blob/master/llvm/tools/llvm-readobj/ELFDumper.cpp#L4495 (and similarly for LLVMStyle::printNotes). I imagine you can repurpose the getGNUBuildId method.

It may be worth a llvm-objcopy change, too. One justification is that this watermark can be used to verify llvm-objcopy does not break things (does not alter SHF_ALLOC sections).

Interesting idea. What patch are you suggesting here?

lld/ELF/SyntheticSections.cpp
344

http://www.sco.com/developers/gabi/1998-04-29/ch5.pheader.html#note_section
I think 5 is the correct value for namesz; padding exists in the note but is not included in the value of namesz

345

Content -> Descriptor

edd added a comment.Aug 22 2019, 8:51 AM

Hi all,

Chris is out of the office until the 28th. I'm sure he'll do what he can to address the technical concerns and gaps raised so far upon his return, but I'll try to field some of the questions regarding intent until then.

I think I want to hear a bit more about the motivation and how you'd use this feature.

We are primarily concerned about the situation where a studio produces a game for PlayStation that depends on an incidental detail of the OS. The scenario we must avoid whenever possible is an update to the OS interfering with the intended operation of a game. Fixing that is a costly process. We put a lot of effort in to making sure our tools do what they can to avoid this situation.

So given an ELF with a watermark and the ability to recalculate the watermark (perhaps with llvm-readelf), we can detect when additional transformations were applied post-link. Such transformations may break some of the invariants that we have been careful to establish and maintain in our supported tooling and workflows. There are many ways in which our users could accidentally introduce fragility, so it isn't water-tight or all-encompassing, but detecting this one situation is nevertheless useful to us as we may decide to explore what transformations were applied to seek (mutual) reassurance or identify a gap in our SDK offering.

IIRC lld's --build-id={md5,sha1} was slow at first but after we made them a tree-hash to utilize mutli-cores, its cost became negligible. Have you considered taking that approach? This watermark hash is probably not a thing that people attack by the collision attack, so it might not have to be a cryptographically-safe hash, but still cryptographically-safe hash function has nice properties compared to non-crypto ones.

We don't really need any kind of cryptographic guarantees as the watermark is not intended to be part of a strict gating process. We probably would have considered crc32 if it was already available, but our experiments have shown that xxHash adds negligible overhead. Actually, md5 or sha1 may also be fine but we had no need to explore them given the existence of xxHash.

What really is important is that we only have the content of PT_LOADs contribute to the watermark. This is because we would like to be able to recalculate the watermark post-link via a tool and get the same value back, even if the ELF has since been stripped of metadata (DWARF, .symtab, etc).

Another problem is that .note.gnu.build-id is SHF_ALLOC and included in a PT_LOAD segment. When computing watermark, you probably don't want to include its contents. So you should compute build-id and watermark first before you write.

We would like to have two PT_NOTEs, one for use by the OS and another for use by tooling. This is achieved by linker scripts. The second PT_NOTE is outside of any PT_LOAD and this is where the watermark would be housed. As described above, a required property of the watermark is that it can be recalculated by an external tool to infer whether or not the loadable parts of the ELF have been modified post-link. By having the watermark outside of any PT_LOAD, it is simpler for the external tool to recalculate. For a similar reason, it would actually be better in our case to have the watermark calculated after the build ID value has been "filled-in", as the build ID is inside a PT_LOAD.

(Adjusting the code to better accommodate other layouts is certainly something worth considering. I'm just explaining how we intend to make use of it).

My first reaction is that this seems to be quite a bit of a platform specific feature to build into the linker, it could also make the platform dependent on LLD if this didn't also get into binutils or other ELF linkers.

This is true, but we have a very similar (although admittedly not identical) feature in our existing proprietary linker and we have a requirement that our customers use the linker supplied with our SDK.

An alternative approach which I believe has been used in other platforms before (for example https://people.freebsd.org/~tmm/elfcksum.c) is to reserve some empty space in the binary that an external tool can post-process to write any checksum/hash etc that you want into it. This is not as convenient but would be compatible with other linkers and not require a LLVM specific extension to ELF.

The existence of the watermark is a requirement on PlayStation. Indeed, we would like to avoid mandating an easily-forgotten post-link step. More to the point, adding the watermark via a post-link step would mean post-link modifications could be made before the watermark is added, which rather defeats the point (that may not have been too clear before - sorry).

Thanks,
Edd

pcc added a subscriber: pcc.Aug 23 2019, 12:31 PM
pcc added inline comments.
lld/ELF/Writer.cpp
2764

I guess the right thing to do in the case of multiple partitions would be to compute a separate hash for each partition. But this can always be changed later since the partitions feature is experimental.

2766

Should this exclude the ELF headers if present in a segment? The header fields e_shoff, e_shnum and e_shstrndx can and likely must be rewritten by strip and other tools.

This update applies suggested source corrections, enables readobj to output the note and adds logic to writeWatermark() to exclude the ELF Header and Program Header Table (PHT) from the watermark. In this diff the PHT can only be excluded if it is the first or last segment. If it is neither, an error is emitted.

This patch introduces a new ELF extension with a new tag: NT_LLVM_WATERMARK. Last times such ELF extensions were made accompanying RFCs were posted:

https://lists.llvm.org/pipermail/llvm-dev/2019-February/130583.html
https://lists.llvm.org/pipermail/llvm-dev/2019-March/131004.html

I still have some concerns about the extension, especially its generality on the widely used ELF platforms.

We would like to have two PT_NOTEs, one for use by the OS and another for use by tooling. This is achieved by linker scripts. The second PT_NOTE is outside of any PT_LOAD and this is where the watermark would be housed. As described above, a required property of the watermark is that it can be recalculated by an external tool to infer whether or not the loadable parts of the ELF have been modified post-link. By having the watermark outside of any PT_LOAD, it is simpler for the external tool to recalculate. For a similar reason, it would actually be better in our case to have the watermark calculated after the build ID value has been "filled-in", as the build ID is inside a PT_LOAD.

Sorry, I am not following. Do you have a concrete readelf -Sl dump to help my understanding?
By linking a trivial executable with ld.lld a.o --watermark -o a, what I see is:

 PHDR           0x000040 0x0000000000200040 0x0000000000200040 0x000118 0x000118 R   0x8
 LOAD           0x000000 0x0000000000200000 0x0000000000200000 0x000174 0x000174 R   0x1000
 GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0
 NOTE           0x000158 0x0000000000200158 0x0000000000200158 0x00001c 0x00001c R   0x4

Section to Segment mapping:
 Segment Sections...
  00
  01     .note.llvm.watermark
  02
  03     .note.llvm.watermark

.note.llvm.watermark is still included in a PT_LOAD segment. I think you probably meant:

  • .note.gnu.build-id is included in a PT_LOAD and a PT_NOTE
  • .note.llvm.watermark is included in another PT_NOTE

There is also a question why .note.llvm.watermark should be flagged SHF_ALLOC if it is not supposed to be inspected at runtime.

lld/ELF/InputFiles.cpp
986
% ld.lld -r --watermark a.o -o b.o
ld.lld: error: Unable to apply watermark because no PT_LOAD segments were found!                               
ld.lld: ../projects/lld/ELF/Writer.cpp:2690: void WriteHash(std::vector<uint8_t> &, const size_t, const size_t,
size_t, const lld::elf::BuildIdKind): Assertion `first < last' failed.
lld/ELF/Writer.cpp
2778

error: missing 'typename' prior to dependent type name 'ELFType<llvm::support::big, false>::Ehdr'

ruiu added a comment.Sep 9 2019, 10:39 PM

It seems the main reason why you guys wanted to avoid an external tool is that it is too easy to forget to run the tool after link. But that can be fixed easily by writing a shell script as ld.lld which invokes the real ld.lld and a watermarking tool. Or, you could make a change to lld so that, after creating an executable file and before existing, lld invokes a watermarking tool on a file that the linker just created (in this configuration, the watermarking tool can either run in the same process or as a child process). Have you considered that approach? I think it is fine to add this feature directly to lld if it is convenient, but I'd like to explore other possibilities before we make a decision.

Besides that, there are a few technical concerns in this patch as below:

  • lld guarantees that the same build id will be computed only when the resulting output file (except the build id part itself) and the linker version are the same. We didn't guarantee that different versions of lld compute the build id in the same way. Actually, we have tweaked hash functions and the strategy for tree hash several times. This contract seems too weak for your use case -- for your use case, we need to guarantee that the way how we compute a hash value doesn't change over time. So, we need to make sure that the current way of hash computation is something that we want to maintain like forever.

    If watermarking doesn't have to be fast (e.g. users only have to do this for release binaries), consider using a simple non-tree hash function.
  • Maybe you should add a version field or something to the note section so that we can change a hash function or something in the future.
  • The watermark feature is to make sure that the program image loaded to memory hasn't changed since its file is created. In that sense, the hash function seems a bit too fragile. If you move a segment within an ELF file, you'd have to change the file offset field of a program header, but the memory image won't change by doing that, so in the sense of watermarking, I don't think it should be considered a change.

It seems the main reason why you guys wanted to avoid an external tool is that it is too easy to forget to run the tool after link. But that can be fixed easily by writing a shell script as ld.lld which invokes the real ld.lld and a watermarking tool. Or, you could make a change to lld so that, after creating an executable file and before existing, lld invokes a watermarking tool on a file that the linker just created (in this configuration, the watermarking tool can either run in the same process or as a child process). Have you considered that approach? I think it is fine to add this feature directly to lld if it is convenient, but I'd like to explore other possibilities before we make a decision.

Besides that, there are a few technical concerns in this patch as below:

  • lld guarantees that the same build id will be computed only when the resulting output file (except the build id part itself) and the linker version are the same. We didn't guarantee that different versions of lld compute the build id in the same way. Actually, we have tweaked hash functions and the strategy for tree hash several times. This contract seems too weak for your use case -- for your use case, we need to guarantee that the way how we compute a hash value doesn't change over time. So, we need to make sure that the current way of hash computation is something that we want to maintain like forever.

    If watermarking doesn't have to be fast (e.g. users only have to do this for release binaries), consider using a simple non-tree hash function.
  • Maybe you should add a version field or something to the note section so that we can change a hash function or something in the future.
  • The watermark feature is to make sure that the program image loaded to memory hasn't changed since its file is created. In that sense, the hash function seems a bit too fragile. If you move a segment within an ELF file, you'd have to change the file offset field of a program header, but the memory image won't change by doing that, so in the sense of watermarking, I don't think it should be considered a change.

Unfortunately the shell available is not particularly capable and because performance is critical we are reluctant to pay the penalty for invocation of an external tool in this way. However, we are going to experiment with invoking a process with lld directly and examine the performance differences.

Of course we would like to maintain the watermark between versions of lld, so we will decouple the buildId and watermark functionality. The inclusion of a version in the note is a good idea should changes be needed in future.

Indeed the order of segments within the ELF file could change without affecting the image loaded to memory. We will modify the watermark computation so that it is agnostic to this ordering.

chrisjackson added a reviewer: bd1976llvm.

@ruiu observed that the watermark computation was reliant on the segment ordering in the ELF file. This ordering can change without affecting the loadable image. Therefore, we now apply an ordering based on the segment's virtual address when calculating the watermark.

@MaskRay our linker scripts prevent the watermark note section from being present in PT_LOAD segments. Logic has been added to prevent the section from being used in the watermark computation if it overlaps with a PT_LOAD. SHF_ALLOC has also been removed.

The readobj functionality for the new note section has been moved to D70316.

We carried out extensive tests using an external process to compute the watermark but unfortunately the penalty for accessing the file even when in the system file cache was greater than we are willing to pay.

ruiu added a comment.Nov 19 2019, 8:57 PM

I think this feature is worth more attention. There may be someone who wants to start using this once it's landed, and I'd make sure that we satisfy their needs. Do you mind if I ask you to start a thread on llvm-dev to propose this feature? I think that even a comment like "we'll use this" is valuable. tlike "

lld/ELF/Writer.cpp
2755

Do you think you can move the new code to watermark.{cpp,h} and add file comment to explain (1) what this is and (2) how watermark is computed?

This patch introduces a new ELF extension with a new tag: NT_LLVM_WATERMARK. Last times such ELF extensions were made accompanying RFCs were posted:

https://lists.llvm.org/pipermail/llvm-dev/2019-February/130583.html
https://lists.llvm.org/pipermail/llvm-dev/2019-March/131004.html

I still have some concerns about the extension, especially its generality on the widely used ELF platforms.

We would like to have two PT_NOTEs, one for use by the OS and another for use by tooling. This is achieved by linker scripts. The second PT_NOTE is outside of any PT_LOAD and this is where the watermark would be housed. As described above, a required property of the watermark is that it can be recalculated by an external tool to infer whether or not the loadable parts of the ELF have been modified post-link. By having the watermark outside of any PT_LOAD, it is simpler for the external tool to recalculate. For a similar reason, it would actually be better in our case to have the watermark calculated after the build ID value has been "filled-in", as the build ID is inside a PT_LOAD.

Sorry, I am not following. Do you have a concrete readelf -Sl dump to help my understanding?
By linking a trivial executable with ld.lld a.o --watermark -o a, what I see is:

 PHDR           0x000040 0x0000000000200040 0x0000000000200040 0x000118 0x000118 R   0x8
 LOAD           0x000000 0x0000000000200000 0x0000000000200000 0x000174 0x000174 R   0x1000
 GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0
 NOTE           0x000158 0x0000000000200158 0x0000000000200158 0x00001c 0x00001c R   0x4

Section to Segment mapping:
 Segment Sections...
  00
  01     .note.llvm.watermark
  02
  03     .note.llvm.watermark

.note.llvm.watermark is still included in a PT_LOAD segment. I think you probably meant:

  • .note.gnu.build-id is included in a PT_LOAD and a PT_NOTE
  • .note.llvm.watermark is included in another PT_NOTE

There is also a question why .note.llvm.watermark should be flagged SHF_ALLOC if it is not supposed to be inspected at runtime.

I'll provide an readelf -Sl dump that will illustrate the note to segment mapping.

I think this feature is worth more attention. There may be someone who wants to start using this once it's landed, and I'd make sure that we satisfy their needs. Do you mind if I ask you to start a thread on llvm-dev to propose this feature? I think that even a comment like "we'll use this" is valuable. tlike "

Of course, I will post to llvm-dev.

chrisjackson marked an inline comment as done.Nov 20 2019, 8:37 AM
chrisjackson added inline comments.
lld/ELF/Writer.cpp
2755

I have begun work on a revision with the watermarking code in a separate library so that it can be shared with the utility that checks the watermark.

aganea added a subscriber: aganea.Nov 21 2019, 5:28 AM

I sympathize with the requirement to tell whether a binary has been edited after the link step. E.g. one could then raise an error from the loader.

Writing 8 bytes to a known location in the binary can't achieve that. Whatever post link modification is performed will recalculate and update the hash in the binary. If I understand correctly, the plan is to enhance a llvm binary utility to conveniently perform this updating, or at least to provide the 8 bytes to be written into the known location by any other tool.

So I see the cost - lld and other tools get more complicated - and I see the requirement - but I can't see how the proposed change meets the requirement.

I sympathize with the requirement to tell whether a binary has been edited after the link step. E.g. one could then raise an error from the loader.

Writing 8 bytes to a known location in the binary can't achieve that. Whatever post link modification is performed will recalculate and update the hash in the binary. If I understand correctly, the plan is to enhance a llvm binary utility to conveniently perform this updating, or at least to provide the 8 bytes to be written into the known location by any other tool.

So I see the cost - lld and other tools get more complicated - and I see the requirement - but I can't see how the proposed change meets the requirement.

A post-link modification could recalculate and update the hash, but this would only occur in a deliberate attempt to subvert the watermark mechanism. The watermark is not intended to detect all, e.g. nefarious, post-link modifications. It is not a security feature.

JonChesterfield added a comment.EditedDec 3 2019, 12:45 AM

A post-link modification could recalculate and update the hash, but this would only occur in a deliberate attempt to subvert the watermark mechanism

I think it follows that this patch only detects accidental modifications to the binary that occur after linking. That seems to put it in the realm of network transmission errors, disk bit rot, optical media errors and so forth.

In which case, why only guard a subset of the binary, instead of computing a sha256 of all the compiled artifacts and checking that at install/network copy time? Then there is again no linker patch required.

Unless this is intended to catch people who deliberately change the binary, but lack the skills to then update the hash, which is surely vanishingly few people. Fewer when provided with convenient tools to recalculate the hash.

@chrisjackson You replied via email, so there is no record on Phabricator. I am attaching your response below.

A post-link modification could recalculate and update the hash, but this would only occur in a deliberate attempt to subvert the watermark mechanism

I think it follows that this patch only detects accidental modifications to the binary that occur after linking. That seems to put it in the realm of network transmission errors, disk bit rot, optical media errors and so forth.

In which case, why only guard a subset of the binary, instead of computing a sha256 of all the compiled artifacts and checking that at install/network copy time? Then there is again no linker patch required.

Unless this is intended to catch people who deliberately change the binary, but lack the skills to then update the hash, which is surely vanishingly few people. Fewer when provided with convenient tools to recalculate the hash.

@chrisjackson wrote:
The watermark is intended to detect changes in the loadable image of the binary, not all of the ELF file e.g. ignore debug data. As you've stated, it is there to detect post-link modifications to the loadable segments.

https://lists.llvm.org/pipermail/llvm-dev/2019-November/137319.html

The whole point of the watermark is to show that no post-link modifications have been made, and if the watermark itself is added post-link, it does not achieve this aim: someone could either deliberately or accidentally add a step prior to the watermarking happening.

I am still confused. What I infer from the sentence is that strip/llvm-strip is still allowed. To make .note.llvm.watermark survive strip/llvm-strip, you place it into a PT_NOTE segment. So post-link modification is still possible, then why can't you use another tool to compute the watermark and append a section? In my comment, there are some other questions that are not answered. I have suggested an approach that will not slow down the whole build time.

A post-link modification could recalculate and update the hash, but this would only occur in a deliberate attempt to subvert the watermark mechanism

I think it follows that this patch only detects accidental modifications to the binary that occur after linking. That seems to put it in the realm of network transmission errors, disk bit rot, optical media errors and so forth.

In which case, why only guard a subset of the binary, instead of computing a sha256 of all the compiled artifacts and checking that at install/network copy time? Then there is again no linker patch required.

Unless this is intended to catch people who deliberately change the binary, but lack the skills to then update the hash, which is surely vanishingly few people. Fewer when provided with convenient tools to recalculate the hash.

The watermark is intended to detect changes in the loadable image of the binary, not all of the ELF file e.g. ignore debug data. As you've stated, it is there to detect post-link modifications to the loadable segments.

The whole point of the watermark is to show that no post-link modifications have been made, and if the watermark itself is added post-link, it does not achieve this aim: someone could either deliberately or accidentally add a step prior to the watermarking happening.

I am still confused. What I infer from the sentence is that strip/llvm-strip is still allowed. To make .note.llvm.watermark survive strip/llvm-strip, you place it into a PT_NOTE segment. So post-link modification is still possible, then why can't you use another tool to compute the watermark and append a section? In my comment, there are some other questions that are not answered. I have suggested an approach that will not slow down the whole build time.

(For clarity, @chrisjackson and I are on the same team, and I've been helping him with this). The problem with any post-link external tool used to create the watermark is that it doesn't prevent something happening between the link step and the watermarking step. For example, this would not be detected: 1) Do link; 2) Make a modification to the .data section; 3) Run the watermark tool.

Stripping (and other things that don't affect the loadable image) is allowed, because it doesn't affect the loadable image. The aim of the watermark is to detect loadable data changes.

Build times should include the watermarking process, since that is part of creating a release build (just as e.g. llvm-strip etc should be too). Thus, saying that an external tool will not slow down build times is incorrect.

In my comment, there are some other questions that are not answered. I have suggested an approach that will not slow down the whole build time.

I think you are referring to adding this to llvm-objcopy, yes? I think my above point should address this.

Watermark functionality now placed in separate source and header in the object library. This is used by Writer and SyntheticSections.

The whole point of the watermark is to show that no post-link modifications have been made, and if the watermark itself is added post-link, it does not achieve this aim: someone could either deliberately or accidentally add a step prior to the watermarking happening.

I am still confused. What I infer from the sentence is that strip/llvm-strip is still allowed. To make .note.llvm.watermark survive strip/llvm-strip, you place it into a PT_NOTE segment. So post-link modification is still possible, then why can't you use another tool to compute the watermark and append a section? In my comment, there are some other questions that are not answered. I have suggested an approach that will not slow down the whole build time.

(For clarity, @chrisjackson and I are on the same team, and I've been helping him with this). The problem with any post-link external tool used to create the watermark is that it doesn't prevent something happening between the link step and the watermarking step. For example, this would not be detected: 1) Do link; 2) Make a modification to the .data section; 3) Run the watermark tool.

Stripping (and other things that don't affect the loadable image) is allowed, because it doesn't affect the loadable image. The aim of the watermark is to detect loadable data changes.

As I understand it, the scenario is:

  1. Do link; 2) Run the watermark tool to append .note.llvm.watermark; 3) Release SDK; 4) Downstream vendors modify .data and ship to end users; 5) End users verify that .note.llvm.watermark does not match computed watermark of loadable contents.

The build process before 3) are all controlled. The process should ensure there is no modification to .data between 1) and 2). How do you guarantee a linker side feature can prevent modification? How can you prevent the following:

  1. Do link and generate .note.llvm.watermark in one step 1.5) Modify .data 2) Run the watermark tool to update .note.llvm.watermark

Build times should include the watermarking process, since that is part of creating a release build (just as e.g. llvm-strip etc should be too). Thus, saying that an external tool will not slow down build times is incorrect.

The watermark tool can append .note.llvm.watermark to the executable. It just has to rewrite the section header table at the end of the ELF (usually a few hundred bytes). This is not slower than a built-in linker feature.

In my comment, there are some other questions that are not answered. I have suggested an approach that will not slow down the whole build time.

I think you are referring to adding this to llvm-objcopy, yes? I think my above point should address this.

The whole point of the watermark is to show that no post-link modifications have been made, and if the watermark itself is added post-link, it does not achieve this aim: someone could either deliberately or accidentally add a step prior to the watermarking happening.

I am still confused. What I infer from the sentence is that strip/llvm-strip is still allowed. To make .note.llvm.watermark survive strip/llvm-strip, you place it into a PT_NOTE segment. So post-link modification is still possible, then why can't you use another tool to compute the watermark and append a section? In my comment, there are some other questions that are not answered. I have suggested an approach that will not slow down the whole build time.

(For clarity, @chrisjackson and I are on the same team, and I've been helping him with this). The problem with any post-link external tool used to create the watermark is that it doesn't prevent something happening between the link step and the watermarking step. For example, this would not be detected: 1) Do link; 2) Make a modification to the .data section; 3) Run the watermark tool.

Stripping (and other things that don't affect the loadable image) is allowed, because it doesn't affect the loadable image. The aim of the watermark is to detect loadable data changes.

As I understand it, the scenario is:

  1. Do link; 2) Run the watermark tool to append .note.llvm.watermark; 3) Release SDK; 4) Downstream vendors modify .data and ship to end users; 5) End users verify that .note.llvm.watermark does not match computed watermark of loadable contents.

The build process before 3) are all controlled. The process should ensure there is no modification to .data between 1) and 2). How do you guarantee a linker side feature can prevent modification? How can you prevent the following:

  1. Do link and generate .note.llvm.watermark in one step 1.5) Modify .data 2) Run the watermark tool to update .note.llvm.watermark

This is a motivation to not have an external tool on the watermarking. That being said, as @chrisjackson has said on more than one occasion, this isn't intended to be a security feature so we are not attempting to detect a malicious attacker. It could be possible for downstream LLD producers to add a local salt to the watermark to make it more secure, should they so choose, I suppose.

Build times should include the watermarking process, since that is part of creating a release build (just as e.g. llvm-strip etc should be too). Thus, saying that an external tool will not slow down build times is incorrect.

The watermark tool can append .note.llvm.watermark to the executable. It just has to rewrite the section header table at the end of the ELF (usually a few hundred bytes). This is not slower than a built-in linker feature.

It really doesn't matter which is slower: if an external watermarking tool is not part of the linker, things can be done post-link and therefore the thing being watermarked is not the output of the linker. The main goal of this is to show that there have been no changes to the loadable part of the linker output, whether controlled by the user of the linker or not.

I'm not sure if you've missed how the process is expected to go. Here's a recap:

  1. User calls linker with --watermark specified, producing output.elf with a watermark.
  2. Optionally a user might choose to do modifications that don't affect non-loadable data, e.g. llvm-objcopy --strip-all output.elf etc.
  3. A user (probably a different user) runs a tool to validate that the watermark remains correct. If not, they can report it to the producer of output.elf.

Any attempt to modify e.g. .data between 1 and 3, either accidentally or deliberately, will be detected, unless the user explicitly tries to defeat the watermarking, which as mentioned is not something this feature on its own tries to detect. Having an external tool that does the watermarking post-link (i.e. after step 1) would allow users to make the modifications before watermarking, which is explicitly what the watermarking is trying to detect (and therefore wouldn't in this case).

Modified extractSegments() in watermark.h.

ruiu added a comment.Dec 10 2019, 10:33 PM

As I understand it, the scenario is:

  1. Do link; 2) Run the watermark tool to append .note.llvm.watermark; 3) Release SDK; 4) Downstream vendors modify .data and ship to end users; 5) End users verify that .note.llvm.watermark does not match computed watermark of loadable contents.

The build process before 3) are all controlled. The process should ensure there is no modification to .data between 1) and 2). How do you guarantee a linker side feature can prevent modification? How can you prevent the following:

  1. Do link and generate .note.llvm.watermark in one step 1.5) Modify .data 2) Run the watermark tool to update .note.llvm.watermark

This is a motivation to not have an external tool on the watermarking. That being said, as @chrisjackson has said on more than one occasion, this isn't intended to be a security feature so we are not attempting to detect a malicious attacker. It could be possible for downstream LLD producers to add a local salt to the watermark to make it more secure, should they so choose, I suppose.

This is actually going to be an interesting problem. Do your users make post-link modifications to executables by intention or by accident? If it's intentional, you are raising a bar of an arms race, and they'll catch up by adding --update-watermark option or something to their tool, so that they'll update a watermark when a binary is modified, which nullifies the point of this change.

As I understand it, the scenario is:

  1. Do link; 2) Run the watermark tool to append .note.llvm.watermark; 3) Release SDK; 4) Downstream vendors modify .data and ship to end users; 5) End users verify that .note.llvm.watermark does not match computed watermark of loadable contents.

The build process before 3) are all controlled. The process should ensure there is no modification to .data between 1) and 2). How do you guarantee a linker side feature can prevent modification? How can you prevent the following:

  1. Do link and generate .note.llvm.watermark in one step 1.5) Modify .data 2) Run the watermark tool to update .note.llvm.watermark

This is a motivation to not have an external tool on the watermarking. That being said, as @chrisjackson has said on more than one occasion, this isn't intended to be a security feature so we are not attempting to detect a malicious attacker. It could be possible for downstream LLD producers to add a local salt to the watermark to make it more secure, should they so choose, I suppose.

This is actually going to be an interesting problem. Do your users make post-link modifications to executables by intention or by accident? If it's intentional, you are raising a bar of an arms race, and they'll catch up by adding --update-watermark option or something to their tool, so that they'll update a watermark when a binary is modified, which nullifies the point of this change.

The watermark is intended as a safety measure for several scenarios post-link.

  1. A user deliberately modifies a loadable segment but is unaware that they shouldn't.
  2. A user accidentally modifies a loadable segment.
  3. A tool somewhere in the build system has unexpected behaviour that modifies a loadable segment.

If a user intentionally modifies a loadable segment and updates the watermark, then this is nefarious behaviour that the watermark is not intended to prevent.

ruiu added a comment.Dec 12 2019, 12:25 AM

Sorry for asking too many questions, but how do you verify that a watermark matches the contents? Are you creating a new command?

Sorry for asking too many questions, but how do you verify that a watermark matches the contents? Are you creating a new command?

@chrisjackson has proposed it as an option to add to llvm-readobj (see D70316).

Partially guarding against a user accidentally or incompetently modifying a binary isn't sufficiently useful to justify adding code to lld in my opinion.

In the spirit of (belated) full disclosure, I'm following this patch because I recognise a similar feature from a proprietary linker and it makes me sad to see it replicated.

However, I haven't contributed any code to lld, so will bow out at this point. It's not my call.

Hello @ruiu and @MaskRay , can I help with any more technical queries concerning this proposal? Also with respect to the associated D70316 that enables printing of watermark note sections and computing of watermarks in readobj?

MaskRay added a comment.EditedJan 10 2020, 11:31 AM

Apologies for the late reply.

http://lists.llvm.org/pipermail/llvm-dev/2019-November/137108.html does not evoke many responses. I take the lack of responses on the RFC and questions from @JonChesterfield and @ruiu as people are still doubting whether this feature will be generic enough to benefit the community, rather than a feature used in a very specific scenario, which can be easily replaced by another tool. I have also asked some other folks but haven't receive positive reaction yet. I concur with what @ruiu said. I feel this just raises the bar of an arm race, which may not be necessary in the first place if the process can be improved (why can the build system modify contents after linking and before a binary manipulation tool like llvm-objcopy?).

I think this feature, as it stands, is not quite justified as a linker feature. Adding to llvm-objcopy is probably fine but I hope we can aim for something bigger. The statement "compliant with the system ABI" is pretty vague. Its defined meaning here is: "if we are byte identical, then we are ABI compliant." This is obviously too strong and does not reflect the real fact. If we want to make sure external symbols (part of ABI) do not change, we can use something like interface shared objects. Fedora is doing something more generic. https://fedoraproject.org/wiki/Toolchain/Watermark I wish if we want to add an LLVM watermark, we can make it more generally useful.

Apologies for the late reply.

http://lists.llvm.org/pipermail/llvm-dev/2019-November/137108.html does not evoke many responses. I take the lack of responses on the RFC and questions from @JonChesterfield and @ruiu as people are still doubting whether this feature will be generic enough to benefit the community, rather than a feature used in a very specific scenario, which can be easily replaced by another tool.

While current applicability of this feature does appear to be limited due to the lack of responses, I don't think the proposed feature is alone in its limited userbase. For example, I'm not sure how widespread the use of ELF partitioning is, which is much more intrusive (To be clear, I'm not suggesting partitioning should not be part of lld). The feature must be part of the linker as an external tool does not prevent a modification post-link (see @edd's earlier comment), thus defeating the purpose of the watermark.

It was my understanding that I had the code owner's blessing for the watermarking feature but of course this may have changed.

I have also asked some other folks but haven't receive positive reaction yet. I concur with what @ruiu said. I feel this just raises the bar of an arm race, which may not be necessary in the first place if the process can be improved (why can the build system modify contents after linking and before a binary manipulation tool like llvm-objcopy?).

There cannot be an arms race as this is not a security feature. I think perhaps this feature is best thought of as build-id for PT_LOAD segments. It provides a tool for detecting build systems that act in this questionable manner. Why the build systems behave this way would be determined and corrected afterwards. Would a change in name to something like 'loadable-build-id' cause less confusion?

I think this feature, as it stands, is not quite justified as a linker feature. Adding to llvm-objcopy is probably fine but I hope we can aim for something bigger.
The statement "compliant with the system ABI" is pretty vague. Its defined meaning here is: "if we are byte identical, then we are ABI compliant." This is obviously too strong and does not reflect the real fact. If we want to make sure external symbols (part of ABI) do not change, we can use something like interface shared objects. Fedora is doing something more generic. https://fedoraproject.org/wiki/Toolchain/Watermark I wish if we want to add an LLVM watermark, we can make it more generally useful.

While this is a very strong statement, it is true. We don't think that the watermark feature is a perfect ABI compliance checking utility (indeed perhaps referring to the ABI was a mistake on my part), but it does enforce the requirements @edd highlighted.

It would seem that there just isn't sufficient interest in this, so I'm going to mark this revision as abandoned if there are no further comments.

chrisjackson abandoned this revision.Feb 19 2020, 1:39 AM