This is an archive of the discontinued LLVM Phabricator instance.

[lld] [mach-o] make lld respect alignment constraints more
ClosedPublic

Authored by t.p.northover on Oct 15 2014, 4:16 PM.

Details

Reviewers
t.p.northover
Summary

Hi all,

At the moment, lld is very lax about casting random pointers to ones with ABI alignment requirements, and even dereferencing them afterwards. This is not good, and we spend a lot of our time telling other developers not to do that (and that it's their fault their program now crashes), so we should probably get it right ourselves. Preferably *before* it bites us.

One big part of this (but not the only one) is the interface presented by the endian utility functions "readN" and "writeN", which effectively forces the creation of dodgy pointers. This patch attempts to fix them.

I also change the functions to have a more consistent interface (all use uintN_t) and change the "swap" parameter ("are we compiling for an architecture that's different to our own?") into an absolute "isBig" ("are we compiling for a big-endian target?").

The interfaces are still ugly, and if anyone has a better suggestion I'd love to hear it (this patch catches all uses, so it would be a good time to refactor it properly if there's a better way). Ideas I've considered so far:

  • read<uint32_t>(...). Slightly nicer implementation, but even more verbose usage. Might be worth it if structs could be accommodated.
  • "typedef ulittle32_t utarget32_t" in ArchHandler or similar. Would require making *everything* a template.
  • virtual functions in ArchHandler. Would eliminate passing isBig around, but may not always have an ArchHandler.

How does it look?

Tim.

Diff Detail

Event Timeline

t.p.northover retitled this revision from to [lld] [mach-o] make lld respect alignment constraints more.
t.p.northover updated this object.
t.p.northover edited the test plan for this revision. (Show Details)
t.p.northover added a subscriber: Unknown Object (MLST).

How come in the ArchHandler_x86_64 you use:

write32(loc, value, false);

but inside

How come in the ArchHandler_x86_64 you use:

write32(loc, value, false);

but inside ArchHandler_arm64 you use:

write32(loc, value, _isBig);

Seems like each arch has a fixed endianness. Given that,

write32(loc, value, false);

could be optimized to:

writeLittle32(loc, value); // or use little32_t directly?

Have you looked at the codegen of this verse the old 'swap' parameter. Just to make sure the compiler really is optimizing what you think it should be?

Good point. After thinking about it over night, I'm leaning towards directly using ulittle and ubig within each ArchHandler. I'll see how it looks.

t.p.northover edited the test plan for this revision. (Show Details)

I definitely think it's neater with the explicit ulittle32_t in the individual archhandlers; shame we can't do something similar in the generic code.

I also checked speed on check-lld. The difference (if there was one) seemed to be well within the noise limits.

Does it look better to you too?

Cheers.

Tim.

The ArchHandlers look much better!

lib/ReaderWriter/MachO/MachONormalizedFileBinaryReader.cpp
266–267

This is sad that the casts are needed. Perhaps, read32() can be overloaded to take a uint32_t* so all these casts can be removed.

t.p.northover added inline comments.Oct 17 2014, 11:47 AM
lib/ReaderWriter/MachO/MachONormalizedFileBinaryReader.cpp
266–267

I was trying to avoid it (even creating improperly aligned pointers is very dodgy, anything making that easier is a worry), but there are a lot of casts here. I'll make them templated and put an assertion about pointer alignment in there.

t.p.northover edited the test plan for this revision. (Show Details)

How about this? I think all the readN calls are cast-free now.

t.p.northover accepted this revision.Oct 27 2014, 3:59 PM
t.p.northover added a reviewer: t.p.northover.

Thanks Nick. Should be r220730.

This revision is now accepted and ready to land.Oct 27 2014, 3:59 PM
t.p.northover closed this revision.Oct 27 2014, 4:00 PM