This is an archive of the discontinued LLVM Phabricator instance.

Update ObjectFileELF to detect ELF triple based on ELF notes and the ELF header.
ClosedPublic

Authored by tfiala on Jun 25 2014, 5:12 PM.

Details

Reviewers
emaste
Summary

This change modifies how ObjectFileELF determines the architecture of an ELF file. It no longer assumes the elf file's vendor and OS is the same as the current host. It is one step in supporting running lldb across MacOSX/Linux/FreeBSD/NetBSD.

The current ELF note parsing added supports checking for a FreeBSD, Linux, and NetBSD ABI note. If found, the OS is set appropriately. The previously-existing GNU UUID note parsing code has been incorporated into the broader note-parsing code provided here.

Diff Detail

Event Timeline

tfiala updated this revision to Diff 10864.Jun 25 2014, 5:12 PM
tfiala retitled this revision from to Update ObjectFileELF to detect ELF triple based on ELF notes and the ELF header..
tfiala updated this object.
tfiala edited the test plan for this revision. (Show Details)
tfiala added a reviewer: emaste.
tfiala added a subscriber: Unknown Object (MLST).

Hang on - just caught a couple member variables that are not in the initializer list. Will add a new diff in a few minutes with all members properly in the initializer list.

tfiala edited the test plan for this revision. (Show Details)Jun 25 2014, 5:29 PM
tfiala updated this revision to Diff 10866.Jun 25 2014, 5:33 PM

There were quite a few members not in the constructor initializer list. Added those there, ran the tests on Linux, all clean.

Ed - the second (complete) diff is fine to look at now.

emaste added inline comments.Jun 25 2014, 6:35 PM
source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
1075

n_name is a std::string and the length (namesz) should be handled in the note parser so you should be able to just do note.n_name == LLDB_NT_OWNER_FREEBSD, no?

Also for the purpose of this parsing we'll generally have just one case in each of the switch statements - maybe it'd be more concise to just check note.n_type == LLDB_NT_FREEBSD_ABI_TAG in the if() as well?

emaste edited edge metadata.Jun 25 2014, 7:50 PM

Hmm, I just tried w/ a core file I have around, and it seems we correctly get the triple but then fail:

main-thread ObjectFileELF::GetModuleSpecifications file '/bin/sleep' module OSABI: ELFOSABI_FREEBSD
main-thread ObjectFileELF::GetModuleSpecifications file '/bin/sleep' set ELF module OS type from ELF header OSABI.
main-thread ObjectFileELF::RefineModuleDetailsFromNote parsing note name='FreeBSD', type=1
main-thread ObjectFileELF::RefineModuleDetailsFromNote detected FreeBSD, min version constant 1100021
main-thread ObjectFileELF::RefineModuleDetailsFromNote parsing note name='FreeBSD', type=2
main-thread ObjectFileELF::GetModuleSpecifications file '/bin/sleep' module set to triple: x86_64-unknown-freebsd (architecture x86_64)
main-thread 0x80926b600 Module::Module((x86_64) '/bin/sleep')
main-thread ObjectFileELF::GetModuleSpecifications file '/bin/sleep' module OSABI: ELFOSABI_FREEBSD
main-thread ObjectFileELF::GetModuleSpecifications file '/bin/sleep' set ELF module OS type from ELF header OSABI.
main-thread ObjectFileELF::RefineModuleDetailsFromNote parsing note name='FreeBSD', type=1
main-thread ObjectFileELF::RefineModuleDetailsFromNote detected FreeBSD, min version constant 1100021
main-thread ObjectFileELF::RefineModuleDetailsFromNote parsing note name='FreeBSD', type=2
main-thread ObjectFileELF::GetModuleSpecifications file '/bin/sleep' module set to triple: x86_64-unknown-freebsd (architecture x86_64)
main-thread ObjectFileELF::RefineModuleDetailsFromNote parsing note name='FreeBSD', type=1
main-thread ObjectFileELF::RefineModuleDetailsFromNote detected FreeBSD, min version constant 1100021
main-thread ObjectFileELF::RefineModuleDetailsFromNote parsing note name='FreeBSD', type=2
main-thread Target::Target created with architecture x86_64 (x86_64-unknown-freebsd)
main-thread 0x80926c000 Module::Module((x86_64) 'libc.so.7')
main-thread 0x80926c000 Module::~Module((unknown) '')
main-thread 0x80926c000 Module::Module((x86_64) '/data/emaste/src/llvm/build/sleep.core')
main-thread ObjectFileELF::GetModuleSpecifications file '/data/emaste/src/llvm/build/sleep.core' module OSABI: ELFOSABI_FREEBSD
main-thread ObjectFileELF::GetModuleSpecifications file '/data/emaste/src/llvm/build/sleep.core' set ELF module OS type from ELF header OSABI.
main-thread ObjectFileELF::GetModuleSpecifications file '/data/emaste/src/llvm/build/sleep.core' module set to triple: x86_64-unknown-freebsd (architecture x86_64)
main-thread 0x80926c000 Module::~Module((x86_64) '/data/emaste/src/llvm/build/sleep.core')
main-thread Went to stop the private state thread, but it was already invalid.
error: Unable to find process plug-in for core file '/data/emaste/src/llvm/build/sleep.core'

Looking into it...

tfiala added a comment.EditedJun 26 2014, 7:45 AM

error: Unable to find process plug-in for core file '/data/emaste/src/llvm/build/sleep.core'
Looking into it...

Hey Ed,

I'm not sure how you were doing the entire setup above, but here's what I need to do on the MacOSX side when debugging a Linux exe (over llgs):

// Enable some logging
log enable -T -f /tmp/xcode_gdbremote.log gdb-remote packets process
log enable -T -f /tmp/xcode_module.log lldb module

// Syncing over files from my Mac box to my Linux box
platform shell rsync -av /Users/tfiala/play/loop_test tfiala2.mtv.corp.google.com:play

// This obviously only works in the llgs branch for now, but: fire up llgs
platform shell ssh tfiala2.mtv.corp.google.com 'lldb/macosx.sync/git-s02-macpro/build-debug/bin/lldb-gdbserver --log-file /tmp/llgs.log --lldb-command "log enable -T -f /tmp/llgs.log lldb process host" --lldb-command "log enable -T -f /tmp/llgs_packets.log gdb-remote packets" tfiala-macpro.mtv.corp.google.com:1234' &

// This may be the critical part you're missing - se the platform for whatever ELF object files you were trying to load. The --sysroot part shows where I have my linux root filesystem mapped locally on the MacOSX side so that things like libc.so can be found.
platform select --sysroot /Volumes/tfiala2.mtv.corp.google.com remote-linux

// And then create the target for the given architecture
target create --arch x86_64-pc-linux /Users/tfiala/play/loop_test/bin/linux-x86_64/main --remote-file /usr/local/google/home/tfiala/play/loop_test/bin/linux-x86_64/main

// GDB remote in.
gdb-remote tfiala2.mtv.corp.google.com:1234

// Set a breakpoint.
b main.cpp:10

// And go.
run

Those LLDB commands represent how I'm able to successfully run a Linux x86_64 Linux ELF (2.6.26 ABI IIRC, from Ubuntu 12.04 LTS) on the llgs side and somewhat play with it on the MacOSX side.

It's possible there are little bits that the llgs branch has in it that are making this more possible. I did not try running from FreeBSD to a Linux llgs side.

These ObjectFileELF changes were necessary to get to the point where I could do a "(lldb) file /some/elf/from/other/platform" to show up right with an "(lldb) image list -t -f" command.

source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
1075

Also for the purpose of this parsing we'll generally have just one case in each of the switch statements - maybe it'd be more concise to just check note.n_type == LLDB_NT_FREEBSD_ABI_TAG in the if() as well?

Sure, I think it's only GNU that is using multiple ones. I can change that for the others.

n_name is a std::string and the length (namesz) should be handled in the note parser so you should be able to just do note.n_name == LLDB_NT_OWNER_FREEBSD, no?

Yes, but I was looking to avoid the string compare if the length wasn't equal. That would stop needing to get the string length of the constant. Overall this is likely very minor compared to the cost of scanning the entire contents though, so probably worth the simplification.

I'm not sure how you were doing the entire setup above

This wasn't using llgs at all, just the local core file support

bin/lldb /bin/sleep -c sleep.core

Last night I discovered that the triple is x86_64-unknown-freebsd with the patch, and x86_64-unknown-freebsd10.0 without. The latter matches the platform triple reported by platform status.

With the patch, in ModuleList::GetSharedModule the module_sp->GetObjectFile() fails.

source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
1075

Yes, but I was looking to avoid the string compare if the length wasn't equal. That would stop needing to get the string length of the constant. Overall this is likely very minor compared to the cost of scanning the entire contents though, so probably worth the simplification.

Oh, good point. I was probably reading something else into it from my previous investigations into ELF notes and having to deal with a special case for the Linux core note n_namesz.

Last night I discovered that the triple is x86_64-unknown-freebsd with the patch, and x86_64-unknown-freebsd10.0 without. The latter matches the platform triple reported by platform status.

Interesting. I'll need to get that to correspond.

I'm adjusting the triple with llvm::Triple::setOS(). I don't see a llvm::Triple::OSType for FreeBSD10.0? (i.e. is it not possible to build that triple with the llvm code? Shouldn't it be? Maybe the OS version somehow needs to get appended on that - looking at the details of Triple now...)

Hey Ed - how do I interpret the payload 32-bit value that comes with the OS ABI part?

Also - I have a cleaned up patch coming soon that adjusts for some other comments you had.

Hey Ed - how do I interpret the payload 32-bit value that comes with the OS ABI part?

Specifically, for the FREEBSD ABI note.

Hey Ed,

It looks like the FreeBSD ABI note contains a release date? If that's right, any idea on the best way to convert that to a FreeBSD version number? We could use a table of release dates and find an exact match or, if not a match, the first release date that comes before the given date.

I'll wait for you to tell me more about that note. There might be a better way to get the version number on FreeBSD for elf. Seems a little crazy to parse it out of a release date.

BTW - I was looking here:
http://lists.freebsd.org/pipermail/freebsd-current/2009-January/001939.html

to try to figure out what it was. The comments make it look like an os release date. When interpreted as a decimal integer, it almost looks like it's encoding something more useful like a 10 with some other (maj/min/dot info?) shifted by a few factors of 10.

tfiala added a comment.EditedJun 26 2014, 9:23 AM

I've got a patch coming up that will support this, Ed. You'll want to adjust (or tell me what to adjust) that does this:
tfiala-macpro:lldb tfiala$ DerivedData/lldb/Build/Products/Debug/lldb
(lldb) file /tmp/ls-freebsd
Current executable set to '/tmp/ls-freebsd' (x86_64).
(lldb) image list -b -t
[ 0] ls-freebsd x86_64-unknown-freebsd10.0
(lldb)

tfiala updated this revision to Diff 10887.Jun 26 2014, 9:26 AM
tfiala edited edge metadata.

Updated for comments from Ed Maste re: string usage, case usage on single case, and freebsd OS name setting.

Requires a change to really parse out the FreeBSD version from the FreeBSD ABI ELF note payload.

emaste added inline comments.Jun 26 2014, 9:48 AM
source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
210

I think this should be fine (and will test in a moment):
version_major = version_info / 100000;
version_minor = (version_info / 1000) % 100;

But something's odd with LLDB's FreeBSD platform triples - it reports -freebsd10.0 on my FreeBSD 11-current laptop, and -freebsd9.1 on my FreeBSD 9.2 desktop.

It's called osreldate but is actually the version information - for
example, on my desktop running FreeBSD 9.2-STABLE branch:
+-+-+-+-+-+-+-+

90 25 0 9

+-+-+-+-+-+-+-+
Maj Min Rxx

Ok cool - that gives me everything I need. So for me down below, it would be Maj = 10, Min = 0, Rxx is 510.

<lldb.driver.main-thread> ObjectFileELF::RefineModuleDetailsFromNote parsing note name='FreeBSD', type=1
<lldb.driver.main-thread> ObjectFileELF::RefineModuleDetailsFromNote detected FreeBSD, min version constant 1000510

I'll update the patch for that.

Also for the purpose of this parsing we'll generally have just one case in each of the switch statements - maybe it'd be more concise to just check note.n_type == LLDB_NT_FREEBSD_ABI_TAG in the if() as well?

Sure, I think it's only GNU that is using multiple ones. I can change that for the others.

Yeah - I was just hoping to lessen the bloat a bit for future
additions -- I suspect later ones will be (n_name == OpenBSD && n_type

OPENBSD_ABI_TAG), n_name == DragonFly && n_type

DRAGONFLY_ABI_TAG), etc.. all following the same pattern.

tfiala updated this revision to Diff 10895.Jun 26 2014, 10:13 AM

With this patch, I get the following loading a FreeBSD 10.0.510 elf object file:

tfiala-macpro:lldb tfiala$ DerivedData/lldb/Build/Products/Debug/lldb
(lldb) log enable lldb modules
(lldb) file /tmp/ls-freebsd
...
<lldb.driver.main-thread> ObjectFileELF::RefineModuleDetailsFromNote parsing note name='FreeBSD', type=1
<lldb.driver.main-thread> ObjectFileELF::RefineModuleDetailsFromNote detected FreeBSD 10.0.510
...
Current executable set to '/tmp/ls-freebsd' (x86_64).
(lldb) image list -b -t
[ 0] ls-freebsd x86_64-unknown-freebsd10.0
(lldb)

It turns out this still fails in some cases, as it seems somewhere in
ModuleList::GetSharedModule's call of module_sp->GetObjectFile there's
an exact match on the platform triple string. On FreeBSD the platform
triple ends up as e.g. "x86_64-unknown-freebsd9.1" (with the version
appended), and the version is detected at cmake run time.

I'm not sure what the right fix is.

a) Add code to Host::GetArchitecture() to strip the version off
(returning x86_64-unknown-freebsd), and going back to the earlier
version of the FreeBSD ABI note handling.
b) Change the match to compare only the Arch and OS.
c) Something else?

It turns out this still fails in some cases, as it seems somewhere in
ModuleList::GetSharedModule's call of module_sp->GetObjectFile there's
an exact match on the platform triple string. On FreeBSD the platform
triple ends up as e.g. "x86_64-unknown-freebsd9.1" (with the version
appended), and the version is detected at cmake run time.

So with the latest patch, it isn't always computing x86_64-unknown-freebsd9.1 in your case for the ObjectFileELF object file? It would be great to find out where the non-versioned x86_64-unknown-freebsd was coming from in that case.

So with the latest patch, it isn't always computing x86_64-unknown-freebsd9.1 in your case for the ObjectFileELF object file? It would be great to find out where the non-versioned x86_64-unknown-freebsd was coming from in that case.

With the patch it correctly reports "x86_64-unknown-freebsd9.2." The
problem is that the triple reported by the platform plugin is
"x86_64-unknown-freebsd9.1" and it doesn't match.

Ok, thanks for explaining.

I'd be in favor of eliminating the dependence on the version number of the OS. If there is a reason that becomes important, I suspect there will be other parts we'll need to address as well since the version check might need to be something like: OS version must be >= some min ABI version to do something. So the version number seems too specific to have in the triple.

tfiala added a comment.EditedJun 26 2014, 12:49 PM

I'm going to remove this part from the llgs branch and not focus on cross-architecture debugging for the initial llgs upstream. This clearly needs a little more time to work out a situation that doesn't break FreeBSD. I was primarily working on this since it was a necessary part to enabling cross Linux/MacOSX lldb/llgs to work (Linux x86_64 llgs with MacOSX lldb). But that element is not critical for getting llgs Linux x86_64 upstreamed.

emaste added inline comments.Jun 26 2014, 2:52 PM
source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
2565

After a lengthy debugging session it turns out this is the change that breaks core file loading on FreeBSD. The core file has no sections (the core data and thread / register notes are in program segments), so ParseSectionHeaders() returns 0.

After a lengthy debugging session it turns out this is the change that breaks core file loading on FreeBSD. The core file has no sections (the core data and thread / register notes are in program segments), so ParseSectionHeaders() returns 0.

Oh dang. I originally had a comment on that in the first version of the patch to draw attention to whether that was valid or not to have no sections. I ultimately removed the comment.

Okay - so I can easily just call the method. Of course, if there are no sections, it also won't pick up the notes either. Is this a case where its creating a module on the fly?

tfiala updated this revision to Diff 10907.Jun 26 2014, 3:51 PM

Hey Ed,

This patch fixes the ParseHeaders() with a zero return value so that it does not fail out the GetArchitecture() call.

Let me know if that gets you further.

We'll still need to figure out how to adjust the code to not require a FreeBSD version number (or adjust how the version number is set properly across the board).

Thanks Todd - this works on FreeBSD now.

I must have been mistaken in part of my earlier analysis; based on my test now and Greg's description of the triple comparison it looks like it is actually done by matching the Arch and OS, not the string.

emaste accepted this revision.Jun 26 2014, 6:44 PM
emaste edited edge metadata.
emaste added inline comments.
source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
1101

Should we validate that the descsz is as expected in here too? E.g. (note.n_name == LLDB_NT_OWNER_FREEBSD && note.n_type == LLDB_NT_FREEBSD_ABI_TAG && note.n_descsz == 4)

This revision is now accepted and ready to land.Jun 26 2014, 6:44 PM

I'll go ahead and adjust for the size check in the FreeBSD ABI tag and do the same for NetBSD.

Thanks for testing, Ed! Glad to get this in.

source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
1101

I could do that. It will fail either way if it's less than the expected size (i.e. if we go down the path of processing a note with an owner of LLDB_NT_OWNER_FREEBSD and a note type of LLDB_NT_FREEBSD_ABI_TAG) since the read will fail.

It might be worth putting a warning in the code if the size doesn't match expectations - that way we'd at least be able to track down what was breaking if we found the note but it didn't match size expectations. (i.e. effectively a change in the note definition or a broken code emitter).

Thanks Todd, I'm very happy to see this go in!

source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
1101

I was thinking of the other case. if the size recorded in the file is larger than the data. We'd then partially consume the payload and then try to read a subsequent ELF note from the wrong location, no?

tfiala closed this revision.Jun 27 2014, 10:02 AM

This went in here, with minor tweaks to test note size and a test to verify that Linux and FreeBSD x86_64 image architecture/triple types are reported correctly:

svn commit
Sending source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
Sending source/Plugins/ObjectFile/ELF/ObjectFileELF.h
Adding test/functionalities/object-file
Adding test/functionalities/object-file/TestImageListMultiArchitecture.py
Adding (bin) test/functionalities/object-file/ls-freebsd-10.0-x86_64
Adding (bin) test/functionalities/object-file/sleep-ubuntu-14.04-x86_64
Transmitting file data .....
Committed revision 211907.