Page MenuHomePhabricator

Explicitly set entry point arch when it's thumb
ClosedPublic

Authored by aadsm on Sep 25 2019, 7:11 PM.

Details

Summary

I found a case where the main android binary (app_process32) had thumb code at its entry point but no entry in the symbol table indicating this. This made lldb set a 4 byte breakpoint at that address (we default to arm code) instead of a 2 byte one (like we should for thumb).
The big deal with this is that the expression evaluator uses the entry point as a way to know when a JITed expression has finished executing by putting a breakpoint there. Because of this, evaluating expressions on certain android devices (Google Pixel something) made the process crash.
This was fixed by checking this specific situation when we parse the symbol table and add an artificial symbol for this 2 byte range and indicating that it's arm thumb.

I created 2 unit tests for this, one to check that now we know that the entry point is arm thumb, and the other to make sure we didn't change the behaviour for arm code.

I also run the following on the command line with the app_process32 where I found the issue:
Before:

(lldb) dis -s 0x1640 -e 0x1644
app_process32[0x1640]: .long  0xf0004668                ; unknown opcode

After:

(lldb) dis -s 0x1640 -e 0x1644
app_process32`:
app_process32[0x1640] <+0>: mov    r0, sp
app_process32[0x1642]:      andeq  r0, r0, r0

Diff Detail

Repository
rL LLVM

Event Timeline

aadsm created this revision.Sep 25 2019, 7:11 PM

This sounds like a reasonable thing to do, but it seems to me we make that more generic. Having a symbol for the entry point would help in other cases too, and we already go through a lot of trouble to track down various symbol addresses (plt parsing, eh_frame parsing, ...), so this would fit the general pattern there. So, I think we should just remove the if(arm) check and always create a "synthetic" symbol for the entry point if there is no symbol at that address already.

For the test, what would you say to writing that as a lit test instead (testing the address class deduction via the disassemble command you mentioned)? The yaml is actually fairly readable as is, but given that you felt the need to include the commands used to produce that input, it might be better to actually produce that file via the commands as a part of the test (substituting llvm-mc for as and ld.lld for linker).

lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
2728 ↗(On Diff #221881)

GetNextSyntheticSymbolName().GetCString()

2737–2738 ↗(On Diff #221881)

This is pretty arbitrary. Would it be possible to just set size_is_valid=false, and let the usual symbol size deduction logic figure out something reasonable here?

aadsm marked an inline comment as done.Sep 26 2019, 5:48 PM

For the test, what would you say to writing that as a lit test instead (testing the address class deduction via the disassemble command you mentioned)?

I was actually keen on this since lit is the only type of test I haven't used yet but then thought that it wouldn't really test my change directly (just indirectly). I know I put that as a test in my summary but it was more like a sanity check. The real test here is checking the address class (which is what is changed in the code). There are different things in lldb that uses that like setting a breakpoint or disassembling instructions, that's why I don't feel that testing the consequence is the ideal test. What do you think?

The yaml is actually fairly readable as is, but given that you felt the need to include the commands used to produce that input, it might be better to actually produce that file via the commands as a part of the test (substituting llvm-mc for as and ld.lld for linker).

I just put it there for completion sake, I always like to have the source of things when I didn't do it by hand. In this case I prefer to have the yaml because I'm not sure if in all scenarios that we run the test we'll be able to assemble arm assembly into an object?

lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
2737–2738 ↗(On Diff #221881)

To be honest I'm not sure how the size_is_valid = false is used as I've only seen it being used when going through the EH frame FDE entries.

Setting the size to 0 is problematic though, when the symbol is added to the symtab its size will automatically span to the next function symbol boundary. I think this can be dangerous because the symtab for the object file might not have all function boundaries defined and in the event that we have mixed arm/thumb code in it, it will incorrectly mark arm code as thumb. This is why I wanted to be conservative here.

aadsm updated this revision to Diff 222063.Sep 26 2019, 5:50 PM

Use GetNextSyntheticSymbolName() to generate the symbol name

For the test, what would you say to writing that as a lit test instead (testing the address class deduction via the disassemble command you mentioned)?

I was actually keen on this since lit is the only type of test I haven't used yet but then thought that it wouldn't really test my change directly (just indirectly). I know I put that as a test in my summary but it was more like a sanity check. The real test here is checking the address class (which is what is changed in the code). There are different things in lldb that uses that like setting a breakpoint or disassembling instructions, that's why I don't feel that testing the consequence is the ideal test. What do you think?

Yeah, lit tests are often indirect to some degree. I think that's one of their main drawbacks (the lack of interactivity being the other). This needs to be balanced with their advantages. In this particular, case I think using the lit approach would be fine (though I haven't looked at the code to see how exactly this information is plumbed), as both disassembly and breakpoint should use the same source for determining the instruction set, but I am fine also fine with keeping the test like this.

The yaml is actually fairly readable as is, but given that you felt the need to include the commands used to produce that input, it might be better to actually produce that file via the commands as a part of the test (substituting llvm-mc for as and ld.lld for linker).

I just put it there for completion sake, I always like to have the source of things when I didn't do it by hand. In this case I prefer to have the yaml because I'm not sure if in all scenarios that we run the test we'll be able to assemble arm assembly into an object?

The things you need are the arm target being built, and having lld checked out (so the test would need an annotation like REQUIRES: lld, arm). Not everyone has both of these things enabled (particularly having lld is less common), but they are things that one _can_ enable no matter what is his host or target architecture, so I am not worried by that. And as tests like these become more common, I expect more people will start having these enabled by default.

lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
2737–2738 ↗(On Diff #221881)

That's true, but in the case we don't have all function symbols, we're down to guessing anyway. For all you know, treating the rest of code as thumb might actually be the right thing to do. (In fact it will almost certainly be correct at least for the next few instructions, since the entry point function is unlikely to be just one instruction long.)

The FDE parsing code actually does exactly the same thing you do here. It searches for possible function entry points, and creates symbols for those locations, if we don't have them already. Since it doesn't know the size of the those symbols it just sets size_is_valid=false, and lets it be auto-computed.

That's why I'm pushing to make this consistent with the FDE code, as adding this additional symbol actually improves the accuracy of the sizes computed for the FDE symbols. Likewise, if you set size_is_valid=false here, then any additional symbol-searching heuristics we introduce will improve the accuracy of the entry symbol size.

Not everyone has both of these things enabled (particularly having lld is less common), but they are things that one _can_ enable no matter what is his host or target architecture, so I am not worried by that. And as tests like these become more common, I expect more people will start having these enabled by default.

Will the tests running on the build bots have lld, arm enabled?

aadsm added a comment.Sep 27 2019, 4:40 PM

(substituting llvm-mc for as and ld.lld for linker).

I was not able to figure out how to generate the object file with llvm-mc correctly. I've tried a few variations of the triple (e.g.: armv7-eabi) but I never end up with the same code that I have in the assembly: # RUN: llvm-mc -triple=thumbv7 %s -filetype=obj -o %t.o. Any idea here?

Not everyone has both of these things enabled (particularly having lld is less common), but they are things that one _can_ enable no matter what is his host or target architecture, so I am not worried by that. And as tests like these become more common, I expect more people will start having these enabled by default.

Will the tests running on the build bots have lld, arm enabled?

All of the current "buildbot" bots have it. We even have an "experimental" arm buildbot running on arm hardware, but that one is red all the time :(. "Greendragon" don't have lld iirc, but one day I might try to convince them to enable it too. :)

(substituting llvm-mc for as and ld.lld for linker).

I was not able to figure out how to generate the object file with llvm-mc correctly. I've tried a few variations of the triple (e.g.: armv7-eabi) but I never end up with the same code that I have in the assembly: # RUN: llvm-mc -triple=thumbv7 %s -filetype=obj -o %t.o. Any idea here?

Changing the input to:

movs r0, #42
movs r7, #1
svc #0

seems to do the trick to me (this how objdump disassembles the gnu as produced binary). I guess that's because "mov" does not name a specific instruction and as and llvm-mc choose to interpret it differently. I think a case could be made for changing llvm-mc to pick the other (shorter) form by default, but that's probably not relevant now.

aadsm updated this revision to Diff 222490.Sep 30 2019, 1:43 PM

Add lit test to check dissassembly

Thanks for adding the lit test.

lldb/lit/SymbolFile/dissassemble-entry-point.s
6 ↗(On Diff #222490)

Can you delete this? I'm pretty sure the nested quoting is going to cause problems when running the test on windows... If you need it, you can use regexes {{.*}} to match the filename portion in the check lines below.

12 ↗(On Diff #222490)

I don't think this has any effect, as it differs case from the symbol below, so you should be able to just delete it.

lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
2719 ↗(On Diff #222490)

You've removed the architecture check, but this check here means that the code only kicks in for entry points that happen to be at an odd address. That is definitely not right. If we're going to go through with making this stuff unconditional (and I still think we should) then this condition needs to go too. The only thing that needs to check for arm/thumb is the m_address_class_map[opcode_addr] = AddressClass::eCodeAlternateISA; line down below.

Also, the long comment above needs to be redone to reflect the new reality. I'd just give a short comment that we're creating the entry point symbol if there isn't one, and that getting the address class right is important for expression evaluation on arm.

aadsm updated this revision to Diff 222680.Oct 1 2019, 1:10 PM

Update tests and fix logic to correctly add the symtab entry.

labath added a comment.Oct 2 2019, 5:51 AM

Thanks. The code looks fine now, but it doesn't look like you've addressed my comments in the test (or at least, they didn't make it into the uploaded version). Also, I don't think the long comment before the code in ObjectFileELF really reflects what the code does right now.

aadsm added a comment.Oct 2 2019, 8:16 AM

I actually just completely missed the first comment on the .s test and forgot about the comment 😇.

aadsm updated this revision to Diff 222848.Oct 2 2019, 9:33 AM

Update comment and test

labath accepted this revision.Oct 2 2019, 10:56 AM
labath added inline comments.
lldb/lit/SymbolFile/dissassemble-entry-point.s
10 ↗(On Diff #222848)

This .global _Start is also unneeded, as it does not even match the actual _start symbol below.

This revision is now accepted and ready to land.Oct 2 2019, 10:56 AM
aadsm updated this revision to Diff 222967.Oct 2 2019, 10:24 PM

Remove .global _Start

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptOct 3 2019, 5:09 PM
aadsm added a comment.Oct 3 2019, 6:58 PM

I ended reverting this because of SymbolFile/Breakpad/symtab.test. It seems that in this test we add symbols after the synthetic entry point symbol is added creating confusion.
I think I'll need to change the code that adds symbols in Symtab to explicitly check if we're overlapping the synthetic entry point and reduce the entry point or remove it in the case it overlaps at the beginning.

labath added a comment.Oct 4 2019, 2:32 AM

Yea, I see what you mean. I wouldn't spend too much time on fixing that though. The ability for SymbolFiles to add symtab entries is a fairly new thing. Not all issues with it have been ironed out, and I don't think it's up to you to fix them. The same kind of conflict could have happened with any other synthetic symbol (e.g. the eh_frame stuff) being added by the object file, and you were just unlucky that I wrote that particular test to check for the entry point symbol. I think you can just tweak the test it doesn't trigger this issue (e.g. you can just delete the entry point from the yamlized elf file being used as input).

lldb/trunk/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
2714

When playing around with the test failure, I realized that you're creating this entry point symbol even if the entry point is empty/zero. I think you should only create it if is matching a known section (and maybe also only if this object is an executable file (not a library)). In fact, probably GetEntryPointAddress should check that, and not return anything in this case.

2719

This actually shouldn't be needed if you saved the original Address object returned from GetEntryPointAddress() as the section should already be present there.

aadsm added a comment.Oct 4 2019, 8:35 AM

Ok. Initially I thought that with the entry point we were making big assumptions there but after reading the parse unwind symbols I guess it's really no big difference.
I guess my main concern is that the user can no longer create symbols within the span of the entry point symbol, even if they happen to know better (it's not even possible to manually remove symbols). But like you said, the same is true with the other synthetic symbols..