This is an archive of the discontinued LLVM Phabricator instance.

[LLDB] Let DataExtractor deal with two-byte addresses
ClosedPublic

Authored by aykevl on Feb 4 2020, 8:00 AM.

Details

Summary

AVR usually uses two byte addresses. By making DataExtractor deal with this, it is possible to load AVR binaries that don't have debug info associated with them.


This is a more focused version of D73961. With this, I can load an avr-gcc compiled binary and connect to a gdb-remote as long as it doesn't include any instructions that cannot yet be disassembled (see D73911 and D73958).

Diff Detail

Event Timeline

aykevl created this revision.Feb 4 2020, 8:00 AM
aykevl updated this revision to Diff 242340.Feb 4 2020, 8:01 AM

Added a unit test.

I think it would also be a good idea to test this by loading an AVR binary and connecting to a debugger (thus triggering this code), but I don't know how to add such a test. Suggestions would be appreciated.

labath added a comment.Feb 4 2020, 3:38 PM

I think this is a much better approach than the other do-in-all-together patch -- we should take this one step at a time.

I do have one question though. Will the DataExtractor actually do something reasonable for non-power-of-2 sizes (5,6,7) ? If yes, then great -- if not, we should keep the assert as 2 || 4 || 8.

I agree that more tests would be useful, but I would not consider that as a requirement for this patch. If you want, you can submit extra tests in separate patches.

shafik added a subscriber: shafik.Feb 4 2020, 10:10 PM

Since we are doing the same test all over m_addr_size >= 1 && m_addr_size <= 8 can we just make it a function and avoid the repetition and potential erroneous updating later on that does not fix them all?

andwar added a subscriber: andwar.Feb 5 2020, 10:53 AM
aykevl updated this revision to Diff 242694.EditedFeb 5 2020, 11:01 AM
aykevl set the repository for this revision to rG LLVM Github Monorepo.

I do have one question though. Will the DataExtractor actually do something reasonable for non-power-of-2 sizes (5,6,7) ? If yes, then great -- if not, we should keep the assert as 2 || 4 || 8.

I checked, and the only places it is actually used is in GetAddress, GetAddress_unchecked, and GetPointer. All those end up at GetMaxU64 (or the unchecked variant) which seems to take care of odd pointer lengths. I have added a test case to make sure this will keep working.

Since we are doing the same test all over m_addr_size >= 1 && m_addr_size <= 8 can we just make it a function and avoid the repetition and potential erroneous updating later on that does not fix them all?

Right now there are asserts both when constructing/copying(?) the object (5 asserts) and at the place where m_addr_size is used (3 asserts). I would propose picking one place, such as where it is used. That would reduce the number of asserts to 3 and keep them close together. What do you think?

Sidenote: GetAddress and GetPointer are the same function but with a different name. Even the doc comment right before is almost identical. I tried tracing back where they came from and they can both be traced back to the initial LLDB checin from Apple. Maybe this is worth deduplicating eventually.

labath added a comment.Feb 5 2020, 2:36 PM

I do have one question though. Will the DataExtractor actually do something reasonable for non-power-of-2 sizes (5,6,7) ? If yes, then great -- if not, we should keep the assert as 2 || 4 || 8.

I checked, and the only places it is actually used is in GetAddress, GetAddress_unchecked, and GetPointer. All those end up at GetMaxU64 (or the unchecked variant) which seems to take care of odd pointer lengths. I have added a test case to make sure this will keep working.

Awesome, thanks.

Since we are doing the same test all over m_addr_size >= 1 && m_addr_size <= 8 can we just make it a function and avoid the repetition and potential erroneous updating later on that does not fix them all?

Right now there are asserts both when constructing/copying(?) the object (5 asserts) and at the place where m_addr_size is used (3 asserts). I would propose picking one place, such as where it is used. That would reduce the number of asserts to 3 and keep them close together. What do you think?

I agree only one of those two places should be enough. My idea was to make the constructors delegate to one another (if necessary by creating a private uber-constructor that everybody can delegate to). Then we could have only one assert.

But if that's hard for some reason, your proposal seems fine too.

Sidenote: GetAddress and GetPointer are the same function but with a different name. Even the doc comment right before is almost identical. I tried tracing back where they came from and they can both be traced back to the initial LLDB checin from Apple. Maybe this is worth deduplicating eventually.

Indeed, I think I'll do that when I find a bit of time.

aykevl added a comment.Feb 6 2020, 3:43 AM

Right now there are asserts both when constructing/copying(?) the object (5 asserts) and at the place where m_addr_size is used (3 asserts). I would propose picking one place, such as where it is used. That would reduce the number of asserts to 3 and keep them close together. What do you think?

I agree only one of those two places should be enough. My idea was to make the constructors delegate to one another (if necessary by creating a private uber-constructor that everybody can delegate to). Then we could have only one assert.

Should I do that in the same patch, or in a different one? Before or after this patch? It seems like doing the constructor merging in this patch would make it more complicated than necessary.

labath added a comment.Feb 6 2020, 8:33 AM

Right now there are asserts both when constructing/copying(?) the object (5 asserts) and at the place where m_addr_size is used (3 asserts). I would propose picking one place, such as where it is used. That would reduce the number of asserts to 3 and keep them close together. What do you think?

I agree only one of those two places should be enough. My idea was to make the constructors delegate to one another (if necessary by creating a private uber-constructor that everybody can delegate to). Then we could have only one assert.

Should I do that in the same patch, or in a different one? Before or after this patch? It seems like doing the constructor merging in this patch would make it more complicated than necessary.

My hope is that this will be a relatively simple thing, in which case it would be ok to put it into this patch. But I am fine with doing it separately as well...

This looks good to me from an AVR point of view. I don't have very much experience with LLDB however.

aykevl updated this revision to Diff 246443.Feb 25 2020, 7:00 AM

Rebase on master (after GetPointer was removed).

I agree only one of those two places should be enough. My idea was to make the constructors delegate to one another (if necessary by creating a private uber-constructor that everybody can delegate to). Then we could have only one assert.

But if that's hard for some reason, your proposal seems fine too.

Honestly I don't feel very comfortable changing the constructors. I'm not very experienced with C++ and I'm afraid I mess something up. I could remove them from all constructors, reducing the number of asserts from 7 to 3?

labath accepted this revision.Feb 25 2020, 7:11 AM

I think this is fine. There's no need to hold this up over the assert duplication issue, as this does not make the situation any worse.

Thanks for the patch.

This revision is now accepted and ready to land.Feb 25 2020, 7:11 AM
This revision was automatically updated to reflect the committed changes.

Thanks for the review!