This is an archive of the discontinued LLVM Phabricator instance.

Allow generic arm ArchSpec to merge with specific arm ArchSpec; allow Cortex M0-7's to always force thumb mode
ClosedPublic

Authored by jasonmolenda on Oct 8 2015, 7:46 PM.

Details

Summary

This patch is to address a few issues I came up with when working on an armcc-generated ELF binary being run on a Cortex-M4 core with lldb. The first problem is that the elf binary comes in as a generic "arm" triple ("arm--") and the second is that we don't see to get any arm/thumb hints about the functions in this binary.

This is a Cortex-M4 ("armv7em") so the instructions are always thumb.

When lldb calls the instruction unwinder (FuncUnwinders::GetUnwindAssemblyProfiler) it gets the object file's arch and merges in the Target's arch via ArchSpec::MergeFrom. The object file has "arm--", the target has "armv7em-unknown-unknown". The changes to MergeFrom allow us to replace the "arm" cpu part of the triple with the Target's "armv7em".

I wanted to centralize the "Cortex M0-M7 are always thumb" special knowledge a little better - we already had two examples where we were checking against specific core types (armv6m, armv7m, armv7em) in the disassembler. So I added a method to ArchSpec to check that. It's a very arm specific method name, but it seems a bit of an arm specific problem that we're trying to identify here.

Down in the instruction emulation I use the ArchSpec method to make sure we don't try to emulate anything as arm because we're missing the arm/thumb hints from the ObjectFile.

I imagine at some point I'll want to do something in SysV-Arm ABI plugin for the DefaultUnwindPlan to say that if this core is always running thumb instructions, the frame pointer is r7 for instance. That default UnwindPlan which uses the arm r11 as a frame pointer is a problem for the unwinder's behavior in a real mixed arm/thumb environment - we sidestep it inside Apple be using r7 as the frame pointer for both types of code.

Tamas, I put you as the reviewer because this is closest to what you work on regularly. Please let me know what you think when you have a chance.

Diff Detail

Repository
rL LLVM

Event Timeline

jasonmolenda retitled this revision from to Allow generic arm ArchSpec to merge with specific arm ArchSpec; allow Cortex M0-7's to always force thumb mode.
jasonmolenda updated this object.
jasonmolenda added a reviewer: tberghammer.
jasonmolenda set the repository for this revision to rL LLVM.
jasonmolenda added subscribers: lldb-commits, clayborg.

Hi Jason,

This has nothing to do with your patch per se, but we have accurate target descriptions in LLVM, and the parser for all the triples and extra options is now publicly available, so I was wondering if (probably after this go in), you could have a look at using these libraries.

I'm of course assuming you already use other LLVM libraries in LLDB, as it would make sense to use the same target description classes to understand about the target without re-writing everything.

Just a thought.

cheers,
--renato

tberghammer edited edge metadata.Oct 9 2015, 5:30 AM

In general I like the approach you are taking and creating IsAlwaysThumbInstructions sounds like a good idea, but I would implement it a bit differently.

llvm::Triple already contains a SubArch field what is filled in from the first part of the triple in case of arm so I would prefer to use that instead because then we update the architecture information inside the Triple also what is used in many places (I would expect that the information you put into the core field will get lost at some place).

Just for curiosity, how do you debug a Cortex-M4 processor? Do you have PlatformJTAG/ProcessJTAG implementations or do you use debugserver on them?

Hi Renato & Tamas, thanks for the feedback.

I'm trying to rewrite IsAlwaysThumbInstructions() to use the information that llvm already has, as per Renato's suggestion. The MCSubtargetInfo has a getFeatureBits() method which can indicate ARM::FeatureNoARM. I'm still having a little bit of trouble getting it to build, the ARMFeatures.h is in an unusual location and it indirectly pulls in ARMGenRegisterInfo.inc, ARMGenInstrInfo.inc, and ARMGenSubtargetInfo.inc from the build dir - I'll need to update xcode and cmake to add the $(llvm-build-dir)/lib/Target/ARM/ to the include path for these to be found. I might not have time to finish this tonight, we'll see.

I'm not sure how expensive it is to make an llvm::Target and get the MCSubtargetInfo out of that every call; I may need to save something in an ArchSpec ivar. Maybe just the results of the FeatureBits check.

Tamas, in this case the people developing this are using an apple internal JTAG interface. But I was working with a group a while ago doing some work with a Cortex-M0 generic board using the Segger J-Link jtag device. In both cases, the JTAG speaks generic gdb-remote protocol. lldb doesn't have anything JTAG specific - it's using ProcessGDBRemote and the generic "host" platform. There's no DynamicLoad plugin activated - developers in these environments need to specify where their binaries are loaded to lldb.

I'm trying to rewrite IsAlwaysThumbInstructions() to use the information that llvm already has, as per Renato's suggestion. The MCSubtargetInfo has a getFeatureBits() method which can indicate ARM::FeatureNoARM. I'm still having a little bit of trouble getting it to build, the ARMFeatures.h is in an unusual location and it indirectly pulls in ARMGenRegisterInfo.inc, ARMGenInstrInfo.inc, and ARMGenSubtargetInfo.inc from the build dir - I'll need to update xcode and cmake to add the $(llvm-build-dir)/lib/Target/ARM/ to the include path for these to be found. I might not have time to finish this tonight, we'll see.

The target information classes depend on table-generated files, so if you were not building them before, you'll need to do that now. I though LLDB was doing this already, that's why I suggested lightly. But ultimately, I believe you *don't* want to have your own target description anyway, so it's a move that you'd have to do anyway.

I'm not sure how expensive it is to make an llvm::Target and get the MCSubtargetInfo out of that every call; I may need to save something in an ArchSpec ivar. Maybe just the results of the FeatureBits check.

I'm not sure you should be doing that. Those calls normally involve traversing tables and correlating information. Even though that gets optimised pretty well, it's not something that you want to be delaying delicate hardware handling of breakpoints, run states etc.

I'd guess you could have a local cache with only the information you need, and you initialise it from the target description classes once. You may end up with some duplication, but hopefully, it won't be too much and it'll be a lot faster.

--renato

jasonmolenda edited edge metadata.

Quick update of the patch using the llvm target information. Builds and works correctly on macosx with xcodebuild. Doesn't build with cmake yet; I don't have the includes set up correctly for the llvm build-dir lib/Target/ARM directory yet. Also doesn't cache the result of the llvm target features check.

clayborg accepted this revision.Oct 13 2015, 9:53 AM
clayborg added a reviewer: clayborg.
This revision is now accepted and ready to land.Oct 13 2015, 9:53 AM
tberghammer added inline comments.Oct 13 2015, 9:58 AM
source/Core/ArchSpec.cpp
859–870

I would like to see this part of the code to be replaced with something like this so we store the sub-architecture inside the triple and we don't have to special case arm (it is already done in llvm::Triple):

if (GetTriple().getSubArch() == llvm::Triple::NoSubArch)
    GetTriple().setSubArch(other.GetTriple().getSubArch());
Eugene.Zelenko added a subscriber: Eugene.Zelenko.

Committed in rL265377.