Page MenuHomePhabricator

Fix arm64 floating point register spill recording in UnwindPlan analysis
ClosedPublic

Authored by jasonmolenda on Oct 20 2016, 9:42 PM.

Details

Summary

When we're creating an UnwindPlan for arm64 functions that spill a floating point register, we don't decode the vector register numbers correctly and we don't have a way to represent the fact that the lower 64-bits of the registers are being saved to the stack (as the AAPCS64 ABI dictates). The first fix is easiest, the second is a bit more involved. The patch I have here may be a bit controversial, I'd like feedback from Greg and Tamas.

RegisterInfos_arm64.h defines the eRegisterKindLLDB register numbers used on Linux, Darwin, and FreeBSD. It also defines the layout of these registers in the register context buffer.

I add the "w" pseudo registers to represent the low 32 bits of the "x" registers, and "s" and "d" registers to represent the low 32/64 bits of the "v" registers. I add definitions for how to find those pseudo registers in the register context in RegisterInfos_arm64.h. It's not possible to specify this in a bi-endian correct way, so I've renamed the table from "g_register_infos_arm64" to "g_register_infos_arm64_le". I used defines to pick the offset of the w, s, and d registers so it will be easy (by duplicating the table) to have a correct definition for big-endian if we add support for a target like that in the future.

I hoisted the register number definitions (the eRegisterKindLLDB register numbers) out of RegisterInfos_arm64.h into a separate file so they aren't co-mingled with the register context definitions. I have them in Utility/ARM64_LLDB_Registers.h currently.

I switched EmulateInstructionARM64 and TestArm64InstEmulation to create / test against unwind plans using the eRegisterKindLLDB register numbers because DWARF doesn't have codepoints defined for the "s" or "d" registers.

I fixed the STP register decoding in EmulateInstructionARM64 for the vector registers.

I added a unit test to test that these spill and restore instructions are included in the UnwindPlan correctly.

Here are some reasons why this patch isn't usable as-is.

First, Greg's objection is along the lines that it is not correct to have a generic eRegisterKindLLDB register table shared by multiple targets. Today all arm64 targets have the same reg set, but it's possible to imagine an arm64 Cortex-M core that might not have the vector registers, for instance. If such a target used a generic table that defines v0, it might try to display that register as zeros or give an error about how that register is unavailable or something.

Second, Tamas is going to notice that there's a fourth arm64 RegisterContext file - RegisterContextPOSIX_arm64 - and a header file already defining eRegisterKindLLDB register numbers for arm64 in lldb-arm64-register-enums.h (similar tables also exist for arm, s390x, x86, and mips with different ones for different targets). I'm duplicating that file. To be honest, I only just noticed that file right before I was writing up this patch and I didn't want to wait an extra day to roll the register definitions into one.

I don't have a strong preference for whether I keep the Utility/ARM64_LLDB_Registers.h or whether I add the w, s, and d register numbers to lldb-arm64-register-enums.h and use that.

Diff Detail

Repository
rL LLVM

Event Timeline

jasonmolenda retitled this revision from to Fix arm64 floating point register spill recording in UnwindPlan analysis.
jasonmolenda updated this object.
jasonmolenda added reviewers: clayborg, tberghammer.
jasonmolenda set the repository for this revision to rL LLVM.
jasonmolenda added a subscriber: lldb-commits.
tberghammer edited edge metadata.Oct 21 2016, 6:19 AM

A few high level comments:

  • I have the feeling you ported much more code over to use the LLDB register numbers then it would be strictly necessary. I am not sure if it is good or bad as it can help us consolidate the confusion around the different register numbering schema but it icnreases the size of this patch by a lot and also adds a lot of new boilerplate code.
  • I would prefer to keep lldb-arm64-register-enums.h over the new file you added as your code contains a lot of logic in ARM64_LLDB_Registers.cpp (e.g. to string conversion) what seems unnecessary to me as the register tables you modified recently already contains all information to get from register number to register name
  • Regarding Greg's concern I think we should have a single table (RegisterInfoInterface) containing all registers what can appear on the target and then the register context should be the one who detects which one is available and then returns the correct one accordingly. We already have functionality like this for x86_64 where RegisterInfos_x86_64.h and lldb-x86-register-enums.h defines all possibly available register and then NativeRegisterContextLinux_x86_64 detects if the AVX and MPX registers are available or not and then report the register list accordingly.

In mid/long term I think we should clean up the full register info handling code to simplify all of this in the following lines (vague ideas):

  • Implement a single copy of RegisterInfoInterface for every architecture what returns the correct list of registers based on the architecture and sub-architecture
  • Define register sets inside the RegisterInfoInterface instead of inside the RegistrerContext... classes
  • Change EmulateInstruction... to use the RegisterInfoInterface to get a RegisterInfo object from the register numbers
  • Change all RegisterContext... to use the information from the RegisterInfoInterface... classes instead of duplicating it into them

Doing all of the cleanup will be a lot of work so I am not asking you to do it or I don't want to wait with this change for that long but wanted to raise it here so we don't start to move in a direction what will make later improvements more difficult

source/Plugins/Instruction/ARM64/EmulateInstructionARM64.cpp
196–197

What do you think about teaching this function to handle both eRegisterKindDWARF and eRegisterKindLLDB? That way you can use both in the emulation code. Also that would me that you don't have to update all usage inside the emulation code.

source/Utility/ARM64_LLDB_Registers.cpp
18

I think this function is completely unnecessary if you use the g_register_infos_arm64_le table (or one of it's wrapper)

Hi Tamas, sorry for not replying earlier, something urgent came up that I needed to look at.

Thanks for the review. I agree with using your existing arm64 register enums in lldb-arm64-register-enums.h. I'd like to remove the enums in RegisterInfos_arm64.h instead of having two copies that could diverge.

You asked about having EmulateInstructionARM64 handle both eRegisterKindDWARF and eRegisterKindLLDB. I'm not sure how that would work - each UnwindPlan must can use only one register numbering scheme. We could use eRegisterKindDWARF if there were no floating point register save/restores - but that seems a bit complicated; we'd conditionally use eRegisterKindLLDB sometimes, eRegisterKindDWARF most of the time.

As for the GetRegisterName() function in ARM64_LLDB_Registers.cpp, we could have EmulateInstructionARM64 with the register file definition (we need to define seven preprocessor symbols before we can include RegisterInfos_arm64.h and then we'll need to write a GetRegisterName() method in EmulateInstructionARM64.cpp. Wouldn't it be clearer to have a lldb-arm64-register-enums.cpp with this method?

I was talking about these register numbering problems with Greg Clayton and he half-jokingly said we should switch from register numbers to using ConstString register names in the UnwindPlans. It's not a bad idea! As long as everyone can agree on the register names for an architecture, anyway.

jasonmolenda edited edge metadata.

I believe this rewrite of the original patch addresses Tamas' feedback (thanks Tamas!) I dropped a lot of the changes in the original; I switch EmulateInstructionARM64 from using DWARF register numbering to use LLDB register numbering and I am using the numbers from lldb-arm64-register-enums.h. I get the register definitions (size, etc) from RegisterInfos_arm64.h.

I also fix the original bug that I was doing this for - to recognize the callee saves of the floating point register spills. And added a unit test to test for that.

Tamas, when you have a minute could you look at this again and let me know what you think? Thanks.

labath added a subscriber: labath.Oct 28 2016, 3:54 AM

Tamas is not in this week. He should be back on monday.

tberghammer accepted this revision.Oct 31 2016, 9:35 AM
tberghammer edited edge metadata.

The patch generally looks good to me. I added a few high level thought about register context but they are clearly out of scope for this change. Also next time please upload your patch with full context as it is much easier to review that way.

Hi Tamas, sorry for not replying earlier, something urgent came up that I needed to look at.

Thanks for the review. I agree with using your existing arm64 register enums in lldb-arm64-register-enums.h. I'd like to remove the enums in RegisterInfos_arm64.h instead of having two copies that could diverge.

You asked about having EmulateInstructionARM64 handle both eRegisterKindDWARF and eRegisterKindLLDB. I'm not sure how that would work - each UnwindPlan must can use only one register numbering scheme. We could use eRegisterKindDWARF if there were no floating point register save/restores - but that seems a bit complicated; we'd conditionally use eRegisterKindLLDB sometimes, eRegisterKindDWARF most of the time.

My idea is based on the fact that every register access goes through the GetRegisterInfo function what gets a "register kind" and a "register number" so if we can teach that function to be able to fetch the correct register info both for eRegisterKindLLDB and for eRegisterKindDWARF then the actual emulate functions can use both. It would mean something like the following, but I don't know what should we put in place of the "...":

if (reg_kind == eRegisterKindLLDB)
  return LLDBTableGetRegisterInfo(reg_num, reg_info);
if (reg_kind == eRegisterKindDwarf)
  return ...

As for the GetRegisterName() function in ARM64_LLDB_Registers.cpp, we could have EmulateInstructionARM64 with the register file definition (we need to define seven preprocessor symbols before we can include RegisterInfos_arm64.h and then we'll need to write a GetRegisterName() method in EmulateInstructionARM64.cpp. Wouldn't it be clearer to have a lldb-arm64-register-enums.cpp with this method?

RegisterInfos_arm64.h already contains a register name and an alternative name so I would prefer to use them as they are already specified but I have no problem with having a GetRegisterName() function if we can have a single copy of it. My main problem with your previous approach was that you are adding a second copy of it (current one is in ARM64_DWARF_Registers.cpp)

Regarding including RegisterInfos_arm64.h I am not too keen on the approach of having a large table and defining a set of macros as it is very verbose and limits the flexibility (e.g. big/little endian issue we found recently) but I haven't come up with a better idea yet.

I was talking about these register numbering problems with Greg Clayton and he half-jokingly said we should switch from register numbers to using ConstString register names in the UnwindPlans. It's not a bad idea! As long as everyone can agree on the register names for an architecture, anyway.

Personally I really don't like ConstString because of the global pool behind it (kind of leaks memory) so I think having some sort of register number is better. I think a betetr approach would be to agree that inside LLDB everything should use eRegisterKindLLDB and we do the register number translation at the communication boundaries (gdb-remote communication, eh_frame parsing, dwarf reading)

This revision is now accepted and ready to land.Oct 31 2016, 9:35 AM
jasonmolenda closed this revision.Oct 31 2016, 6:37 PM

Committed in r285662.