Add basic support, i386 version will be added later.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
I had difficulties with this code, as each platform implements it differently.
What is DBG? What should be there?
What is UserArea? What should be there?
This class is one of the parts of the infrastructure needed to access registers for a given platform. These show up on the LLDB side rather than the process monitor side. So it should be LLDB using these to access register state, from any platform (e.g. used by LLDB running on perhaps Linux, debugging a process on NetBSD - this class would be used then).
(There are a set of classes called NativeRegisterContext* that are used in process monitoring directly on target, used by the process monitoring infrastructure as found in lldb-server - these are different and have different requirements).
Since this class needs to exist on any host OS, it will not include system headers to define these structures since, say in the scenario above, Linux wouldn't have the system headers necessary to define these structures for NetBSD. So they need to be defined size-wise correct for any platform that would compile it. (Otherwise it would be tempting to include the platform-specific system headers that already define all this data for you).
What is DBG? What should be there?
These are the debug control registers:
https://en.wikipedia.org/wiki/X86_debug_register
What is UserArea? What should be there?
This is the structure that the OS typically defines. If you trace down the NetBSD existing debugging code headers, there will be something like this in there. You want to mimic what this is as it should be the same memory layout as provided by NetBSD. It almost certainly will look like this, since all these OSes for a given architecture generally have to represent the data however the underlying architecture allows them to expose it.
If you have trouble tracking that down, I can dig around some to find out what header it shows up in.
The important part is that you look to see what NetBSD exposes register-wise for a given process. Even though the architecture is the same across several different versions of this class, it is possible for an OS to tweak the way the CPU starts up and therefore influence which register sets are present. That kind of OS/register interaction gets handled here. So when LLDB is poking around the register file, it knows what it can ask for and how to get it. In addition to live threads modeled in LLDB, I think these instances are also used when looking at post-mortem debugging - the core file-style processes will hand these out to LLDB as well for a given thread. (In general we are asking about registers in the context of a given thread, so one of the ways to trace how these work is by looking at the Thread class and its register context retrieval function).
If all else fails when trying to look up the NetBSD expectations about register access, you can start with the ptrace interface (man ptrace). It should have some way for you to ask for some kind of register-file-offset-based value and a width to retrieve content from the registers for a process being ptraced. The docs there should tell you how to find out what are valid register offsets, which should lead you to the data and expectations for the user area layout. (This is assuming the ptrace mechanism on NetBSD looks anything similar to other Unix variants).
@tfiala thank you for your verbose and quick feedback. It made the things more clear!
While I'm processing your reply -- a question, does it matter whether we define GPR as signed or unsigned integers? Is it platform specific part too? Should I change this to uint64_t?
My pleasure!
While I'm processing your reply -- a question, does it matter whether we define GPR as signed or unsigned integers? Is it platform specific part too? Should I change this to uint64_t?
I don't remember off the top of my head. I'll have a look later when I can get to the code. For starters, though, I would follow whatever the Linux and OS X ones do for this as it shouldn't be too different than that. (My first guess would be we just treat them as unsigned bytes at the raw level, and then adjust them as needed based on what data types we think are specified in the registers, but I could be wrong.)
It looks like UserArea will need to map this structure:
typedef struct { __gregset_t __gregs; __greg_t _mc_tlsbase; __fpregset_t __fpregs; } mcontext_t;
- src/sys/arch/amd64/include/mcontext.h
https://github.com/IIJ-NetBSD/netbsd-src/blob/master/sys/arch/amd64/include/mcontext.h#L61
I'm thinking if I need to customize FPR as well:
/* * Floating point register state * The format of __fpregset_t is that of the fxsave instruction * which requires 16 byte alignment. However the mcontext version * is never directly accessed. */ typedef char __fpregset_t[512] __aligned(8);
https://github.com/IIJ-NetBSD/netbsd-src/blob/master/sys/arch/amd64/include/mcontext.h#L53
Joerg Sonnenberg (I forgot @login) wrote:
I think you want to keep at least the mc_tlsbase field as well.
Access to __thread variables will need it.
This will be handled by new UserArea.
I will research if I can skip the definition of struct DBG, as currently there is no support for debug registers on NetBSD.
This kind of stuff:
struct UserArea { GPR gpr; FPR fpr; DBG dbg; };
Was done on MacOSX systems because we have thread_get_state() and thread_set_state() that take a flavor (enumeration) that specifies which registers we should read. It also takes a pointer to a structure to fill in and the size in bytes of that structure. For this reason, we ended up making a UserArea that looked like above.
What you should be doing for your register context is duplicating the structures that are in the system headers so that they can be compiled on a system without these header files. So, this should be a duplication (don't use the exact same names for the structure types):
typedef struct { __gregset_t __gregs; __greg_t _mc_tlsbase; __fpregset_t __fpregs; } mcontext_t;
Then if you have only one "read all registers" call that fills in all registers, you should make any single register read kick off a single read of all registers and mark all registers as valid. For MacOSX, a register read will trigger a thread_get_state() and the GPR, FPU or DBG register flavor will be used to fill in the register context for the thread and all registers one of GPR, FPU, DBG will be marked as valid. If you have an API on your system that allows each register to be read/written individually, you could do them separately and mark each individual register valid after it is read. So this really depends on your system interface for reading registers. There is usually ptrace() calls for most unix based systems to read registers and you end up reading them in sets.
The other thing to note, is that LLDB expects your registers numbers to start at zero, and never have any register number gaps.
So as long as your changes adhere to what was said above, it is a good start.
Thank you @clayborg for your notes. I'm planning to submit new version in the coming days.
- I was trying to comment out DBG registers (as unsupported by NetBSD) from RegisterInfos_x86_64.h with the following patch:
diff --git a/lldb-git/patches/patch-lldb_source_Plugins_Process_Utility_RegisterInfos__x86__64.h b/lldb-git/patches/patch-lldb_source_Plugins_Process_Utility_RegisterInfos__x86__64.h new file mode 100644 index 0000000..280330e --- /dev/null +++ b/lldb-git/patches/patch-lldb_source_Plugins_Process_Utility_RegisterInfos__x86__64.h @@ -0,0 +1,74 @@ +$NetBSD$ + +--- lldb/source/Plugins/Process/Utility/RegisterInfos_x86_64.h.orig 2015-12-06 02:57:30.000000000 +0000 ++++ lldb/source/Plugins/Process/Utility/RegisterInfos_x86_64.h +@@ -48,7 +48,9 @@ + // Number of bytes needed to represent a YMM register. + #define YMM_SIZE sizeof(YMMReg) + ++#ifndef NO_DEBUGREGS + #define DR_SIZE sizeof(((DBG*)NULL)->dr[0]) ++#endif + + // RegisterKind: EHFrame, DWARF, Generic, Process Plugin, LLDB + +@@ -85,10 +87,12 @@ + { dwarf_##reg##i##h_x86_64, dwarf_##reg##i##h_x86_64, LLDB_INVALID_REGNUM, LLDB_INVALID_REGNUM, lldb_##reg##i##_x86_64 }, \ + NULL, NULL } + ++#ifndef NO_DEBUGREGS + #define DEFINE_DR(reg, i) \ + { #reg#i, NULL, DR_SIZE, DR_OFFSET(i), eEncodingUint, eFormatHex, \ + { LLDB_INVALID_REGNUM, LLDB_INVALID_REGNUM, LLDB_INVALID_REGNUM, \ + LLDB_INVALID_REGNUM, LLDB_INVALID_REGNUM }, NULL, NULL } ++#endif + + #define DEFINE_GPR_PSEUDO_32(reg32, reg64) \ + { #reg32, NULL, 4, GPR_OFFSET(reg64), eEncodingUint, \ +@@ -251,6 +255,7 @@ g_register_infos_x86_64[] = + DEFINE_YMM(ymm, 14), + DEFINE_YMM(ymm, 15), + ++#ifndef NO_DEBUGREGS + // Debug registers for lldb internal use + DEFINE_DR(dr, 0), + DEFINE_DR(dr, 1), +@@ -259,7 +264,8 @@ g_register_infos_x86_64[] = + DEFINE_DR(dr, 4), + DEFINE_DR(dr, 5), + DEFINE_DR(dr, 6), +- DEFINE_DR(dr, 7) ++ DEFINE_DR(dr, 7), ++#endif + }; + + static_assert((sizeof(g_register_infos_x86_64) / sizeof(g_register_infos_x86_64[0])) == k_num_registers_x86_64, +@@ -314,10 +320,12 @@ do { + g_register_infos[lldb_##reg##i##_i386].byte_offset = YMM_OFFSET(i); \ + } while(false); + ++#ifndef NO_DEBUGREGS + #define UPDATE_DR_INFO(reg_index) \ + do { \ + g_register_infos[lldb_dr##reg_index##_i386].byte_offset = DR_OFFSET(reg_index); \ + } while(false); ++#endif + + // Update the register offsets + UPDATE_GPR_INFO(eax, rax); +@@ -400,6 +408,7 @@ do { + UPDATE_YMM_INFO(ymm, 6); + UPDATE_YMM_INFO(ymm, 7); + ++#ifndef NO_DEBUGREGS + UPDATE_DR_INFO(0); + UPDATE_DR_INFO(1); + UPDATE_DR_INFO(2); +@@ -408,6 +417,7 @@ do { + UPDATE_DR_INFO(5); + UPDATE_DR_INFO(6); + UPDATE_DR_INFO(7); ++#endif + + #undef UPDATE_GPR_INFO + #undef UPDATE_GPR_INFO_8H
But I get this assert being triggered:
In file included from /tmp/pkgsrc-tmp/wip/lldb-git/work/lldb/source/Plugins/Process/Utility/RegisterContextNetBSD_x86_64.cpp:59:0: /tmp/pkgsrc-tmp/wip/lldb-git/work/lldb/source/Plugins/Process/Utility/RegisterInfos_x86_64.h:271:1: error: static assertion failed: g_register_infos_x86_64 has wrong number of register infos static_assert((sizeof(g_register_infos_x86_64) / sizeof(g_register_infos_x86_64[0])) == k_num_registers_x86_64, ^
What's the correct approach to address it? Mirror modified RegisterInfos_x86_64.h in RegisterContextNetBSD_x86_64.cpp and removed/altered static_assert?
- I don't understand "marking registers as valid". NetBSD offers ptrace(2) call to with a pair of accessors set or get, one for REG and the other for FPREG.
Thank you in advance.
You usually make a sized array that matches your zero based enumerations for your registers. So the number of items in the g_register_infos_x86_64 structure should be the same as your zero based register number enumerations. I prefer to always make an enum like:
enum RegisterNumbers { eGPRRegisterRAX, eGPRRegisterRBX, eGPRRegisterRCX, eGPRRegisterRDX, k_num_registers };
Then define the RegisterInfo structure as a constant array that sizes itself:
static RegisterInfo g_register_infos_x86_64[] = { ... rax ..., ... rbx ..., ... rcx ..., ... rdx ..., ... rsi ..., };
Note that if we had this definition, that the assert you had would fire because our enumerations in RegisterNumbers has the number of registers as 4, but we have 5 in our g_register_infos_x86_64 structure. I do it this way to ensure that we don't get our enumeration definition off somehow and then access an array item in g_register_infos_x86_64 that is out of bounds.
What's the correct approach to address it? Mirror modified `RegisterInfos_x86_64.h` in `RegisterContextNetBSD_x86_64.cpp` and removed/altered `static_assert`? 2. I don't understand "marking registers as valid". NetBSD offers `ptrace`(2) call to with a pair of accessors set or get, one for `REG` and the other for `FPREG`.
Somehow you need to track which registers are valid and which aren't. If someone asks you to read "rax", you will read all GPR registers. The next time someone asks you to read "rbx", if you have already read the GPR registers, you can just grab the value from your local structure. The register context will be asked to invalidate its registers via the pure virtual function:
virtual void RegisterContext::InvalidateAllRegisters () = 0;
This would make you say "m_gpr_regs_valid = false;" and possible "m_fpu_regs_valid = false;" or something inside your register context to track which registers are valid. We want to avoid multiple ptrace calls to read all registers when they aren't needed. Also, if a registers is written, you will need to invalidate these as needed (if reg comes from GPR registers, then mark all GPRs as invalid, same for FPU reg). The GDB remote servers sometimes can only read single registers, one at a time, so a register context like that needs to track the validity of each register individually. So it really depends on your register context. So just make sure you correctly track when your registers are valid so you can avoid re-reading register sets via ptrace.
What's the correct approach to address it? Mirror modified RegisterInfos_x86_64.h in RegisterContextNetBSD_x86_64.cpp and removed/altered static_assert?
I think that's worth trying. That will let you have a contiguous, consistent register set.
Revamped version.
Changes:
- change type of registers from int64_t to uint64_t,
- remove unsupported debug x86 registers,
- model UserArea after mcontext_t
Patch building tested.
Seems reasonable to start. If there are issues, they'll show up when lldb starts using registers. This will get you started, though.
lldb/trunk/source/Plugins/Process/Utility/RegisterContextNetBSD_x86_64.cpp | ||
---|---|---|
308–309 ↗ | (On Diff #46610) | Unused function? |