Fix breakpoint trap opcode detection for arm linux
Diff Detail
Event Timeline
Looks good.
One thing to comment on: If you accidentally set an ARM breakpoint in thumb code you will hose your program by executing opcode 0x01f0:
ASR (immediate) (isa = T32, encoding = T2) Arithmetic Shift Right (immediate) 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 ,---------------------------------------------------------------. | 0 0 0 | 0 0 | 0 0 1 1 1 | 1 1 0 | 0 0 0 | | | op | imm5 | Rn | Rd | `---------------------------------------------------------------' [12:11] op = 0 (0x0) [10: 6] imm5 = 7 (0x7) [ 5: 3] Rn = 6 (0x6) [ 2: 0] Rd = 0 (0x0)
Followed by a branch for 0xE7F0:
B (isa = T32, encoding = T2) Branch 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 ,---------------------------------------------------------------. | 1 1 1 0 0 | 1 1 1 1 1 1 1 0 0 0 0 | | | imm11 | `---------------------------------------------------------------' [10: 0] imm11 = 2032 (0x7f0)
What we do is always try to use a 32 bit ARM instructions whose lower 16 bits would also trigger a Thumb breakpoint. If you look at the ARM opcode you are using:
UDF (isa = A32, encoding = A1) Permanently Undefined 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 ,-------------------------------------------------------------------------------------------------------------------------------. | 1 1 1 0 0 1 1 1 1 1 1 1 | 0 0 0 0 0 0 0 0 0 0 0 1 | 1 1 1 1 | 0 0 0 0 | | | imm12 | | imm4 | `-------------------------------------------------------------------------------------------------------------------------------' [19: 8] imm12 = 1 (0x1) [ 3: 0] imm4 = 0 (0x0)
And the thumb breakpoint opcode you are using:
B (isa = T32, encoding = T1) Permanently Undefined 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 ,---------------------------------------------------------------. | 1 1 0 1 | 1 1 1 0 | 0 0 0 0 0 0 0 1 | | | cond | imm8 | `---------------------------------------------------------------' [11: 8] cond = 14 (0xe) [ 7: 0] imm8 = 1 (0x1)
You can then play with the ARM instruction and modify the imm12 and imm4 so you can change to use 0xE7f0def1:
UDF (isa = A32, encoding = A1) Permanently Undefined 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 ,-------------------------------------------------------------------------------------------------------------------------------. | 1 1 1 0 0 1 1 1 1 1 1 1 | 0 0 0 0 1 1 0 1 1 1 1 0 | 1 1 1 1 | 0 0 0 1 | | | imm12 | | imm4 | `-------------------------------------------------------------------------------------------------------------------------------' [19: 8] imm12 = 222 (0xde) [ 3: 0] imm4 = 1 (0x1)
And for Thumb use 0xdef1:
B (isa = T32, encoding = T1) Branch 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 ,---------------------------------------------------------------. | 1 1 0 1 | 1 1 1 0 | 1 1 1 1 0 0 0 1 | | | cond | imm8 | `---------------------------------------------------------------' [11: 8] cond = 14 (0xe) [ 7: 0] imm8 = 241 (0xf1)
Now you have an ARM opcode that will mostly trigger a thumb breakpoint correctly even if you set it wrong. I say mostly because if you accidentally set the ARM breakpoint in the middle of a 32 bit Thumb instruction things could still go wrong.
On MacOSX we use the actual BKPT instructions for ARM and Thumb that have the immediate values set correctly so the ARM BKPT works for Thumb as well:
static const uint8_t g_arm_breakpoint_opcode[] = { 0x70, 0xBE, 0x20, 0xE1 };
static const uint8_t g_thumb_breakpoint_opcode[] = { 0x70, 0xBE };
But, If these opcodes are what your kernel recognizes, you will want to use what the kernel expects.
I almost forgot the main reason that you really want to be using the BKPT instructions: the Thumb IT (if/then/else) instruction...
If you have thumb code that has a IF THEN THEN:
0x1000: <opcode> ITE<condition> 0x1002: 0xXXXXYYYY # THEN conditional 0x1006: 0xZZZZ # ELSE conditional
Where 0xXXXXYYYY is a 32 bit thumb instruction at address 0x1002 and 0xZZZZ is any 16 bit thumb instruction at address 0x1006. Now set a thumb breakpoint at 0x1002. This is what you code looks like now:
0x1000: <opcode> ITE<condition> 0x1002: 0xde01 # THEN conditional 0x1004: 0xYYYY # ELSE conditional if a completely incorrect 16 bit opcode that is half of the original 32 bit thumb instruction 0x1006: 0x1616 # NOT CONDITIONAL ANYMORE!!!
So you really don't want to be using anything but the BKPT instruction. Why? Because BKPT is special and it will ALWAYS stop even in a condition IT block when the condition doesn't match. So it stops you from hosing your code up like the second example shows. This also means that LLDB will stop when it shouldn't, but fear not LLDB already has support for figuring out that is stopped at a condition that doesn't match and it will auto continue and "do the right thing".
Thanks for the detailed explanation about the trap opcodes. I just copied over these opcodes from PlatformLinux but I plan to merge the two function to avoid the code duplication. I will investigate what type of opcode we can use for beakpoints on Linux (and Android) and update the code accordingly (most likely with an other CL).