Fix several issue in arm single stepping and stack unwinding
- Improve instruction emulation for the following instructions: ADD, B, CMP, LDR, LDRB, SUB, TB
- Fix prologue restoration code with calculating correct offset for the new rows and restoring saved register values also next to the CFA