Implement more missing ARM EABI runtime functions.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/builtins/arm/aeabi_cdcmp.S | ||
---|---|---|
12 | Is this actually portable? Can you rewrite this as: __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ or even: __ARMEB__ | |
13 | Please replace FIXME with a slightly better message. How about something like: #error big endian support not implemented | |
21 | Did you mean the _helper here? | |
28 | Please add a comment that as per the RTABI, this function *must* save/restore the core registers except ip, lr, and CPSR. And a note about lr since its not immediately obvious that it is for compactness. | |
31 | Not a big fan of this helper name. It simply checks for NaNs. | |
33 | Cant you hoist the pop {r0-r3} here? | |
39 | Indicate that this is safe now since we have ruled out NaNs, so cmple can't trap. | |
83 | Why not just swap and tail call? | |
lib/builtins/arm/aeabi_cdcmpeq_helper.c | ||
13 | Don't you need to check the mantissa to determine if it is a NaN? | |
lib/builtins/arm/aeabi_cfcmpeq_helper.c | ||
13 | Why the mask? Don't you need to check if the mantissa to determine if it is a NaN? |
lib/builtins/arm/aeabi_cdcmp.S | ||
---|---|---|
20 | Double comment on this line, and the rest of the lines are formatted strangely. | |
46 | These comments don't match the implementation below, so something seems wrong. | |
lib/builtins/arm/aeabi_cdcmpeq_helper.c | ||
19 | Couldn't this be using __builtin_isnan instead of redefining it? | |
lib/builtins/arm/aeabi_cfcmpeq_helper.c | ||
19 | I have the same comment about using __builtin_isnan() here too. |
I agree with Stephen and Saleem on their comments, and I have some additional ones.
Also, avoid using push lr / pop pc, since this behaviour doesn't work across all ARM cores. There's a macro for returning that should be used.
cheers,
--renato
lib/builtins/arm/aeabi_cdcmp.S | ||
---|---|---|
20 | you can move the comment up one line. Though, you may want to encode the logic in the example and avoid a comment. | |
31 | me neither. If it's small enough, you can either write in asm or move to a separate static function in this file with a more descriptive name. | |
lib/builtins/arm/aeabi_cdcmpeq_helper.c | ||
19 | or even calling it directly from the asm function? |
Thanks for the comments, I'll fix them up and have an updated patch shortly
lib/builtins/arm/aeabi_cdcmp.S | ||
---|---|---|
12 | Fixed | |
13 | Fixed | |
33 | Yep, done | |
39 | Fixed | |
46 | Oops, the comments are incorrect. Fixed | |
83 | I misinterpreted the RTABI doc, I was applying the "3 way status returning functions preserve all registers except ip, lr, cpsr" to the reversed 3 way functions as well. Fixed | |
test/builtins/Unit/arm/call_apsr.S | ||
12 | indentation broken here |
lib/builtins/arm/aeabi_cfcmp.S | ||
---|---|---|
13 | This comment is wrong |
Ugh, I didn't know that ARMv4's pop {pc} doesn't work with switching Thumb mode. Are there any other problems with that?
There are several other functions that use push {lr}, pop {pc}. What do you think about something along the lines of the following?
#if __ARM_ARCH > 4 || !defined(__ARM_ARCH_ISA_THUMB) #define POP_FRAME(registers) \ pop {registers, pc} #else #define POP_FRAME(registers) \ pop {registers} \ pop {lr} \ JMP(lr) #endif
Address most of the comments
lib/builtins/arm/aeabi_cdcmp.S | ||
---|---|---|
28 | If I'm reading it correctly, this function is not actually required to preserve the argument registers, only the c?cmple ones are. We just happen to need them after checking for NaNs. I added the comment to c?cmple. | |
31 | Renamed it to __aeabi_cdcmpeq_check_nan | |
lib/builtins/arm/aeabi_cdcmpeq_helper.c | ||
19 | Doesn't __builtin_isnan do overloading via magic and/or generic macros? Is there an actual symbol I can jump to? | |
lib/builtins/arm/aeabi_cfcmpeq_helper.c | ||
13 | Yeah, this is wrong, I replaced it with a __builtin_nan instead |
lib/builtins/arm/aeabi_cdcmpeq_helper.c | ||
---|---|---|
19 | It looks like you resolved the issue here (by calling through with the appropriate C type). Was this just a stale reply? |
lib/builtins/arm/aeabi_cdcmp.S | ||
---|---|---|
30 | Yeah, after a second (or third) reading, I agree. This function is exempt from that restriction. Im not sure what the benefit of saving lr here is, as you can just as well as bx lr in one case, and just tail call as you do in the other case. | |
34 | I think that this plus the highlighting in my copy of the RTABI document caused the confusion. Saving behaviour of cdcmple is irrelevant. | |
35 | You hoisted it one instruction too early. You just clobbered the result (r0). | |
45 | Wouldn't this be shorter as: __aeabi_cdcmpeq: push {r0-r3} blx __aeabi_cdcmpeq_check_nan cmp r0, #1 pop {r0-r3} beq __aeabi_cdcmple msr CPSR_f, #APSR_C bx lr | |
74 | It seems you can factor out the setting of the CPSR, and do a single return. Something along the lines of: __aeabi_cdcmple: push {r0-r3} blx __aeabi_dcmplt cmp r0, #1 moveq ip, #0 beq 1f ldm sp, {r0-r3} blx __aeabi_dcmpeq cmp r0, #1 moveq ip, #(APSR_Z | APSR_C) beq 1f mov ip, #APSR_C 1:msr CPSR_f, ip pop {r0-r3} bx lr |
lib/builtins/arm/aeabi_cdcmp.S | ||
---|---|---|
30 | The call to the check_nan helper trashes lr | |
35 | Whoops. I'm not sure how I missed that. I guess the tests passed because it's not changing the behavior if the implementation doesn't trap on NaN and you don't pass in a float with a value of 1 in the first word. | |
45 | Yes it would, except {r0-r3, lr} for the push/pop since the call to __aeabi_cdcmpeq_check_nan will trash lr. | |
76 | How about this? __aeabi_cdcmple: push {r0-r4, lr} bl __aeabi_dcmplt cmp r0, #1 moveq r4, #0 beq 1f ldm sp, {r0-r3} bl __aeabi_dcmpeq cmp r0, #1 moveq r4, #(APSR_C | APSR_Z) movne r4, #(APSR_C) 1: msr CPSR_f, r4 pop {r0-r4, pc} |
lib/builtins/arm/aeabi_cdcmp.S | ||
---|---|---|
45 | Ah, right. | |
76 | I like the idea of the moveq/movne ... though, Im not sure I understand the purpose of using r4 rather than ip which is already a scratch register. The pop {pc} has the issue that Renato pointed out wrt older chips, (the stack savings are probably not important). |
lib/builtins/arm/aeabi_cdcmp.S | ||
---|---|---|
76 | Oh yeah, using ip is fine here. I originally had orr r4, r4, #APSR_C; orreq r4, r4, #APSR_Z instead of the direct mov. I added a macro for doing either pop {pc} or pop {ip}; bx ip |
Missing file headers on call_apsr.S and call_apr.h. Please use the DEFINE_COMPILERRT_PRIVATE_FUNCTION macros for call_apr_{d,f}. LGTM otherwise.
Is this actually portable? Can you rewrite this as:
or even: