Mainly translate IT block into cmp/branch for functions in comparesf2.S
Details
Diff Detail
Event Timeline
Hi Weiming,
I don't think you can use MSR/MRS like that. They're not available anywhere Thumb1 is available (AFAIK, only M0), so this will only work on M0, not on anything else where __ARM_ARCH_ISA_THUMB == 1 will be true.
Also, MRS/MSR is an additional 4 cycles each, and by the number of uses you have here, this function will be several times slower than it should.
cheers,
--renato
lib/builtins/arm/comparesf2.S | ||
---|---|---|
87 | Why do you use named labels here and not below? I'm fine with either standard, would just be good to be consistent, I think. My personal preference is to use 1f for everything, as it's clear what you want from the local context and the notation is slightly less cluttered. |
Hi Renato,
I agree with your comments on msr/mrs. It's ugly and may only work for armv6m.
The nasty thing is I can't find a way to convert "mvn r0, r1, asr #31" to Thumb1 without clobbering the apsr.
Any ideas? Otherwise, I think we have to just use the C version.
I think the problem here is trying to fit a Thumb1 flow around a Thumb2 one. Maybe if we had a completely separate version in pure Thumb1, and the ifdef just chooses it, we could have a cleaner implementation.
The way I do this is to write in C, compile to assembly and then optimise by hand, add comments, nice labels, etc.
Given the number of alternative we could have (T1/T2/ARM vs, soft/hard), it could be better if we had different files with each one and the main file (this one) defines the function alias depending on the defined pre-processor symbols. Though, this change can be on a separate patch, later.
cheers,
--renato
lib/builtins/arm/comparesf2.S | ||
---|---|---|
87 | I tend to prefer the temporary local labels (e.g. the 1) that you are suggesting as well. However, I think I would rather see a switch to that across all the various T1 versions that have been added recently rather than just this one. | |
154 | Unindent the label please. | |
155 | This would need to change to 2 if you rename the next label. | |
157 | Same here. Can you bump this to 2? Makes it easier to follow | |
178 | We didnt update the condition codes before, did we? | |
179 | Unindent the labels, and please make them monotonically increasing. It makes it easier to spot them when jumping through the code. | |
180 | No need to align the lr with the second column. | |
216 | Unnecessary newline. | |
243 | Did we set the condition flags previously? | |
253 | Hmm, won't this possibly execute both sides of the ITE which the T2 path doesnt do IIRC. Shouldnt we write this as: bhi 1f cmp r3, r1 b 2f 1: movs r0, #1 @ b 2f @ for symmetry 2: |
By tracking the original implementation, I think we can avoid the msr/mrs. Also address some formatting issues.
lib/builtins/arm/comparesf2.S | ||
---|---|---|
243 | thumb1 has to use the "s" suffixed instructions. The flags will be reset by the "cmp" anyway, so won't affect the correctness. |
lib/builtins/arm/comparesf2.S | ||
---|---|---|
253 | The original code is not an "if-else". the "cmpls" will update the flags as well. It's a short-circuited comparison like " r0 = (r2 < 0xff0000000 || r3 < 0xff000000) ? r0 : 1 |
I think that the inline comments are slightly too dense to ready comfortably, especially with the amount of whitespace the rest of the file uses. But thats a minor thing.
I can fix the comments
lib/builtins/arm/comparesf2.S | ||
---|---|---|
194 | do you mean comments like this and line 214 too? |
Yeah, those felt a slight bit dense while I was trying to read through the assembly. (BTW, no need to update the patch if you adjust the comments before committing).
Ugh, I missed the removal of the two lines in __gtsf2 in the !T1 case. Ill adjust it as Im working on adding support for *hf targets.
Why do you use named labels here and not below? I'm fine with either standard, would just be good to be consistent, I think.
My personal preference is to use 1f for everything, as it's clear what you want from the local context and the notation is slightly less cluttered.