Mainly translate IT block into cmp/branch for functions in comparesf2.S
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
Hi Weiming,
I don't think you can use MSR/MRS like that. They're not available anywhere Thumb1 is available (AFAIK, only M0), so this will only work on M0, not on anything else where __ARM_ARCH_ISA_THUMB == 1 will be true.
Also, MRS/MSR is an additional 4 cycles each, and by the number of uses you have here, this function will be several times slower than it should.
cheers,
--renato
| lib/builtins/arm/comparesf2.S | ||
|---|---|---|
| 87 ↗ | (On Diff #82203) | Why do you use named labels here and not below? I'm fine with either standard, would just be good to be consistent, I think. My personal preference is to use 1f for everything, as it's clear what you want from the local context and the notation is slightly less cluttered. | 
Hi Renato,
I agree with your comments on msr/mrs. It's ugly and may only work for armv6m. 
The nasty thing is I can't find a way to convert "mvn r0, r1, asr #31" to Thumb1 without clobbering the apsr.
Any ideas? Otherwise, I think we have to just use the C version.
I think the problem here is trying to fit a Thumb1 flow around a Thumb2 one. Maybe if we had a completely separate version in pure Thumb1, and the ifdef just chooses it, we could have a cleaner implementation.
The way I do this is to write in C, compile to assembly and then optimise by hand, add comments, nice labels, etc.
Given the number of alternative we could have (T1/T2/ARM vs, soft/hard), it could be better if we had different files with each one and the main file (this one) defines the function alias depending on the defined pre-processor symbols. Though, this change can be on a separate patch, later.
cheers,
--renato
| lib/builtins/arm/comparesf2.S | ||
|---|---|---|
| 87 ↗ | (On Diff #82203) | I tend to prefer the temporary local labels (e.g. the 1) that you are suggesting as well. However, I think I would rather see a switch to that across all the various T1 versions that have been added recently rather than just this one. | 
| 154 ↗ | (On Diff #82203) | Unindent the label please. | 
| 155 ↗ | (On Diff #82203) | This would need to change to 2 if you rename the next label. | 
| 157 ↗ | (On Diff #82203) | Same here. Can you bump this to 2? Makes it easier to follow | 
| 178 ↗ | (On Diff #82203) | We didnt update the condition codes before, did we? | 
| 179 ↗ | (On Diff #82203) | Unindent the labels, and please make them monotonically increasing. It makes it easier to spot them when jumping through the code. | 
| 180 ↗ | (On Diff #82203) | No need to align the lr with the second column. | 
| 216 ↗ | (On Diff #82203) | Unnecessary newline. | 
| 243 ↗ | (On Diff #82203) | Did we set the condition flags previously? | 
| 253 ↗ | (On Diff #82203) | Hmm, won't this possibly execute both sides of the ITE which the T2 path doesnt do IIRC. Shouldnt we write this as: bhi 1f cmp r3, r1 b 2f 1: movs r0, #1 @ b 2f @ for symmetry 2: | 
By tracking the original implementation, I think we can avoid the msr/mrs. Also address some formatting issues.
| lib/builtins/arm/comparesf2.S | ||
|---|---|---|
| 243 ↗ | (On Diff #82203) | thumb1 has to use the "s" suffixed instructions. The flags will be reset by the "cmp" anyway, so won't affect the correctness. | 
| lib/builtins/arm/comparesf2.S | ||
|---|---|---|
| 253 ↗ | (On Diff #82203) | The original code is not an "if-else". the "cmpls" will update the flags as well. It's a short-circuited comparison like " r0 = (r2 < 0xff0000000 || r3 < 0xff000000) ? r0 : 1 | 
I think that the inline comments are slightly too dense to ready comfortably, especially with the amount of whitespace the rest of the file uses. But thats a minor thing.
I can fix the comments
| lib/builtins/arm/comparesf2.S | ||
|---|---|---|
| 196 ↗ | (On Diff #82585) | do you mean comments like this and line 214 too? | 
Yeah, those felt a slight bit dense while I was trying to read through the assembly. (BTW, no need to update the patch if you adjust the comments before committing).
Ugh, I missed the removal of the two lines in __gtsf2 in the !T1 case. Ill adjust it as Im working on adding support for *hf targets.