This is an archive of the discontinued LLVM Phabricator instance.

[Builtins] [ARM] Adding Thumb1 support for fcmp
ClosedPublic

Authored by weimingz on Dec 20 2016, 11:35 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

weimingz updated this revision to Diff 82203.Dec 20 2016, 11:35 PM
weimingz retitled this revision from to [Builtins] [ARM] Adding Thumb1 support for fcmp.
weimingz updated this object.
weimingz added reviewers: rengolin, compnerd.
weimingz added a subscriber: llvm-commits.
rengolin edited edge metadata.Dec 22 2016, 7:15 AM

Hi Weiming,

I don't think you can use MSR/MRS like that. They're not available anywhere Thumb1 is available (AFAIK, only M0), so this will only work on M0, not on anything else where __ARM_ARCH_ISA_THUMB == 1 will be true.

Also, MRS/MSR is an additional 4 cycles each, and by the number of uses you have here, this function will be several times slower than it should.

cheers,
--renato

lib/builtins/arm/comparesf2.S
87 ↗(On Diff #82203)

Why do you use named labels here and not below? I'm fine with either standard, would just be good to be consistent, I think.

My personal preference is to use 1f for everything, as it's clear what you want from the local context and the notation is slightly less cluttered.

Hi Renato,

I agree with your comments on msr/mrs. It's ugly and may only work for armv6m.
The nasty thing is I can't find a way to convert "mvn r0, r1, asr #31" to Thumb1 without clobbering the apsr.

Any ideas? Otherwise, I think we have to just use the C version.

I agree with your comments on msr/mrs. It's ugly and may only work for armv6m.
The nasty thing is I can't find a way to convert "mvn r0, r1, asr #31" to Thumb1 without clobbering the apsr.

Any ideas? Otherwise, I think we have to just use the C version.

I think the problem here is trying to fit a Thumb1 flow around a Thumb2 one. Maybe if we had a completely separate version in pure Thumb1, and the ifdef just chooses it, we could have a cleaner implementation.

The way I do this is to write in C, compile to assembly and then optimise by hand, add comments, nice labels, etc.

Given the number of alternative we could have (T1/T2/ARM vs, soft/hard), it could be better if we had different files with each one and the main file (this one) defines the function alias depending on the defined pre-processor symbols. Though, this change can be on a separate patch, later.

cheers,
--renato

compnerd added inline comments.Dec 23 2016, 1:15 PM
lib/builtins/arm/comparesf2.S
87 ↗(On Diff #82203)

I tend to prefer the temporary local labels (e.g. the 1) that you are suggesting as well. However, I think I would rather see a switch to that across all the various T1 versions that have been added recently rather than just this one.

154 ↗(On Diff #82203)

Unindent the label please.

155 ↗(On Diff #82203)

This would need to change to 2 if you rename the next label.

157 ↗(On Diff #82203)

Same here. Can you bump this to 2? Makes it easier to follow

178 ↗(On Diff #82203)

We didnt update the condition codes before, did we?

179 ↗(On Diff #82203)

Unindent the labels, and please make them monotonically increasing. It makes it easier to spot them when jumping through the code.

180 ↗(On Diff #82203)

No need to align the lr with the second column.

216 ↗(On Diff #82203)

Unnecessary newline.

243 ↗(On Diff #82203)

Did we set the condition flags previously?

253 ↗(On Diff #82203)

Hmm, won't this possibly execute both sides of the ITE which the T2 path doesnt do IIRC. Shouldnt we write this as:

  bhi 1f
  cmp r3, r1
  b 2f
1:
  movs r0, #1
  @ b 2f @ for symmetry
2:
weimingz updated this revision to Diff 82584.Dec 28 2016, 12:20 AM
weimingz edited edge metadata.

By tracking the original implementation, I think we can avoid the msr/mrs. Also address some formatting issues.

weimingz added inline comments.Dec 28 2016, 12:23 AM
lib/builtins/arm/comparesf2.S
243 ↗(On Diff #82203)

thumb1 has to use the "s" suffixed instructions. The flags will be reset by the "cmp" anyway, so won't affect the correctness.

weimingz added inline comments.Dec 28 2016, 12:32 AM
lib/builtins/arm/comparesf2.S
253 ↗(On Diff #82203)

The original code is not an "if-else". the "cmpls" will update the flags as well. It's a short-circuited comparison like " r0 = (r2 < 0xff0000000 || r3 < 0xff000000) ? r0 : 1

weimingz updated this revision to Diff 82585.Dec 28 2016, 12:37 AM

Missed a format issue in previous upload. Sorry~

Manually ran the unit test and passed

I'm happy if @compnerd is. :)

compnerd accepted this revision.Jan 6 2017, 9:39 PM
compnerd edited edge metadata.

I think that the inline comments are slightly too dense to ready comfortably, especially with the amount of whitespace the rest of the file uses. But thats a minor thing.

This revision is now accepted and ready to land.Jan 6 2017, 9:39 PM

I can fix the comments

lib/builtins/arm/comparesf2.S
196 ↗(On Diff #82585)

do you mean comments like this and line 214 too?

Yeah, those felt a slight bit dense while I was trying to read through the assembly. (BTW, no need to update the patch if you adjust the comments before committing).

This revision was automatically updated to reflect the committed changes.

Ugh, I missed the removal of the two lines in __gtsf2 in the !T1 case. Ill adjust it as Im working on adding support for *hf targets.

Ugh, I missed them in during clean up in diff 2. Very sorry!