This is an archive of the discontinued LLVM Phabricator instance.

[flang] Support aint/anint for 80/128 bit in lowering
ClosedPublic

Authored by peixin on Jul 18 2022, 9:15 AM.

Details

Summary

For aint/anint, LLVM conversion operations llvm.trunc and llvm.round
can support the edge case of aint(-0.) and anint(-0.). The output is -0.
and it is the same of gfortran and classic flang, while the output
of ifort is 0.. The real(10)/real(16) is not supported before. Support
it and remove the runtime functions for aint/anint.

For nint, gfortran, ifort, and LLVM Flang using llvm.lround have
different results when the magnitude of argument is more than the max of
result value range. So delay its support in lowering after more
investigations.

Diff Detail

Event Timeline

peixin created this revision.Jul 18 2022, 9:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 18 2022, 9:15 AM
peixin requested review of this revision.Jul 18 2022, 9:15 AM
vzakhari added inline comments.
flang/lib/Lower/IntrinsicCall.cpp
1190

Can you please add the same entries here?

peixin updated this revision to Diff 445559.Jul 18 2022, 10:18 AM

Add the same entries for mathOperations.

Add the same entries for mathOperations.

Thanks! Ideally, there should be a LIT test for F80 and the same additions in flang/test/Lower/late-math-lowering.f90. Otherwise looks good to me.

peixin updated this revision to Diff 445780.Jul 19 2022, 4:59 AM
peixin retitled this revision from [flang] Support aint/anint for 128 bit in lowering to [flang] Support aint/anint for 80/128 bit in lowering.
peixin edited the summary of this revision. (Show Details)

Thanks @vzakhari for pointing out where the tests for mathOperations sit. Add the test cases for F80 and math operations.

vzakhari accepted this revision.Jul 19 2022, 9:25 AM

Thanks for the changes! LGTM.

This revision is now accepted and ready to land.Jul 19 2022, 9:25 AM
peixin updated this revision to Diff 447037.Jul 22 2022, 11:11 PM

Rebase before land.

This revision was automatically updated to reflect the committed changes.

Hi @peixin, we are seeing some bugs in our tests after this patch. Basically, on the platform we are testing (x86-64 gnu linux) LLVM lowers llvm.round.f128 to some roundl calls expecting it will take fp128 arguments into xmm0 register. This is not the case of our roundl implementation that expects long doubles.

Here is a reproducer:

  real(16) :: x, y
  y = 0.6
  call bar(x, y)
  print *, x ! expects 1., we get 0.6
contains

  subroutine bar(x, y)
    real(16) :: x, y
    x = anint(y)
  end subroutine
end

The LLVM IR for bar looks like:

define void @_QPbar(ptr %0, ptr %1)  {
  %3 = load fp128, ptr %1, align 16
  %4 = call fp128 @llvm.round.f128(fp128 %3)
  store fp128 %4, ptr %0, align 16
  ret void
}
; Function Attrs: nocallback nofree nosync nounwind readnone speculatable willreturn           
declare fp128 @llvm.round.f128(fp128) #0                                                       

attributes #0 = { nocallback nofree nosync nounwind readnone speculatable willreturn }

This looks good to me, but llc translate this into:

_QPbar:                                 # @_QPbar
        push    rbx
        mov     rbx, rdi
        movaps  xmm0, xmmword ptr [rsi]
        call    roundl@PLT
        movaps  xmmword ptr [rbx], xmm0
        pop     rbx
        ret

And that is wrong, at least on our platform.

Is the small test working for you after your change ?
Do you know if llvm.round.f128 should only be emitted if there is native support for fp128 on the platform, or if there is a way to change the way LLVM translates llvm.round.f128 ?

Hi @peixin, we are seeing some bugs in our tests after this patch. Basically, on the platform we are testing (x86-64 gnu linux) LLVM lowers llvm.round.f128 to some roundl calls expecting it will take fp128 arguments into xmm0 register. This is not the case of our roundl implementation that expects long doubles.

Here is a reproducer:

  real(16) :: x, y
  y = 0.6
  call bar(x, y)
  print *, x ! expects 1., we get 0.6
contains

  subroutine bar(x, y)
    real(16) :: x, y
    x = anint(y)
  end subroutine
end

The LLVM IR for bar looks like:

define void @_QPbar(ptr %0, ptr %1)  {
  %3 = load fp128, ptr %1, align 16
  %4 = call fp128 @llvm.round.f128(fp128 %3)
  store fp128 %4, ptr %0, align 16
  ret void
}
; Function Attrs: nocallback nofree nosync nounwind readnone speculatable willreturn           
declare fp128 @llvm.round.f128(fp128) #0                                                       

attributes #0 = { nocallback nofree nosync nounwind readnone speculatable willreturn }

This looks good to me, but llc translate this into:

_QPbar:                                 # @_QPbar
        push    rbx
        mov     rbx, rdi
        movaps  xmm0, xmmword ptr [rsi]
        call    roundl@PLT
        movaps  xmmword ptr [rbx], xmm0
        pop     rbx
        ret

And that is wrong, at least on our platform.

Is the small test working for you after your change ?
Do you know if llvm.round.f128 should only be emitted if there is native support for fp128 on the platform, or if there is a way to change the way LLVM translates llvm.round.f128 ?

Thanks @jeanPerier . @sscalpone have submitted one test case to me. This patch turns one TODO into one execution error. I was testing the CI locally.

I created one issue https://github.com/llvm/llvm-project/issues/56720 for this. The problem is on X86_64. llvm.round and llvm.trunc works well on aarch64.

Do you want I revert this patch, or use https://reviews.llvm.org/D130556 to skip the current execution error and support it later? I need to check if the runtime can support aint/anint for real(16) later. Currently, the lowering does not support it using the runtime.