This is an archive of the discontinued LLVM Phabricator instance.

Lower mask intrinsics
ClosedPublic

Authored by tarunprabhu on Jul 7 2022, 8:31 AM.

Details

Summary

This patch lowers the maskl and maskr intrinsics from F2008.

The first argument to the intrinsic is I which is the number of bits on the left/right to be set in the result.

If I == 0, no bits will be set in the result

If I == BIT_SIZE(KIND), all bits will be set in the result

If I < 0 or I > BIT_SIZE(KIND), the result will always be 0.

where KIND is optionally provided as an argument to MASK* and is the default integer kind if left unspecified.

Diff Detail

Event Timeline

tarunprabhu created this revision.Jul 7 2022, 8:31 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 7 2022, 8:31 AM
tarunprabhu requested review of this revision.Jul 7 2022, 8:31 AM

Thanks for working on this.

Where do you get the "If I < 0 or I > BIT_SIZE(KIND), the result will always be 0." requirement ? My understanding from the standard it is non conforming if I < 0 or I > BIT_SIZE(KIND), and so there is no specified behavior.

I checked with a few compilers, but it seems only xlf does this, and ifort for the <0 case, otherwise, gfortran, nag and ifort (for the I > BIT_SIZE(KIND) case) returns the same but non zero results.
I also noticed that f18 semantics folds the I > BIT_SIZE(KIND) to -1 (all other compilers, except xlf, raise a compile time error in these cases).

Anyway, I am raising this because I do not know if the extra ops are worth it. Is it really safer to return zero in these case, or is it an extension some program expects ? ON the other hand, I can understand that zero mask are probably more likely to generate results that a programmer would quickly question, and I am not sure what is the kind of perf hit adding these instructions would cause. A better (future, no need to do that here), would probably to add a way to have runtime checks with a fatal error for this kind of situation under some specific flags.

Please do not abandon the behavior of your patch right away though, I am not expert enough in the usages of MASKL./MASKR to have a strong opinion.

@vdonaldson or @klausler, do you think f18 need to deal with the "If I < 0 or I > BIT_SIZE(KIND)" case in a special way at runtime ?

flang/lib/Lower/IntrinsicCall.cpp
3332

I am surprised you have to do that here. the KIND argument, if present, must already be represented in the resultType that semantics selected for the intrinsic. So I think you should be able to remove this case (otherwise, this would indicate some bug with resultType).

Where do you get the "If I < 0 or I > BIT_SIZE(KIND), the result will always be 0." requirement ? My understanding from the standard it is non conforming if I < 0 or I > BIT_SIZE(KIND), and so there is no specified behavior.

The same point was raised by @vdonaldson on another patch (https://reviews.llvm.org/D129316) which makes the same assumption. If the experts can weigh in on that patch, I'll apply similar changes to this one.

flang/lib/Lower/IntrinsicCall.cpp
3332

Duh! I didn't check to see if the resultType was already correct. If it is as expected, I'll remove the check.

I haven't tested the mask intrinsic functions, but I did note on x86 & power that all compilers return zero from SHIFTR(x, count) when "count" is a negative value not known at compilation time -- except nvfortran which lacks SHIFTR.

The general rule should be to follow precedents for nonconforming behavior f they are common and there isn't much of a performance penalty, since we have to make it easy to port to f18. We don't know whether there are users that depend on some behavior or not in their existing codes. Another way to put it is that a code that compiles and runs the same way on multiple existing compilers is portable by definition, whether it's conforming or not, and if it doesn't work with f18 we should at least have a good intentional reason for it.

The only fixed requirement for intrinsic implementation is that conformant calls must generate correct results. Any violation of that requirement is a bug.

For nonconformant calls, performance should almost always take precedence. An implementation should rarely to never include code to get specific nonconformant results if that penalizes conformant call performance.

There are a half a dozen or so extant compilers that can be checked for comparison, starting with gfortran and classic flang. If compilers disagree on nonconformant results, that is usually an indication that conformant call implementations differ. If compilers agree on nonconformant calls, which often happens, it is likely because they all have the same obvious implementation.

That does leave some wiggle room for nonconformant calls. There could, for example, be multiple implementations that generate correct conformant call results but different nonconformant results with identical instruction cost. Function genIbits for example, has a comment to that effect.

For comparison, here is a test program that generates both conformant and nonconformant ibits calls for default integer kind, and distinguishes between the two in the output. If compiled with multiple compilers, you can verify that results for conformant calls agree across compilers. (If not, there is a bug somewhere in new code or another compiler.) You can also specifically compare nonconformant results to inform implementation decisions. Copy and tweak for other argument kinds or other intrinsics.

  integer, parameter :: v(17) = [-1000, -130, -129, -128, -127, -126, -1, 0, 1, 5, 10, 14, 31, 32, 33, 666, 99999999]
  integer, parameter :: x(18) = [-33333333, -1111, -2, -1, 0, 1, 2, 3, 14, 15, 16, 17, 30, 31, 32, 33, 7777, 8888888]
  character ok

  do i = lbound(v,1), ubound(v,1)
    do j = lbound(x,1), ubound(x,1)
      print*
      do k = lbound(x,1), ubound(x,1)
        ok = ' '
        ii = v(i)
        jj = x(j)
        kk = x(k)
        if (jj < 0) ok = '!'
        if (kk < 0) ok = '!'
        if (jj + kk > 32) ok = '!'
        print "(A, '   ', 3I12,'     =',I12)", ok, ii, jj, kk, ibits(ii, jj, kk)
      enddo
    enddo
  enddo
end

What do you suggest I do with this patch and the other concerning SHIFT intrinsics?

All have very straightforward implementations using one of LLVM's shl, lshr and ashr instructions, especially if bounds checks are not carried out. According to the documentation, those instructions will produce poison values if called with out-of-bounds operands. Some very basic testing seems to produce some non-determinstic result on my machine (Linux, x86-64) in the case of dynamic poison values.

I do not have access to multiple Fortran compilers to test this, so I cannot attempt to match the behavior of other compilers.

NOTES:

  1. Values for the operands for these intrinsics considered out-of-bounds by the Fortran standard would also be considered out-of-bounds by LLVM.
  2. Out-of-bounds values that are statically known result in a compile-time error in Flang (and also gfortran).

I do not have access to multiple Fortran compilers to test this, so I cannot attempt to match the behavior of other compilers.

You can test ifort, ifx, and gfortran results on godbolt.org, it now runs the programs in the bottom right window (link below).

This shows that they do not give unanimous results in the MASKR/MASKL non-confromant cases. Only ifx and xlf always return zeros. gfortran does not. So I think you can go for the simpler implementation here. Please add a small comment in the code that there is no common behavior of other compilers on the non conformant case so they are not dealt with in a special way.

For your other patch about the SHIFT's intrinsic, applying @klausler's rule, I think you should return zero in the non conformant cases since that seems to be a common behavior of the other compilers.

See for instance this maskr/maskl example on godbolt.org

Removed the checks for I < 0 and I >= BIT_SIZE(KIND).

Removed the need to check the kind argument explicitly.

tarunprabhu marked an inline comment as done.Jul 13 2022, 8:24 AM

@jeanPerier

There is no test for the KIND=16 case. This is because adding results in a segfault which is probably related to this issue.

I think it would be good to include that test for completeness. Should I add a comment saying that the test should be added when that issue is resolved? Or is there another way of flagging this?

jeanPerier accepted this revision.Jul 19 2022, 2:10 AM

Thanks for answering all the comments.

I think it would be good to include that test for completeness. Should I add a comment saying that the test should be added when that issue is resolved? Or is there another way of flagging this?

A comment pointing to the github issue sounds good to me.

This revision is now accepted and ready to land.Jul 19 2022, 2:10 AM

Added RUN: %flang_fc1 to tests

Added comment indicating the need for tests for kind=16 conditional on 56446 being resolved.

This revision was automatically updated to reflect the committed changes.

@tarunprabhu , this is causing NAG tests to fail. Do you have access to them?

@PeteSteinfeld, No. I do not have access to the NAG tests.

Hi @tarunprabhu, it seems the MASKL(0) and MASKR(0) cases are not correct. See https://github.com/llvm/llvm-project/issues/56706.