As Fortran 2018 16.9.163, the reshape is the only intrinsic which
requires the shape argument to be rank-one integer array and the SIZE
of it to be one constant expression. The current expression lowering
converts the shape expression with slice in intrinsic into one box value
with the box element type of unknown extent. However, the genReshape
requires the box element type to be constant size. So, convert the box
value into one with box element type of sequence of 1 x constant. This
corner case is found in cam4 in SPEC 2017
https://github.com/llvm/llvm-project/issues/56140.
Details
Diff Detail
Event Timeline
Thanks for finding this bug and proposing a fix.
The issue is indeed that reshape lowering is relying on the FIR type of SHAPE to deduce the result rank, and this is a bit fragile because we do lose the shape info here and there. We should probably fix this in a more robust way: either enforcing that all expr that have constant shape are correctly typed. Or not relying on this.
The former (getting the types right) is probably what we should aim at.
The nicer workaround for your specific example would be to update reduceRank to build a constant shape sequence type when possible here: https://github.com/llvm/llvm-project/blob/dc97886fa36df582ed8957d13e4e51321507dc18/flang/lib/Lower/ConvertExpr.cpp#L6032.
You can maybe use fir::factory::getIntIfConstant and add a constant version of fir::factory::genExtentFromTriplet() and use it there, this will be more generic than this rebase specific code and might benefit more FIR code.
I'll try to think if we can enforce getting the type right in expression lowering.
The former (getting the types right) is probably what we should aim at.
The nicer workaround for your specific example would be to update reduceRank to build a constant shape sequence type when possible here: https://github.com/llvm/llvm-project/blob/dc97886fa36df582ed8957d13e4e51321507dc18/flang/lib/Lower/ConvertExpr.cpp#L6032.
I have tried this before getting this patch, but it doesn't work only changing reduceRank. After changing reduceRank, the FIR would be like the following:
%44 = fir.embox %1(%42) [%43] : (!fir.ref<!fir.array<2xi32>>, !fir.shape<1>, !fir.slice<1>) -> !fir.box<!fir.array<2xi32>>
The execution result for the test case in this patch would be 0 0 1 2 if you change it into one main program. The expected result is 1 2 3 4. I had thought the reduceRank is not designed like what I think. Then I go through all the intrinsics, and find out this case is only one having this restriction, so I declare it as corner case.
With this patch, the FIR would be like the following:
%44 = fir.embox %1(%42) [%43] : (!fir.ref<!fir.array<2xi32>>, !fir.shape<1>, !fir.slice<1>) -> !fir.box<!fir.array<?xi32>> %45 = fir.convert %44 : (!fir.box<!fir.array<?xi32>>) -> !fir.box<!fir.array<2xi32>>
You can maybe use fir::factory::getIntIfConstant and add a constant version of fir::factory::genExtentFromTriplet() and use it there, this will be more generic than this rebase specific code and might benefit more FIR code.
Now confirmed by you (Thanks @jeanPerier for that) that it's better to have the more generic method. I will dig more about codegen. It seems current codegen does not support the embox operation with slice into the box with constant size. There are 2 possible solutions:
- either support the embox with slice into constant size in codegen.
- or emit fir::convert operation after fir::embox for the expr with constant shape.
That is weird, it works for me (prints 1 2 3 4).
I attached my FIR dump of your example, you can see the: %47 = fir.embox %1(%45) [%46] : (!fir.ref<!fir.array<4xi32>>, !fir.shape<1>, !fir.slice<1>) -> !fir.box<!fir.array<2xi32>>
Can you check if you are getting different FIR or if you get bad results with my FIR dump when using tco + llc ?
I tried it again after rebase, and the problem disappear. Unfortunately, I didn't keep the previous commit before rebase and previous temporary FIR files are overwritten :(. Here is what I tried:
diff --git a/flang/include/flang/Optimizer/Builder/FIRBuilder.h b/flang/include/flang/Optimizer/Builder/FIRBuilder.h index ec9da6b71457..49c0b90550bd 100644 --- a/flang/include/flang/Optimizer/Builder/FIRBuilder.h +++ b/flang/include/flang/Optimizer/Builder/FIRBuilder.h @@ -542,6 +542,10 @@ mlir::Value createZeroValue(fir::FirOpBuilder &builder, mlir::Location loc, /// Unwrap integer constant from an mlir::Value. llvm::Optional<std::int64_t> getIntIfConstant(mlir::Value value); +/// Get the integer constants of triplet and compute the extent. +llvm::Optional<std::int64_t> +getExtentFromTriplet(mlir::Value lb, mlir::Value ub, mlir::Value stride); + /// Generate max(\p value, 0) where \p value is a scalar integer. mlir::Value genMaxWithZero(fir::FirOpBuilder &builder, mlir::Location loc, mlir::Value value); diff --git a/flang/lib/Lower/ConvertExpr.cpp b/flang/lib/Lower/ConvertExpr.cpp index 20e7c33df5a6..215979f2fb64 100644 --- a/flang/lib/Lower/ConvertExpr.cpp +++ b/flang/lib/Lower/ConvertExpr.cpp @@ -6027,9 +6027,14 @@ private: mlir::Operation::operand_range triples = slOp.getTriples(); fir::SequenceType::Shape shape; // reduce the rank for each invariant dimension - for (unsigned i = 1, end = triples.size(); i < end; i += 3) - if (!mlir::isa_and_nonnull<fir::UndefOp>(triples[i].getDefiningOp())) + for (unsigned i = 1, end = triples.size(); i < end; i += 3) { + if (auto extent = fir::factory::getExtentFromTriplet( + triples[i - 1], triples[i], triples[i + 1])) + shape.push_back(*extent); + else if (!mlir::isa_and_nonnull<fir::UndefOp>( + triples[i].getDefiningOp())) shape.push_back(fir::SequenceType::getUnknownExtent()); + } return fir::SequenceType::get(shape, seqTy.getEleTy()); } // not sliced, so no change in rank diff --git a/flang/lib/Optimizer/Builder/FIRBuilder.cpp b/flang/lib/Optimizer/Builder/FIRBuilder.cpp index f12d54399960..095bcb577dd3 100644 --- a/flang/lib/Optimizer/Builder/FIRBuilder.cpp +++ b/flang/lib/Optimizer/Builder/FIRBuilder.cpp @@ -1227,6 +1227,21 @@ llvm::Optional<std::int64_t> fir::factory::getIntIfConstant(mlir::Value value) { return {}; } +llvm::Optional<std::int64_t> +fir::factory::getExtentFromTriplet(mlir::Value lb, mlir::Value ub, + mlir::Value stride) { + auto lbOp = mlir::dyn_cast<fir::ConvertOp>(lb.getDefiningOp()); + auto ubOp = mlir::dyn_cast<fir::ConvertOp>(ub.getDefiningOp()); + auto strideOp = mlir::dyn_cast<fir::ConvertOp>(stride.getDefiningOp()); + if (lbOp && ubOp && strideOp) + if (auto lbInt = fir::factory::getIntIfConstant(lbOp.getValue())) + if (auto ubInt = fir::factory::getIntIfConstant(ubOp.getValue())) + if (auto strideInt = + fir::factory::getIntIfConstant(strideOp.getValue())) + return 1 + (ubInt.value() - lbInt.value()) / strideInt.value(); + return {}; +} + mlir::Value fir::factory::genMaxWithZero(fir::FirOpBuilder &builder, mlir::Location loc, mlir::Value value) {
There are 9 failed tests on aarch64:
Failed Tests (9): Flang :: Lower/Intrinsics/size.f90 Flang :: Lower/array-expression-slice-1.f90 Flang :: Lower/array-temp.f90 Flang :: Lower/call-by-value-attr.f90 Flang :: Lower/derived-allocatable-components.f90 Flang :: Lower/io-item-list.f90 Flang :: Lower/pointer-assignments.f90 Flang :: Lower/pointer-initial-target.f90 Flang :: Lower/structure-constructors.f90
I tested the execution results for Lower/Intrinsics/size.f90 and it is correct and the same of gfortran. Fixing FileCheck for it will be fine.
Even if all the failed tests passed the execution. I am still worried if this patch will crash some other code. Can you help evaluate if this is the right direction?
Even if all the failed tests passed the execution. I am still worried if this patch will crash some other code
fir.embox to a constant array shape ought to work. If this patch turns out to crash some code, then it will be a bug in fir.embox and we should fix it then if such crashes occur. So I would go that way.
> + auto lbOp = mlir::dyn_cast<fir::ConvertOp>(lb.getDefiningOp()); > + auto ubOp = mlir::dyn_cast<fir::ConvertOp>(ub.getDefiningOp()); > + auto strideOp = mlir::dyn_cast<fir::ConvertOp>(stride.getDefiningOp()); > + if (lbOp && ubOp && strideOp) > + if (auto lbInt = fir::factory::getIntIfConstant(lbOp.getValue()))
It is a bit brutal: skipping the convert might be wrong in the general case if they are narrowing the value. Here, they really should not be narrowing, but I am not a fan of setting too much precedent of silently ignoring the Convert to get the constant arguments. Do you know if we could instead leverage MLIR folding here to get the constant value argument if it is indeed constant ?
It is a bit brutal: skipping the convert might be wrong in the general case if they are narrowing the value.
I know it is kind of brutal, but I don't see any possibility of failure according to current sliceOp design. The triple will all be index type(https://github.com/llvm/llvm-project/blob/1ba7f5218ccdc0b4fd836cb4f0d647866f793c87/flang/include/flang/Optimizer/Dialect/FIROps.td#L1843-L1867)? If the triplet is constant size, the bounds and stride will be folded into constant converted into index type.
Do you know if we could instead leverage MLIR folding here to get the constant value argument if it is indeed constant ?
I am not an expert in MLIR. I take a look at MLIR (grep constant-related code) and do not find any helper function or analysis to handle this case.
- Use the new approach.
- Fix getExtentFromTriplet to make it more generic.
- Fix the test cases.
Doing this so that it is easier to review and discuss.
Thanks for the update, preserving more constant shape information in the FIR types looks better to me.
I think a long term solution could be to use evaluate::GetShape or lower::converType on the array sub-expressions to get and preserve as much information as possible in the FIR types and avoid trying to get that information back looking at the slice mlir operands, but your patch looks like a good and simple start to improve the sub-expression FIR types.
flang/lib/Optimizer/Builder/FIRBuilder.cpp | ||
---|---|---|
1244 ↗ | (On Diff #441056) | You should add std::max(0, ..) to cope with reversed bounds. Since this is compile time, I think it is probably safer to catch strideInt.value()=0 case. I am not sure if the front-end can catch all cases. Returning nullopt in that case might be safer to avoid crashes (better to hit a verifier error). |
flang/test/Lower/Intrinsics/reshape.f90 | ||
117 | Pretty much most of the FIR above this is related to the v2d(2,2) and v2(:, :) = and not really to the reshape. Could you remove the checks for things that do not matter here (you could maybe do call some_proc(reshape(tmp, dims(i:2)) to have a Fortran code that is mostly about the reshape expression evaluation) |
Agree. The reduceRank is kind of hard to understand. And the extent should be obtained as early as possible when analyzing the expression, rather than deducing it from the sliceOp using mlir::Value.
flang/lib/Optimizer/Builder/FIRBuilder.cpp | ||
---|---|---|
1244 ↗ | (On Diff #441056) | Fixed. |
flang/test/Lower/Intrinsics/reshape.f90 | ||
117 | Good point. Fixed. |
llvm::StringRef