Extending to f32 first (as is done for f16) results in better generated code for RISC-V (and affects no other in-tree tests). Additionally, performing the FP_EXTEND first seems equally justified for bf16 and for f16.
Details
Details
Diff Detail
Diff Detail
Unit Tests
Unit Tests
Time | Test | |
---|---|---|
60,040 ms | x64 debian > MLIR.Examples/standalone::test.toy |