This is an archive of the discontinued LLVM Phabricator instance.

[llvm][RISCV][IR] Zext flag in IR for RISC-V
Needs ReviewPublic

Authored by karouzakisp on Jul 27 2023, 8:56 AM.

Details

Summary

Added zext flag in IR named was_sext for the target
RISC-V that converts back sext instructions from zext instructions
that were originated as a sext instruction. This conversion
happens in the backend of RISC-V in RISCVCodeGenPrepare.

Diff Detail

Event Timeline

karouzakisp created this revision.Jul 27 2023, 8:56 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 27 2023, 8:56 AM
karouzakisp requested review of this revision.Jul 27 2023, 8:56 AM
nikic added a comment.Jul 27 2023, 9:05 AM

Missing LangRef change.

was_sext is the wrong name for this flag, because it's not meaningful in terms of operational semantics of the IR. You'll want to call this something like nneg or nonneg, with the semantics that if the argument is negative, the return value is poison. This allows refining zext nneg %x to sext %x.

Generally I'm on board with adding such a flag.

Patch needs a real title

Please upload patches with full context using -U99999 per the directions here https://llvm.org/docs/Phabricator.html#requesting-a-review-via-the-web-interface

llvm/lib/AsmParser/LLParser.cpp
3747
if (Opc == Instruction::ZExt && EatIfPresent(lltok::kw_was_sext))
  WasSext = true;
6355

Unrelated change?

6382

Can this be moved into parseCast?

llvm/lib/Bitcode/Reader/BitcodeReader.cpp
1472

This TODO needs to removed, or needs to say what needs to be done

4874

Is this reformatting existing code that you aren't changing?

4893

TODO has no description

4921

Join conditions with && to reduce nesting

llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
105

This TODO is in the middle of a sentence

llvm/lib/IR/Instructions.cpp
3761 ↗(On Diff #544796)

This is adding curly braces to code you didn't otherwise change. Please remove

karouzakisp retitled this revision from [llvm][IR] to Zext flag in IR for RISC-V.Jul 27 2023, 9:46 AM

Added -U999999 in
git show
to make the reviewing possible via the web interface

karouzakisp set the repository for this revision to rG LLVM Github Monorepo.Jul 27 2023, 10:04 AM
karouzakisp retitled this revision from Zext flag in IR for RISC-V to [llvm][RISCV][IR] Zext flag in IR for RISC-V.Jul 27 2023, 10:09 AM

Missing changes to Operator::hasPoisonGeneratingFlags and Instruction::copyIRFlags

karouzakisp marked 8 inline comments as done.

Changed flag's name from `was_sext to nneg`. Updated the LangRef and all the required changes such as Instruction.h Operator.h.

craig.topper added inline comments.Aug 28 2023, 10:22 PM
llvm/lib/Bitcode/Reader/BitcodeReader.cpp
3228

Extra blank line

4921

Extra blank line

6411

Not sure what this is?

nikic added a comment.Aug 29 2023, 2:14 AM

This is still missing LangRef changes.

I think you should also drop everything here that is related to supporting zext constant expressions. These will be removed in the near future, and I don't think it makes sense to add nneg support for them just to drop them again.

llvm/lib/Bitcode/flags.ll
31

This test should not be in the lib/ directory.

llvm/lib/IR/Instructions.cpp
3768 ↗(On Diff #554043)

Spurious white space changes.

Removed redudant whitespaces.

Removed more redudant white spaces.

This is still missing LangRef changes.

I think you should also drop everything here that is related to supporting zext constant expressions. These will be removed in the near future, and I don't think it makes sense to add nneg support for them just to drop them again.

"
Semantics:

The zext fills the high order bits of the value with zero bits until it reaches the size of the destination type, ty2.

When zero extending from i1, the result will always be either 0 or 1.

The nneg flag means the value to be zero extended is non negative.
So we can safely convert the zext to a sext.
This applies only for the RISC-V target because sext i32 to i64 can be a no op for RV64
and it is always cheaper than performing a zero extension .

Example:
%X = zext i32 257 to i64 ; yields i64:257
%Y = zext i1 true to i32 ; yields i32:1
%Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
%H = zext nneg 8 i32 to i64 ; yields to i64:8

llvm/lib/AsmParser/LLParser.cpp
6355

Yes, I reverted it back as it was.

6382

It can be moved but I wrote this, as all the other flags was written in a similar way. Also if we move it to parseCast we must refactor parseCast with one additional optional argument.

llvm/lib/Bitcode/Reader/BitcodeReader.cpp
4874

Probably
git clang-format
reformatted it.
Should I reformat it back?

This is still missing LangRef changes.

I think you should also drop everything here that is related to supporting zext constant expressions. These will be removed in the near future, and I don't think it makes sense to add nneg support for them just to drop them again.

"
Semantics:

The zext fills the high order bits of the value with zero bits until it reaches the size of the destination type, ty2.

When zero extending from i1, the result will always be either 0 or 1.

The nneg flag means the value to be zero extended is non negative.
So we can safely convert the zext to a sext.
This applies only for the RISC-V target because sext i32 to i64 can be a no op for RV64
and it is always cheaper than performing a zero extension .

Example:
%X = zext i32 257 to i64 ; yields i64:257
%Y = zext i1 true to i32 ; yields i32:1
%Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
%H = zext nneg 8 i32 to i64 ; yields to i64:8

I don't think we need to mention RISC-V. I've seen cases where the middle end needs this information. This issue https://discourse.llvm.org/t/aggressive-conversion-of-sext-to-zext-blocks-indvarsimplify/61561 was independent of RISC-V.

This is still missing LangRef changes.

I think you should also drop everything here that is related to supporting zext constant expressions. These will be removed in the near future, and I don't think it makes sense to add nneg support for them just to drop them again.

"
Semantics:

The zext fills the high order bits of the value with zero bits until it reaches the size of the destination type, ty2.

When zero extending from i1, the result will always be either 0 or 1.

The nneg flag means the value to be zero extended is non negative.
So we can safely convert the zext to a sext.
This applies only for the RISC-V target because sext i32 to i64 can be a no op for RV64
and it is always cheaper than performing a zero extension .

Example:
%X = zext i32 257 to i64 ; yields i64:257
%Y = zext i1 true to i32 ; yields i32:1
%Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
%H = zext nneg 8 i32 to i64 ; yields to i64:8

I don't think we need to mention RISC-V. I've seen cases where the middle end needs this information. This issue https://discourse.llvm.org/t/aggressive-conversion-of-sext-to-zext-blocks-indvarsimplify/61561 was independent of RISC-V.

Semantics:

The zext fills the high order bits of the value with zero bits until it reaches the size of the destination type, ty2.

When zero extending from i1, the result will always be either 0 or 1.

The nneg flag means the value to be zero extended is non negative.
So we can safely convert the zext to a sext.

Example:
%X = zext i32 257 to i64 ; yields i64:257
%Y = zext i1 true to i32 ; yields i32:1
%Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
%H = zext nneg 8 i32 to i64 ; yields to i64:8

This is still missing LangRef changes.

I think you should also drop everything here that is related to supporting zext constant expressions. These will be removed in the near future, and I don't think it makes sense to add nneg support for them just to drop them again.

"
Semantics:

The zext fills the high order bits of the value with zero bits until it reaches the size of the destination type, ty2.

When zero extending from i1, the result will always be either 0 or 1.

The nneg flag means the value to be zero extended is non negative.
So we can safely convert the zext to a sext.
This applies only for the RISC-V target because sext i32 to i64 can be a no op for RV64
and it is always cheaper than performing a zero extension .

Example:
%X = zext i32 257 to i64 ; yields i64:257
%Y = zext i1 true to i32 ; yields i32:1
%Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
%H = zext nneg 8 i32 to i64 ; yields to i64:8

I don't think we need to mention RISC-V. I've seen cases where the middle end needs this information. This issue https://discourse.llvm.org/t/aggressive-conversion-of-sext-to-zext-blocks-indvarsimplify/61561 was independent of RISC-V.

Semantics:

The zext fills the high order bits of the value with zero bits until it reaches the size of the destination type, ty2.

When zero extending from i1, the result will always be either 0 or 1.

The nneg flag means the value to be zero extended is non negative.
So we can safely convert the zext to a sext.

Example:
%X = zext i32 257 to i64 ; yields i64:257
%Y = zext i1 true to i32 ; yields i32:1
%Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
%H = zext nneg 8 i32 to i64 ; yields to i64:8

This is still missing LangRef changes.

I think you should also drop everything here that is related to supporting zext constant expressions. These will be removed in the near future, and I don't think it makes sense to add nneg support for them just to drop them again.

"
Semantics:

The zext fills the high order bits of the value with zero bits until it reaches the size of the destination type, ty2.

When zero extending from i1, the result will always be either 0 or 1.

The nneg flag means the value to be zero extended is non negative.
So we can safely convert the zext to a sext.
This applies only for the RISC-V target because sext i32 to i64 can be a no op for RV64
and it is always cheaper than performing a zero extension .

Example:
%X = zext i32 257 to i64 ; yields i64:257
%Y = zext i1 true to i32 ; yields i32:1
%Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
%H = zext nneg 8 i32 to i64 ; yields to i64:8

I don't think we need to mention RISC-V. I've seen cases where the middle end needs this information. This issue https://discourse.llvm.org/t/aggressive-conversion-of-sext-to-zext-blocks-indvarsimplify/61561 was independent of RISC-V.

Semantics:

The zext fills the high order bits of the value with zero bits until it reaches the size of the destination type, ty2.

When zero extending from i1, the result will always be either 0 or 1.

The nneg flag means the value to be zero extended is non negative.
So we can safely convert the zext to a sext.

Example:
%X = zext i32 257 to i64 ; yields i64:257
%Y = zext i1 true to i32 ; yields i32:1
%Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
%H = zext nneg 8 i32 to i64 ; yields to i64:8

Ping.

I don't see tests for constantexpr.

llvm/include/llvm/IR/Instruction.h
372

"sext flag" -> "nneg flag"

383

"Determine whether the nneg flag is set."

llvm/lib/AsmParser/LLLexer.cpp
567

I think this list should try to be the same order as LLVMToken.h where you have this after inbounds

llvm/lib/AsmParser/LLParser.cpp
6381

We can just do return parseCast I think?

7181

Space after if

llvm/lib/Bitcode/Reader/BitcodeReader.cpp
3216

Don't change the formatting of this line

3224

Extra space after &&

4889

Don't change the formatting of this line

llvm/lib/IR/Instruction.cpp
385

Space after if

414

space after if

llvm/test/Bitcode/constantsTest.3.2.ll
124

The only change to this file seems to be adding a new line. I don't think we should touch this file in this patch.

Updated clang-format and fixed minor errors.

I don't see tests for constantexpr.

Semantics:

The zext fills the high order bits of the value with zero bits until it reaches the size of the destination type, ty2.

When zero extending from i1, the result will always be either 0 or 1.

The nneg flag means the value to be zero extended is non negative.
So we can safely convert the zext to a sext.
This applies only for the RISC-V target because sext i32 to i64 can be a no op for RV64
and it is always cheaper than performing a zero extension .

Example:
%X = zext i32 257 to i64 ; yields i64:257
%Y = zext i1 true to i32 ; yields i32:1
%Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
%H = zext nneg 8 i32 to i64 ; yields to i64:8

ConstantExpr tests::

define i64 @zext_plain_ce() {
; CHECK: ret zext nneg i32 (i32 ptrtoint (ptr @addr to i32), i32 130) to i64

ret zext nneg i32 (i32 ptrtoint (ptr @addr to i32), i32 130) to i64

}

nikic added a comment.Sep 23 2023, 1:02 PM

The semantics need to be a change to llvm/docs/LangRef.rst in this patch, not a comment.

As I said before, I don't think it makes sense to add nneg support to zext constant expressions, which are slated for removal.

llvm/include/llvm/IR/Operator.h
349

Missing comment text?

llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
105

Not done?

mamai added a subscriber: mamai.Sep 28 2023, 6:14 AM

The semantics need to be a change to llvm/docs/LangRef.rst in this patch, not a comment.

As I said before, I don't think it makes sense to add nneg support to zext constant expressions, which are slated for removal.

Refactoring NonNeg Operator and removing all zext constant expressions leads to 3 test failing due to the inheritance graph:
NonNeg Operator is Operator and Operator is User and ConstantExpr is a Constant and Constant is User too.
So the User class must be refactored too to not support ConstantExpr in general.

The semantics need to be a change to llvm/docs/LangRef.rst in this patch, not a comment.

As I said before, I don't think it makes sense to add nneg support to zext constant expressions, which are slated for removal.

Refactoring NonNeg Operator and removing all zext constant expressions leads to 3 test failing due to the inheritance graph:
NonNeg Operator is Operator and Operator is User and ConstantExpr is a Constant and Constant is User too.
So the User class must be refactored too to not support ConstantExpr in general.

Yes, Operator is for cases that support Instruction and ConstantExpr. You would change NonNegOperator -> NonNegInstruction/NonNegInst (if you need it at all, it would only have the classof check for zext and nothing else).

The semantics need to be a change to llvm/docs/LangRef.rst in this patch, not a comment.

As I said before, I don't think it makes sense to add nneg support to zext constant expressions, which are slated for removal.

Refactoring NonNeg Operator and removing all zext constant expressions leads to 3 test failing due to the inheritance graph:
NonNeg Operator is Operator and Operator is User and ConstantExpr is a Constant and Constant is User too.
So the User class must be refactored too to not support ConstantExpr in general.

Yes, Operator is for cases that support Instruction and ConstantExpr. You would change NonNegOperator -> NonNegInstruction/NonNegInst (if you need it at all, it would only have the classof check for zext and nothing else).

Yes I have done that and still the 3 tests fail, inheriting from User directly, instead of Operator.
and I get --> " decltype(auto) llvm::cast(From*) [with To = llvm::NonNegOperator; From = const llvm::Operator]: Assertion `isa<To>(Val) && "cast<Ty>() argument of incompatible type!"' failed. "

Basically I used operator to set/get the flag similarly to all the other flags but now those support ConstantExpr.
and I think that's the problem.

craig.topper added inline comments.Sep 30 2023, 1:13 PM
llvm/lib/Bitcode/Reader/BitcodeReader.cpp
4874

This should be reformatted back the way it was.

nikic added a comment.Oct 2 2023, 6:02 AM

I've put up https://github.com/llvm/llvm-project/pull/67982 based on this code, without the constant expression support and some cleanup.