This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Extract and store new vl of vleff/vlsegff iff new_vl output pointer isn't null
AbandonedPublic

Authored by pcwang-thead on May 26 2022, 5:52 AM.

Details

Summary

Store to null will be changed to unreachable, so all instructions
after a vleff/vlsegff intrinsic call with a null new_vl output pointer
will be deleted and will cause runtime errors. With this patch, if new_vl
output pointer is null, we won't extract and store the new vl.

We will generate IR like the following pseudocode:

if (new_vl_ptr != null) {
  *new_vl_ptr = new_vl;
}
// do something.

If new_vl output pointer is known to be null or not null, the code can be
optimized and there is no impact on performance.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptMay 26 2022, 5:52 AM
pcwang-thead requested review of this revision.May 26 2022, 5:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 26 2022, 5:52 AM
craig.topper added a comment.EditedMay 26 2022, 7:08 PM

What if it isn't known to be null at compile time but gets optimized to it. Are we trying to make passing a null pointer defined behavior for these intrinsics?

  • Generate IRs to compare destination and null. Only extract and store new vl iff destination is not null.
  • Add a test to show optimized code.
pcwang-thead edited the summary of this revision. (Show Details)May 27 2022, 1:19 AM

What if it isn't known to be null at compile time but gets optimized to it.

Thanks, I haven't thought it clearly and I just wanted to solve found bugs. :-)
I changed it to generate IRs to do the check. It is the same as GCC implementation now I think.
https://github.com/riscv-collab/riscv-gcc/blob/riscv-gcc-10.1-rvv-dev/gcc/config/riscv/riscv_vector.h#L289

if (new_vl)								\
    {									\
    if (__riscv_xlen == 32)						\
      *new_vl = __builtin_riscv_vreadvlsi ();				\
    else								\
      *new_vl = __builtin_riscv_vreadvldi ();				\
    }

Are we trying to make passing a null pointer defined behavior for these intrinsics?

Yes, but only for vleff instructions, since it has two outputs actually. And this behavior is compatible with GCC.
If necessary, I will propose it to rvv-intrinsic-doc.

craig.topper added inline comments.Jun 7 2022, 9:04 AM
clang/include/clang/Basic/riscv_vector.td
650

I don't think we usually use capital letters in basic block names.

651

newvl.isNull doesn't make sense since isNotNull eventually jumps to it.

Maybe use "store_newvl" and "store_newvl_end"?

651

I changed my mind. How about "newvl_store" and "newvl_end"?

khchen added a comment.Jun 7 2022, 7:06 PM

Store to null will be changed to unreachable, so all instructions after vleff intrinsic call will be deleted and it causes runtime errors. If destination to store is null, we won't extract and store the new vl.

Yes, but only for vleff instructions, since it has two outputs actually. And this behavior is compatible with GCC. If necessary, I will propose it to rvv-intrinsic-doc.

Compiling with -O0, I didn't see this behavior, so are you trying to make the optimized code behavior is compatible with GCC?

In addition, it seems this behavior also exist in scalar code, do we also need to make the scalar result is compatible with GCC?
ex.

int foo (int j){
    int *k = nullptr;
    *k = 10;
    return 100;
}

compiling with -O3, llvm generates empty function but gcc generates

foo(int):
        sw      zero,0(zero)
        ebreak

https://godbolt.org/z/46vGrzs49

Please correct me if I misunderstand something, thanks.

pcwang-thead edited the summary of this revision. (Show Details)

Rename BasicBlocks.

pcwang-thead added a comment.EditedJun 8 2022, 3:08 AM

Store to null will be changed to unreachable, so all instructions after vleff intrinsic call will be deleted and it causes runtime errors. If destination to store is null, we won't extract and store the new vl.

Yes, but only for vleff instructions, since it has two outputs actually. And this behavior is compatible with GCC. If necessary, I will propose it to rvv-intrinsic-doc.

Compiling with -O0, I didn't see this behavior, so are you trying to make the optimized code behavior is compatible with GCC?
In addition, it seems this behavior also exist in scalar code, do we also need to make the scalar result is compatible with GCC?

Store to nullptr is UB and it seems that LLVM and GCC have different behaviors. As shown in https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Utils/Local.cpp#L2317, LLVM (simplifycfg in particular) will turn stores to nullptr into unreachable. When compiling with -O0, no optimization is done, so we can't see this behavior.
What I am doing in this patch is not making store to nullptr a defined behavior, but changing the way to store the new vl in vleff intrinsics (only do it if destination to store is not null).
Before:

new_vl = vleff(...)
*dst = new_vl;

Now:

new_vl = vleff(...)
if (dst) {
  *dst = new_vl;
}

So that we won't generate store to nullptr if we pass a nullptr to vleff intrinsics. This behavior is compatible with GCC.

As for the difference when compiling with -O3, it is because GCC handles this UB (store to nullptr) in other way. But as you can see, sw zero,0(zero) will cause interruption (runtime errors). I don't know if we should make LLVM and GCC with the same behavior to handle stores to null, which is beyond this patch.

ex.

int foo (int j){
    int *k = nullptr;
    *k = 10;
    return 100;
}

compiling with -O3, llvm generates empty function but gcc generates

foo(int):
        sw      zero,0(zero)
        ebreak

https://godbolt.org/z/46vGrzs49

Please correct me if I misunderstand something, thanks.

khchen added a comment.Jun 8 2022, 7:11 AM

IMO, if I'm an user, I would not expected intrinsic function will generate the condition code to impact the performance, maybe we need to raise a issue in rvv-intrinsic-doc.

maybe another way is adding a note in intrinsic document to address that the vl could not be a null pointer.

How about the segment load? Does it make sense to add null pointer checking for all argument v0~vN?

Add a test to show optimizaion of store to destination known to be not null.

pcwang-thead edited the summary of this revision. (Show Details)Jun 8 2022, 10:51 PM

IMO, if I'm an user, I would not expected intrinsic function will generate the condition code to impact the performance, maybe we need to raise a issue in rvv-intrinsic-doc.
maybe another way is adding a note in intrinsic document to address that the vl could not be a null pointer.

I understand your concern about impact on performance and I added a test to show that the comparison and branch instruction can be optimized if destination is known to be not null (which is common in most scenarios using vleff).
For example:

size_t new_vl;
v=vleff(..., &new_vl, vl);
//usage of new_vl.

When doing instsimplify, LLVM knows new_vl won't be null. Comparison is evaluated to be true and branch instruction will be changed to unconditional branch (which will be simplified by simplifycfg).
So I think there is no impact on performance for most scenarios. For some cases that we don't know whether destination is null or not (like using destination passed by argument in vleff.c, but I think it is not normal usage and IPO may help), it is a potential risk that we may end in program crashing if we don't do null pointer checking.

How about the segment load? Does it make sense to add null pointer checking for all argument v0~vN?

That's a good point and I have thought it before. Here is my point but we can have more discussion.
I think we don't need null pointer checking for non-tuple segment loads. One of the reason is that we have said (but not documented?) that pointers can't be null in https://github.com/riscv-non-isa/rvv-intrinsic-doc/issues/95. Besides, I think we should treat the output of segment loads as whole structure though we have non-tuple intrinsics for now. The reason why we add null pointer checking for vleff is because the outputs of vleff are of two different types and we just want to ignore the output new vl.

pcwang-thead edited the summary of this revision. (Show Details)

Add null pointer checking for vlsegff.

pcwang-thead retitled this revision from [RISCV] Extract and store new vl of vleff iff destination isn't null to [RISCV] Extract and store new vl of vleff/vlsegff iff destination isn't null.Jun 9 2022, 1:58 AM
pcwang-thead edited the summary of this revision. (Show Details)

Update test rvv-intrinsics-overloaded/vlsegff.c.

Ping. Any comments?

reames requested changes to this revision.Jun 20 2022, 8:27 AM
reames added a subscriber: reames.

Despite the comments above, the purpose of this patch remains unclear.

Per the draft spec, the relevant wording is:
"These instructions execute as a regular load except that they will only take a trap caused by a synchronous exception
on element 0. If element 0 raises an exception, vl is not modied, and the trap is taken. If an element > 0 raises an
exception, the corresponding trap is not taken, and the vector length vl is reduced to the index of the element that would
have raised an exception."

Working through the scenario in this patch with the destination being null, the expected result is for a trap to be generated (provided null is unmapped of course), and VL not to be modified. In order for this change to make any difference in runtime behavior, the value passed must be null (or otherwise guaranteed to fault). It seems very odd to me that we are modifying code which only runs after an instruction which is guaranteed to fault. Is there an assumed runtime here which is e.g. restarting execution?

Presumably, the actual IR instruction returns the unmodified VL in the faulting first access case. (If it doesn't, that's a bug we should fix.)

The last point, and it's a critical one, is that the outparam for new_vl does not have to be null if dest is. Given that, I think the entire prior discussion on motivation here is off base. Unless you can point to something in the intrinsic docs which says *explicitly* that the new_vl param to the intrinsic can be null when the dest is known to fault, I think we should strongly reject this patch. Even if you can, I think we should first ask if that's a bug in the intrinsic spec.

This revision now requires changes to proceed.Jun 20 2022, 8:27 AM

Could you please purpose this implement in rvv-intrinsc-doc first?
I think this feature need to have discussion because store to nullptr is UB but we are making it as defined behavior only for these intrinsics.
Personally I like they have consistent behavior and in document side we just make a note for users that vl should not be a null pointer.

Despite the comments above, the purpose of this patch remains unclear.

Per the draft spec, the relevant wording is:
"These instructions execute as a regular load except that they will only take a trap caused by a synchronous exception
on element 0. If element 0 raises an exception, vl is not modied, and the trap is taken. If an element > 0 raises an
exception, the corresponding trap is not taken, and the vector length vl is reduced to the index of the element that would
have raised an exception."

Working through the scenario in this patch with the destination being null, the expected result is for a trap to be generated (provided null is unmapped of course), and VL not to be modified. In order for this change to make any difference in runtime behavior, the value passed must be null (or otherwise guaranteed to fault). It seems very odd to me that we are modifying code which only runs after an instruction which is guaranteed to fault. Is there an assumed runtime here which is e.g. restarting execution?

dst in the patch description is not the pointer being loaded, it's the pointer of where to store the new_vl. That is only thing being checked for null in this patch.

Could you please purpose this implement in rvv-intrinsc-doc first?
I think this feature need to have discussion because store to nullptr is UB but we are making it as defined behavior only for these intrinsics.
Personally I like they have consistent behavior and in document side we just make a note for users that vl should not be a null pointer.

https://github.com/riscv-non-isa/rvv-intrinsic-doc/issues/153

Despite the comments above, the purpose of this patch remains unclear.

Per the draft spec, the relevant wording is:
"These instructions execute as a regular load except that they will only take a trap caused by a synchronous exception
on element 0. If element 0 raises an exception, vl is not modied, and the trap is taken. If an element > 0 raises an
exception, the corresponding trap is not taken, and the vector length vl is reduced to the index of the element that would
have raised an exception."

Working through the scenario in this patch with the destination being null, the expected result is for a trap to be generated (provided null is unmapped of course), and VL not to be modified. In order for this change to make any difference in runtime behavior, the value passed must be null (or otherwise guaranteed to fault). It seems very odd to me that we are modifying code which only runs after an instruction which is guaranteed to fault. Is there an assumed runtime here which is e.g. restarting execution?

dst in the patch description is not the pointer being loaded, it's the pointer of where to store the new_vl. That is only thing being checked for null in this patch.

Thanks for your kind explanation! @craig.topper
I did this change just because previous GCC implementation have done null pointer checking. Some of our codebases crashed when we switched to LLVM because new_vl is set to NULL to ignore output new vl. : ) @reames

dst in the patch description is not the pointer being loaded, it's the pointer of where to store the new_vl. That is only thing being checked for null in this patch.

I agree this is a possible interpretation, but it's also inconsistent with some of the other review discussion above.

@pcwang-thead Please consider clarifying the patch description a blocking item for future review or potential approval.

dst in the patch description is not the pointer being loaded, it's the pointer of where to store the new_vl. That is only thing being checked for null in this patch.

I agree this is a possible interpretation, but it's also inconsistent with some of the other review discussion above.

@pcwang-thead Please consider clarifying the patch description a blocking item for future review or potential approval.

I don't believe it was meant to be inconsistent, but I agree it could be confusing. All of the prior discussion is talking about the C intrinsic interface which has two pointers, the pointer to load from, and the outparam pointer for new_vl. All mentions of a pointer being null were only ever meant to refer to the outparam pointer.

craig.topper retitled this revision from [RISCV] Extract and store new vl of vleff/vlsegff iff destination isn't null to [RISCV] Extract and store new vl of vleff/vlsegff iff new_vl output pointer isn't null.Jun 21 2022, 11:22 AM
craig.topper edited the summary of this revision. (Show Details)

I have attempted to revise the title and description. Please take a look @pcwang-thead and @reames

craig.topper edited the summary of this revision. (Show Details)Jun 21 2022, 11:24 AM
craig.topper edited the summary of this revision. (Show Details)
craig.topper edited the summary of this revision. (Show Details)
craig.topper edited the summary of this revision. (Show Details)

@craig.topper Much clearer, thank you.

Ok, with the revised description, let me start anew in my response.

I don't have any particular problem with this change. I do think it would be good to explicitly update the docs to indicate whether the param can be null or not, but given a) gcc has allowed it, and b) the performance impact should be minimal, I would see no concerns with just documenting the current gcc state and implementing it in clang. That seems like the path of least resistance here and is a reasonable outcome.

I'm not worried about perf impact from the null check as idiomatic usage is likely to involve taking the address of a local variable which will be trivially non-null.

pcwang-thead edited the summary of this revision. (Show Details)Jun 22 2022, 12:31 AM
pcwang-thead abandoned this revision.Jun 23 2022, 2:04 AM

After discussion, we decide to not change anything right now.