This is an archive of the discontinued LLVM Phabricator instance.

[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default
ClosedPublic

Authored by hyeongyukim on Jun 29 2021, 9:37 PM.

Details

Summary

Turning on enable_noundef_analysis flag allows better codegen by removing freeze instructions.
I modified clang by renaming enable_noundef_analysis flag to disable-noundef-analysis and turning it off by default.

Test updates are made as a separate patch: D108453

Diff Detail

Event Timeline

hyeongyukim created this revision.Jun 29 2021, 9:37 PM
hyeongyukim requested review of this revision.Jun 29 2021, 9:37 PM
hyeongyukim edited the summary of this revision. (Show Details)

A large amount of testing will be affected, and I will update it soon.

Changed disable_noundef_args flag to enable_noundef_args to enable emitting noundef attributes on IR call arguments and return values by default.
Changed test codes will be attached at another PR.

hyeongyukim retitled this revision from [Clang/Test]: Enable enable_noundef_analysis as default to [Clang/Test]: Enable enable_noundef_analysis by default.Aug 20 2021, 3:01 AM
hyeongyukim edited the summary of this revision. (Show Details)
aqjune updated this revision to Diff 368033.Aug 22 2021, 10:38 PM

Rename disable-noundef-args to disable-noundef-analysis

aqjune retitled this revision from [Clang/Test]: Enable enable_noundef_analysis by default to [Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default.
aqjune edited the summary of this revision. (Show Details)

Hello all,

I'm a messenger of @hyeongyukim's two patches that allow clang to turn on noundef analysis by default.
The noundef analysis flag was added by D81678 in the past. Its goal is to mark arguments and return values in C/C++ as noundef if legal.
Attaching noundef is beneficial because it allows quite a few optimizations that are unsound w.r.t. undef or poison (they are usually guarded with isGuaranteedNotToBeUndefOrPoison check).
Since the fact that arg/ret values are noundef is derived from the source language (C/C++)'s specification, the information is permanently lost unless explicitly attached by clang.

Previously, the flag was not activated by default because it required a lot of tests to be updated, possibly raising conflicts with downstream patches (discussed in this thread: D82317)
To handle the conflict, I'd like to update requested tests to simply adding -disable-noundef-analysis at // RUN: %clang_cc1 ... rather than fixing the whole text.
I'll talk about this more in the second patch (D108453).

kda added a subscriber: kda.Sep 1 2021, 3:14 PM
nlopes added a comment.Sep 7 2021, 2:37 AM

(repost my message from llvm-dev)

I can add one thought regarding why this direction makes sense and why it doesn't constrain optimizations.

Traditionally we don't want to mark too many things as UB as it restricts code movement and thus limits optimizations. That's why we have poison for e.g. signed arithmetic overflow rather than using UB as allowed by the C standard.
For function calls, however, optimizers are already super constrained: in general we cannot move them around because they may not return, they may throw an exception, they may modify memory, do a syscall, and so on. So adding another restriction to function calls isn't a big deal; optimizers don't touch them that much anyway.

This proposal adds more UB, as it turns undef/poison into UB on function calls, but that doesn't limit optimizations. So it seems like a good tradeoff: we learn some values can't be undef/poison "for free". Plus that UB can be dropped if needed; dropping noundef is legal! So there are really no downsides.
That's why I believe this is a good direction for clang.

From the users perspective, hopefully the sanitizers can already flag related issues so hopefully this won't lead to hard-to-debug UB bugs.

I did some experiments to confirm the benefits of adding a noundef attribute to the function parameter. (test result link)

One of the most significant advantages of this patch is that it can improve performance by removing unnecessary Freeze.
To check whether the performance is improved, first, I checked how much performance regression occurred on SPECrate2017 when we fixed the miscompilation problem of LoopUnswitch through the Freeze (D106041.)
Then, I applied this patch to see if the unnecessary Freeze was removed and performance was improved.
From the experimental results, the following facts were found.

  • Fixing only LoopUnswitch(D106041) reduces runtime performance and increases object size.
  • Freeze added that fixing the LoopUnswitch interferes with other optimizations, increasing the number of instructions. (There is still a problem with LoopUnswitch due to these performance problems)
  • Applying this patch improves runtime and makes the object size almost the same as the LLVM trunk.
  • Knowing that Function arguments are not noundef/poison, some freezes were removed, and performance and binary size have been improved.

The second sheet compares the performance of the LLVM trunk and this patch, but there is no noticeable change.
In the last sheet, about 2,000 LLVM test-suites were compiled, and the object size was compared. In most cases, the object size was improved.

I think many advantages can be obtained by applying this patch.

@hyeongyukim, thank you for the summary. This looks like a great change, and a net positive to me. The test churn in the other patch is unfortunate, but IMHO unavoidable.

If no one has any further objections,
LGTM

aqjune added a comment.Oct 7 2021, 6:16 PM

Thank you!

I'll send a mail to cfe-dev + llvm-dev to notify the change (and gather any possible concern if exists).

If there is no objection in a week, let's land these patches.

Thank you!
In the meantime, I will rebase this patch and resolve conflicts.

There has been no concern in a week, so I'd like to land this patch and D108453 in this weekend.
I'll carefully watch the buildbots and address failures if any.
@eugenis could you accept this patch and D108453 please, if you don't mind?

eugenis accepted this revision.Oct 15 2021, 9:54 AM
This revision is now accepted and ready to land.Oct 15 2021, 9:54 AM

I've bisected a crash in generated code down to this commit. The code that crashes is clean when run in ubsan. The misbehaviour happens across 4 tested architectures (i686, x86_64, armv7, aarch64).

The misbehaviour occurs in this preprocessed source, https://martin.st/temp/seek-preproc.c, compiled with clang -target x86_64-linux-gnu -c -O3 seek-preproc.c.

The whole failing testcase can be reproduced (on e.g. linux) with these steps:

git clone git://source.ffmpeg.org/ffmpeg
cd ffmpeg
./configure --cc=clang
make -j4 fate-seek-acodec-flac

(The failing object file is libavformat/seek.o.)

I can see that @ff_seek_frame_binary is the only affected function.
It introduces llvm.assume as well as !nonnull at a few places and folds null pointer checks.
Still investigating..

It seems the original code has a use of an uninitialized variable.
Line 4420 at seek-preproc.c (function ff_seek_frame_binary):

 int64_t pos_min=pos_min, pos_max=pos_max, pos, pos_limit; // pos_min and pos_max are self-assigned.
...
if (sti->index_entries) {
   ...
}
// pos_min and pos_max are used as arguments below
pos = ff_gen_search(s, stream_index, target_ts, pos_min, pos_max, pos_limit,
                     ts_min, ts_max, flags, &ts, avif->read_timestamp);

https://gist.github.com/aqjune/3bd0ea19bbc12b4744843c0c070e994c

If the branch is not taken, pos_min and pos_max are read while they are still uninitialized.

I guess the variables are self-assigned to avoid warnings?

It seems the original code has a use of an uninitialized variable.
Line 4420 at seek-preproc.c (function ff_seek_frame_binary):

 int64_t pos_min=pos_min, pos_max=pos_max, pos, pos_limit; // pos_min and pos_max are self-assigned.
...
if (sti->index_entries) {
   ...
}
// pos_min and pos_max are used as arguments below
pos = ff_gen_search(s, stream_index, target_ts, pos_min, pos_max, pos_limit,
                     ts_min, ts_max, flags, &ts, avif->read_timestamp);

https://gist.github.com/aqjune/3bd0ea19bbc12b4744843c0c070e994c

If the branch is not taken, pos_min and pos_max are read while they are still uninitialized.

I guess the variables are self-assigned to avoid warnings?

Yes, I believe so. If the branch is not taken, pos_min and pos_max are undefined when entering ff_gen_search. (I would assume that their value isn't ever used within ff_gen_search in that case.) But regardless of that, in this case, the generated code crashes around this line, https://gist.github.com/aqjune/3bd0ea19bbc12b4744843c0c070e994c#file-ff_seek_frame_binary-c-L39, before entering ff_gen_search - and within that branch, those variables are properly set before they're used.

Yes, I believe so. If the branch is not taken, pos_min and pos_max are undefined when entering ff_gen_search. (I would assume that their value isn't ever used within ff_gen_search in that case.) But regardless of that, in this case, the generated code crashes around this line, https://gist.github.com/aqjune/3bd0ea19bbc12b4744843c0c070e994c#file-ff_seek_frame_binary-c-L39, before entering ff_gen_search - and within that branch, those variables are properly set before they're used.

I see, the crash is happening at the line because SimplifyCFG removes the sti->index_entries null pointer check (which seems valid to me).
If stl->index_entries was null, it would lead to uses of uninitialized variables as function arguments, which is UB.
SimplifyCFG relies on the information and removes the stl->index_entries null check.

int64_t pos_min=pos_min, pos_max=pos_max, pos, pos_limit; // pos_min and pos_max are undef
...
if (sti->index_entries) {
   ... (dereference sti->index_entries+index)
   ... (initialize pos_min and pos_max)
}
// If sti->index_entries was NULL, UB must happen at the call below because undef is passed to ff_gen_search's noundef arg.
pos = ff_gen_search(s, stream_index, target_ts, pos_min, pos_max, pos_limit,
                     ts_min, ts_max, flags, &ts, avif->read_timestamp);

SimplifyCFG optimizes this to the code below:

int64_t pos_min=pos_min, pos_max=pos_max, pos, pos_limit;
...
if (true) { // Changed to true!
   ... (dereference sti->index_entries+index) // This can crash.
   ... (initialize pos_min and pos_max)
}

pos = ff_gen_search(s, stream_index, target_ts, pos_min, pos_max, pos_limit,
                     ts_min, ts_max, flags, &ts, avif->read_timestamp);

I think the solution is to initialize pos_min and pos_max to some value.
If sti->index_entries is null, they are never used inside ff_gen_search anyway, so initializing it into any value (e.g. 0) will work.

I see, the crash is happening at the line because SimplifyCFG removes the sti->index_entries null pointer check (which seems valid to me).
If stl->index_entries was null, it would lead to uses of uninitialized variables as function arguments, which is UB.
SimplifyCFG relies on the information and removes the stl->index_entries null check.

Thanks! That explains it. Although the actual crash is due to yet another removed condition:

int64_t pos_min=pos_min, pos_max=pos_max, pos, pos_limit; // pos_min and pos_max are undef
...
if (sti->index_entries) {
   index = av_index_search_timestamp();
   if (index >= 0) {
       ... (dereference sti->index_entries+index)
       ... (initialize pos_min and pos_max)
   }
}
// If sti->index_entries was NULL, UB must happen at the call below because undef is passed to ff_gen_search's noundef arg.
pos = ff_gen_search(s, stream_index, target_ts, pos_min, pos_max, pos_limit,
                     ts_min, ts_max, flags, &ts, avif->read_timestamp);

Both of the nested if conditions are removed:

int64_t pos_min=pos_min, pos_max=pos_max, pos, pos_limit;
...
if (true) { // Changed to true!
   index = av_index_search_timestamp();
   if (true) { // Also changed to true
       ... (dereference sti->index_entries+index) // This can crash with index = -1
       ... (initialize pos_min and pos_max)
   }
}
pos = ff_gen_search(s, stream_index, target_ts, pos_min, pos_max, pos_limit,
                     ts_min, ts_max, flags, &ts, avif->read_timestamp);

I think the solution is to initialize pos_min and pos_max to some value.
If sti->index_entries is null, they are never used inside ff_gen_search anyway, so initializing it into any value (e.g. 0) will work.

Yes, any value should be fine (and I guess 0 is the easiest to optimize for the compiler). Just as background - this is in a codebase that really tries to avoid default variable initializations unless they're proven to be necessary. But here they clearly are due to the UB rules of the language.

This change is causing a lot of failures in the address sanitiser tests on the 2-stage AArch64 buildbots. For example: https://lab.llvm.org/buildbot/#/builders/179/builds/1326

I can reproduce the failures on another AArch64 machine, they only happen with a 2-stage build, the first stage tests pass fine. I've verified that reverting this (and the related test patches) does make the failures go away.

Is there enough detail in the buildbot logs for you to investigate this, or is there any extra stuff (logs, intermediate files) I can send you to help? Would it be OK to revert this change until these failures are fixed?

This change caused libclc to fail to build on old ubuntu (using llvm-spir 10.0.0-1)
See:
https://bugs.llvm.org/show_bug.cgi?id=52200

aqjune added a comment.EditedOct 18 2021, 7:02 AM

I will revert this patch, since its fix needs some time for me to investigate.
I have a colleague who has access to an AArch64 server, so I can give it a try by myself.

I checked the reason for failure in address sanitizer tests on the 2-stage aarch64 buildbots.
The buildbot failure was occured because the internal_clone function of the compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp file is being compiled incorrectly.
The internal_clone function is a simple function that calls the clone system call of Linux. Its original return value should be the PID of the newly created process, but the actual returned value is 220 (which is the __NR_clone value.)

The aarch64 assembly changed by this patch is as follows.

// before
84: d2801b88  mov x8, #0xdc                   // #0xdc(220): system call number of clone
88: d4000001  svc #0x0                        // system call
...
a4: a9434ff4  ldp x20, x19, [sp, #48]
a8: a94257f6  ldp x22, x21, [sp, #32]
ac: a9415ff8  ldp x24, x23, [sp, #16]
b0: a8c467fe  ldp x30, x25, [sp], #64
b4: d65f03c0  ret

=========================
// after
88: d2801b88  mov x8, #0xdc                   // #0xdc(220): system call number of clone
8c: d4000001  svc #0x0                        // system call
...
a8: a9434ff4  ldp x20, x19, [sp, #48]
ac: aa0803e0  mov x0, x8                      // return value(x0) was overwritten by 0xdc(220)
b0: a94257f6  ldp x22, x21, [sp, #32]
b4: a9415ff8  ldp x24, x23, [sp, #16]
b8: a8c467fe  ldp x30, x25, [sp], #64
bc: d65f03c0  ret

Does anyone know why the internal_clone function of aarch64 is affected by this patch?

aqjune added a comment.Nov 1 2021, 8:53 PM

I am not familiar with inline assembly, but it seems the output variable (%0) is not updated because it does not appear in the code.

1382   __asm__ __volatile__(
1383                        "mov x0,x2\n" /* flags  */
1384                        "mov x2,x4\n" /* ptid  */
1385                        "mov x3,x5\n" /* tls  */
1386                        "mov x4,x6\n" /* ctid  */
1387                        "mov x8,%9\n" /* clone  */
1388 
1389                        "svc 0x0\n"
1390 
1391                        /* if (%r0 != 0)
1392                         *   return %r0;
1393                         */
1394                        "cmp x0, #0\n"
1395                        "bne 1f\n"
1396 
1397                        /* In the child, now. Call "fn(arg)". */
1398                        "ldp x1, x0, [sp], #16\n"
1399                        "blr x1\n"
1400 
1401                        /* Call _exit(%r0).  */
1402                        "mov x8, %10\n"
1403                        "svc 0x0\n"
1404                      "1:\n"
1405 
1406                        : "=r" (res)
1407                        : "i"(-EINVAL),
1408                          "r"(__fn), "r"(__stack), "r"(__flags), "r"(__arg),
1409                          "r"(__ptid), "r"(__tls), "r"(__ctid),
1410                          "i"(__NR_clone), "i"(__NR_exit)
1411                        : "x30", "memory");

Should %0 be updated?

You are absolutely right. X86 variant uses an "=a" constraint (rax register), others pin the output variable to a specific register with asm declaration. It appears we've missed it in Aarch64.

Could you check if asm("x0") in the declaration of res helps?

aqjune added a comment.Nov 3 2021, 1:40 AM

You are absolutely right. X86 variant uses an "=a" constraint (rax register), others pin the output variable to a specific register with asm declaration. It appears we've missed it in Aarch64.

Could you check if asm("x0") in the declaration of res helps?

Thank you, It worked!
Only a couple of failures in ASAN left, and we are investigating them.

diff --git a/compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp b/compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp
index ea3e5bdbc754..826c6d36e1b1 100644
--- a/compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_linux.cpp
@@ -1360,7 +1360,7 @@ uptr internal_clone(int (*fn)(void *), void *child_stack, int flags, void *arg,
 #elif defined(__aarch64__)
 uptr internal_clone(int (*fn)(void *), void *child_stack, int flags, void *arg,
                     int *parent_tidptr, void *newtls, int *child_tidptr) {
-  long long res;
+  register long long res __asm__("x0");
   if (!fn || !child_stack)
     return -EINVAL;
   CHECK_EQ(0, (uptr)child_stack % 16);

After modifying internal_clone like this, the problem disappeared.
Is it okay to commit this change by myself?

-  long long res;
+  register long long res __asm__("x0");

Is it okay to commit this change by myself?

Yes, go ahead!

Either this or D108453 (which were committed together!) caused this assert according to my git-bisect: https://godbolt.org/z/4rqYKfW7K

NOTE that this fails in a lit-test for me, clang CodeGen/ifunc.c (though my downstream seems to run the verifier even with -emit-llvm, so you might need to just swap it to an -emit-obj to get this to repro).

If you cannot fix this quickly, let me know and I can revert it.

IR for this looks like:

[ekeane1@scsel-clx-24 llvm]$ /localdisk2/ekeane1/workspaces/xmain-web/builds/xmainefi2linux_debug/llvm/bin/clang -cc1 -internal-isystem /localdisk2/ekeane1/workspaces/xmain-web/builds/xmainefi2linux_debug/llvm/lib/clang/14.0.0/include -nostdsysteminc -triple i386-unknown-linux-gnu -emit-llvm -o - /localdisk2/ekeane1/workspaces/xmain-web/llvm/clang/test/CodeGen/ifunc.c -disable-llvm-passes
; ModuleID = '/localdisk2/ekeane1/workspaces/xmain-web/llvm/clang/test/CodeGen/ifunc.c'
source_filename = "/localdisk2/ekeane1/workspaces/xmain-web/llvm/clang/test/CodeGen/ifunc.c"
target datalayout = "e-m:e-p:32:32-p270:32:32-p271:32:32-p272:64:64-f64:32:64-f80:32-n8:16:32-S128"
target triple = "i386-unknown-linux-gnu"

@global = global i32 0, align 4

@foo = ifunc i32 (i32), i32 (i32)* ()* @foo_ifunc
@goo = ifunc void (), bitcast (i8* ()* @goo_ifunc to void ()* ()*)

; Function Attrs: noinline nounwind optnone
define internal i32 (i32)* @foo_ifunc() #0 {
entry:
  %0 = load i32, i32* @global, align 4
  %tobool = icmp ne i32 %0, 0
  %1 = zext i1 %tobool to i64
  %cond = select i1 %tobool, i32 (i32)* @f1, i32 (i32)* @f2
  ret i32 (i32)* %cond
}

; Function Attrs: noinline nounwind optnone
define dso_local i32 @bar() #0 {
entry:
  %call = call i32 @foo(i32 noundef 1)
  ret i32 %call
}

; Function Attrs: noinline nounwind optnone
define dso_local void @bar2() #0 {
entry:
  call void @goo()
  ret void
}

; Function Attrs: noinline nounwind optnone
define dso_local i8* @goo_ifunc() #0 {
entry:
  ret i8* null
}

; Function Attrs: noinline nounwind optnone
define internal i32 @f1(i32 noundef %i) #0 {
entry:
  %i.addr = alloca i32, align 4
  store i32 %i, i32* %i.addr, align 4
  %0 = load i32, i32* %i.addr, align 4
  %add = add nsw i32 %0, 1
  ret i32 %add
}

; Function Attrs: noinline nounwind optnone
define internal i32 @f2(i32 noundef %i) #0 {
entry:
  %i.addr = alloca i32, align 4
  store i32 %i, i32* %i.addr, align 4
  %0 = load i32, i32* %i.addr, align 4
  %add = add nsw i32 %0, 2
  ret i32 %add
}

attributes #0 = { noinline nounwind optnone "frame-pointer"="none" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+cx8,+x87" }

!llvm.module.flags = !{!0, !1}
!llvm.ident = !{!2}

!0 = !{i32 1, !"NumRegisterParameters", i32 0}
!1 = !{i32 1, !"wchar_size", i32 4}
!2 = !{!"Intel(R) oneAPI DPC++/C++ Compiler 2022.1.0 (2022.x.0.YYYYMMDD)"}

Either this or D108453 (which were committed together!) caused this assert according to my git-bisect: https://godbolt.org/z/4rqYKfW7K

NOTE that this fails in a lit-test for me, clang CodeGen/ifunc.c (though my downstream seems to run the verifier even with -emit-llvm, so you might need to just swap it to an -emit-obj to get this to repro).

If you cannot fix this quickly, let me know and I can revert it.

IR for this looks like:

[ekeane1@scsel-clx-24 llvm]$ /localdisk2/ekeane1/workspaces/xmain-web/builds/xmainefi2linux_debug/llvm/bin/clang -cc1 -internal-isystem /localdisk2/ekeane1/workspaces/xmain-web/builds/xmainefi2linux_debug/llvm/lib/clang/14.0.0/include -nostdsysteminc -triple i386-unknown-linux-gnu -emit-llvm -o - /localdisk2/ekeane1/workspaces/xmain-web/llvm/clang/test/CodeGen/ifunc.c -disable-llvm-passes
; ModuleID = '/localdisk2/ekeane1/workspaces/xmain-web/llvm/clang/test/CodeGen/ifunc.c'
source_filename = "/localdisk2/ekeane1/workspaces/xmain-web/llvm/clang/test/CodeGen/ifunc.c"
target datalayout = "e-m:e-p:32:32-p270:32:32-p271:32:32-p272:64:64-f64:32:64-f80:32-n8:16:32-S128"
target triple = "i386-unknown-linux-gnu"

@global = global i32 0, align 4

@foo = ifunc i32 (i32), i32 (i32)* ()* @foo_ifunc
@goo = ifunc void (), bitcast (i8* ()* @goo_ifunc to void ()* ()*)

; Function Attrs: noinline nounwind optnone
define internal i32 (i32)* @foo_ifunc() #0 {
entry:
  %0 = load i32, i32* @global, align 4
  %tobool = icmp ne i32 %0, 0
  %1 = zext i1 %tobool to i64
  %cond = select i1 %tobool, i32 (i32)* @f1, i32 (i32)* @f2
  ret i32 (i32)* %cond
}

; Function Attrs: noinline nounwind optnone
define dso_local i32 @bar() #0 {
entry:
  %call = call i32 @foo(i32 noundef 1)
  ret i32 %call
}

; Function Attrs: noinline nounwind optnone
define dso_local void @bar2() #0 {
entry:
  call void @goo()
  ret void
}

; Function Attrs: noinline nounwind optnone
define dso_local i8* @goo_ifunc() #0 {
entry:
  ret i8* null
}

; Function Attrs: noinline nounwind optnone
define internal i32 @f1(i32 noundef %i) #0 {
entry:
  %i.addr = alloca i32, align 4
  store i32 %i, i32* %i.addr, align 4
  %0 = load i32, i32* %i.addr, align 4
  %add = add nsw i32 %0, 1
  ret i32 %add
}

; Function Attrs: noinline nounwind optnone
define internal i32 @f2(i32 noundef %i) #0 {
entry:
  %i.addr = alloca i32, align 4
  store i32 %i, i32* %i.addr, align 4
  %0 = load i32, i32* %i.addr, align 4
  %add = add nsw i32 %0, 2
  ret i32 %add
}

attributes #0 = { noinline nounwind optnone "frame-pointer"="none" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+cx8,+x87" }

!llvm.module.flags = !{!0, !1}
!llvm.ident = !{!2}

!0 = !{i32 1, !"NumRegisterParameters", i32 0}
!1 = !{i32 1, !"wchar_size", i32 4}
!2 = !{!"Intel(R) oneAPI DPC++/C++ Compiler 2022.1.0 (2022.x.0.YYYYMMDD)"}

Hmm.. I'll revert it.

Hmm. I wonder if that's related to the problem uncovered by the verifier in https://reviews.llvm.org/D113352.

Prior to the latest revert (fd9b099906c61e46574d1ea2d99b973321fe1d21), the Linux kernel's binary verifier (objtool) points out an issue when building with ThinLTO and -fsanitize=integer-divide-by-zero. I have no idea if this is an issue with the tool or this series. A simplified reproducer:

$ cat ravb_main.i
int ravb_set_gti_ndev_rate;
unsigned int ravb_set_gti_ndev_inc;
void ravb_set_gti_ndev() {
  ravb_set_gti_ndev_inc = 1000000000;
  ravb_set_gti_ndev_inc = ravb_set_gti_ndev_inc / ravb_set_gti_ndev_rate;
  if (ravb_set_gti_ndev_inc)
    _dev_err(ravb_set_gti_ndev_inc);
}

$ clang -std=gnu89 -O2 -flto=thin -fsanitize=integer-divide-by-zero -c -o ravb_main.o ravb_main.i

$ llvm-ar cDPrsT ravb.o ravb_main.o

$ ld.lld -m elf_x86_64 -r -o ravb.lto.o --whole-archive ravb.o

$ ./objtool orc generate --no-fp --no-unreachable --retpoline --uaccess --mcount --module ravb.lto.o
ravb.lto.o: warning: objtool: .text.ravb_set_gti_ndev: unexpected end of section

With LLVM 13.0.0, there is no warning with those commands. The original and reduced .i file, interestingness test, and static objtool binary are available here.

Either this or D108453 (which were committed together!) caused this assert according to my git-bisect: https://godbolt.org/z/4rqYKfW7K

NOTE that this fails in a lit-test for me, clang CodeGen/ifunc.c (though my downstream seems to run the verifier even with -emit-llvm, so you might need to just swap it to an -emit-obj to get this to repro).

The lit-test failure of CodeGen/ifunc.c was not directly related to this patch.
emitIFuncDefinition was creating an incorrect function attribute.
It added the noundef attribute to the function even though there are no parameters (foo_ifunc function of ifunc.c), and it was fixed a few days ago.

The patch that solved this problem is D113352.

The emitIFuncDefinition fucntion incorrectly passes the GlobalDecl of the IFunc itself to the call to GetOrCreateLLVMFunction for creating the resolver, which causes it to be created with a wrong attribute list, which fails Verifier::visitFunction with "Attribute after last parameter!". You'll note that unlike the relationship between aliases and their aliasees, the resolver and the ifunc have different types - the resolver takes no parameters.

Either this or D108453 (which were committed together!) caused this assert according to my git-bisect: https://godbolt.org/z/4rqYKfW7K

NOTE that this fails in a lit-test for me, clang CodeGen/ifunc.c (though my downstream seems to run the verifier even with -emit-llvm, so you might need to just swap it to an -emit-obj to get this to repro).

The lit-test failure of CodeGen/ifunc.c was not directly related to this patch.
emitIFuncDefinition was creating an incorrect function attribute.
It added the noundef attribute to the function even though there are no parameters (foo_ifunc function of ifunc.c), and it was fixed a few days ago.

The patch that solved this problem is D113352.

The emitIFuncDefinition fucntion incorrectly passes the GlobalDecl of the IFunc itself to the call to GetOrCreateLLVMFunction for creating the resolver, which causes it to be created with a wrong attribute list, which fails Verifier::visitFunction with "Attribute after last parameter!". You'll note that unlike the relationship between aliases and their aliasees, the resolver and the ifunc have different types - the resolver takes no parameters.

Thanks for the update! Sorry for the late response, i was on vacation the last two weeks.

Prior to the latest revert (fd9b099906c61e46574d1ea2d99b973321fe1d21), the Linux kernel's binary verifier (objtool) points out an issue when building with ThinLTO and -fsanitize=integer-divide-by-zero. I have no idea if this is an issue with the tool or this series. A simplified reproducer:

$ cat ravb_main.i
int ravb_set_gti_ndev_rate;
unsigned int ravb_set_gti_ndev_inc;
void ravb_set_gti_ndev() {
  ravb_set_gti_ndev_inc = 1000000000;
  ravb_set_gti_ndev_inc = ravb_set_gti_ndev_inc / ravb_set_gti_ndev_rate;
  if (ravb_set_gti_ndev_inc)
    _dev_err(ravb_set_gti_ndev_inc);
}

$ clang -std=gnu89 -O2 -flto=thin -fsanitize=integer-divide-by-zero -c -o ravb_main.o ravb_main.i

$ llvm-ar cDPrsT ravb.o ravb_main.o

$ ld.lld -m elf_x86_64 -r -o ravb.lto.o --whole-archive ravb.o

$ ./objtool orc generate --no-fp --no-unreachable --retpoline --uaccess --mcount --module ravb.lto.o
ravb.lto.o: warning: objtool: .text.ravb_set_gti_ndev: unexpected end of section

With LLVM 13.0.0, there is no warning with those commands. The original and reduced .i file, interestingness test, and static objtool binary are available here.

I checked the reason why Objtool makes a warning.

cont.thread:                                      ; preds = %entry
  tail call void @__ubsan_handle_divrem_overflow(i8* bitcast ({ { [12 x i8]*, i32, i32 }, { i16, i16, [15 x i8] }* }* @1 to i8*), i64 1000000000, i64 0) #3, !nosanitize !8
  br label %if.then

cont:                                             ; preds = %entry
  %div = udiv i32 1000000000, %0
  store i32 %div, i32* @ravb_set_gti_ndev_inc, align 4, !tbaa !4
  %tobool.not = icmp ugt i32 %0, 1000000000
  br i1 %tobool.not, label %if.end, label %if.then

if.then:                                          ; preds = %cont.thread, %cont
  %div3 = phi i32 [ poison, %cont.thread ], [ %div, %cont ]
  %call = tail call i32 (i32, ...) bitcast (i32 (...)* @_dev_err to i32 (i32, ...)*)(i32 noundef %div3) #3
  br label %if.end

This IR code is an IR that has not passed the optimization pass completely.
This code calculates the division only if ravb_set_gti_ndev_rate is non-zero and it calls ubsan_handle_divrem_overflow function to handle UB if ravb_set_gti_ndev_rate is zero.
So far, there is no warning. But a warning occurs when this code passes the SimpleCFG optimization.

cont.thread:                                      ; preds = %entry
  tail call void @__ubsan_handle_divrem_overflow(i8* bitcast ({ { [12 x i8]*, i32, i32 }, { i16, i16, [15 x i8] }* }* @1 to i8*), i64 1000000000, i64 0) #3, !nosanitize !8
  unreachable

cont:                                             ; preds = %entry
  %div = udiv i32 1000000000, %0
  store i32 %div, i32* @ravb_set_gti_ndev_inc, align 4, !tbaa !4
  %tobool.not = icmp ugt i32 %0, 1000000000
  br i1 %tobool.not, label %if.end, label %if.then

if.then:                                          ; preds = %cont
  %call = tail call i32 (i32, ...) bitcast (i32 (...)* @_dev_err to i32 (i32, ...)*)(i32 noundef %div) #3
  br label %if.end

if.end:                                           ; preds = %if.then, %cont
  ret void
}

After it passes the SimplyCFG, the br instruction was changed to the unreachable instruction in cont.thread block.

This patch added noundef to the parameter of the _dev_err function, making the %div3 unable to be Poison.
It is impossible to jump from the cont.thread block to if.then block, so br instruction was changed to unreachable instruction.
It would be nice to remove the unreachable block, but the above IR is not wrong because it is UB when ravb_set_gti_ndev_rate is zero.

There seems to be no existing problem in clang, and I think we can bypass this warning by adding a code that checks whether the gravb_set_gti_ndev_rate is zero or not as follows.

int ravb_set_gti_ndev_rate;
unsigned int ravb_set_gti_ndev_inc;
void ravb_set_gti_ndev() {
  ravb_set_gti_ndev_inc = 1000000000;
  ravb_set_gti_ndev_inc = ravb_set_gti_ndev_inc / ravb_set_gti_ndev_rate;
  if (ravb_set_gti_ndev_rate != 0)
    if (ravb_set_gti_ndev_inc)
      _dev_err(ravb_set_gti_ndev_inc);
}

@nathanchance How about changing the existing test code as above?

@hyeongyukim I am currently offline for the evening but it seems like my reduction might have been too aggressive. It looks like this code comes from ravb_set_gti() (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/renesas/ravb_main.c?h=v5.16-rc7#n2480), which checks that rate is not zero before using it as a divisor. I will see if I can get a reproducer without any undefined behavior such as this.

@hyeongyukim I am currently offline for the evening but it seems like my reduction might have been too aggressive. It looks like this code comes from ravb_set_gti() (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/renesas/ravb_main.c?h=v5.16-rc7#n2480), which checks that rate is not zero before using it as a divisor. I will see if I can get a reproducer without any undefined behavior such as this.

Okay! Thank you for your response.

@hyeongyukim I hand reduced a couple of the translation units that I see issues in so apologies if they are a little verbose.

The full set of warnings:

drivers/net/ethernet/renesas/ravb.lto.o: warning: objtool: .text.ravb_set_gti: unexpected end of section
drivers/net/ethernet/wiznet/w5100.lto.o: warning: objtool: .text.w5100_tx_skb: unexpected end of section
drivers/net/ethernet/wiznet/w5100.lto.o: warning: objtool: .text.w5100_readbuf: unexpected end of section
drivers/video/fbdev/intelfb/intelfb.lto.o: warning: objtool: .text.intelfbhw_validate_mode: unexpected end of section

which individually can be viewed at the following links:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/renesas/ravb_main.c?h=v5.16-rc7#n2480
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/wiznet/w5100.c?h=v5.16-rc7#n798
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/wiznet/w5100.c?h=v5.16-rc7#n513
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/video/fbdev/intelfb/intelfbhw.c?h=v5.16-rc7#n313

I have a reproducer for the first two, as that is all I had time for; if you would like them for the other two, I can get those for you tomorrow.

https://github.com/nathanchance/creduce-files/tree/3699d2b725a305c0f02a7672447b30578115ba51/D105169-ravb_main

$ clang -O2 -flto=thin -fsanitize=integer-divide-by-zero -c -o ravb_main.o ravb_main.i
ravb_main.i:68:5: warning: expression result unused [-Wunused-value]
    __rem;
    ^~~~~
1 warning generated.

$ ld.lld -m elf_x86_64 -r -o ravb.lto.o --whole-archive ravb_main.o

$ ./objtool orc generate --module --no-fp --no-unreachable --uaccess --mcount ravb.lto.o
ravb.lto.o: warning: objtool: .text.ravb_set_gti: unexpected end of section

This one can also be reproduced without LTO:

$ clang -O2 -fsanitize=integer-divide-by-zero -c -o ravb_main.o ravb_main.i

$ ./objtool orc generate --module --no-fp --no-unreachable --uaccess --mcount ravb_main.o
ravb_main.o: warning: objtool: .text: unexpected end of section

As far as I can tell, that function already has a check to avoid dividing by zero.

https://github.com/nathanchance/creduce-files/tree/3699d2b725a305c0f02a7672447b30578115ba51/D105169-w5100

$ clang -O2 -flto=thin -fsanitize=integer-divide-by-zero -c -o w5100.o w5100.i

$ ld.lld -m elf_x86_64 -r -o w5100.lto.o --whole-archive w5100.o

$ w5100.lto.o: warning: objtool: .text.w5100_tx_skb: unexpected end of section

Without LTO, another warning is emitted, with a similar root cause I believe:

$ clang -O2 -fsanitize=integer-divide-by-zero -c -o w5100.o w5100.i

$ ./objtool orc generate --module --no-fp --no-unreachable --uaccess --mcount w5100.o
w5100.o: warning: objtool: w5100_tx_skb() falls through to next function w5100_probe()

In theory, w5100_writebuf() should probably have a check for s0_tx_buf_size being zero but the input is completely controlled by the kernel, as can be seen in w5100_probe() (s0_tx_buf_size cannot be zero). Most if not all kernel maintainers are not going to accept patches adding checks that are superfluous.

Great. I'll check it out.

I have a reproducer for the first two, as that is all I had time for; if you would like them for the other two, I can get those for you tomorrow.

@nathanchance I think the other two can be reproduced without difficulty. If the reproduction fails, I will request it :)

@nathanchance

I tried to reproduce the last warning (intelfbhw_validate_mode), but I failed to produce it.
I think my reproducer is correct, but it does not make any warning.
Can you tell me which part was wrong?

clang -O2 -flto=thin -fsanitize=integer-divide-by-zero -c -o intelfb.o intelfb.i
ld.lld -m elf_x86_64 -r -o intelfb.lto.o --whole-archive intelfb.o
objtool orc generate --module --no-fp --no-unreachable --uaccess --mcount intelfb.lto.o

I use these commands, and I attached the intelfb.i file.

@nathanchance

I tried to reproduce the last warning (intelfbhw_validate_mode), but I failed to produce it.
I think my reproducer is correct, but it does not make any warning.
Can you tell me which part was wrong?

clang -O2 -flto=thin -fsanitize=integer-divide-by-zero -c -o intelfb.o intelfb.i
ld.lld -m elf_x86_64 -r -o intelfb.lto.o --whole-archive intelfb.o
objtool orc generate --module --no-fp --no-unreachable --uaccess --mcount intelfb.lto.o

I use these commands, and I attached the intelfb.i file.

It looks like this particular case also needs -fsanitize-coverage=trace-pc (which comes from CONFIG_KCOV). Once I add that with your reduced reproducer, I see the initial warning.

$ clang -O2 -flto=thin -fsanitize=integer-divide-by-zero -fsanitize-coverage=trace-pc -c -o intelfb.{o,i}

$ ld.lld -m elf_x86_64 -r -o intelfb.lto.o --whole-archive intelfb.o

$ ./objtool orc generate --module --no-fp --no-unreachable --uaccess --mcount intelfb.lto.o
intelfb.lto.o: warning: objtool: .text.intelfbhw_validate_mode: unexpected end of section

@nathanchance
You were right. I succeeded in reproducing the warning with -fsanitize-coverage=trace-pc
However, this problem seems to be fixed in the latest version. (I used 57a551a)

@nathanchance Hi, I analyzed all four warnings.

Warning #1: can be handled by x86-backend. filled issue #53118
Warning #2: bug in the kernel, fixed in the next version.
Warning #3: same reason with #2
Warning #4: It was not reproduced in the latest clang.

As you see, one of this patch's advantages is that it exposed some bugs in the kernel!
All issues are either identified or will be fixed soon. Is it okay to recommit this patch?

@nathanchance Hi, I analyzed all four warnings.

Thanks a lot for doing this!

Warning #1: can be handled by x86-backend. filled issue #53118

This is the warning in drivers/net/ethernet/wiznet/w5100.lto.o if I am reading the issue, right? I would have thought that would be warnings #2 and #3.

Warning #2: bug in the kernel, fixed in the next version.
Warning #3: same reason with #2

I think this refers to drivers/net/ethernet/renesas/ravb.lto.o, which is indeed fixed by https://git.kernel.org/linus/d9f31aeaa1e5aefa68130878af3c3513d41c1e2d.

Warning #4: It was not reproduced in the latest clang.

I can still see this with clang @1441ffe6a6da90e9c4800914eb0005b1087aa5d2 and Linux @ 112450df61b7373529b0fe4c122ad13b89d80a8a built with KCFLAGS="-Xclang -enable-noundef-analysis". Your small reproducer has it as well:

$ clang --version | head -1
ClangBuiltLinux clang version 14.0.0 (https://github.com/llvm/llvm-project 1441ffe6a6da90e9c4800914eb0005b1087aa5d2)

$ clang -O2 -flto=thin -fsanitize=integer-divide-by-zero -fsanitize-coverage=trace-pc -Xclang -enable-noundef-analysis -c -o intelfb.{o,i}

$ ld.lld -m elf_x86_64 -r -o intelfb.lto.o --whole-archive intelfb.o

$ ./objtool orc generate --module --no-fp --no-unreachable --uaccess --mcount intelfb.lto.o
intelfb.lto.o: warning: objtool: .text.intelfbhw_validate_mode: unexpected end of section

As you see, one of this patch's advantages is that it exposed some bugs in the kernel!
All issues are either identified or will be fixed soon. Is it okay to recommit this patch?

I just went to go hammer on this option to how many warnings I see on the next version of Linux ("linux-next") [1] with allmodconfig (which in turn enables CONFIG_UBSAN) + CONFIG_LTO_CLANG_THIN=y . I think I see a lot more warnings there versus Linus' tree as the kernel no longer uses -fsanitize=object-size there, which potentially hides some of these?

$ echo "CONFIG_GCOV_KERNEL=n
CONFIG_KASAN=n
CONFIG_LTO_CLANG_THIN=y
CONFIG_WERROR=n" >allmod.config

$ make -skj"$(nproc)" KCFLAGS="-Xclang -enable-noundef-analysis" KCONFIG_ALLCONFIG=1 LLVM=1 allmodconfig all
...
drivers/video/fbdev/omap2/omapfb/dss/omapdss.lto.o: warning: objtool: .text.dsi_cio_timings: unexpected end of section
drivers/soc/qcom/qcom_rpmh.lto.o: warning: objtool: .text.rpmh_rsc_write_ctrl_data: unexpected end of section
net/tipc/tipc.lto.o: warning: objtool: .text.tipc_nl_compat_link_stat_dump: unexpected end of section
drivers/scsi/mpi3mr/mpi3mr.lto.o: warning: objtool: .text.mpi3mr_op_request_post: unexpected end of section
drivers/md/bcache/bcache.lto.o: warning: objtool: .text.bch_cache_show: unexpected end of section
drivers/scsi/dc395x.lto.o: warning: objtool: .text.msgin_phase0: unexpected end of section
...
fs/ocfs2/cluster/ocfs2_nodemanager.lto.o: warning: objtool: .text.o2hb_setup_one_bio: unexpected end of section
drivers/video/fbdev/intelfb/intelfb.lto.o: warning: objtool: .text.intelfbhw_validate_mode: unexpected end of section
drivers/net/ethernet/wiznet/w5100.lto.o: warning: objtool: .text.w5100_tx_skb: unexpected end of section
drivers/net/ethernet/wiznet/w5100.lto.o: warning: objtool: .text.w5100_rx_skb: unexpected end of section
drivers/gpu/drm/amd/amdgpu/amdgpu.lto.o: warning: objtool: .text.dqm_debugfs_hqds: unexpected end of section
...

Without KCFLAGS="-Xclang -enable-noundef-analysis", I only see the mpi3mr_op_request_post warning, so that is probably unrelated to this (probably some recent change regressed this, I'll see if I can hunt that down).

Without ThinLTO, I see:

$ echo CONFIG_WERROR=n >allmod.config

$ make -skj"$(nproc)" KCFLAGS="-Xclang -enable-noundef-analysis" KCONFIG_ALLCONFIG=1 LLVM=1 allmodconfig all
...
drivers/soc/qcom/rpmh-rsc.o: warning: objtool: rpmh_rsc_write_ctrl_data() falls through to next function trace_raw_output_rpmh_tx_done()
drivers/gpu/drm/radeon/sumo_dpm.o: warning: objtool: sumo_dpm_set_power_state() falls through to next function sumo_dpm_post_set_power_state()
...
drivers/scsi/mpi3mr/mpi3mr_fw.o: warning: objtool: mpi3mr_op_request_post() falls through to next function mpi3mr_check_rh_fault_ioc()
drivers/net/ethernet/wiznet/w5100.o: warning: objtool: w5100_tx_skb() falls through to next function w5100_get_drvinfo()
drivers/net/ethernet/wiznet/w5100.o: warning: objtool: w5100_rx_skb() falls through to next function w5100_mmio_probe()
...

Some of the same warnings, which likely have the same root cause as above but no LTO might make this easier to look into. Without KCFLAGS="-Xclang -enable-noundef-analysis", same as above, I only see the mpi3mr_op_request_post warning.

I can reduce all of these down for you and/or I can start an email thread with the objtool maintainers to see if there is a way to fix or avoid these warnings on the objtool side and include you in that discussion, if LLVM is not really doing anything wrong here. I am by no means an expert in this area and I don't want to delay this anymore but I want to avoid regressing our builds, as objtool regularly helps us spot compiler bugs. Perhaps @nickdesaulniers has some other thoughts?

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/

I can reduce all of these down for you and/or I can start an email thread with the objtool maintainers to see if there is a way to fix or avoid these warnings on the objtool side and include you in that discussion, if LLVM is not really doing anything wrong here. I am by no means an expert in this area and I don't want to delay this anymore but I want to avoid regressing our builds, as objtool regularly helps us spot compiler bugs. Perhaps @nickdesaulniers has some other thoughts?

LLVM is not doing anything wrong here. The issue is that once we have UB, LLVM is not required to lay out the functions nicely. For example, issue #53118 is just a "nice to fix", it's not a bug.
On the other hand, I understand that fixing objtool is likely impossible if we consider this UB thing as assembly doesn't have enough information to distinguish between a compiler bug and a UB case. I just don't know how frequent and/or useful these warnings are.

I don't think we should hold this patch anymore as it's not wrong and the side-effects are understood. We can and should try to reduce those kernel warnings to zero, but we cannot put all that burden on this patch's author.

I can reduce all of these down for you and/or I can start an email thread with the objtool maintainers to see if there is a way to fix or avoid these warnings on the objtool side and include you in that discussion, if LLVM is not really doing anything wrong here. I am by no means an expert in this area and I don't want to delay this anymore but I want to avoid regressing our builds, as objtool regularly helps us spot compiler bugs. Perhaps @nickdesaulniers has some other thoughts?

LLVM is not doing anything wrong here. The issue is that once we have UB, LLVM is not required to lay out the functions nicely. For example, issue #53118 is just a "nice to fix", it's not a bug.
On the other hand, I understand that fixing objtool is likely impossible if we consider this UB thing as assembly doesn't have enough information to distinguish between a compiler bug and a UB case. I just don't know how frequent and/or useful these warnings are.

I don't think we should hold this patch anymore as it's not wrong and the side-effects are understood. We can and should try to reduce those kernel warnings to zero, but we cannot put all that burden on this patch's author.

Fair enough. We will just try to deal with this change on the kernel side in one way or another.

This appears to have triggered some buildbot failures:

https://lab.llvm.org/buildbot/#/builders/5/builds/17461/steps/10/logs/stdio

e.g. in path-mappings.test:

"message": "Unknown argument: '-enable-noundef-analysis'",

This flag isn't being added by clangd code, I'm not sure where it's coming from... maybe somewhere in driver?

Ah, this seems to be coming from the buildbot configuration.

https://github.com/llvm/llvm-zorg/blob/38ab06456f7302b4eab53e5b6f1e2eaf4f127132/zorg/buildbot/builders/sanitizers/buildbot_functions.sh#L140-L149

@vitalybuka looks like sanitizer-buildbot7 might need a config update for this change?

ab added a subscriber: ab.Jan 18 2022, 2:02 PM
ab added inline comments.
clang/lib/CodeGen/CGCall.cpp
2535

Hmm, if I'm reading this right, this overwrites the nonnull dereferenceable align attributes separately computed for this around l2335, right? (or inalloca and sret before that)

It sounds like an ancient bug, that's only exposed because noundef ends up triggering this logic much more often?

Many of our downstream tests hit this. The hacked up patch seems to do the job; ideally we'd feed the previously-computed attrs when constructing the AttrBuilder (which would also fix the padding argument), but we'd need to match up the IR args first. Maybe that's fine as a special-case for arg 0 though

nlopes added inline comments.Jan 18 2022, 2:34 PM
clang/lib/CodeGen/CGCall.cpp
2535

nice catch! It's an ancient bug where arg 0 is overwritten.

DavidSpickett added a subscriber: DavidSpickett.EditedJan 19 2022, 7:04 AM

Hi, this patch has caused a gcc test suite failure on our SVE bots: https://lab.llvm.org/buildbot/#/builders/184/builds/1941
(test source is https://github.com/llvm/llvm-test-suite/blob/main/SingleSource/Regression/C/gcc-c-torture/execute/pr79286.c)

I've only just been made aware of the failure so we will spend some time to find out the reason before asking that anything be changed. Could be something specific to SVE in which case we'll handle it.

Hi, this patch has caused a gcc test suite failure on our SVE bots: https://lab.llvm.org/buildbot/#/builders/184/builds/1941
(test source is https://github.com/llvm/llvm-test-suite/blob/main/SingleSource/Regression/C/gcc-c-torture/execute/pr79286.c)

I've only just been made aware of the failure so we will spend some time to find out the reason before asking that anything be changed. Could be something specific to SVE in which case we'll handle it.

Thanks!
That test has UB in the printf class as it reads a non-initialized variable. Can you check if changing line 6 to 'int e = 0;' solves the problem?

fhahn added a subscriber: fhahn.Jan 19 2022, 12:04 PM
fhahn added inline comments.
clang/lib/CodeGen/CGCall.cpp
2535

Is anybody looking into a fix or should we revert the patch until this can be properly addressed?

nlopes added inline comments.Jan 20 2022, 1:54 AM
clang/lib/CodeGen/CGCall.cpp
2535

I would recommend against reverting this patch because of all the followup patches. There were quite a few commits afterwards plus fixes to buildbot configurations, so it's a non-trivial overhead to revert everything.
I was assuming @ab would fix it as he already root-caused the bug..

fhahn added inline comments.Jan 20 2022, 2:17 AM
clang/lib/CodeGen/CGCall.cpp
2535

FWIW it seems a bit unfortunate that there are some clear regressions when the tests got update, e.g. https://github.com/llvm/llvm-project/commit/1b1c8d83d3567a60280291c0adb95d1d60335509#diff-7e80d593f26f6f6fb24765c6a169884d7350685d565ee970b0a7b9abaf0fb205L355

I'll work with @ab to fix this though rather than reverting, because another revert would cause even more conflicts for us downstream.

sammccall added inline comments.Jan 20 2022, 2:47 AM
clang/lib/CodeGen/CGCall.cpp
2535

FWIW it seems a bit unfortunate that there are some clear regressions when the tests got update, e.g. https://github.com/llvm/llvm-project/commit/1b1c8d83d3567a60280291c0adb95d1d60335509#diff-7e80d593f26f6f6fb24765c6a169884d7350685d565ee970b0a7b9abaf0fb205L355

I'll work with @ab to fix this though rather than reverting, because another revert would cause even more conflicts for us downstream.

Just a reminder that the 14 release is cut soon (1 feb: https://llvm.discourse.group/t/llvm-14-0-0-release-schedule/5846)

I don't know this code; if a clean fix is ready soon and unlikely to have a ripple effect then great. But it does seem risky to be treating such configuration changes as "too big to fail" at this point in the release cycle.

Hi, this patch has caused a gcc test suite failure on our SVE bots: https://lab.llvm.org/buildbot/#/builders/184/builds/1941
(test source is https://github.com/llvm/llvm-test-suite/blob/main/SingleSource/Regression/C/gcc-c-torture/execute/pr79286.c)

I've only just been made aware of the failure so we will spend some time to find out the reason before asking that anything be changed. Could be something specific to SVE in which case we'll handle it.

Thanks!
That test has UB in the printf class as it reads a non-initialized variable. Can you check if changing line 6 to 'int e = 0;' solves the problem?

I can confirm that fixes the issue.

Hi, this patch has caused a gcc test suite failure on our SVE bots: https://lab.llvm.org/buildbot/#/builders/184/builds/1941
(test source is https://github.com/llvm/llvm-test-suite/blob/main/SingleSource/Regression/C/gcc-c-torture/execute/pr79286.c)

I've only just been made aware of the failure so we will spend some time to find out the reason before asking that anything be changed. Could be something specific to SVE in which case we'll handle it.

Thanks!
That test has UB in the printf class as it reads a non-initialized variable. Can you check if changing line 6 to 'int e = 0;' solves the problem?

I can confirm that fixes the issue.

Thank you! I've committed the fix.

nlopes added inline comments.Jan 20 2022, 4:43 AM
clang/lib/CodeGen/CGCall.cpp
2535

Thank you Florian!

fhahn added inline comments.Jan 20 2022, 5:49 AM
clang/lib/CodeGen/CGCall.cpp
2535

should be fixed by 67aa314bcee7

MaskRay added a subscriber: MaskRay.EditedFeb 11 2022, 11:22 AM

It may not be worth changing now, but I want to mention: it's more conventional to have a BoolOption which adds -[no-]noundef-analysis. Since both positive and negative forms exist. When we make the default switch, existing users don't need to change the option. After the option becomes quite stable and the workaround is deemed not useful, we can remove the CC1 option.

It may not be worth changing now, but I want to mention: it's more conventional to have a BoolOption which adds -[no-]noundef-analysis. Since both positive and negative forms exist. When we make the default switch, existing users don't need to change the option. After the option becomes quite stable and the workaround is deemed not useful, we can remove the CC1 option.

+1 to this (changing the name and the default at the same time makes migrations a bit more difficult - if the default is changed without renaming (by having both positive and negative flag names) then users can adopt their current default explicitly with no change ahead of picking up the patch that changes the default) & also this flag seems to have no tests? Could you (@hyeongyukim ) add some frontend test coverage for the flag - and yeah, maybe consider giving it a name that has both explicit on/off names, as @MaskRay suggested? (I think that's useful even after the default switch - since a user might want to override a previous argument on the command line, etc)

It may not be worth changing now, but I want to mention: it's more conventional to have a BoolOption which adds -[no-]noundef-analysis. Since both positive and negative forms exist. When we make the default switch, existing users don't need to change the option. After the option becomes quite stable and the workaround is deemed not useful, we can remove the CC1 option.

+1 to this (changing the name and the default at the same time makes migrations a bit more difficult - if the default is changed without renaming (by having both positive and negative flag names) then users can adopt their current default explicitly with no change ahead of picking up the patch that changes the default) & also this flag seems to have no tests? Could you (@hyeongyukim ) add some frontend test coverage for the flag - and yeah, maybe consider giving it a name that has both explicit on/off names, as @MaskRay suggested? (I think that's useful even after the default switch - since a user might want to override a previous argument on the command line, etc)

Sure, I'll add some tests.
By the way, is it right to change the flag's name to -[no-]enable-noundef-analysis? or would it be better to use -[no-]noundef-analysis as @MaskRay suggested?
I prefer to use -[no-]enable-noundef-analysis to maintain backward compatibility.

It may not be worth changing now, but I want to mention: it's more conventional to have a BoolOption which adds -[no-]noundef-analysis. Since both positive and negative forms exist. When we make the default switch, existing users don't need to change the option. After the option becomes quite stable and the workaround is deemed not useful, we can remove the CC1 option.

+1 to this (changing the name and the default at the same time makes migrations a bit more difficult - if the default is changed without renaming (by having both positive and negative flag names) then users can adopt their current default explicitly with no change ahead of picking up the patch that changes the default) & also this flag seems to have no tests? Could you (@hyeongyukim ) add some frontend test coverage for the flag - and yeah, maybe consider giving it a name that has both explicit on/off names, as @MaskRay suggested? (I think that's useful even after the default switch - since a user might want to override a previous argument on the command line, etc)

Sure, I'll add some tests.
By the way, is it right to change the flag's name to -[no-]enable-noundef-analysis? or would it be better to use -[no-]noundef-analysis as @MaskRay suggested?
I prefer to use -[no-]enable-noundef-analysis to maintain backward compatibility.

For driver and CC1 options, the convention of boolean options is to use [no-], not use enable- or disable-.
That said, -[no-]enable-noundef-analysis sounds good to me since no-noundef-analysis may be odd and -enable-noundef-analysis maintains backward compatibility.

Thank you for your clarification. I'll change it.

Herald added a project: Restricted Project. · View Herald TranscriptOct 9 2022, 4:20 PM