This is an archive of the discontinued LLVM Phabricator instance.

[Clang] Diagnose jumps into statement expressions
ClosedPublic

Authored by cor3ntin on Jul 7 2023, 2:24 AM.

Details

Summary

Such jumps are not allowed by GCC and allowing them
can lead to situations where we jumps into unevaluated
statements.

Fixes #63682

Diff Detail

Event Timeline

cor3ntin created this revision.Jul 7 2023, 2:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 7 2023, 2:24 AM
cor3ntin requested review of this revision.Jul 7 2023, 2:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 7 2023, 2:24 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
cor3ntin added reviewers: aaron.ballman, rsmith, Restricted Project.Jul 7 2023, 2:26 AM
shafik added a subscriber: shafik.Jul 7 2023, 12:11 PM
shafik added inline comments.
clang/docs/ReleaseNotes.rst
642

Maybe remark this conforms with gcc behavior.

clang/lib/Sema/JumpDiagnostics.cpp
478

I guess we don't have a variable that stands or no-diagnostic.

clang/lib/Sema/SemaExpr.cpp
16512

I see why we need this but I am not sure how someone would know that need to do this in addition to the change in BuildScopeInformation(...)

clang/test/Sema/scope-check.c
251

I am sorry for this example:

int main() {
   sizeof(({goto label;}), 0);
   return 3;
  ({label:1;});  
}

see godbolt: https://godbolt.org/z/aeb6TbsoG

Note gcc's behavior Vs clangs.

cor3ntin updated this revision to Diff 538249.Jul 7 2023, 1:44 PM

Address Shafik's feedback.

cor3ntin updated this revision to Diff 538250.Jul 7 2023, 1:47 PM
cor3ntin marked an inline comment as done.

Missed a comment

cor3ntin updated this revision to Diff 538252.Jul 7 2023, 1:48 PM
cor3ntin marked an inline comment as done.

Fix test

cor3ntin marked 2 inline comments as done.Jul 7 2023, 1:50 PM
cor3ntin added inline comments.
clang/docs/ReleaseNotes.rst
642

I added something. should we link to GCC documentation?

clang/lib/Sema/JumpDiagnostics.cpp
478

nope, we use 0 all over the place in that file

clang/lib/Sema/SemaExpr.cpp
16512

I added a comment.
I may have spent some time trying to figure out that i needed that line :)

clang/test/Sema/scope-check.c
251

I added it. It's gnarly :)

aaron.ballman accepted this revision.Jul 11 2023, 11:56 AM

LGTM though there might be a minor typo with a test comment.

clang/test/Sema/asm-goto.cpp
59
This revision is now accepted and ready to land.Jul 11 2023, 11:56 AM
This revision was landed with ongoing or failed builds.Jul 11 2023, 12:41 PM
This revision was automatically updated to reflect the committed changes.
cor3ntin marked 2 inline comments as done.

Thanks for the review :D

This patch breaks building the Linux kernel for PowerPC, which uses asm goto with a local label in a statement expression for the WARN macro. It seems like something with the scoping is going wrong.

I have two reproducers. The first is manually reduced from the kernel sources:

struct rb_node {
	unsigned long __rb_parent_color;
	struct rb_node *rb_right;
	struct rb_node *rb_left;
} __attribute__((aligned(sizeof(long))));

struct rb_root {
	struct rb_node *rb_node;
};
struct kernfs_elem_dir {
	struct rb_root children;
};

struct kernfs_node {
	struct kernfs_elem_dir dir;
	unsigned short		flags;
};

enum kernfs_node_flag {
	KERNFS_ACTIVATED	= 0x0010,
	KERNFS_NS		= 0x0020,
	KERNFS_HAS_SEQ_SHOW	= 0x0040,
	KERNFS_HAS_MMAP		= 0x0080,
	KERNFS_LOCKDEP		= 0x0100,
	KERNFS_HIDDEN		= 0x0200,
	KERNFS_SUICIDAL		= 0x0400,
	KERNFS_SUICIDED		= 0x0800,
	KERNFS_EMPTY_DIR	= 0x1000,
	KERNFS_HAS_RELEASE	= 0x2000,
	KERNFS_REMOVING		= 0x4000,
};

enum kernfs_node_type {
	KERNFS_DIR		= 0x0001,
	KERNFS_FILE		= 0x0002,
	KERNFS_LINK		= 0x0004,
};

#define KERNFS_TYPE_MASK		0x000f

struct bug_entry {
	signed int bug_addr_disp;
	signed int file_disp;
	unsigned short line;
	unsigned short flags;
};

static enum kernfs_node_type kernfs_type(struct kernfs_node *kn)
{
	return kn->flags & KERNFS_TYPE_MASK;
}

void kernfs_enable_ns(struct kernfs_node *kn)
{
	({
		int __ret_warn_on = !!(kernfs_type(kn) != KERNFS_DIR);
		if (__builtin_expect(!!(__ret_warn_on), 0))
			do {
				__label__ __label_warn_on;
				asm goto("1:	"
					 "twi 31, 0, 0"
					 "\n"
					 ".section __ex_table,\"a\";"
					 " "
					 ".balign 4;"
					 " "
					 ".long (1b) - . ;"
					 " "
					 ".long (%l[__label_warn_on]) - . ;"
					 " "
					 ".previous"
					 " "
					 ".section __bug_table,\"aw\"\n"
					 "2:	.4byte 1b - .\n"
					 "	.4byte %0 - .\n"
					 "	.short %1, %2\n"
					 ".org 2b+%3\n"
					 ".previous\n"
					 :
					 : "i"("include/linux/kernfs.h"),
					   "i"(378),
					   "i"((1 << 0) |
					       ((1 << 1) | ((9) << 8))),
					   "i"(sizeof(struct bug_entry))
					 :
					 : __label_warn_on);
				do {
				} while (0);
				__builtin_unreachable();
__label_warn_on:
				break;
			} while (0);
		__builtin_expect(!!(__ret_warn_on), 0);
	});
	({
		int __ret_warn_on = !!(!(
			({
				do {
					__attribute__((
						__noreturn__)) extern void
					__compiletime_assert_244(void)
						__attribute__((__error__(
							"Unsupported access size for {READ,WRITE}_ONCE().")));
					if (!((sizeof((&kn->dir.children)
							      ->rb_node) ==
						       sizeof(char) ||
					       sizeof((&kn->dir.children)
							      ->rb_node) ==
						       sizeof(short) ||
					       sizeof((&kn->dir.children)
							      ->rb_node) ==
						       sizeof(int) ||
					       sizeof((&kn->dir.children)
							      ->rb_node) ==
						       sizeof(long)) ||
					      sizeof((&kn->dir.children)
							     ->rb_node) ==
						      sizeof(long long)))
						__compiletime_assert_244();
				} while (0);
				(*(const volatile typeof(_Generic(
					((&kn->dir.children)->rb_node),
					char: (char)0,
					unsigned char: (unsigned char)0,
					signed char: (signed char)0,
					unsigned short: (unsigned short)0,
					signed short: (signed short)0,
					unsigned int: (unsigned int)0,
					signed int: (signed int)0,
					unsigned long: (unsigned long)0,
					signed long: (signed long)0,
					unsigned long long: (
						unsigned long long)0,
					signed long long: (signed long long)0,
					default: ((&kn->dir.children)->rb_node)))
					   *)&((&kn->dir.children)->rb_node));
			}) == ((void *)0)));
		if (__builtin_expect(!!(__ret_warn_on), 0))
			do {
				__label__ __label_warn_on;
				asm goto("1:	"
					 "twi 31, 0, 0"
					 "\n"
					 ".section __ex_table,\"a\";"
					 " "
					 ".balign 4;"
					 " "
					 ".long (1b) - . ;"
					 " "
					 ".long (%l[__label_warn_on]) - . ;"
					 " "
					 ".previous"
					 " "
					 ".section __bug_table,\"aw\"\n"
					 "2:	.4byte 1b - .\n"
					 "	.4byte %0 - .\n"
					 "	.short %1, %2\n"
					 ".org 2b+%3\n"
					 ".previous\n"
					 :
					 : "i"("include/linux/kernfs.h"),
					   "i"(379),
					   "i"((1 << 0) |
					       ((1 << 1) | ((9) << 8))),
					   "i"(sizeof(struct bug_entry))
					 :
					 : __label_warn_on);
				do {
				} while (0);
				__builtin_unreachable();
__label_warn_on:
				break;
			} while (0);
		__builtin_expect(!!(__ret_warn_on), 0);
	});
	kn->flags |= KERNFS_NS;
}
$ ../install/llvm-good/bin/clang --target=powerpc64le-linux-gnu -c -o /dev/null test.c
test.c:93:3: warning: ignoring return value of function declared with const attribute [-Wunused-value]
   93 |                 __builtin_expect(!!(__ret_warn_on), 0);
      |                 ^~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~
test.c:174:3: warning: ignoring return value of function declared with const attribute [-Wunused-value]
  174 |                 __builtin_expect(!!(__ret_warn_on), 0);
      |                 ^~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~
2 warnings generated.

$ ../install/llvm-bad/bin/clang --target=powerpc64le-linux-gnu -c -o /dev/null test.c
test.c:93:3: warning: ignoring return value of function declared with const attribute [-Wunused-value]
   93 |                 __builtin_expect(!!(__ret_warn_on), 0);
      |                 ^~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~
test.c:174:3: warning: ignoring return value of function declared with const attribute [-Wunused-value]
  174 |                 __builtin_expect(!!(__ret_warn_on), 0);
      |                 ^~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~
test.c:60:5: error: cannot jump from this asm goto statement to one of its possible targets
   60 |                                 asm goto("1:    "
      |                                 ^
test.c:171:1: note: possible target of asm goto statement
  171 | __label_warn_on:
      | ^
test.c:95:2: note: jump enters a statement expression
   95 |         ({
      |         ^
test.c:141:5: error: cannot jump from this asm goto statement to one of its possible targets
  141 |                                 asm goto("1:    "
      |                                 ^
test.c:90:1: note: possible target of asm goto statement
   90 | __label_warn_on:
      | ^
test.c:55:2: note: jump enters a statement expression
   55 |         ({
      |         ^
2 warnings and 2 errors generated.

The second is a simpler one from cvise, which should convey the same idea.

void kernfs_enable_ns() {
  ({
    __label__ __label_warn_on;
    asm goto("" : : : : __label_warn_on);
__label_warn_on:;
  });

  ({
    __label__ __label_warn_on;
    asm goto("" : : : : __label_warn_on);
__label_warn_on:;
  });
}
$ ../install/llvm-good/bin/clang -c -o /dev/null asm-offsets.i

$ ../install/llvm-bad/bin/clang -c -o /dev/null asm-offsets.i
asm-offsets.i:4:5: error: cannot jump from this asm goto statement to one of its possible targets
    4 |     asm goto("" : : : : __label_warn_on);
      |     ^
asm-offsets.i:11:1: note: possible target of asm goto statement
   11 | __label_warn_on:;
      | ^
asm-offsets.i:8:3: note: jump enters a statement expression
    8 |   ({
      |   ^
asm-offsets.i:10:5: error: cannot jump from this asm goto statement to one of its possible targets
   10 |     asm goto("" : : : : __label_warn_on);
      |     ^
asm-offsets.i:5:1: note: possible target of asm goto statement
    5 | __label_warn_on:;
      | ^
asm-offsets.i:2:3: note: jump enters a statement expression
    2 |   ({
      |   ^
2 errors generated.

GCC 13.1.0 has no issues with either of these test cases.

cor3ntin reopened this revision.Jul 12 2023, 11:01 AM

@nathanchance Thanks for letting me know. I did revert the patch and plan to investigate later this week

This revision is now accepted and ready to land.Jul 12 2023, 11:01 AM

The issue is that VerifyIndirectOrAsmJumps does not consider that label may be local label at all, I'm not exactly sure how to improve that.
The fact it works currently seems kind of brittle, what prevent incorrect jumps to be ill-formed is that we looked if a label is defined in the same scope as the __local__ label introduction, but for the purpose of checking jumps this is not helping at all, not sure if there is a way to break the current implementation in funny ways

The issue is that VerifyIndirectOrAsmJumps does not consider that label may be local label at all, I'm not exactly sure how to improve that.
The fact it works currently seems kind of brittle, what prevent incorrect jumps to be ill-formed is that we looked if a label is defined in the same scope as the __local__ label introduction, but for the purpose of checking jumps this is not helping at all, not sure if there is a way to break the current implementation in funny ways

That is likely the same reason that we see a similar error when introducing a variable with __attribute__((__cleanup__(...))), which involves the same PowerPC asm goto code: https://github.com/ClangBuiltLinux/linux/issues/1886

I suspect this is the same issue but our CI pointed out another instance of this error in some x86 code (https://storage.tuxsuite.com/public/clangbuiltlinux/continuous-integration2/builds/2SW8BY7moEweMJC6DJpzidlGYt8/build.log), which does not appear to use local labels with the same name, but rather unique label names. cvise spits out:

void emulator_cmpxchg_emulated() {
  int __ret = ({
    asm goto(" .pushsection \"__ex_table\",\"a\"\n"
             " .popsection\n"
             :
             :
             :
             : efaultu16);
    __ret;
  });
efaultu16:
  ({
    asm goto(" .pushsection \"__ex_table\",\"a\"\n"
             " .popsection\n"
             :
             :
             :
             : efaultu32);
  efaultu32:;
  });
}
$ clang -fsyntax-only x86.i
x86.i:3:5: error: cannot jump from this asm goto statement to one of its possible targets
    3 |     asm goto(" .pushsection \"__ex_table\",\"a\"\n"
      |     ^
x86.i:19:3: note: possible target of asm goto statement
   19 |   efaultu32:;
      |   ^
x86.i:12:3: note: jump enters a statement expression
   12 |   ({
      |   ^
1 error generated.

which certainly seems bogus to me.

FYI, this can be relanded after https://reviews.llvm.org/D155342 is merged

FYI, this can be relanded after https://reviews.llvm.org/D155342 is merged

D155342 is all merged up

cor3ntin updated this revision to Diff 542813.Jul 21 2023, 2:13 AM

Rebase + add test

This revision was landed with ongoing or failed builds.Jul 21 2023, 6:08 AM
This revision was automatically updated to reflect the committed changes.

The error added by this patch triggers on some recently added code to the Linux kernel's -next tree, which I failed to test above due to its generally unstable nature:

drivers/gpu/drm/tests/drm_exec_test.c:41:3: error: cannot jump from this indirect goto statement to one of its possible targets
   41 |                 drm_exec_retry_on_contention(&exec);
      |                 ^
include/drm/drm_exec.h:96:4: note: expanded from macro 'drm_exec_retry_on_contention'
   96 |                         goto *__drm_exec_retry_ptr;             \
      |                         ^
drivers/gpu/drm/tests/drm_exec_test.c:39:2: note: possible target of indirect goto statement
   39 |         drm_exec_until_all_locked(&exec) {
      |         ^
include/drm/drm_exec.h:79:33: note: expanded from macro 'drm_exec_until_all_locked'
   79 |                 __label__ __drm_exec_retry;                     \
      |                                                                 ^
drivers/gpu/drm/tests/drm_exec_test.c:39:2: note: jump enters a statement expression

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/include/drm/drm_exec.h
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/gpu/drm/tests/drm_exec_test.c

A standalone reproducer from the kernel sources:

struct drm_exec {};
int drm_exec_cleanup(struct drm_exec *);
void drm_exec_fini(struct drm_exec *);
void drm_exec_init(struct drm_exec *);
int drm_exec_is_contended(struct drm_exec *);
void drm_exec_lock_obj(struct drm_exec *);

void test_lock(void)
{
        struct drm_exec exec;

        drm_exec_init(&exec);
        for (void *__drm_exec_retry_ptr; ({
                     __label__ __drm_exec_retry;
     __drm_exec_retry:
                     __drm_exec_retry_ptr = &&__drm_exec_retry;
                     (void)__drm_exec_retry_ptr;
                     drm_exec_cleanup(&exec);
             });) {
                drm_exec_lock_obj(&exec);
                do {
                        if (__builtin_expect(!!(drm_exec_is_contended(&exec)),
                                             0))
                                goto *__drm_exec_retry_ptr;
                } while (0);
        }
        drm_exec_fini(&exec);
}
$ clang -fsyntax-only test.c
test.c:24:5: error: cannot jump from this indirect goto statement to one of its possible targets
   24 |                                 goto *__drm_exec_retry_ptr;
      |                                 ^
test.c:15:6: note: possible target of indirect goto statement
   15 |      __drm_exec_retry:
      |      ^
test.c:13:35: note: jump enters a statement expression
   13 |         for (void *__drm_exec_retry_ptr; ({
      |                                          ^
1 error generated.

I am not sure this is a problem with the patch, since if I am reading GCC's documentation on statement expressions correctly (https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html), the construct the kernel is using with labels as values to jump into a statement expression with a computed goto is undefined behavior, but I figured I would bring it up just in case there is a problem.

Jumping into a statement expression with goto or using a switch statement outside the statement expression with a case or default label inside the statement expression is not permitted. Jumping into a statement expression with a computed goto (see Labels as Values) has undefined behavior.

If this is a problem on the kernel side, I can bring this up to them. GCC 13.1.0 at least does not error on this but if it is documented as UB, the construct should still be avoided.

The error added by this patch triggers on some recently added code to the Linux kernel's -next tree, which I failed to test above due to its generally unstable nature:

drivers/gpu/drm/tests/drm_exec_test.c:41:3: error: cannot jump from this indirect goto statement to one of its possible targets
   41 |                 drm_exec_retry_on_contention(&exec);
      |                 ^
include/drm/drm_exec.h:96:4: note: expanded from macro 'drm_exec_retry_on_contention'
   96 |                         goto *__drm_exec_retry_ptr;             \
      |                         ^
drivers/gpu/drm/tests/drm_exec_test.c:39:2: note: possible target of indirect goto statement
   39 |         drm_exec_until_all_locked(&exec) {
      |         ^
include/drm/drm_exec.h:79:33: note: expanded from macro 'drm_exec_until_all_locked'
   79 |                 __label__ __drm_exec_retry;                     \
      |                                                                 ^
drivers/gpu/drm/tests/drm_exec_test.c:39:2: note: jump enters a statement expression

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/include/drm/drm_exec.h
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/gpu/drm/tests/drm_exec_test.c

A standalone reproducer from the kernel sources:

struct drm_exec {};
int drm_exec_cleanup(struct drm_exec *);
void drm_exec_fini(struct drm_exec *);
void drm_exec_init(struct drm_exec *);
int drm_exec_is_contended(struct drm_exec *);
void drm_exec_lock_obj(struct drm_exec *);

void test_lock(void)
{
        struct drm_exec exec;

        drm_exec_init(&exec);
        for (void *__drm_exec_retry_ptr; ({
                     __label__ __drm_exec_retry;
     __drm_exec_retry:
                     __drm_exec_retry_ptr = &&__drm_exec_retry;
                     (void)__drm_exec_retry_ptr;
                     drm_exec_cleanup(&exec);
             });) {
                drm_exec_lock_obj(&exec);
                do {
                        if (__builtin_expect(!!(drm_exec_is_contended(&exec)),
                                             0))
                                goto *__drm_exec_retry_ptr;
                } while (0);
        }
        drm_exec_fini(&exec);
}
$ clang -fsyntax-only test.c
test.c:24:5: error: cannot jump from this indirect goto statement to one of its possible targets
   24 |                                 goto *__drm_exec_retry_ptr;
      |                                 ^
test.c:15:6: note: possible target of indirect goto statement
   15 |      __drm_exec_retry:
      |      ^
test.c:13:35: note: jump enters a statement expression
   13 |         for (void *__drm_exec_retry_ptr; ({
      |                                          ^
1 error generated.

I am not sure this is a problem with the patch, since if I am reading GCC's documentation on statement expressions correctly (https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html), the construct the kernel is using with labels as values to jump into a statement expression with a computed goto is undefined behavior, but I figured I would bring it up just in case there is a problem.

Jumping into a statement expression with goto or using a switch statement outside the statement expression with a case or default label inside the statement expression is not permitted. Jumping into a statement expression with a computed goto (see Labels as Values) has undefined behavior.

If this is a problem on the kernel side, I can bring this up to them. GCC 13.1.0 at least does not error on this but if it is documented as UB, the construct should still be avoided.

At least GCC documentation says this is UB. It's interesting GCC does not diagnose.
I'd be reluctant to revert the patch at this point as it ensure we don't jump into unevaluated statements like sizeof, etc.
Can you let me know what the linux folks think? Thanks a lot!

Can you let me know what the linux folks think? Thanks a lot!

Sure thing, you should be able to follow along on the web as well:

https://lore.kernel.org/20230722220637.GA138486@dev-arch.thelio-3990X/