This is an archive of the discontinued LLVM Phabricator instance.

[Clang][AArch64] Capturing proper pointer alignment for Neon vld1 intrinsicts
ClosedPublic

Authored by pratlucas on May 11 2020, 9:18 AM.

Details

Summary

During CodeGen for AArch64 Neon intrinsics, Clang was incorrectly
assuming all the pointers from which loads were being generated for vld1
intrinsics were aligned according to the intrinsics result type, causing
alignment faults on the code generated by the backend.

This patch updates vld1 intrinsics' CodeGen to properly capture the
correct load alignment based on the type of the pointer provided as
input for the intrinsic.

Diff Detail

Event Timeline

pratlucas created this revision.May 11 2020, 9:18 AM
This revision is now accepted and ready to land.May 20 2020, 3:52 AM
This revision was automatically updated to reflect the committed changes.
efriedma added inline comments.May 26 2020, 3:02 PM
clang/lib/CodeGen/CGBuiltin.cpp
10360

IgnoreParenCasts() here seems really dubious. You could easily end up with a type that's not even a pointer.

Can we just set the alignment to 1 and leave it at that? It's not like it really matters for NEON.

plotfi added a subscriber: plotfi.May 26 2020, 3:34 PM

@efriedma I am also seeing a similar crash in the llvm-test-suite for llvm-test-suite/SingleSource/UnitTests/Vector/NEON/simple.c

I will try and reduce the case.

After reducing the test case it looks the same as the bugzilla filing, running clang --target=aarch64-unknown-linux-gnu -c on the following does it:

typedef signed char int8_t;
typedef __attribute__((neon_vector_type(16))) int8_t int8x16_t;
typedef struct int8x16x2_t {
  int8x16_t val[2];
} int8x16x2_t;

int8x16_t f() {
  int8_t d[16];
  return __extension__ ({ int8x16_t __ret; __ret = (int8x16_t) __builtin_neon_vld1q_v(d, 32); __ret; });
}

Reduced crash case even further:

void f() {
  signed char d[16];
  __builtin_neon_vld1q_v(d, 32);
}

@pratlucas please reland once the bugzilla issue is resolved and the testsuite builds for aarch64. Thanks @echristo

pratlucas reopened this revision.May 27 2020, 5:41 AM
This revision is now accepted and ready to land.May 27 2020, 5:41 AM
pratlucas updated this revision to Diff 266504.May 27 2020, 5:45 AM

Hi @efriedma and @plotf,

Thank you the reduced test and for reverting the original patch.
I've updated it to avoid the issue while still capturing the proper argument alignment.
Do you mind taking a look before a reland it?

I'm not completely happy with using EmitPointerWithAlignment here... but I guess it's the same thing we do for 32-bit ARM, so it must be mostly usable in practice. I'd like to see some tests in clang/test/CodeGen/aarch64-neon-intrinsics.c showing what happens if you pass, for example, a void* pointer to vld1_u32.

clang/lib/CodeGen/CGBuiltin.cpp
10359

You can't call EmitPointerWithAlignment() here; we've already emitted the expression. See CodeGenFunction::EmitARMBuiltinExpr for how we handle this on 32-bit ARM.

10360

Might as well just return Builder.CreateLoad(VTy, PtrOp0);

10387

Should we fix vst1 while we're in the area?

pratlucas updated this revision to Diff 267905.Jun 2 2020, 9:25 AM

Addressing review comments and extending tests.

pratlucas marked 5 inline comments as done.Jun 2 2020, 9:29 AM
pratlucas added inline comments.
clang/lib/CodeGen/CGBuiltin.cpp
10360

The bitcast of the pointer for the proper vector type is required prior to the load, so Builder.CreateLoad may not be the best fit here.

pratlucas marked an inline comment as done.Jun 2 2020, 9:29 AM
efriedma accepted this revision.Jun 2 2020, 3:13 PM

LGTM

This revision was automatically updated to reflect the committed changes.