This is an archive of the discontinued LLVM Phabricator instance.

Additional fix for PR14269: Clang crashes when a bit field is used as inline assembler input / output with memory constraint
ClosedPublic

Authored by andreybokhanko on Jun 16 2015, 8:11 AM.

Details

Summary

In his post-review for my fix (http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20150608/130464.html) Richard Smith wrote:

What about other weird kinds of lvalues, like the result of __real / __imag, vector indexing, and global register variables? Those have the same problem; CGStmt.cpp blindly calls LValue::getAddress without checking for those cases.

This patch fixes these cases:

like the result of __real / __imag

This already works well -- no errors printed for real (as they can be handled just fine), a error ("invalid lvalue in asm output") is printed for imag.

I can't create a test case that leads either to compile time fail or incorrect code generation.

vector indexing

Fixed

and global register variables

Fixed

Diff Detail

Event Timeline

andreybokhanko retitled this revision from to Additional fix for PR14269: Clang crashes when a bit field is used as inline assembler input / output with memory constraint.
andreybokhanko updated this object.
andreybokhanko edited the test plan for this revision. (Show Details)
andreybokhanko added a subscriber: Unknown Object (MLST).

Pinging Richard!

echristo edited edge metadata.Jul 7 2015, 11:33 AM

This seems to be a bit overconstrained versus what gcc accepts on the testcase:

dzur:~/tmp> gcc -S baz.c |
baz.c: In function ‘test16’: |
baz.c:18:4: error: cannot take address of bit-field ‘field3’ |

: "m" (a.field3)); // expected-error {{reference to a non-addressable value in asm input with a me\|

m |

^                                                                                                  |

baz.c:28:4: error: address of global register variable ‘test16_baz’ requested |

: "m" (test16_baz)); // expected-error {{reference to a non-addressable value in asm input with a \|

m |

^                                                                                                  |

baz.c:16:3: error: memory input 0 is not directly addressable |

__asm__("movl $5, %0"                                                                               |
^                                                                                                   |

baz.c:26:3: error: memory input 0 is not directly addressable |

__asm__("movl $5, %0"                                                                               |
^                                                                                                   |

also, the precision in error messages is nice. :)

-eric

andreybokhanko edited edge metadata.

Error message became more specific (as requested in Eric's comment); check for memory constraint compatibility put into a separate helper function (checkExprMemoryConstraintCompat).

Hi Eric,

Thanks for looking into the patch!

This seems to be a bit overconstrained versus what gcc accepts on the testcase:

There are three cases that GCC accepts and clang (with my fix) doesn't:

  1. Output to a bit-field:
typedef struct test16_foo {
  unsigned int field1 : 1;
  unsigned int field2 : 2;
  unsigned int field3 : 3;
} test16_foo;

void test16()
{
  test16_foo a;

  __asm__("movl $5, %0"
          : "=rm" (a.field2)); // expected-error {{reference to a non-addressable value in asm output with a memory constraint '=rm'}}
}

GCC accepts this, but generates incorrect code, that is not even accepted by assembler:

$ gcc test1.c
test1.c: Assembler messages:
test1.c:11: Error: `%al' not allowed with `movl'
  1. Vector elements:
typedef __attribute__((vector_size(16))) int test16_bar;

int main()
{
  test16_bar b = {1, 2, 3, 4};

  __asm__("movl $5, %0"
          : "=rm" (b[2]));

  return b[2];
}

The problem here is that LLVM IR represents vectors with a specific vector type; you can't get address of a random element inside vector. Specific instructions should be used to get individual vector elements ("extractelement" and "insertelement"), but then again -- they don't provide addresses of elements. GCC simply treats a vector as an array of elements and computes desired address. In theory, this can be done in LLVM IR as well, but I don't think this is the right approach -- we generally can't make any assumptions on how vectors are represented by a target CPU.

Do you agree?

  1. Global register variables:
register int test16_baz asm("rbx");

void test16()
{
  __asm__("movl $5, %0"
          : "=rm" (test16_baz)); // expected-error {{reference to a non-addressable value in asm output with a memory constraint '=rm'}}
}

The constraint here says "register *or* memory". GCC chooses register and compiles the test fine. Clang always chooses memory -- due to the following check at CGStmt.cpp:1874:

// If this is a register output, then make the inline asm return it
// by-value.  If this is a memory result, return the value by-reference.
if (!Info.allowsMemory() && hasScalarEvaluationKind(OutExpr->getType())) {

I have no idea why it checks for "!info.allowsMemory()" and not for "info.allowsRegister", but this is a separate issue, not related to my patch.

When a test is re-written in a way that allows *only* memory, GCC complains as well:

register int test16_baz asm("rbx");

void test16()
{
  __asm__("movl $5, %0"
          : "=m" (test16_baz)); // expected-error {{reference to a non-addressable value in asm output with a memory constraint '=rm'}}
}
$ gcc test3.c
test3.c: In function ?test16?:
test3.c:6:11: error: address of global register variable ?test16_baz? requested
           : "=m" (test16_baz)); // expected-error {{reference to a non-addressable value in asm output with a memory constraint '=rm'}}
           ^
test3.c:5:3: error: invalid lvalue in asm output 0
   __asm__("movl $5, %0"
   ^

also, the precision in error messages is nice. :)

Done!

Please re-review.

Andrey

The problem here is that LLVM IR represents vectors with a specific vector type; you can't get address of a random element inside vector. Specific instructions should be used to get individual vector elements ("extractelement" and "insertelement"), but then again -- they don't provide addresses of elements. GCC simply treats a vector as an array of elements and computes desired address. In theory, this can be done in LLVM IR as well, but I don't think this is the right approach -- we generally can't make any assumptions on how vectors are represented by a target CPU.

Do you agree?

I agree; we don't want to tie frontend functionality to a specific representation of the vectors.

We could support this, but we'd need to do it by:

  1. Creating a local stack variable (alloca)
  2. Extracting the requested vector element and storing it in that stack-allocated memory
  3. Providing the address of the local stack variable to the inline asm
  4. After the inline asm, loading the value from the local stack variable and inserting it back into the vector

I have no opinion on whether or not this is worth implementing.

I agree; we don't want to tie frontend functionality to a specific representation of the vectors.

We could support this, but we'd need to do it by:

  1. Creating a local stack variable (alloca)
  2. Extracting the requested vector element and storing it in that stack-allocated memory
  3. Providing the address of the local stack variable to the inline asm
  4. After the inline asm, loading the value from the local stack variable and inserting it back into the vector

I have no opinion on whether or not this is worth implementing.

Hal, what you suggested means basically creating a new local variable, copying value of a vector element to it and then providing address of this local variable, not original vector element. I'm not sure that preserves semantic of inline assembly's "m" restriction, as it asks for memory address of original variable, not some copy.

Eric, what do you think?

Andrey

I agree; we don't want to tie frontend functionality to a specific representation of the vectors.

We could support this, but we'd need to do it by:

  1. Creating a local stack variable (alloca)
  2. Extracting the requested vector element and storing it in that stack-allocated memory
  3. Providing the address of the local stack variable to the inline asm
  4. After the inline asm, loading the value from the local stack variable and inserting it back into the vector

I have no opinion on whether or not this is worth implementing.

Hal, what you suggested means basically creating a new local variable, copying value of a vector element to it and then providing address of this local variable, not original vector element. I'm not sure that preserves semantic of inline assembly's "m" restriction, as it asks for memory address of original variable, not some copy.

The real question is: Is the difference observable? When I made the suggestion, I did so because I felt the answer was no. But this is not true if you capture the address to use later. Thus, the difference is observable, and I'll vote that we simply not support this case.

Eric, what do you think?

Andrey

echristo accepted this revision.Jul 24 2015, 10:47 AM
echristo edited edge metadata.

I agree; we don't want to tie frontend functionality to a specific representation of the vectors.

We could support this, but we'd need to do it by:

  1. Creating a local stack variable (alloca)
  2. Extracting the requested vector element and storing it in that stack-allocated memory
  3. Providing the address of the local stack variable to the inline asm
  4. After the inline asm, loading the value from the local stack variable and inserting it back into the vector

I have no opinion on whether or not this is worth implementing.

Hal, what you suggested means basically creating a new local variable, copying value of a vector element to it and then providing address of this local variable, not original vector element. I'm not sure that preserves semantic of inline assembly's "m" restriction, as it asks for memory address of original variable, not some copy.

The real question is: Is the difference observable? When I made the suggestion, I did so because I felt the answer was no. But this is not true if you capture the address to use later. Thus, the difference is observable, and I'll vote that we simply not support this case.

Eric, what do you think?

I agree here.

Let's go with this patch right now and do anything else incrementally.

Thanks!

-eric

This revision is now accepted and ready to land.Jul 24 2015, 10:47 AM
This revision was automatically updated to reflect the committed changes.