This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
docs/
2/2
LangRef.rst
-
lib/
-
AsmParser/
18/18
LLParser.cpp
-
IR/
-
Constants.cpp
-
test/CodeGen/AVR/
-
CodeGen/
-
AVR/
-
block-address-is-in-progmem-space.ll
-
brind.ll

Differential D48803

Place the BlockAddress type in the address space of the containing function
ClosedPublic

Authored by arichardson on Jun 30 2018, 6:26 AM.

Download Raw Diff

Details

Reviewers

bjope
dylanmckay
theraven
arsenm
jdoerfert

Commits

rGc142c06c19b3: Place the BlockAddress type in the address space of the containing function

Summary

While this should not matter for most architectures (where the program
address space is 0), it is important for CHERI. We use address space 200
for all of our code pointers and without this change we assert in
SelectionDAG handling of BlockAddress nodes.

It is also useful for AVR: previously programs targeting
AVR that attempt to read their own machine code
via a pointer to a label would instead read from RAM
using a pointer relative to the the start of program flash.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

arichardson created this revision.Jun 30 2018, 6:26 AM

Herald added subscribers: llvm-commits, wdng. · View Herald TranscriptJun 30 2018, 6:26 AM

I don't know much about the BlockAddress concept. The LangRef says things like "always has an i8* type" and "this may be passed around as an opaque pointer sized value". But I guess it would be weird if the size doesn't match the size of pointers in the program address space, so the patch makes sense to me.

I assume that this can't be reproduced for any in-tree target?
If you can't find an in-tree reproducer, then maybe you can describe the problem a little bit more instead. Such as which assert you hit, and maybe a small stack trace. That might help when trying to motivate this patch in the future.

In D48803#1149383, @bjope wrote:

I don't know much about the BlockAddress concept. The LangRef says things like "always has an i8* type" and "this may be passed around as an opaque pointer sized value". But I guess it would be weird if the size doesn't match the size of pointers in the program address space, so the patch makes sense to me.

I assume that this can't be reproduced for any in-tree target?
If you can't find an in-tree reproducer, then maybe you can describe the problem a little bit more instead. Such as which assert you hit, and maybe a small stack trace. That might help when trying to motivate this patch in the future.

Yes I'm not sure I can make a test for this with any of the existing targets. I'll see if I can get something with AVR since that sets program address space to 1.

Nice patch, this looks useful!

Yes I'm not sure I can make a test for this with any of the existing targets. I'll see if I can get something with AVR since that sets program address space to 1.

Here's a test for you that does it.

There's another bug in LLParser that stops nonzero program address spaces from working; if the function referenced in a block address is not known in the first pass of the LLParser (for example, when the blockaddress exists earlier in the IR file than the function definition, the LLParser must insert a forward reference for the function. It does this by creating a new global variable, but it unconditionally left the global variable in the default address space of zero.

The diff I have included has a fix for this.

I've also amended the LangRef docs so that they would be accurate under the new patch.

diff --git a/docs/LangRef.rst b/docs/LangRef.rst
index 06e092fb9fc..deac223d1a1 100644
--- a/docs/LangRef.rst
+++ b/docs/LangRef.rst
@@ -3275,7 +3275,16 @@ Addresses of Basic Blocks
 ``blockaddress(@function, %block)``
 
 The '``blockaddress``' constant computes the address of the specified
-basic block in the specified function, and always has an ``i8*`` type.
+basic block in the specified function.
+
+It always has an ``i8 addrspace(P)*`` type, where ``P`` is the program
+memory address space specified in the data layout. For targets that place
+code and data in the same address space (Von-Neumann architectures) a block
+address will have the same address space as data pointers, usually
+``addrspace(0)``. Block addresses on targets that have different data and
+code address spaces (Harvard architectures) will always be in the program
+memory address space specified in the target's data layout.
+
 Taking the address of the entry block is illegal.
 
 This value only has defined behavior when used as an operand to the
diff --git a/lib/AsmParser/LLParser.cpp b/lib/AsmParser/LLParser.cpp
index 5fe1e125d48..6581436c20f 100644
--- a/lib/AsmParser/LLParser.cpp
+++ b/lib/AsmParser/LLParser.cpp
@@ -3154,9 +3154,13 @@ bool LLParser::ParseValID(ValID &ID, PerFunctionState *PFS) {
                                               std::map<ValID, GlobalValue *>()))
               .first->second.insert(std::make_pair(std::move(Label), nullptr))
               .first->second;
-      if (!FwdRef)
+      if (!FwdRef) {
         FwdRef = new GlobalVariable(*M, Type::getInt8Ty(Context), false,
-                                    GlobalValue::InternalLinkage, nullptr, "");
+                                  GlobalValue::InternalLinkage, nullptr, "",
+                                  nullptr, GlobalValue::NotThreadLocal,
+                                  M->getDataLayout().getProgramAddressSpace());
+      }
+
       ID.ConstantVal = FwdRef;
       ID.Kind = ValID::t_Constant;
       return false;
diff --git a/test/CodeGen/AVR/block-address-is-in-progmem-space.ll b/test/CodeGen/AVR/block-address-is-in-progmem-space.ll
new file mode 100644
index 00000000000..8e6e3a71062
--- /dev/null
+++ b/test/CodeGen/AVR/block-address-is-in-progmem-space.ll
@@ -0,0 +1,51 @@
+; RUN: llc -mcpu=atmega328 < %s -march=avr | FileCheck %s
+
+; This test verifies that the pointer to a basic block
+; should always be a pointer in address space 1.
+;
+; If this were not the case, then programs targeting
+; AVR that attempted to read their own machine code
+; via a pointer to a label would actually read from RAM
+; using a pointer relative to the the start of program flash.
+;
+; This would cause a load of uninitialized memory, not even
+; touching the program's machine code as otherwise desired.
+
+target datalayout = "e-P1-p:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"
+
+; CHECK-LABEL: load_with_no_forward_reference
+define i8 @load_with_no_forward_reference(i8 %a, i8 %b) {
+second:
+  ; CHECK:      ldi r30, .Ltmp0+2
+  ; CHECK-NEXT: ldi r31, .Ltmp0+4
+  ; CHECK: lpm r24, Z
+  %bar = load i8, i8 addrspace(1)* blockaddress(@function_with_no_forward_reference, %second)
+  ret i8 %bar
+}
+
+; CHECK-LABEL: load_from_local_label
+define i8 @load_from_local_label(i8 %a, i8 %b) {
+entry:
+  %result1 = add i8 %a, %b
+
+  br label %second
+
+; CHECK-LABEL: .Ltmp1:
+second:
+  ; CHECK:      ldi r30, .Ltmp1+2
+  ; CHECK-NEXT: ldi r31, .Ltmp1+4
+  ; CHECK-NEXT: lpm r24, Z
+  %result2 = load i8, i8 addrspace(1)* blockaddress(@load_from_local_label, %second)
+  ret i8 %result2
+}
+
+; A function with no forward reference, right at the end
+; of the file.
+define i8 @function_with_no_forward_reference(i8 %a, i8 %b) {
+entry:
+  %result = add i8 %a, %b
+  br label %second
+second:
+  ret i8 0
+}
+

dylanmckay requested changes to this revision.Nov 15 2018, 11:05 PM

This revision now requires changes to proceed.Nov 15 2018, 11:05 PM

Rebase on latest master and merge suggestions

Herald added a reviewer: jdoerfert. · View Herald TranscriptSep 1 2020, 11:15 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: Jim, hiraditya. · View Herald Transcript

clang-format

Harbormaster completed remote builds in B70266: Diff 289229.Sep 1 2020, 11:48 AM

Harbormaster completed remote builds in B70267: Diff 289230.Sep 1 2020, 11:59 AM

Looks good to me. This bakes in the assumption that function pointers and basic block addresses are always in the same address space. That seems reasonable to me but it might be worth documenting in the DataLayout docs about the program address space.

@dylanmckay does this change look good to you now?

arsenm added inline comments.Sep 9 2020, 8:02 AM

llvm/lib/AsmParser/LLParser.cpp
3396	Why wouldn't this come from the parent function? You should be able to mix functions with different address spaces in the same module

bjope added inline comments.Oct 5 2020, 4:42 AM

llvm/lib/AsmParser/LLParser.cpp
3396	(Maybe @arichardson got a different reason, but sharing my point-of-view here anyway.) While it's possible to annotate calls and functions definitions with non-zero program address spaces, I think one need to be consistent. I don't think we really support multiple program address spaces (is there an actual use case for supporting that?). I'm also not exactly sure what you mean by "parent function". The addrspce in the resulting pointer type need to match the addrspace of the function referenced in the first argument of the blockaddress. And that function has not been defined yet, since we are inside the "!F" clause. I guess we have to trust the datalayout if the function hasn't been defined yet (or use some kind of forward ref and backtrack to fill in addrspace to get the correct type later). I wonder if we'd get some kind of type error later if we assume that datalayout is correct here, and we find a different addrspace when finding the function definition later? (We also got the usual problem that if datalayout is set by a datalayout definition that comes later in the ll file we haven't parsed the datalayout yet. But if I remember correclty that is a general problem also for the function definitions etc.)

dylanmckay requested changes to this revision.Oct 27 2020, 4:29 AM

dylanmckay added inline comments.

llvm/lib/AsmParser/LLParser.cpp
3396	I'm also not exactly sure what you mean by "parent function". The addrspce in the resulting pointer type need to match the addrspace of the function referenced in the first argument of the blockaddress. And that function has not been defined yet, since we are inside the "!F" clause. I suspect @arsenm is suggesting that the address space be copied from the `LLParserPerFunctionState* PFS` argument this function has. For example, replace `M->getDataLayout().getProgramAddressSpace()` with `PFS->getFunction()->getFunctionType()->getPointerAddressSpace()` [`PerFunctionState::getFunction` documentation](https://ldhldh.myds.me:10081/docs/llvm700/classllvm_1_1_l_l_parser_1_1_per_function_state.html#a9b14016c1937d715c8f305742547764e) This seems like a better alternative, as it means that if a function did opt to use a different address space from the program address space in the data layout, the blockaddresses within it will use that same address space rather than assuming the one from the datalayout. This also makes an assumption about the address space of the target function, although I feel the case for that assumption is stronger than the current one of "assume address space from datalayout" as the address space from the parent function could be considered "closer to the source". Indeed, we directly lookup the actual address space for this block address in this branch as this is the case for when the actual function has not been defined yet and so the address space information is not available. Please make this replacement and then the patch should be good to go.

This revision now requires changes to proceed.Oct 27 2020, 4:29 AM

arichardson planned changes to this revision.Oct 27 2020, 4:32 AM

arichardson added inline comments.

llvm/lib/AsmParser/LLParser.cpp
3396	Thanks, will do that and add a test case.

bjope added inline comments.Oct 27 2020, 5:28 AM

llvm/lib/AsmParser/LLParser.cpp
3396	Well, taking the address space from the wrong function does not help for "You should be able to mix functions with different address spaces in the same module.". If heading in that direction then please add some nifty code comment explaining what is going on. When using the datalayout from the module it is easy to understand that we aren't using any function specific information, but if you use the function pointer from the wrong function to derive the address space it might fool someone that it either is the correct function that is being used, or that it is an "unintended bug" rather than a "hacky workaround".

arsenm added inline comments.Oct 29 2020, 1:21 PM

llvm/lib/AsmParser/LLParser.cpp
3396	block address is always in the context of a function, there's no wrong function to choose. You just use the address space from the parent function

Herald added a subscriber: dexonsmith. · View Herald TranscriptOct 29 2020, 1:21 PM

bjope added inline comments.Oct 29 2020, 3:11 PM

llvm/lib/AsmParser/LLParser.cpp
3396	Sure, so then we agree that we don't mix address spaces for program (there can be only one and taking it from any function would be ok). Still, there are at least some lit test cases (for example the AVR/brind.ll test modified in this patch) that have global variables involving blockaddress with forward references to functions not yet defined. In such cases there is no parent function, right? We only got the function referenced in the blockaddress argument, and its definition might not have been parsed yet. Or maybe I've simply misunderstood when we end up in this part of the code (if all the test cases pass I guess things are in order).

arsenm added inline comments.Oct 29 2020, 3:15 PM

llvm/lib/AsmParser/LLParser.cpp
3396	If the function type already exists, its address space is set. It's difficult to go back and rewrite IR types, so I don't think the IR parser would be doing this

Make use of per-function state and add various tests

Harbormaster completed remote builds in B80369: Diff 308089.Nov 27 2020, 11:28 AM

Restore Constants.cpp change that was accidentally dropped while updating this revision

arichardson retitled this revision from Place the BlockAddress type in the program address space to Place the BlockAddress type in the address space of the containing function.Nov 29 2020, 5:05 AM

Update langref

Harbormaster completed remote builds in B80447: Diff 308208.Nov 29 2020, 5:41 AM

Harbormaster completed remote builds in B80448: Diff 308209.Nov 29 2020, 5:45 AM

dstenb added a subscriber: dstenb.Feb 15 2021, 11:19 PM

Hi, @arichardson! What is the status for this patch?

I was waiting for an approval from one of the reviewers since it has changed quite a bit since the initial version.

rebased. @dylanmckay does this look okay now?

Harbormaster completed remote builds in B89524: Diff 324253.Feb 17 2021, 4:13 AM

arsenm added inline comments.Feb 17 2021, 6:22 AM

llvm/docs/LangRef.rst
3748–3749	I think this is overexplaining it. The IR is the same regardless of what the target wants to do. It should match the code address space of the function
llvm/lib/AsmParser/LLParser.cpp
3398	This should not depend on pointee element types
3400–3401	Pointee types do not really exist, the error should not mention them

arichardson added inline comments.Feb 17 2021, 6:30 AM

llvm/docs/LangRef.rst
3748–3749	I believe the following should be sufficient, right? It always has an `i8 addrspace(P)* `type, where` `P`` is the address space of the function containing `%block`. @dylanmckay are you happy with dropping the the remaining text?

update error message
Simplify langref

llvm/lib/AsmParser/LLParser.cpp
3398	I've kept this check for now (with a TODO) and only updated the error message.

Harbormaster completed remote builds in B89546: Diff 324301.Feb 17 2021, 7:44 AM

ping?

Herald added a subscriber: jrtc27. · View Herald TranscriptMar 8 2021, 2:02 AM

ping?

dexonsmith removed a subscriber: dexonsmith.Mar 24 2021, 12:14 PM

@dylanmckay ping? Would like to get this committed soon, it's only been 1.5 years

@arsenm / @dylanmckay is this okay to commit now?

I will commit this based on the previous approval in 1 month (1st June) unless there are any further comments.

In D48803#2721948, @arichardson wrote:

I will commit this based on the previous approval in 1 month (1st June) unless there are any further comments.

Looks like this wasn't re-approved after the changes were requested.
I would suggest submitting an RFC to llvm-dev.

In D48803#2722120, @lebedev.ri wrote:

In D48803#2721948, @arichardson wrote:

I will commit this based on the previous approval in 1 month (1st June) unless there are any further comments.

Looks like this wasn't re-approved after the changes were requested.
I would suggest submitting an RFC to llvm-dev.

That makes sense since the pinging here has been ignored.

FWIW, we have merged this downstream some time ago. If I recall correctly it was related to adding support for inline asm goto.

So the solution seem to work fine for us (but have thought that it would be rude to approve it with @dylanmckay and @arsenm being ping:ed in person).

arsenm added inline comments.Apr 28 2021, 5:58 AM

llvm/lib/AsmParser/LLParser.cpp
3398	No, just rip out the type check. It's not needed now, and if it is, it's the verifier's responsibility to check it
3409	Typos programm adress
3410–3412	I prefer to not add code that you can't test and should be unreachable

Thanks for the review, will update shortly.

llvm/lib/AsmParser/LLParser.cpp
3398	Will move check to verifier if there isn't one already.
3410–3412	Sounds good, will change to llvm_unreachable("").

Drop check for i8 as suggested by @arsenm

arichardson marked 4 inline comments as done.Apr 28 2021, 10:19 AM

Harbormaster completed remote builds in B101458: Diff 341256.Apr 28 2021, 12:37 PM

My apologies for the couple month latency, almost every time I've checked this patch in the last few years there isn't any new activity. It looks like there's one comment from Matt last month that still needs to be addressed (remove a piece of error handling). Once updated, I'd like to get this approved.

llvm/lib/AsmParser/LLParser.cpp
3398	The error + if (!ExpectedTy->isPointerTy()) + return error(ID.Loc, + "type of blockaddress must be a pointer and not '" + + getTypeString(ExpectedTy) + "'"); still exists and hasn't been moved to the verifier or removed from this file.

This revision now requires changes to proceed.May 30 2021, 5:12 AM

I believe all remaining issues were addressed in the latest version.

llvm/lib/AsmParser/LLParser.cpp
3398	The problem was checking the pointee type , which I've removed. We need to check if it's a pointer since otherwise we can't call `ExpectedTy->getPointerAddressSpace()`.

Thanks for the clarification, makes sense. LGTM

This revision is now accepted and ready to land.May 31 2021, 9:41 AM

This revision was landed with ongoing or failed builds.Jul 2 2021, 4:19 AM

Closed by commit rGc142c06c19b3: Place the BlockAddress type in the address space of the containing function (authored by arichardson). · Explain Why

This revision was automatically updated to reflect the committed changes.

arichardson marked an inline comment as done.

arichardson added a commit: rGc142c06c19b3: Place the BlockAddress type in the address space of the containing function.

Revision Contents

Path

Size

llvm/

docs/

LangRef.rst

11 lines

lib/

AsmParser/

LLParser.cpp

8 lines

IR/

Constants.cpp

4 lines

test/

CodeGen/

AVR/

block-address-is-in-progmem-space.ll

51 lines

brind.ll

8 lines

Diff 289229

llvm/docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 3,731 Lines • ▼ Show 20 Lines
	.. _blockaddress:			.. _blockaddress:

	Addresses of Basic Blocks			Addresses of Basic Blocks
	-------------------------			-------------------------

	``blockaddress(@function, %block)``			``blockaddress(@function, %block)``

	The '``blockaddress``' constant computes the address of the specified			The '``blockaddress``' constant computes the address of the specified
	basic block in the specified function, and always has an ``i8*`` type.			basic block in the specified function.

				It always has an ``i8 addrspace(P)*`` type, where ``P`` is the program
				memory address space specified in the data layout. For targets that place
				code and data in the same address space (Von-Neumann architectures) a block
				address will have the same address space as data pointers, usually
				``addrspace(0)``. Block addresses on targets that have different data and
				code address spaces (Harvard architectures) will always be in the program
				memory address space specified in the target's data layout.

				arsenmUnsubmitted Done Reply Inline Actions I think this is overexplaining it. The IR is the same regardless of what the target wants to do. It should match the code address space of the function arsenm: I think this is overexplaining it. The IR is the same regardless of what the target wants to do.
				arichardsonAuthorUnsubmitted Done Reply Inline Actions I believe the following should be sufficient, right? It always has an `i8 addrspace(P)* `type, where` `P`` is the address space of the function containing `%block`. @dylanmckay are you happy with dropping the the remaining text? arichardson: I believe the following should be sufficient, right? > It always has an ``i8 addrspace(P)*``…
	Taking the address of the entry block is illegal.			Taking the address of the entry block is illegal.

	This value only has defined behavior when used as an operand to the			This value only has defined behavior when used as an operand to the
	':ref:`indirectbr <i_indirectbr>`' or ':ref:`callbr <i_callbr>`'instruction, or			':ref:`indirectbr <i_indirectbr>`' or ':ref:`callbr <i_callbr>`'instruction, or
	for comparisons against null. Pointer equality tests between labels addresses			for comparisons against null. Pointer equality tests between labels addresses
	results in undefined behavior --- though, again, comparison against null is ok,			results in undefined behavior --- though, again, comparison against null is ok,
	and no label is equal to the null pointer. This may be passed around as an			and no label is equal to the null pointer. This may be passed around as an
	opaque pointer sized value as long as the bits are not inspected. This			opaque pointer sized value as long as the bits are not inspected. This
	▲ Show 20 Lines • Show All 17,022 Lines • Show Last 20 Lines

llvm/lib/AsmParser/LLParser.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

//===-- LLParser.cpp - Parser Class ---------------------------------------===//		//===-- LLParser.cpp - Parser Class ---------------------------------------===//
		Lint: Lint Inline Actions clang-format not found in user's PATH; not linting file. Lint: Lint: clang-format not found in user's PATH; not linting file.
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
▲ Show 20 Lines • Show All 3,373 Lines • ▼ Show 20 Lines	case lltok::kw_blockaddress: {
if (!F) {		if (!F) {
// Make a global variable as a placeholder for this reference.		// Make a global variable as a placeholder for this reference.
GlobalValue *&FwdRef =		GlobalValue *&FwdRef =
ForwardRefBlockAddresses.insert(std::make_pair(		ForwardRefBlockAddresses.insert(std::make_pair(
std::move(Fn),		std::move(Fn),
std::map<ValID, GlobalValue *>()))		std::map<ValID, GlobalValue *>()))
.first->second.insert(std::make_pair(std::move(Label), nullptr))		.first->second.insert(std::make_pair(std::move(Label), nullptr))
.first->second;		.first->second;
if (!FwdRef)		if (!FwdRef) {
FwdRef = new GlobalVariable(*M, Type::getInt8Ty(Context), false,		FwdRef = new GlobalVariable(*M, Type::getInt8Ty(Context), false,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - FwdRef = new GlobalVariable(M, Type::getInt8Ty(Context), false, - GlobalValue::InternalLinkage, nullptr, "", - nullptr, GlobalValue::NotThreadLocal, - M->getDataLayout().getProgramAddressSpace()); + FwdRef = new GlobalVariable( + M, Type::getInt8Ty(Context), false, GlobalValue::InternalLinkage, + nullptr, "", nullptr, GlobalValue::NotThreadLocal, + M->getDataLayout().getProgramAddressSpace()); Lint: Pre-merge checks: clang-format: please reformat the code ``` - FwdRef = new GlobalVariable(*M, Type…
GlobalValue::InternalLinkage, nullptr, "");		GlobalValue::InternalLinkage, nullptr, "",
		nullptr, GlobalValue::NotThreadLocal,
		M->getDataLayout().getProgramAddressSpace());
		}

		arsenmUnsubmitted Done Reply Inline Actions Why wouldn't this come from the parent function? You should be able to mix functions with different address spaces in the same module arsenm: Why wouldn't this come from the parent function? You should be able to mix functions with…
		bjopeUnsubmitted Done Reply Inline Actions (Maybe @arichardson got a different reason, but sharing my point-of-view here anyway.) While it's possible to annotate calls and functions definitions with non-zero program address spaces, I think one need to be consistent. I don't think we really support multiple program address spaces (is there an actual use case for supporting that?). I'm also not exactly sure what you mean by "parent function". The addrspce in the resulting pointer type need to match the addrspace of the function referenced in the first argument of the blockaddress. And that function has not been defined yet, since we are inside the "!F" clause. I guess we have to trust the datalayout if the function hasn't been defined yet (or use some kind of forward ref and backtrack to fill in addrspace to get the correct type later). I wonder if we'd get some kind of type error later if we assume that datalayout is correct here, and we find a different addrspace when finding the function definition later? (We also got the usual problem that if datalayout is set by a datalayout definition that comes later in the ll file we haven't parsed the datalayout yet. But if I remember correclty that is a general problem also for the function definitions etc.) bjope: (Maybe @arichardson got a different reason, but sharing my point-of-view here anyway.) While…
		dylanmckayUnsubmitted Done Reply Inline Actions I'm also not exactly sure what you mean by "parent function". The addrspce in the resulting pointer type need to match the addrspace of the function referenced in the first argument of the blockaddress. And that function has not been defined yet, since we are inside the "!F" clause. I suspect @arsenm is suggesting that the address space be copied from the `LLParserPerFunctionState* PFS` argument this function has. For example, replace `M->getDataLayout().getProgramAddressSpace()` with `PFS->getFunction()->getFunctionType()->getPointerAddressSpace()` [`PerFunctionState::getFunction` documentation](https://ldhldh.myds.me:10081/docs/llvm700/classllvm_1_1_l_l_parser_1_1_per_function_state.html#a9b14016c1937d715c8f305742547764e) This seems like a better alternative, as it means that if a function did opt to use a different address space from the program address space in the data layout, the blockaddresses within it will use that same address space rather than assuming the one from the datalayout. This also makes an assumption about the address space of the target function, although I feel the case for that assumption is stronger than the current one of "assume address space from datalayout" as the address space from the parent function could be considered "closer to the source". Indeed, we directly lookup the actual address space for this block address in this branch as this is the case for when the actual function has not been defined yet and so the address space information is not available. Please make this replacement and then the patch should be good to go. dylanmckay: > I'm also not exactly sure what you mean by "parent function". The addrspce in the resulting…
		arichardsonAuthorUnsubmitted Done Reply Inline Actions Thanks, will do that and add a test case. arichardson: Thanks, will do that and add a test case.
		bjopeUnsubmitted Done Reply Inline Actions Well, taking the address space from the wrong function does not help for "You should be able to mix functions with different address spaces in the same module.". If heading in that direction then please add some nifty code comment explaining what is going on. When using the datalayout from the module it is easy to understand that we aren't using any function specific information, but if you use the function pointer from the wrong function to derive the address space it might fool someone that it either is the correct function that is being used, or that it is an "unintended bug" rather than a "hacky workaround". bjope: Well, taking the address space from the //wrong// function does not help for "You should be…
		arsenmUnsubmitted Done Reply Inline Actions block address is always in the context of a function, there's no wrong function to choose. You just use the address space from the parent function arsenm: block address is always in the context of a function, there's no wrong function to choose. You…
		bjopeUnsubmitted Done Reply Inline Actions Sure, so then we agree that we don't mix address spaces for program (there can be only one and taking it from any function would be ok). Still, there are at least some lit test cases (for example the AVR/brind.ll test modified in this patch) that have global variables involving blockaddress with forward references to functions not yet defined. In such cases there is no parent function, right? We only got the function referenced in the blockaddress argument, and its definition might not have been parsed yet. Or maybe I've simply misunderstood when we end up in this part of the code (if all the test cases pass I guess things are in order). bjope: Sure, so then we agree that we don't mix address spaces for program (there can be only one and…
		arsenmUnsubmitted Done Reply Inline Actions If the function type already exists, its address space is set. It's difficult to go back and rewrite IR types, so I don't think the IR parser would be doing this arsenm: If the function type already exists, its address space is set. It's difficult to go back and…
ID.ConstantVal = FwdRef;		ID.ConstantVal = FwdRef;
ID.Kind = ValID::t_Constant;		ID.Kind = ValID::t_Constant;
		arsenmUnsubmitted Done Reply Inline Actions This should not depend on pointee element types arsenm: This should not depend on pointee element types
		arichardsonAuthorUnsubmitted Done Reply Inline Actions I've kept this check for now (with a TODO) and only updated the error message. arichardson: I've kept this check for now (with a TODO) and only updated the error message.
		arsenmUnsubmitted Done Reply Inline Actions No, just rip out the type check. It's not needed now, and if it is, it's the verifier's responsibility to check it arsenm: No, just rip out the type check. It's not needed now, and if it is, it's the verifier's…
		arichardsonAuthorUnsubmitted Done Reply Inline Actions Will move check to verifier if there isn't one already. arichardson: Will move check to verifier if there isn't one already.
		dylanmckayUnsubmitted Done Reply Inline Actions The error + if (!ExpectedTy->isPointerTy()) + return error(ID.Loc, + "type of blockaddress must be a pointer and not '" + + getTypeString(ExpectedTy) + "'"); still exists and hasn't been moved to the verifier or removed from this file. dylanmckay: The error ``` + if (!ExpectedTy->isPointerTy()) + return error(ID.Loc, +…
		arichardsonAuthorUnsubmitted Done Reply Inline Actions The problem was checking the pointee type , which I've removed. We need to check if it's a pointer since otherwise we can't call `ExpectedTy->getPointerAddressSpace()`. arichardson: The problem was checking the pointee type , which I've removed. We need to check if it's a…
return false;		return false;
}		}

		arsenmUnsubmitted Done Reply Inline Actions Pointee types do not really exist, the error should not mention them arsenm: Pointee types do not really exist, the error should not mention them
// We found the function; now find the basic block. Don't use PFS, since we		// We found the function; now find the basic block. Don't use PFS, since we
// might be inside a constant expression.		// might be inside a constant expression.
BasicBlock *BB;		BasicBlock *BB;
if (BlockAddressPFS && F == &BlockAddressPFS->getFunction()) {		if (BlockAddressPFS && F == &BlockAddressPFS->getFunction()) {
if (Label.Kind == ValID::t_LocalID)		if (Label.Kind == ValID::t_LocalID)
BB = BlockAddressPFS->GetBB(Label.UIntVal, Label.Loc);		BB = BlockAddressPFS->GetBB(Label.UIntVal, Label.Loc);
else		else
BB = BlockAddressPFS->GetBB(Label.StrVal, Label.Loc);		BB = BlockAddressPFS->GetBB(Label.StrVal, Label.Loc);
		arsenmUnsubmitted Done Reply Inline Actions Typos programm adress arsenm: Typos programm adress
if (!BB)		if (!BB)
return Error(Label.Loc, "referenced value is not a basic block");		return Error(Label.Loc, "referenced value is not a basic block");
} else {		} else {
		arsenmUnsubmitted Done Reply Inline Actions I prefer to not add code that you can't test and should be unreachable arsenm: I prefer to not add code that you can't test and should be unreachable
		arichardsonAuthorUnsubmitted Done Reply Inline Actions Sounds good, will change to llvm_unreachable(""). arichardson: Sounds good, will change to llvm_unreachable("").
if (Label.Kind == ValID::t_LocalID)		if (Label.Kind == ValID::t_LocalID)
return Error(Label.Loc, "cannot take address of numeric label after "		return Error(Label.Loc, "cannot take address of numeric label after "
"the function is defined");		"the function is defined");
BB = dyn_cast_or_null<BasicBlock>(		BB = dyn_cast_or_null<BasicBlock>(
F->getValueSymbolTable()->lookup(Label.StrVal));		F->getValueSymbolTable()->lookup(Label.StrVal));
if (!BB)		if (!BB)
return Error(Label.Loc, "referenced value is not a basic block");		return Error(Label.Loc, "referenced value is not a basic block");
}		}
▲ Show 20 Lines • Show All 5,860 Lines • Show Last 20 Lines

llvm/lib/IR/Constants.cpp

//===-- Constants.cpp - Implement Constant nodes --------------------------===//		//===-- Constants.cpp - Implement Constant nodes --------------------------===//
		Lint: Lint Inline Actions clang-format not found in user's PATH; not linting file. Lint: Lint: clang-format not found in user's PATH; not linting file.
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
▲ Show 20 Lines • Show All 1,688 Lines • ▼ Show 20 Lines	BlockAddress BlockAddress::get(Function F, BasicBlock *BB) {
if (!BA)		if (!BA)
BA = new BlockAddress(F, BB);		BA = new BlockAddress(F, BB);

assert(BA->getFunction() == F && "Basic block moved between functions");		assert(BA->getFunction() == F && "Basic block moved between functions");
return BA;		return BA;
}		}

BlockAddress::BlockAddress(Function F, BasicBlock BB)		BlockAddress::BlockAddress(Function F, BasicBlock BB)
: Constant(Type::getInt8PtrTy(F->getContext()), Value::BlockAddressVal,		: Constant(Type::getInt8PtrTy(F->getContext(), F->getAddressSpace()),
&Op<0>(), 2) {		Value::BlockAddressVal, &Op<0>(), 2) {
setOperand(0, F);		setOperand(0, F);
setOperand(1, BB);		setOperand(1, BB);
BB->AdjustBlockAddressRefCount(1);		BB->AdjustBlockAddressRefCount(1);
}		}

BlockAddress BlockAddress::lookup(const BasicBlock BB) {		BlockAddress BlockAddress::lookup(const BasicBlock BB) {
if (!BB->hasAddressTaken())		if (!BB->hasAddressTaken())
return nullptr;		return nullptr;
▲ Show 20 Lines • Show All 1,609 Lines • Show Last 20 Lines

llvm/test/CodeGen/AVR/block-address-is-in-progmem-space.ll

This file was added.

				; RUN: llc -mcpu=atmega328 < %s -march=avr \| FileCheck %s

				; This test verifies that the pointer to a basic block
				; should always be a pointer in address space 1.
				;
				; If this were not the case, then programs targeting
				; AVR that attempted to read their own machine code
				; via a pointer to a label would actually read from RAM
				; using a pointer relative to the the start of program flash.
				;
				; This would cause a load of uninitialized memory, not even
				; touching the program's machine code as otherwise desired.

				target datalayout = "e-P1-p:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"

				; CHECK-LABEL: load_with_no_forward_reference
				define i8 @load_with_no_forward_reference(i8 %a, i8 %b) {
				second:
				; CHECK: ldi r30, .Ltmp0+2
				; CHECK-NEXT: ldi r31, .Ltmp0+4
				; CHECK: lpm r24, Z
				%bar = load i8, i8 addrspace(1)* blockaddress(@function_with_no_forward_reference, %second)
				ret i8 %bar
				}

				; CHECK-LABEL: load_from_local_label
				define i8 @load_from_local_label(i8 %a, i8 %b) {
				entry:
				%result1 = add i8 %a, %b

				br label %second

				; CHECK-LABEL: .Ltmp1:
				second:
				; CHECK: ldi r30, .Ltmp1+2
				; CHECK-NEXT: ldi r31, .Ltmp1+4
				; CHECK-NEXT: lpm r24, Z
				%result2 = load i8, i8 addrspace(1)* blockaddress(@load_from_local_label, %second)
				ret i8 %result2
				}

				; A function with no forward reference, right at the end
				; of the file.
				define i8 @function_with_no_forward_reference(i8 %a, i8 %b) {
				entry:
				%result = add i8 %a, %b
				br label %second
				second:
				ret i8 0
				}

llvm/test/CodeGen/AVR/brind.ll

	; RUN: llc -mattr=sram,eijmpcall < %s -march=avr -verify-machineinstrs \| FileCheck %s			; RUN: llc -mattr=sram,eijmpcall < %s -march=avr -verify-machineinstrs \| FileCheck %s

	@brind.k = private unnamed_addr constant [2 x i8] [i8 blockaddress(@brind, %return), i8* blockaddress(@brind, %b)], align 1			@brind.k = private unnamed_addr constant [2 x i8 addrspace(1)] [i8 addrspace(1) blockaddress(@brind, %return), i8 addrspace(1)* blockaddress(@brind, %b)], align 1

	define i8 @brind(i8 %p) {			define i8 @brind(i8 %p) {
	; CHECK-LABEL: brind:			; CHECK-LABEL: brind:
	; CHECK: ijmp			; CHECK: ijmp
	entry:			entry:
	%idxprom = sext i8 %p to i16			%idxprom = sext i8 %p to i16
	%arrayidx = getelementptr inbounds [2 x i8], [2 x i8]* @brind.k, i16 0, i16 %idxprom			%arrayidx = getelementptr inbounds [2 x i8 addrspace(1)], [2 x i8 addrspace(1)]* @brind.k, i16 0, i16 %idxprom
	%s = load i8, i8* %arrayidx			%s = load i8 addrspace(1), i8 addrspace(1)* %arrayidx
	indirectbr i8* %s, [label %return, label %b]			indirectbr i8 addrspace(1)* %s, [label %return, label %b]
	b:			b:
	br label %return			br label %return
	return:			return:
	%retval.0 = phi i8 [ 4, %b ], [ 2, %entry ]			%retval.0 = phi i8 [ 4, %b ], [ 2, %entry ]
	ret i8 %retval.0			ret i8 %retval.0
	}			}