This is an archive of the discontinued LLVM Phabricator instance.

[AIX][TLS] Produce a faster local-exec access sequence with -maix-small-local-exec-tls (And optimize when load/store offsets are 0)
ClosedPublic

Authored by amyk on Jul 18 2023, 7:51 AM.

Details

Summary

This patch utilizes the -maix-small-local-exec-tls option added in
D155544 to produce a faster access sequence for the local-exec TLS
model, where loading from the TOC can be avoided.

The patch either produces an addi/la with a displacement off of r13
(the thread pointer) when the address is calculated, or it produces an
addi/la followed by a load/store when the address is calculated and
used for further accesses.

This patch also optimizes this sequence a bit more where we can remove
the addi/la when the load/store offset is 0. A follow up patch will
be posted to account for when the load/store offset is non-zero, and
currently in these situations we keep the addi/la that precedes the
load/store.

Furthermore, this access sequence is only performed for TLS variables
that are less than ~32KB in size.

Diff Detail

Event Timeline

amyk created this revision.Jul 18 2023, 7:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 18 2023, 7:51 AM
amyk requested review of this revision.Jul 18 2023, 7:51 AM
amyk updated this revision to Diff 541791.Jul 18 2023, 5:05 PM

Update check to be slightly less than 32KB to account for additional overhead.

llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
1566–1567

Fix sentence to make statement based on target and not host.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
3394

What does AIXTLSUpperDisplacement represent? It is already given a value of 32K - 8. Should the hard coded subtraction be here, should the value of AIXTLSUpperDisplacement be further adjusted instead, or is there unintentional double adjustment happening?

For reference, I encountered no issues linking the result of assembly that performs la (using the subject access pattern) for the past-the-end address of a 32K - 1 local-exec TLS variable.

3410

Typo fix: "on supported" -> "only supported"

Instead, an addi/la with a displacement followed by a load/store
directly off of r13 (the thread pointer) is produced.

@amyk, the patch description seems overly focused on the load/store cases (as opposed to the "pure" address calculation cases).
Also, the sentence quoted above seems to imply that r13 is used as an operand of the load/store and not the addi/la (which is probably wrong in the cases where the addi/la is kept).

I also didn't see (in my skim through the tests) any case where the address is simply calculated and not used for a further access.

llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
7656–7657

This change looks suspicious.

I am getting:

Global must be word-aligned for LD, STD, LWA!
UNREACHABLE executed at /terrannew/hstong/.Lpcoral03/llvmbld/dev/llvm-project/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:1523!

With:

clang --target=powerpc64-ibm-aix -maix-small-local-exec-tls -O -c -o /dev/null -xc - \
<<<$'__attribute__((__tls_model__("local-exec"))) __thread struct { int x __attribute__((__packed__)); } a;\nint f() { return a.x; }'
llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
7656–7657

The comment is out-of-date given the number of RequiresMod4Offset checks in the function. It seems the comment is bolstering the legitimacy of outdated code. The check on dyn_cast<ConstantSDNode>(ImmOpnd) further below seems to be a better version of the code here. Perhaps the code here should just be removed?

7725

Note: Looks like better version of the code is here.

llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
1572

Typo?

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
154

Should be constexpr.

3384

The last sentence is redundant with statements made on the 32-bit path.

3392

ICE/assert:

clang: /terrannew/hstong/.Lpcoral03/llvmbld/dev/llvm-project/llvm/include/llvm/IR/DataLayout.h:669: TypeSize llvm::DataLayout::getTypeSizeInBits(Type *) const: Assertion `Ty->isSized() && "Cannot getTypeInfo() on a type that is unsized!"' failed.

with

clang --target=powerpc64-ibm-aix -c -o /dev/null -xc++ -<<<$'extern __thread struct X x [[gnu::tls_model("local-exec")]];\nvoid *f() { return &x; }'

A similar case with an array of unknown bound does not assert; however, it does generate the small TLS access pattern despite the size being unknown (and potentially huge).

3406–3407

The comment should be clear that there is a 32-bit version that is merely omitted by the implementation here.

llvm/lib/Target/PowerPC/PPCMCInstLower.cpp
40

The symbol would be a label (no storage class attached in the assembly syntax) if it is an alias or if data sections is disabled. What is the comment trying to say about the form of the symbol?

llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
1558

Nit: Indentation.

llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
1524–1526

Not a single test failed when I removed this. Perhaps this does not belong in this patch (but a future one)?

1538–1555

Same comment re: the cases other than ADDI8.

1562

For the purposes of this patch, maybe assert up front that we are handling the ADDI8 case.

Sorry, I forgot to submit these comments when I reviewed this.

llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
1559

I'm not sure I follow what the purpose of this function is. It would seem that it checks whether we are lowering the AIX Small TLS instruction and change any PPCIADDI8 to PPC::LA8. However, if neither of those are true, it doesn't do anything.

The issue I have with it is that if this returns false, we break out of the switch and simply call

LowerPPCMachineInstrToMCInst(MI, TmpInst, *this);
EmitToStreamer(*OutStreamer, TmpInst);

which would have happened if we never called this function anyway. So it would seem that we don't need the additions from lines 1541 to 1564. We can simply add PPC::ADDI8 into the switch above, and change it to PPC::LA8 if both of the following conditions are met:
Subtarget->hasAIXSmallLocalExecTLS() && MO.getTargetFlags() & PPCII::MO_TPREL_FLAG

Or maybe I'm missing something.

llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
7656–7657

Right, if Base.getOperand(1) is not a constant, we need to check the alignment of the GlobalAddress.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
3394

I agree. We shouldn't be doing any arithmetic here on AIXTLSUpperDisplacement. We should set it to what we want it and only perform comparisons.

llvm/lib/Target/PowerPC/PPCMCInstLower.cpp
46

I suppose it doesn't matter for precedence of & over &&, but I think it is still easier to read this condition if you parenthesize this part as (MO.getTargetFlags() & PPCII::MO_TPREL_FLAG).

llvm/lib/Target/PowerPC/PPCMCInstLower.cpp
36

This should be moved down past the early return.

47

This can be TM.getSymbol like the toc-data case. No need for unnecessary differences between the two.

llvm/lib/Target/PowerPC/PPCMCInstLower.cpp
32–48

I replaced the entire function to unconditionally use AP.getSymbol(GV) for MO_GlobalAddress cases. All LIT and test-suite cases pass in 3-stage bootstrap builds on ppc64le Linux.

This should make sense: Going through the previously-default path to get a symbol using just the name is a loss of information (which was found to be problematic for toc-data and small TLS). I see no reason to keep the lossy path.

I strongly suggest going with a generic common path (which becomes well-tested) versus having special-case paths.

Since the change is not actually NFC and this patch seems to be the thing that we know can test it, I guess it goes in with this patch.

Note: I removed the part that does things with the offset. I am not sure if belongs in this function. The name of the function, GetSymbolFromOperand suggests that perhaps only the symbol itself should be considered. Also, the result of this function gets passed to GetSymbolRef, which already expects to get the offset from the MachineOperand.

shchenz added a subscriber: shchenz.Aug 2 2023, 1:04 AM

Is there any reason that we limit fast tls access to only local-exec mode? Can initial exec mode use the fast access? We are trying to add initial exec mode support in https://reviews.llvm.org/D156292

according to https://www.ibm.com/docs/en/aix/7.2?topic=program-using-thread-local-storage

both 32-bit and 64-bit support faster access sequence for the local-exec TLS model. Do you want to modify the summary Since the patch only support 64-bit.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
3394

according to https://www.ibm.com/docs/en/aix/7.2?topic=program-using-thread-local-storage

The local-dynamic and local-exec access methods have a faster code sequence that can be used if the total size of thread-local variables is smaller than 62 KB (I think it is typo here). If the total size of the region is too large, the link-editor will patch the code by generating extra instructions, negating the benefit of using the faster code sequence.

we need to calculate sum of the total size of thread-local variables somewhere and check whether the value is less than 32K.
does the AIXTLSUpperDisplacement want to express sum of the total size of thread-local variables ? if it the calculation is not correct.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
3394

There is no intent to calculate the "total size of thread-local variables" here (there is no real point in doing so at a translation-unit level anyway). The documentation you quote is fairly loose about its use of "can". It is also incorrect in stating that the benefits of the faster code sequence are necessarily negated: an additional benefit of the faster code sequence is less use of the TOC.

It is entirely possible to have a variable > 32K allocated into the area accessible with the small TLS access sequence; however, not all bytes of such a variable can be accessed directly. This does not mean that we can't generate small TLS access sequences for such a variable, but the IBM XL compiler went with the limit (and it is less adventurous for us to follow suit on this).

llvm/lib/Target/PowerPC/PPCMCInstLower.cpp
96–97

In light of https://reviews.llvm.org/D156292, I believe this should have an assertion added about the TLS model (see llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp change in the aforementioned patch).

amyk edited the summary of this revision. (Show Details)Aug 5 2023, 9:28 AM
nemanjai requested changes to this revision.Aug 6 2023, 2:15 AM

I don't think the last set of comments were addressed yet. Requesting changes to remove it from my queue.

This revision now requires changes to proceed.Aug 6 2023, 2:15 AM
amyk updated this revision to Diff 548061.Aug 7 2023, 10:05 PM
amyk marked 20 inline comments as done.
  • Update comments
  • Add assert for TLS model in PPCMCInstLower.cpp
  • Update GetSymbolFromOperand()
  • Only produce the faster local-exec sequences when the global variable is sized and non-empty (to catch cases like an extern struct or unbounded array)
  • Clean up PPCAsmPrinter.cpp
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
1524–1526

This is no longer needed, as Nemanja previously pointed out.

1538–1555

I'm going to remove the other cases since they are not needed.

1559

Yes, you're right. This appears not to be needed. I originally had this because the implementation was more complex than expected, but now it is more simplified so I don't need this.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
3392

Thanks for catching this. I've made it so that extra cases like these just simply use the TOC-based local-exec method.

3394

I have set AIXTLSUpperDisplacement to be the value that we want.
It is meant to represent that maximum size of the TLS variable, in which we can produce the small local-exec sequence for.

llvm/lib/Target/PowerPC/PPCMCInstLower.cpp
32–48

Thanks for the suggestion and for discussing your questions, and concerns with me offline regarding this change.

96–97

I have updated it to add an assert for the local-exec model. Is this what you meant?

Partial re-review comments.

llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
1539–1541

The actually important thing is the target flag and the instruction opcode.

1543–1544

Restrict the check to just the target flag (moving the other "condition" to be just an assertion).

amyk updated this revision to Diff 549623.Aug 12 2023, 9:24 AM
amyk marked 2 inline comments as done.
  • Rebase patch.
  • Update comment and condition, as well as adding an assertion in PPCAsmPrinter.cpp.
hubert.reinterpretcast edited the summary of this revision. (Show Details)Aug 12 2023, 6:32 PM

Additional partial re-review comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
151–153

We chose the maximum based on the behaviour of the reference implementation (not from first principles).

154

"Displacement" implies "offset", not size. Maybe AIXSmallTlsPolicySizeLimit?

3382–3394

Since the limit is not derived from first principles, phrase as a policy choice. Additionally, start off by mentioning the option (so the comments make sense in the context of the earlier comment about the TOC-based path).

3395–3399

Check HasAIXSmallLocalExecTLS first before setting up for more checks.

llvm/lib/Target/PowerPC/PPCMCInstLower.cpp
96–97

Yes.

amyk updated this revision to Diff 549708.Aug 13 2023, 9:07 AM
amyk marked 4 inline comments as done.

Update PPCISelLowering.cpp to:

  • Reword comments
  • Change AIXTLSUpperDisplacement to AIXSmallTlsPolicySizeLimit
  • Move additional checks on the global variable only if the non-TOC based local-exec access method is enabled
llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
7656–7657

If we are to keep the condition, this needs a comment for why HasAIXSmallLocalExecTLS is a sufficient guard to limit the behaviour to just the new AIX "small TLS" accesses. The logic would be that the option is only valid for AIX and, on AIX, a ADDI/ADDI8 node with a global address for the immediate operand would necessarily be for "small TLS" (although possibly for the local-dynamic TLS model with future developments).

However: The optimization (with the checks) is generally sound, so maybe there should be no such check at all (except to limit the scope of the current change because some tests need updating). Yes, there can be downstream code that may have issues with generating the global address as an immediate operand for the load/store instruction, but changing the downstream code seems to be the solution for those cases.

If we keep the check, there should be a TODO comment to remove the artificial limitation.

7656–7657

Aside from the checks that limit the global address (as opposed to constant) immediate operand handling to "small TLS", this code does the same thing as the code for the toc-data cases (except that which operand is the immediate operand is different between the two cases).

The code can be made common (using ImmOpnd) in the non-ReplaceFlags path at line 7717.

Note: isa followed by cast/dyn_cast is an anti-pattern.

7656–7657

The comment is already wrong because the non-ReplaceFlags path can handle non-zero load/store offsets in combination with an original constant immediate operand that is not divisible by 4. Additionally, the other cases now include the toc-data cases (which does not necessarily provide aligned addresses).

I would rather have no comment than a comment that is actively wrong. Further below, I recommend a FIXME comment.

7664

Please add a FIXME comment that this does not account for cases where the offset from the load/store causes the combined immediate operand to be sufficiently aligned of the instruction encoding.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
3392

Sorry for the extra churn: with the new name, we should use <= here (and adjust the value to 32751).

llvm/lib/Target/PowerPC/PPCMCInstLower.cpp
97

asserts print the spelling of the expression, so the \n is not needed.

amyk marked 6 inline comments as done.Aug 17 2023, 10:14 PM

@hubert.reinterpretcast I was going to update the patch, and then I thought about your comments a bit more.
Just wanted to double check on some suggestions before I misunderstand anything and update the patch if that's OK.

llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
7656–7657

This has been accounted for in PPCISelLowering.cpp.

7656–7657
  • I think there are possibilities that HasAIXSmallLocalExecTLS may not be necessary here. Just wanted to check, when you say "artificial limitation", do you mean that HasAIXSmallLocalExecTLS check?
  • Are you suggesting that this else if should be removed as we have ImmOpnd on 7677, and that it goes into the the code in 7717? Or am I misunderstanding? Just figured I'd double check because as you mentioned, this is checking the operand of the ADDI, whereas I believe the below accounts for the cases where we're dealing with the offset from the load/store and combined immediate operand.
  • Are you also suggesting that the initial condition should not be if (isa<GlobalAddressSDNode>(BaseOp1) but rather a dyn_cast of a GlobalAddressSDNode?
llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
7656–7657
  • I think there are possibilities that HasAIXSmallLocalExecTLS may not be necessary here. Just wanted to check, when you say "artificial limitation", do you mean that HasAIXSmallLocalExecTLS check?

Yes. Removing it does enable the optimization in more cases (e.g., TLS on Linux) though. It may make sense to pre-commit this change first.

  • Are you suggesting that this else if should be removed as we have ImmOpnd on 7677, and that it goes into the the code in 7717? Or am I misunderstanding? Just figured I'd double check because as you mentioned, this is checking the operand of the ADDI, whereas I believe the below accounts for the cases where we're dealing with the offset from the load/store and combined immediate operand.

No. This comment is about the non-else if part: That code can be removed (some adjustment would be needed to the else if part here, but there are multiple choices for how much to do) and the substantially similar code near 7649 can be moved into the code at 7717 (with usage of ImmOpnd). That way, we have one less copy of the logic.

  • Are you also suggesting that the initial condition should not be if (isa<GlobalAddressSDNode>(BaseOp1) but rather a dyn_cast of a GlobalAddressSDNode?

Yes, and that is what the code at 7649 does.

7664

For the else if part here, you may want to remove it in favour of the logic at line 7723.

amyk updated this revision to Diff 554179.Aug 28 2023, 10:16 PM
amyk marked an inline comment as done.

Update PPCISelDAGToDAG.cpp as per review comments.

amyk added inline comments.Aug 28 2023, 10:18 PM
llvm/test/CodeGen/PowerPC/ppc64-nonfunc-calls.ll
40

@nemanjai I think remember before that you mentioned that perhaps it is not necessary to check if the Subtarget.hasAIXSmallLocalExecTLS(); flag in PPCISelDAGToDAG.cpp - and that checking if the global variable is aligned should be sufficient.

Through the recent comments on this patch, I think it makes sense that we don’t need to check for this flag. This would mean the change that I am doing apply to Linux, as well. Furthermore, I also updated PPCISelDAGToDAG.cpp to remove a portion that seemed like it had an equivalent portion later on in the file.

After doing some functional testing, I didn’t see any issues, with the exception of one Linux test case affected by the change I am making. This test case was introduced awhile back in 87deb0b8e3f14855e6c4652993be86684c2dd429. I believe in here we’re bitcasting a TLS global and we’re treating this as a call through a function pointer. The node we’re dealing with in the test case is a GlobalAddressSDNode and it is aligned/passes my alignment check, so it ends up getting folded.

Before my patch, the TLS relocation is on the addis/addi as expected, and the subsequent loads load off of r3:

# %bb.0:                                # %entry
	mflr 0
	stdu 1, -112(1)
	addis 3, 13, tls_something@tprel@ha
	std 0, 128(1)
	addi 3, 3, tls_something@tprel@l
	std 2, 40(1)
	ld 4, 0(3)
	ld 11, 16(3)
	ld 2, 8(3)
	mtctr 4
	bctrl
	ld 2, 40(1)
	addi 1, 1, 112
	ld 0, 16(1)
	mtlr 0
	blr

And after my patch, the ld with the zero offset gets affected, and none of the others do (which makes sense because we have non-zero offsets).

# %bb.0:                                # %entry
	mflr 0
	stdu 1, -112(1)
	addis 3, 13, tls_something@tprel@ha
	std 0, 128(1)
	addi 4, 3, tls_something@tprel@l
	ld 3, tls_something@tprel@l(3)
	std 2, 40(1)
	ld 11, 16(4)
	ld 2, 8(4)
	mtctr 3
	bctrl
	ld 2, 40(1)
	addi 1, 1, 112
	ld 0, 16(1)
	mtlr 0
	blr

In any case, I figured I’d ask if you have any thoughts/opinions on this since you requested changes and also had the original thought of not checking for hasAIXSmallLocalExecTLS() in the beginning.

llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
7708–7709

Suggested wording change.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
3392

This comment has not been addressed.

llvm/lib/Target/PowerPC/PPCMCInstLower.cpp
97

This comment has not been addressed.

llvm/lib/Target/PowerPC/PPCMCInstLower.cpp
100

Suggested wording change.

llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
7664

Please add a FIXME comment that this does not account for cases where the offset from the load/store causes the combined immediate operand to be sufficiently aligned of the instruction encoding.

My comment above is incorrect. The offset from the load/store was already aligned for the instruction encoding, so the combination will only remain aligned if the offset from the addi was also a multiple of 4.

Anyhow, the check here was redundant anyway (and has been removed). The change just happens to have no functional effect.

amyk updated this revision to Diff 555411.Sep 1 2023, 9:20 AM
amyk marked 11 inline comments as done.

Reword comments, and address Hubert's previous comments that were previously missed.

amyk added inline comments.Sep 1 2023, 9:23 AM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
3392

Ah, I am very sorry. I thought I had addressed it but it turns out that I did not.
I have addressed it now.

llvm/lib/Target/PowerPC/PPCMCInstLower.cpp
97

Sorry, addressed now.

100

@hubert.reinterpretcast After rebasing the patch, initial-exec also uses the same flag and code path. Thus, I don't think the assert() is appropriate here anymore.

I've updated this to just a check for local-exec and just setting the RefKind in this condition, which I think should be OK since it won't get set in the initial-exec case.

diff --git a/llvm/lib/Target/PowerPC/PPCMCInstLower.cpp b/llvm/lib/Target/PowerPC/PPCMCInstLower.cpp
index 92a1311e57a8..e2bb28b397c0 100644
--- a/llvm/lib/Target/PowerPC/PPCMCInstLower.cpp
+++ b/llvm/lib/Target/PowerPC/PPCMCInstLower.cpp
@@ -96,9 +96,8 @@ static MCOperand GetSymbolRef(const MachineOperand &MO, const MCSymbol *Symbol,
   else if (MO.getTargetFlags() == PPCII::MO_TPREL_FLAG) {
     assert(MO.isGlobal() && "Only expecting a global MachineOperand here!");
     TLSModel::Model Model = TM.getTLSModel(MO.getGlobal());
-    assert(Model == TLSModel::LocalExec &&
-           "Only local-exec accesses are handled!");
-    RefKind = MCSymbolRefExpr::VK_PPC_AIX_TLSLE;
+    if (Model == TLSModel::LocalExec)
+      RefKind = MCSymbolRefExpr::VK_PPC_AIX_TLSLE;
   }

   const MachineInstr *MI = MO.getParent();

I was wondering if you had any thoughts or objections to this.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
3392

No problem; thanks for addressing.

llvm/lib/Target/PowerPC/PPCMCInstLower.cpp
100

Considering that this change to set the RefKind was not needed for the base local-exec code model enablement, there is no reason to expect that initial-exec code actually gets here (and if it does, we should want to inspect the result of what this function does).

In other words, this code only became necessary for local-exec when we started applying relocations on the instruction operand, and since initial-exec TLS relocations on such operands are not allowed, the assert should stay. Actually, the old wording is correct with this rationale (so let's go with the old wording).

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
3389

Condition needs update with rebase.

amyk updated this revision to Diff 555600.Sep 2 2023, 1:55 PM
amyk marked an inline comment as done.

Address comment of adding an additional TLS local-exec model check into PPCISelLowering.cpp.
I have not yet addressed the comment regarding the assert in PPCMCInstLower.cpp, as this is still being discussed.

LGTM with request to add a comment.

llvm/lib/Target/PowerPC/PPCMCInstLower.cpp
100

It seems we run this code for TOC reference cases before we replace with a reference to the TOC entry. That may not be ideal, but it works.

We can leave the code as-is (with the if check). A comment would help:
For the local-exec TLS model, we may generate the offset from the TLS base as an immediate operand (instead of using a TOC entry); set the relocation type in case the result is used for purposes other than a TOC reference. In TOC reference cases, this result is discarded.

amyk updated this revision to Diff 555892.Sep 5 2023, 10:24 AM
amyk marked 2 inline comments as done.

Address Hubert's comment of adding an comment to PPCMCInstLower.cpp.

@nemanjai I see you've requested changes from last time. If you're able to take a look again, that would be greatly appreciated.

llvm/lib/Target/PowerPC/PPCMCInstLower.cpp
100

Thanks for discussing with me offline regarding this. I've added the comment.

Thanks @amyk for adding the comment. Confirming LGTM.

amyk updated this revision to Diff 556058.Sep 6 2023, 11:48 AM

Patch rebased.

nemanjai accepted this revision.Sep 6 2023, 10:55 PM

LGTM. The patch is easy to follow now. Thank you for your patience.

llvm/lib/Target/PowerPC/PPCMCInstLower.cpp
34

Please add a comment here as to why this method is preferred. Perhaps something like:

// Get the symbol from the global, accounting for XCOFF-specific
// intricacies (see TargetLoweringObjectFileXCOFF::getTargetSymbol).
llvm/test/CodeGen/PowerPC/ppc64-nonfunc-calls.ll
40

This seems fine to me.

This revision is now accepted and ready to land.Sep 6 2023, 10:55 PM