This is an archive of the discontinued LLVM Phabricator instance.

[AIX] Handling the label alignment of a global variable with its multiple aliases.
ClosedPublic

Authored by Esme on Apr 28 2022, 8:58 PM.

Details

Summary

This patch handles the case where a variable has multiple aliases.
AIX's assembly directive .set is not usable for aliasing purpose, and using different labels allows AIX to emulate symbol aliases. If there is data emitted between any two labels, meaning they are not aligned, XCOFF will automatically calculate the offset for them.

This patch implements:

  1. Emits the label of the alias just before emitting the value of the sub-element that the alias referred to.
  1. A set of aliases that refers to the same offset should be aligned.
  1. We didn't emit aliasing labels for common and zero-initialized local symbols in PPCAIXAsmPrinter::emitGlobalVariableHelper, but emitted linkage for them in AsmPrinter::emitGlobalAlias, which caused a FAILURE. This patch fixes the bug by blocking emitting linkage for the alias without a label.

Diff Detail

Event Timeline

Esme created this revision.Apr 28 2022, 8:58 PM
Esme requested review of this revision.Apr 28 2022, 8:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 28 2022, 8:58 PM
DiggerLin added inline comments.May 3 2022, 12:59 PM
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
2468

need a test case for the code ?

2470

only one line code, do not need { and }

Esme updated this revision to Diff 427278.May 5 2022, 5:00 AM
Esme edited the summary of this revision. (Show Details)

Addressed @DiggerLin's comments, THX.

Esme added a comment.May 18 2022, 6:54 PM

Gentle ping.

shchenz added inline comments.May 22 2022, 7:39 PM
llvm/test/CodeGen/PowerPC/aix-alias.ll
93

These two .vbyte seems regression to me. Now we request more space in .data section than needed.

101

Same like above. We don't need this .vbyte?

Esme updated this revision to Diff 431856.May 24 2022, 6:52 PM

Addresses comments.

shchenz added inline comments.May 31 2022, 12:50 AM
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
2397

Maybe we need an assertion if we don't support other cases. Returning 0 quietly will cause mis-compile?

llvm/test/CodeGen/PowerPC/aix-alias-alignment.ll
40

This is not right as we allocate other space for var2/var3 which should be aliased to _MergedGlobals. _MergedGlobals has 8 bytes allocated in previous two .vbyte.
You can check the symbol table after assembling the assembly, below is what I get on AIX:

[7]	m   0x00000034     .data     1  extern                    foo
[8]	a4  0x0000000c       0    0     SD       DS    0    0
[9]	m   0x00000040     .data     1  unamex                    L.._MergedGlobals
[10]	a4  0x00000010       0    0     SD       RW    0    0
[11]	m   0x00000048     .data     1  unamex                    var1
[12]	a4  0x00000009       0    0     LD       RW    0    0
[13]	m   0x0000004c     .data     1  unamex                    var2
[14]	a4  0x00000009       0    0     LD       RW    0    0
[15]	m   0x0000004c     .data     1  extern                    var3
[16]	a4  0x00000009       0    0     LD       RW    0    0
[17]	m   0x00000050     .data     1  unamex                    L..base
[18]	a4  0x0000000c       0    0     SD       RW    0    0
[19]	m   0x00000058     .data     1  unamex                    var4
[20]	a4  0x00000011       0    0     LD       RW    0    0
[21]	m   0x00000058     .data     1  extern                    var5
[22]	a4  0x00000011       0    0     LD       RW    0    0
[23]	m   0x0000005c     .data     1  unamex                    TOC
[24]	a4  0x00000000       0    0     SD      TC0    0    0
[25]	m   0x0000005c     .data     1  unamex                    L.._MergedGlobals
[26]	a4  0x00000004       0    0     SD       TC    0    0
[27]	m   0x00000060     .data     1  unamex                    L..base
[28]	a4  0x00000004       0    0     SD       TC    0    0

You can see that section _MergedGlobals now is 16 bytes but it should be 8 bytes and var1 are not aligned with _MergedGlobals too.

I think for data-section enabled, the expected assembly generated from compiler should be like:

assembly
	.csect L.._MergedGlobals[RW],2
	.align	2                               # @_MergedGlobals
var1:
	.vbyte	4, 1                            # 0x1
	.lglobl	var1
var2:
var3:
	.vbyte	4, 2                            # 0x2
	.lglobl	var2
	.globl	var3
	.csect L..base[RW],2
	.align	2                               # @base
	.vbyte	4, 1                            # 0x1
var4:
var5:
	.vbyte	4, 2                            # 0x2
	.lglobl	var4
	.globl	var5

With above assembly, I get:

[9]	m   0x00000040     .data     1  unamex                    L.._MergedGlobals
[10]	a4  0x00000008       0    0     SD       RW    0    0
[11]	m   0x00000040     .data     1  unamex                    var1
[12]	a4  0x00000009       0    0     LD       RW    0    0
[13]	m   0x00000044     .data     1  unamex                    var2
[14]	a4  0x00000009       0    0     LD       RW    0    0
[15]	m   0x00000044     .data     1  extern                    var3
[16]	a4  0x00000009       0    0     LD       RW    0    0
[17]	m   0x00000048     .data     1  unamex                    L..base
[18]	a4  0x00000008       0    0     SD       RW    0    0
[19]	m   0x0000004c     .data     1  unamex                    var4
[20]	a4  0x00000011       0    0     LD       RW    0    0
[21]	m   0x0000004c     .data     1  extern                    var5
[22]	a4  0x00000011       0    0     LD       RW    0    0
[23]	m   0x00000050     .data     1  unamex                    TOC
[24]	a4  0x00000000       0    0     SD      TC0    0    0
[25]	m   0x00000050     .data     1  unamex                    L.._MergedGlobals
[26]	a4  0x00000004       0    0     SD       TC    0    0
[27]	m   0x00000054     .data     1  unamex                    L..base
[28]	a4  0x00000004       0    0     SD       TC    0    0
54

For data-section disabled case, the current patch also has .vbyte issue, however, even I changed the assembly like above, I still can not get right symbol table. I changed the assembly like:

        .csect .data[RW],2
        .align  2                               # @_MergedGlobals
L.._MergedGlobals:
var1:
        .vbyte  4, 1                            # 0x1
        .lglobl var1
var2:
var3:
        .vbyte  4, 2                            # 0x2
        .lglobl var2
        .globl  var3
        .align  2                               # @base
L..base:
        .vbyte  4, 1                            # 0x1
var4:
var5:
        .vbyte  4, 2                            # 0x2
        .lglobl var4
        .globl  var5

I get:

[9]	m   0x00000040     .data     1  unamex                    .data
[10]	a4  0x00000010       0    0     SD       RW    0    0
[11]	m   0x00000040     .data     1  unamex                    var1
[12]	a4  0x00000009       0    0     LD       RW    0    0
[13]	m   0x00000044     .data     1  unamex                    var2
[14]	a4  0x00000009       0    0     LD       RW    0    0
[15]	m   0x00000044     .data     1  extern                    var3
[16]	a4  0x00000009       0    0     LD       RW    0    0
[17]	m   0x0000004c     .data     1  unamex                    var4
[18]	a4  0x00000009       0    0     LD       RW    0    0
[19]	m   0x0000004c     .data     1  extern                    var5
[20]	a4  0x00000009       0    0     LD       RW    0    0
[21]	m   0x00000050     .data     1  unamex                    TOC
[22]	a4  0x00000000       0    0     SD      TC0    0    0
[23]	m   0x00000050     .data     1  unamex                    L.._MergedGlobals
[24]	a4  0x00000004       0    0     SD       TC    0    0
[25]	m   0x00000054     .data     1  unamex                    L..base
[26]	a4  0x00000004       0    0     SD       TC    0    0

You can see that, no way to change the content in .data section for var1/var2/var3, we only have TOC entry for L.._MergedGlobals but not associated to any SD/LD symbol in the .data section. If we want to support them, we may need to add TOC entry for var1/var2/var3/var4/var5 or find a way to make assembler generate two LD symbols to same location.

I suggest in this patch, we only handle data-sections enabled case.

jsji resigned from this revision.Jun 2 2022, 7:56 AM
Esme added inline comments.Jun 15 2022, 9:35 AM
llvm/test/CodeGen/PowerPC/aix-alias-alignment.ll
40

Thank you for your comments!
I doubt that AIX's assembly format can support such strict alignment requirements.
As you suggest, element values in a base AliaseeObject are not emitted under its label but its aliases' labels, which would solve the problem in my test case, but consider the following case, where should the third element value (.vbyte 4, 3) be emitted?

;; base has three elements and only the second one has alias.
@base = global <{ i32, i32, i32 }> <{ i32 1, i32 2, i32 3 }>, align 4
@var1 = internal alias i32, getelementptr inbounds (<{ i32, i32 }>, <{ i32, i32 }>* @base, i32 0, i32 1)
@var2 = alias i32, i32* @var1

Consider a more complex example:

base has 4 elements:  { i32 1, i32 2, i32 3, i32 4 }, align 4
var1 is the alias to {i32 1, i32 2}
var2 is the alias to {i32 2, i32 3}

I have no idea what the assembly of this case should be in order to meet the correct values and alignment.

54

The missing linkage info caused the issue, ie. the directive .globl is not emitted for private global variables. I'm not sure whether the behavior is reasonable. However, it's out of this patch's scope and I'll investigate it as follow-up work.
Using @_MergedGlobals = global <{ i32, i32 }> <{ i32 1, i32 2 }>, align 4 emits the directive .globl _MergedGlobals, which fixed the problem.

Esme updated this revision to Diff 439176.Jun 22 2022, 5:21 PM
Esme edited the summary of this revision. (Show Details)

Thank you for your comments. @shchenz
I had a misunderstanding about the behavior of label alignment on AIX.
Please ignore my previous replies. I've addressed your comments in this update.

  1. Emit the label of the alias just before emitting the value of the sub-element that the alias referred to.
  2. Emit the linkage of alias together with the global variable‘s, that is, before labels and values, although there is no impact on the functionality I think this is more appropriate.
  3. Remove the test case that is unnecessary for the current patch.
  4. Check the symbol table after assembling the assembly to verify no space allocation regression.

Thanks. The assembly for PowerPC part looks right now. I have some more comments about the array and vector types.

llvm/include/llvm/CodeGen/AsmPrinter.h
478–482

Nit: do we need to doc this comment? If so, you may need /// instead of //

llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
2369

Nit: we may need another fatal_error if the second operand is not a constant integer expression.

shchenz added inline comments.Jun 28 2022, 2:57 AM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
3287

No idea why my comments here is lost in my last post. I will re-post it.

I tested some array and vector cases, I think current patch may not be able to handle them well on AIX. Could you please verify? As you are also handling array here, so do we plan to fix them all? Or we just want to fix ConstantStruct type?

case 1: vector type, ConstantDataSequential

@_MergedGlobals = global <2 x i64> <i64 12, i64 34>, align 4
@var1 = alias i64, getelementptr inbounds (<2 x i64>, <2 x i64>* @_MergedGlobals, i32 0, i32 1)

define void @foo(i64 %a1) {
  store i64 %a1, i64* getelementptr inbounds (<2 x i64>, <2 x i64>* @_MergedGlobals, i32 0, i32 1), align 4
  ret void
}

case 2: array type, ConstantDataSequential:

@_MergedGlobals = global [2 x i64] [i64 12, i64 34], align 4
@var1 = alias i64, getelementptr inbounds ([2 x i64], [2 x i64]* @_MergedGlobals, i32 0, i32 1)

define void @foo(i64 %a1) {
  store i64 %a1, i64* getelementptr inbounds ([2 x i64], [2 x i64]* @_MergedGlobals, i32 0, i32 1), align 4
  ret void
}

case 3: array type, ConstantArray:

%struct.B = type { i64 }
@_MergedGlobals = global [2 x %struct.B] [%struct.B {i64 12}, %struct.B {i64 34}], align 4
@var1 = alias %struct.B, getelementptr inbounds ([2 x %struct.B], [2 x %struct.B]* @_MergedGlobals, i32 0, i32 0)

define void @foo(%struct.B %a1) {
  store %struct.B %a1, %struct.B* getelementptr inbounds ([2 x %struct.B], [2 x %struct.B]* @_MergedGlobals, i32 0, i32 1), align 4
  ret void
}
Esme updated this revision to Diff 441254.Jun 29 2022, 9:28 PM

Addressed comments.

Esme added inline comments.Jun 29 2022, 9:30 PM
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
3287

Thanks for pointing this out!
I have added support for these types in the new revision.

shchenz accepted this revision as: shchenz.Jun 30 2022, 2:05 AM

LGTM. Thanks for patience and this good improvement.

This revision is now accepted and ready to land.Jun 30 2022, 2:05 AM
This revision was landed with ongoing or failed builds.Jul 3 2022, 8:18 PM
This revision was automatically updated to reflect the committed changes.

After reading this patch, I'm confused about how it's supposed to work in general. In particular, what happens if we never call emitGlobalAliasInline() with the correct offset? Do we just throw away the symbol?

After reading this patch, I'm confused about how it's supposed to work in general. In particular, what happens if we never call emitGlobalAliasInline() with the correct offset? Do we just throw away the symbol?

If there is a alias symbol which has an offset can not be recognized by this patch, I think a fatal error will be emitted. See PPCAIXAsmPrinter::getAliasOffset(). @Esme could you confirm that?

Esme added a comment.Jul 6 2022, 11:59 PM

After reading this patch, I'm confused about how it's supposed to work in general. In particular, what happens if we never call emitGlobalAliasInline() with the correct offset? Do we just throw away the symbol?

If there is a alias symbol which has an offset can not be recognized by this patch, I think a fatal error will be emitted. See PPCAIXAsmPrinter::getAliasOffset(). @Esme could you confirm that?

Yes, @shchenz is right. In this patch, we emit the alias label before the sub-element value only if they have the same offset. In general, if the alias's offset doesn't match to any sub-element's offset, we will not emit the label for that alias symbol since we don't know where to emit it. So we must guarantee the offset is valid when we get it in PPCAIXAsmPrinter::getAliasOffset().

I don't think that addresses my concern. I'm specifically concerned about cases like the following:

@ConstVector1 = global <2 x i64> zeroinitializer
@var1 = alias i64, getelementptr inbounds (<2 x i64>, ptr @ConstVector1, i32 0, i32 1)

@ConstVector2 = global <2 x i64> <i64 1, i64 2>
@var2 = alias i64, getelementptr inbounds (i8, ptr @ConstVector2, i32 1)

Currently, we silently throw away the definitions of "var1" and "var2".

Esme added a comment.Jul 7 2022, 11:32 AM

I don't think that addresses my concern. I'm specifically concerned about cases like the following:

@ConstVector1 = global <2 x i64> zeroinitializer
@var1 = alias i64, getelementptr inbounds (<2 x i64>, ptr @ConstVector1, i32 0, i32 1)

@ConstVector2 = global <2 x i64> <i64 1, i64 2>
@var2 = alias i64, getelementptr inbounds (i8, ptr @ConstVector2, i32 1)

Currently, we silently throw away the definitions of "var1" and "var2".

Thanks, this is a really interesting case!
Targets like ppc64le-linux express such case with an expression like .set var2, ConstVector2+1.
However AIX doesn't support the .set directive.
I need to do some investigation into how the AIX assembly expresses the case where the alias offset doesn't refer to any sub-element.