This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] Fix the computation of the alignment of split stores.
ClosedPublic

Authored by courbet on Feb 10 2020, 5:33 AM.

Details

Summary

Right now the alignment of the lower half of a store is computed as
align/2, which fails for unaligned stores (align = 1), and is overly
pessimitic for, e.g. a 8 byte store aligned to 4 bytes.
Fixes PR44851

Event Timeline

courbet created this revision.Feb 10 2020, 5:33 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 10 2020, 5:33 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
courbet updated this revision to Diff 243522.Feb 10 2020, 5:36 AM

remove dplicate test

gchatelet added inline comments.Feb 10 2020, 5:47 AM
llvm/lib/CodeGen/CodeGenPrepare.cpp
6872

Can you reuse Alignment here?

lebedev.ri added inline comments.Feb 10 2020, 5:50 AM
llvm/lib/CodeGen/CodeGenPrepare.cpp
6873

On a first read, this is very confusing.
When splitting store in half, naturally one half will retain the alignment
of the original wider store, regardless of whether it was over-aligned or not,
while other will require adjustment.

6875

Can the no-longer-pessimism of the computation be demonstrated by the tests?

courbet updated this revision to Diff 243524.Feb 10 2020, 5:57 AM

use opt test to explicitly show align

courbet updated this revision to Diff 243526.Feb 10 2020, 6:01 AM

Add a test with align=2 to show "no-longer-pessimism".

Harbormaster completed remote builds in B46078: Diff 243526.
courbet updated this revision to Diff 243529.Feb 10 2020, 6:02 AM

Improve documentation as suggested during the review.

courbet marked 3 inline comments as done.Feb 10 2020, 6:03 AM

Thanks

lebedev.ri added inline comments.Feb 10 2020, 6:14 AM
llvm/test/CodeGen/X86/split-store-unaligned.ll
31

Hmm.
Could you please add a similar test (but with wide store being 64-bit aligned) with big-endian triple please.

courbet updated this revision to Diff 243538.Feb 10 2020, 6:17 AM

rename test file

courbet updated this revision to Diff 243549.Feb 10 2020, 6:49 AM

add PowerPC test

lebedev.ri accepted this revision.Feb 10 2020, 7:10 AM

LG to me.

llvm/test/CodeGen/PowerPC/split-store-alignment.ll
2–3 ↗(On Diff #243549)

Please unique checklines

; RUN: opt -S -mtriple=powerpc64le -codegenprepare -force-split-store < %s  | FileCheck --check-prefixes=ALL,PPC64LE %s
; RUN: opt -S -mtriple=powerpc64 -codegenprepare -force-split-store < %s  | FileCheck --check-prefixes=ALL,PPC64 %s
101–102 ↗(On Diff #243549)

Okay, i guess there is no big-endian issue after all.

This revision is now accepted and ready to land.Feb 10 2020, 7:10 AM
courbet updated this revision to Diff 243565.Feb 10 2020, 8:00 AM

unique CHECK lines

spatel added inline comments.Feb 10 2020, 8:05 AM
llvm/test/CodeGen/PowerPC/split-store-alignment.ll
101–102 ↗(On Diff #243549)

Wait - why is there no difference between LE and BE here?
We may need to explicitly specify the datalayout on the command-line or in the test file.

spatel requested changes to this revision.Feb 10 2020, 8:12 AM

CGP is not creating an appropriate default datalayout based on triple, so we need to specify it explicitly:

$ opt -S splitstore.ll -mtriple=powerpc64 -data-layout="E" -force-split-store -codegenprepare
; ModuleID = 'splitstore.ll'
source_filename = "splitstore.ll"
target datalayout = "E"
target triple = "powerpc64"

define void @split_store_align8(i32 %x, i64* %p) {
  %z = zext i32 43 to i64
  %s = shl nuw nsw i64 %z, 32
  %z2 = zext i32 %x to i64
  %o = or i64 %s, %z2
  %1 = bitcast i64* %p to i32*
  %2 = getelementptr i32, i32* %1, i32 1
  store i32 %x, i32* %2, align 8
  %3 = bitcast i64* %p to i32*
  store i32 43, i32* %3, align 4
  ret void
}
$ opt -S splitstore.ll -mtriple=powerpc64le -data-layout="e" -force-split-store -codegenprepare
; ModuleID = 'splitstore.ll'
source_filename = "splitstore.ll"
target datalayout = "e"
target triple = "powerpc64le"

define void @split_store_align8(i32 %x, i64* %p) {
  %z = zext i32 43 to i64
  %s = shl nuw nsw i64 %z, 32
  %z2 = zext i32 %x to i64
  %o = or i64 %s, %z2
  %1 = bitcast i64* %p to i32*
  store i32 %x, i32* %1, align 8
  %2 = bitcast i64* %p to i32*
  %3 = getelementptr i32, i32* %2, i32 1
  store i32 43, i32* %3, align 4
  ret void
}
This revision now requires changes to proceed.Feb 10 2020, 8:12 AM
courbet updated this revision to Diff 243750.Feb 10 2020, 11:51 PM

Add target triple in module in tests, which shows the brokenness of alignment
for BE targets. Fit it.

CGP is not creating an appropriate default datalayout based on triple, so we need to specify it explicitly:

Thanks !

lebedev.ri accepted this revision.Feb 11 2020, 12:07 AM
lebedev.ri marked an inline comment as done.

LG for real now :)

llvm/test/CodeGen/PowerPC/split-store-alignment.ll
101–102 ↗(On Diff #243549)

Oops, thanks for catching that.
So there was an issue after all.

spatel accepted this revision.Feb 11 2020, 10:19 AM

LGTM - see inline for a comment nit and some potential test changes.

llvm/lib/CodeGen/CodeGenPrepare.cpp
6873

The suggested "one half" is better than the current "lower half" - eliminate the endian-specific inaccuracy.

llvm/test/CodeGen/PowerPC/split-store-alignment-le.ll
2 ↗(On Diff #243750)

Unless I'm missing some advantage of the way of the tests are arranged currently, I prefer that:

  1. The tests live under llvm/test/Transforms/CodeGenPrepare/<target>. I know there are exceptions, but that is where I first look for IR --> IR tests.
  2. If we include the layout specifiers as parameters in the RUN line, then it saves some duplication, and it's easier to spot the endian differences.
  3. If we can push the non-crashing tests as a preliminary commit, that would be nicer, so we can see the current buggy code (and the tests will remain just in case this patch ever gets reverted).
This revision is now accepted and ready to land.Feb 11 2020, 10:19 AM

We probably want to backport the big-endian fix into 10.0 in some form?
Please do file a bug

courbet updated this revision to Diff 244088.Feb 12 2020, 12:52 AM

rebase on submitted base tests.

Thanks for the review !

This revision was automatically updated to reflect the committed changes.
hans added a comment.Feb 12 2020, 2:31 AM

We probably want to backport the big-endian fix into 10.0 in some form?
Please do file a bug

Was there a bug filed for this yet?

We probably want to backport the big-endian fix into 10.0 in some form?
Please do file a bug

Was there a bug filed for this yet?

#44877

hans added a comment.Feb 12 2020, 4:42 AM

We probably want to backport the big-endian fix into 10.0 in some form?
Please do file a bug

Was there a bug filed for this yet?

#44877

Thanks! I'll follow up there.