Hi,
This is an attempt to fix PR25873.
When splitting the store of a load, the pre splitting algorithm should check if the store uses a slice of the current alloca. If so, then the algorithm should update the slice information to preserve the partition structure. With this patch, the slice information related to the original store is used to create new slices for the split stores. The original store is eventually marked as dead and its associated slice is "killed".
If we don't update the alloca slices, then we risk to wrongly rewrite the partition of the alloca. I have posted a detailed description of the problem found in bug 25873. The reproducible from PR25873 has been added as a new test case for the existing test SROA/basictest.ll.
Example:
%tmpData = alloca %struct.STest; ... [0,8): %3 = load i64, i64* %1, align 8 [8,16): store i64 %5, i64* %2, align 8
Both load and store use a slice of the %tmpData alloca (slice information is between brackets). In this example, the load is a candidate for pre splitting, and the store is the only user of the load.
Before this patch, the pre splitting algorithm would have rewritten the load/store pair as follows:
[0, 4): %3 = load i32, i32* %.sroa_cast, align 8 [4, 8): %4 = load i32, i32* %.sroa_cast1, align 4 [0, 8): %3 = load i64, i64* %1, align 8 (dead) : store i32 %3, i32* %.sroa_cast3, align 8 : store i32 %4, i32* %.sroa_cast5, align 4 [8,16): store i64 %5, i64* %2, align 8 (dead)
The problem is that no slice information is built for the two new split stores. When the alloca is re-analyzed by SROA, two new problematic partitions are created: [8, 12) [12, 16). Those new partitions would overlap with the already consolidated partition [8, 16).
With this patch, we check if the store uses a slice of the alloca. If so, then we build new slices for the split stores and we mark the previous slice as dead. This is done to ensure that we choose the right partitioning of the alloca for the rewriting stage.
After this patch, the load and store are rewritten as:
[0, 4): %3 = load i32, i32* %.sroa_cast, align 8 [4, 8): %4 = load i32, i32* %.sroa_cast1, align 4 [0, 8): %3 = load i64, i64* %1, align 8 (dead) [8, 12): store i32 %3, i32* %.sroa_cast3, align 8 [12,16): store i32 %4, i32* %.sroa_cast5, align 4 : store i64 %5, i64* %2, align 8 (dead)
Please let me know what you think.
Thanks,
Andrea
This test is pretty expensive. I think you want to instead build a map from stores to slices above when we are scanning the alloca slices.