In the following example, the load can either read the value 100 or the value 500. Therefore, it is straightforward that we can replace the load with a phi node as it is shown below:
Example 1:
Before Load Coercion:
BB1: BB2: store i32 100, i32* %P store i32 500, i32* %P \ / BB3: %V = load i32, i32* %P
After Load Coercion:
BB1: BB2: store i32 100, i32* %P store i32 500, i32* %P \ / BB3: %phi = phi i32 [ 100, %BB1], [ 500, %BB2 ]
In the following example, the load has only one dependency in BB1. To eliminate the load we add an artificial dependency in BB2 (%V1) and we replace the load of BB3 with a phi node. In the worst case scenario, the load of BB2 (%V1) will be executed as many times as the load in BB3.
Example 2:
Before Load Coercion:
BB1: BB2: store i32 100, i32* %P | \ / BB3: %V = load i32, i32* %P
After Load Coercion:
BB1: BB2: store i32 100, i32* %P %V1 = load i32, i32* %P \ / BB3: %phi = phi i32 [ 100, %BB1], [ %V1, %BB2 ]
The algorithm has two main steps:
Collection of loads: In Example 3, the load of BB3 is dependent on the the memory phi of BB3. In this case, the operands of the memory phi are the also the operands of the phi that will replace the load of BB3.
Example 3:
BB1: BB2: 1 = MemoryDef(liveOnEntry) 2 = MemoryDef(liveOnEntry) store i32 100, i32* %P store i32 500, i32* %P \ / BB3: 3 = MemoryPhi({T,1},{F,2}) %V = load i32, i32* %P
Next, we check if there are any live-in loads that can be replaced by a phi node. In the following example, we can add a fake dependency in BB2. In this way, we can replace the load of BB3 with a phi node.
Example 4:
BB1: BB2: 0 = MemoryDef(liveOnEntry) | %V1 = load <2xi32>, <2xi32>* %P | \ / BB3: %V = load i32, i32* %P
Code generation: As it is shown in Examples 1 and 2, there are two code generation scenarios: i. If the load has as many dependencies as its predecessors, then we just replace the load with a phi as it is shown in Example 1. ii. If one of the predecessors of the basic block of the load does not have a dependency, then we emit a fake dependency in this predecessor. Next, we can replace the load with a phi node as it is shown in Example 2.
I need to check why NewGVN does not eliminate the add.