Page MenuHomePhabricator

[Sparc] Custom bitcast between f64 and v2i32

Authored by dcederman on Jul 12 2018, 12:38 AM.



Currently bitcasting constants from f64 to v2i32 is done by storing the value to the stack and then loading it again. This is not necessary, but seems to happen because v2i32 is a valid type for Sparc V8. If it had not been legal, we would have gotten help from the type legalizer.

This patch tries to do the same work as the legalizer would have done by bitcasting the floating point constant and splitting the value up into a vector of two i32 values.

Diff Detail


Event Timeline

dcederman created this revision.Jul 12 2018, 12:38 AM
jrtc27 added inline comments.Jul 14 2018, 8:19 AM
3061 ↗(On Diff #155125)

Should we also be checking Src.getSimpleValueType()? This only works if both source and result are v2i32/f64.

dcederman updated this revision to Diff 160355.Aug 13 2018, 7:36 AM

Added assert for Src.getSimpleValueType() and improved formatting.

jyknight added inline comments.Aug 15 2018, 11:53 AM
3060 ↗(On Diff #160355)

I think it's probably better to move all of this into a PerformDAGCombine function. That's where the usual target-independent BITCAST constant handling happens, so doing this stuff there too seems reasonable and will make some of the below issues easier.

3070 ↗(On Diff #160355)

Doesn't this end up duplicating the load instruction? I think the original load won't disappear even if it has no other value uses, because its chain is used (and elimination of loads used only for their chains only happens in DAG Combine, which is already done by this point, possibly).

This also doesn't properly handle the chain output, so the new load might be misordered w.r.t. dependencies. Needs to replace the chain out of the original load with a tokenfactor of both loads.

If the load is volatile, duplicating it is invalid.

3073 ↗(On Diff #160355)

Is it guaranteed here that VT is only v2i32? Maybe just add that to the if.

3075 ↗(On Diff #160355)

This code works right in both endiannesses, right? (I forgot how build-vector endianness goes).

32–33 ↗(On Diff #160355)

Leave a note that there is still non-optimal codegen in this case?

dcederman edited the summary of this revision. (Show Details)

Moved code into a PerformDAGCombine function.

Removed code related to loads. Loads were already handled properly so the code should not have been included in the first place.

Check both source and target for the correct type.

Added code to handle little-endian systems and added new test cases.

The test_intrins_call test should be generating optimal code now.

jyknight accepted this revision.Aug 26 2018, 9:06 AM
This revision is now accepted and ready to land.Aug 26 2018, 9:06 AM
This revision was automatically updated to reflect the committed changes.