This is an archive of the discontinued LLVM Phabricator instance.

[X86] When storing v1i1/v2i1/v4i1 to memory, make sure we store zeros in the rest of the byte
ClosedPublic

Authored by craig.topper on Nov 11 2020, 1:09 PM.

Details

Summary

We can't store garbage in the unused bits. It possible that something like zextload from i1/i2/i4 is created to read the memory. Those zextloads would be legalized assuming the extra bits are 0.

I'm not sure that the code in lowerStore is executed for the v1i1/v2i1/v4i1 case. It looks like the DAG combine in combineStore may have converted them to v8i1 first. And I think we're missing some cases to avoid going to the stack in the first place. But I don't have time to investigate those things at the moment so I wanted to focus on the correctness issue.

Should fix PR48147.

Diff Detail

Event Timeline

craig.topper created this revision.Nov 11 2020, 1:09 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 11 2020, 1:09 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
craig.topper requested review of this revision.Nov 11 2020, 1:09 PM
RKSimon accepted this revision.Nov 12 2020, 1:34 AM

LGTM - there's a number of optimisations in the new kshift mask clearing codegen that we're missing but we can fix those up later.

llvm/lib/Target/X86/X86ISelLowering.cpp
23882

Any idea why we pad with a INSERT_SUBVECTOR here but use a CONCAT_VECTORS in the other case?

This revision is now accepted and ready to land.Nov 12 2020, 1:34 AM
craig.topper added inline comments.Nov 12 2020, 7:16 AM
llvm/lib/Target/X86/X86ISelLowering.cpp
23882

I think because combineStore runs before type legalization and we try to avoid INSERT_SUBVECTOR early. LowerSTORE runs later, but I guess combineStore is called from LegalizeVectorOps and CONCAT_VECTORS wouldn't become INSERT_SUBVECTOR until LegalizeDAG. So maybe lowerStore should be using CONCAT_VECTORS as well?

RKSimon added inline comments.Nov 12 2020, 1:28 PM
llvm/lib/Target/X86/X86ISelLowering.cpp
23882

Yes, consistently using CONCAT_VECTORS makes sense to me assuming there's no post-legalization gotchas.