Page MenuHomePhabricator

[LibCallSimplifier] fold memset(malloc(x), 0, x) --> calloc(1, x)

Authored by spatel on Jan 19 2016, 4:58 PM.



This is a step towards solving PR25892:

It won't handle the reported case. As noted by the 'TODO' comments in the patch, we need to relax the hasOneUse() constraint and also match patterns that include memset_chk() and the llvm.memset() intrinsic in addition to memset().

Diff Detail


Event Timeline

spatel updated this revision to Diff 45327.Jan 19 2016, 4:58 PM
spatel retitled this revision from to [LibCallSimplifier] fold memset(malloc(x), 0, x) --> calloc(1, x).
spatel updated this object.
spatel added reviewers: mcrosier, davide, hfinkel.
spatel added a subscriber: llvm-commits.
mcrosier edited edge metadata.Jan 20 2016, 6:46 AM

What happens if there's an intervening store to the malloc'ed memory? (dumb I know.. but..)

Slightly modified version of the test case in memset_chk-1.ll:

define float* @pr25892(i64 %size) #0 {

%call = tail call i8* @malloc(i64 %size) #1
%0 = bitcast i8* %call to i32*        ;; 
store i32 1, i32* %0, align 4           ;; fdata[0] = 1;
%cmp = icmp eq i8* %call, null
br i1 %cmp, label %cleanup, label %if.end


%bc = bitcast i8* %call to float*
%call2 = tail call i64 @llvm.objectsize.i64.p0i8(i8* nonnull %call, i1 false)
%call3 = tail call i8* @__memset_chk(i8* nonnull %call, i32 0, i64 %size, i64 %call2) #1
br label %cleanup


%retval.0 = phi float* [ %bc, %if.end ], [ null, %entry ]
ret float* %retval.0


This does get transformed incorrectly with this patch.


978 ↗(On Diff #45327)

Shouldn't you check that Calloc is non-null here?

What happens if there's an intervening store to the malloc'ed memory? (dumb I know.. but..)

Ah, good catch. I checked 'hasOneUse()' in an earlier draft of this to avoid that scenario, but then I noticed that the example in the bug report had that 'llvm.objectsize' and other uses between the malloc and memset, so I removed that condition. I need to find a way around that.

978 ↗(On Diff #45327)

Yes - that won't work if calloc doesn't exist for the target.

spatel updated this revision to Diff 45448.Jan 20 2016, 2:42 PM
spatel updated this object.
spatel edited edge metadata.

Patch updated: now with less ambition and more TODOs!

  1. Restrict the optimization to hasOneUse() of the malloc. This isn't going to fire for most real code AFAICT because neither the llvm.memset() intrinsic nor the memset_chk() libcall cases will qualify. I don't know how to avoid the intermediate store problem noted by Chad yet, so take a baby step. Sorry PR25892.
  1. The earlier draft added a DataLayout member to FortifiedLibCallSimplifier, but we can just pull that from the module and use it as needed. This way we don't have to change the constructor and affect unrelated code (CodeGenPrepare).
  1. Fixed to make sure that calloc() exists and is available for the transform; if not, bail.

This looks to be in pretty good shape, but I wouldn't mind hearing other's opinions.

942 ↗(On Diff #45448)

Would it make more sense to check for calloc here?

if (!TLI.getLibFunc("calloc", Func) || !TLI.has(Func))

return nullptr;

It would avoid some unnecessary work in the event we don't have calloc. However, I assume we generally have calloc, so this might not be a great suggestion... Just a suggestion that doesn't necessarily have to be acted upon.

spatel added inline comments.Jan 22 2016, 3:31 PM
942 ↗(On Diff #45448)

If we view this in isolation, that sounds reasonable, but the calloc existence check is in emitCalloc because that's the pattern used by the other emit* functions in BuildLibCalls.

I'd prefer to leave this as-is pending resolution of the 'TODO' at line 918 about moving emitCalloc() over to BuildLibCalls to be with its siblings (even though that entire separate file is currently unnecessary because it's only ever used here AFAICT).

It's also possible (there's always a chance!) that someone will come up with some other transform here in LibCallSimplifier that needs a calloc(). In that case, I think it'd also be better to leave the existence check where it is.

mcrosier accepted this revision.Jan 25 2016, 7:37 AM
mcrosier edited edge metadata.

LGTM assuming you've done the necessary correctness/performance due diligence.

942 ↗(On Diff #45448)

All very reasonable comments, which is why I left this to your discretion.

This revision is now accepted and ready to land.Jan 25 2016, 7:37 AM

LGTM assuming you've done the necessary correctness/performance due diligence.

Thanks, Chad. I ran test-suite and see no fails (and don't expect this to fire anyway due to the 1-use restriction). I'm not able to measure any perf differences even with a micro-benchmark on OSX, but at least the code is smaller.

davide accepted this revision.Jan 25 2016, 4:17 PM
davide edited edge metadata.

lgtm, Sanjay, and sorry for the delay reviewing this.

This revision was automatically updated to reflect the committed changes.