This is an archive of the discontinued LLVM Phabricator instance.

Add support for complex aggregate store in InstCombine
AbandonedPublic

Authored by deadalnix on May 13 2015, 4:31 PM.

Details

Summary

As per title. This is a continuation of the work to support aggregate load and store properly. This got throught the aggregate and form an integral using bitcast and various bit manips, and store a large integral at the end representing the layout of the aggregate.

Ensure this is done after alignement calculation so semantic is not altered as far as the machine is concerned.

Diff Detail

Event Timeline

deadalnix updated this revision to Diff 25746.May 13 2015, 4:31 PM
deadalnix retitled this revision from to Add support for complex aggregate store in InstCombine.
deadalnix updated this object.
deadalnix edited the test plan for this revision. (Show Details)
deadalnix added a subscriber: Unknown Object (MLST).May 13 2015, 4:32 PM
deadalnix updated this revision to Diff 25755.May 13 2015, 10:16 PM

Pass PackSize down instead of recomputing it every time it is needed.

majnemer edited edge metadata.May 13 2015, 10:27 PM

This looks very questionable. InstCombine isn't supposed to create integer operations using types like i256.

@majnemer I've been told the exact reverse last I tried to get something similar in (it was generating a set of store instead of one giant one).

deadalnix updated this revision to Diff 26499.May 25 2015, 4:05 PM
deadalnix edited edge metadata.

Rebase.

As per discussion with @majnemer , the open question is what to do if the aggregate is larger than what is supported on the target.

IMO, it is still preferable to generate load too larges, as they are at least somewhat supported as opposed to not supported at all.

deadalnix updated this revision to Diff 26703.May 28 2015, 10:40 AM

Rebase, ping, ping ?

reames removed a reviewer: reames.Jun 15 2015, 4:35 PM
deadalnix updated this revision to Diff 28062.Jun 19 2015, 4:52 PM

Rebased that diff on top of existing master. I'd really like to get that moving. I'm aware of the concern voiced by @majnemer , however

  • No better solution is on the table by now. Asking for something better is ok, but just letting the thing hang around is just poor code review management.
  • The proposed combination is at least understood by most of the toolchain, when fca are not. That is just dumb to reject an improvement simply for the foggy hope of some better improvement that is never comming.

I do think I have proven to be sticking around and can be trusted to continue work on that in the future. inf act, the process has been so slow that pretty much anyone would have given up by now.

It's been a month that this is just sitting there. You guys need to provide a way forward to get it as it is. I just can't reimplement that in a new way every 2 month because nobody has any clue how to move forward on the subject. This is disrespectful of my work and hurtful for the project at large as you can't get a better strategy to get contributor pissed of.

I think it would be useful to take a step back and ask "why is your IR generator creating loads of such large aggregates". I feel like if we had this information, we'd be able to determine what the best course of action is.

Pretty much all frontend need to manipulate aggregate. clang is no exception. Right now all frontend use a set of trick to make that work. clang does it via doing memcpy of the struct into an alloca and letting SROA do the work.

I can do the same on my own or we can facilite frontend writter life by handling aggregates properly. The later is preferable.

deadalnix updated this revision to Diff 32335.Aug 17 2015, 1:17 PM

Rebased, retested, ping ping ?

deadalnix abandoned this revision.Dec 14 2015, 6:40 PM