This is an archive of the discontinued LLVM Phabricator instance.

[PPC] - Add LoadCluster/StoreCluster DAG mutations to the MachineScheduler
AbandonedPublic

Authored by syzaara on Feb 9 2017, 11:57 AM.

Diff Detail

Event Timeline

sfertile created this revision.Feb 9 2017, 11:57 AM
inouehrs added inline comments.Feb 22 2017, 1:27 AM
lib/Target/PowerPC/PPCInstrInfo.cpp
2016

Here, do you mean cache line size?
Some PPC processors use a shorter cache line size (e.g. L1$ of BG/Q or embedded PPC), so maybe better to avoid hardcoding a value.
(But, TTI->getCacheLineSize() always returns 64, since it is only used on BG/Q so far.)

2127

Ditto.

nemanjai edited edge metadata.Feb 22 2017, 5:50 AM

Please be a bit more careful to ensure that comments are complete sentences with the expected punctuation, etc. Also, if your editor doesn't automatically format your code, consider running the patch through clang-format-diff.

lib/Target/PowerPC/PPCInstrInfo.cpp
1937

What is the use of the string parameter?

1940

Naming convention.

1950

"...are instructions are..." -> "... instructions are..."

1953

Please fix the formatting in this comment. I believe we should send out an RFC on llvm-dev to provide a new property for intrinsics that specifies that they have side-effects but do not touch memory. We have such a flag for instructions, I think it makes sense to have it for intrinsics as well. If we do that, I believe this comment should include a FIXME to remove this function once such a property exists and is used correctly in the PPC back end in all instances where it should be used.

1957

Why can't SDNode::getGluedNode() be used for this purpose and eliminate the need for this function?

1959

Is the operand you're after not guaranteed to be the very last operand?
In any case, it seems more natural to loop over the operands in reverse order, terminating before the first - and return the Glue operand when one is found. If the loop terminates, assert and return SDValue().

1965

Why is this needed? It seems that the above loop has terminated either when it has gone over all the operands (caught by previous assert) or when it found an operand whose type is MVT::Glue (whose type you check in this assert again). I think the weird structure of this function makes the logic more difficult to reason about than it needs to be.

1966

Keep in mind that you'll be returning something that you probably don't want to return (i.e. what you would assert on) in a no-asserts build.

1992

Text for the assert.

2001

tlat -> that

2015

Why?

2016

I agree. If we mean cache line size, we should check the cache line size. If we don't have accurate information about the cache line size for the common processors in the TTI implementation, we should update the implementation.

2037

This assert is unnecessary. Already exists in getImm().

2099

I think you may have physical registers at this point for TOC based loads (i.e. X2/R2 may already be part of the instruction at this point). It would make sense to look into whether this is the case and if it is, we should handle physical registers. Other than that, I don't see any compelling need to handle physical registers here.

2102

I am probably missing something here so it's a good indication that a detailed comment is in order...
What I mean is that it appears at first glance that if we have a load-hit-store situation where the store is followed by a load from the exact memory location, we'd happily cluster them together here. Presumably there's something after this (or before?) that will prevent this, but I think it would be good to document that here.

2103

This line looks way too long.

2106

A comment as to why we're exiting here.

lib/Target/PowerPC/PPCInstrInfo.h
52

Please refrain from unrelated formatting changes.

183

"... a used..." -> "... used ..."

204

Sentence fragment in comment, please revise.

sfertile added inline comments.Mar 2 2017, 8:42 PM
lib/Target/PowerPC/PPCInstrInfo.cpp
1937

I was using the string argument for some debugging and left it in by accident. I'll clean it up.

2016

Yeah this is the cache-line size. I'll update it the TTI implementation like you guys suggest as part of this patch

syzaara commandeered this revision.Apr 26 2017, 8:27 AM
syzaara edited reviewers, added: sfertile; removed: syzaara.
syzaara updated this revision to Diff 96750.Apr 26 2017, 8:31 AM
syzaara retitled this revision from [PPC] WIP - Add LoadCluster/StoreCluster DAG mutations to the MachineScheduler to [PPC] - Add LoadCluster/StoreCluster DAG mutations to the MachineScheduler.
syzaara abandoned this revision.May 4 2017, 12:55 PM