This is an archive of the discontinued LLVM Phabricator instance.

[VPlan] Use consecutive numbers to print VPValues instead of addresses.
ClosedPublic

Authored by fhahn on Jan 20 2020, 9:41 PM.

Details

Summary

Currently when printing VPValues we use the object address, which makes
it hard to distinguish VPValues as they usually are large numbers with
varying distance between them.

This patch adds a simple slot tracker, similar to the ModuleSlotTracker
used for IR values. In order to dump a VPValue or anything containing a
VPValue, a slot tracker for the enclosing VPlan needs to be created. The
existing VPlanPrinter can take care of that for the existing code. We
assign consecutive numbers to each VPValue we encounter in a reverse
post order traversal of the VPlan.

The drawback is that the slot tracker is also required to print a single
instruction and needs to be passed in, as currently neither instructions
nor VPBlockBase has a reference to the containing plan. To address this,
I think we should add a reference to the containing VPlan to VPBlockBase
and use that to provide print functions and operator<< without needing
to pass in a slot tracker. What do you think?

Diff Detail

Event Timeline

fhahn created this revision.Jan 20 2020, 9:41 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 20 2020, 9:41 PM
fhahn added a comment.Jan 20 2020, 9:43 PM

It looks like no tests currently really check the numbering of the VPvalues in the VPlan dump. With this patch, they should be consistent between runs, should I add a test for the exact output?

Unit tests: pass. 62041 tests passed, 0 failed and 783 were skipped.

clang-tidy: unknown.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

gilr added a comment.Feb 3 2020, 1:39 AM

It would indeed be good to improve VPValue printing. Initially, there were only a handful of VPValues so those pointer-based IDs seemed sufficient.
Since VPValues are now both more common and optionally have an underlying IR value, retaining names from the underlying IR where available could go a long way w.r.t both readability and relating to the loop being vectorized; This would hopefully cover most VPValues.
BTW, VP basic blocks could also retain the name of the corresponding IR basic block.
It would also be good to distinguish between "pure" VPValues and VPValues wrapping unnamed IR values (can we somehow retain the latter's enumeration?)

fhahn added a comment.Feb 3 2020, 1:44 PM

It would indeed be good to improve VPValue printing. Initially, there were only a handful of VPValues so those pointer-based IDs seemed sufficient.
Since VPValues are now both more common and optionally have an underlying IR value, retaining names from the underlying IR where available could go a long way w.r.t both readability and relating to the loop being vectorized; This would hopefully cover most VPValues.

Sounds good, printing the name of the underlying value should be quite trivial.

BTW, VP basic blocks could also retain the name of the corresponding IR basic block.

Sounds good!

It would also be good to distinguish between "pure" VPValues and VPValues wrapping unnamed IR values (can we somehow retain the latter's enumeration?)

+1. I think we would still need a mechanism like this patch to provide a better numbering for 'pure' VPValues.

For VPValues with unnamed wrapped IR values, I think it should be possible to print the number in the IR ( this will implicitly create an IR slot tracker and number the function on demand). We could use a scheme similar to machine IR and use something like %ir.(IR name) for VPValues with underlying IR values.

What do you think?

gilr added a comment.Feb 5 2020, 6:46 AM

It would indeed be good to improve VPValue printing. Initially, there were only a handful of VPValues so those pointer-based IDs seemed sufficient.
Since VPValues are now both more common and optionally have an underlying IR value, retaining names from the underlying IR where available could go a long way w.r.t both readability and relating to the loop being vectorized; This would hopefully cover most VPValues.

Sounds good, printing the name of the underlying value should be quite trivial.

BTW, VP basic blocks could also retain the name of the corresponding IR basic block.

Sounds good!

It would also be good to distinguish between "pure" VPValues and VPValues wrapping unnamed IR values (can we somehow retain the latter's enumeration?)

+1. I think we would still need a mechanism like this patch to provide a better numbering for 'pure' VPValues.

Agreed.

For VPValues with unnamed wrapped IR values, I think it should be possible to print the number in the IR ( this will implicitly create an IR slot tracker and number the function on demand). We could use a scheme similar to machine IR and use something like %ir.(IR name) for VPValues with underlying IR values.

What do you think?

On second thought, retaining original IR names and/or numbers is probably worth a separate discussion, also concerning names generated on VPlan execution.
The proposed Slot Tracker seems like a good foundation to have and possibly extend later, e.g. facilitating giving (enumerated) meaningful names to VPInstructions and retaining original IR name/number of live-ins.

Adding a pointer to VPlan in VPBlockBase would indeed be less intrusive to the code; how would this work for VPInstructions not assigned to any block?

fhahn updated this revision to Diff 244008.Feb 11 2020, 2:30 PM

It would indeed be good to improve VPValue printing. Initially, there were only a handful of VPValues so those pointer-based IDs seemed sufficient.
Since VPValues are now both more common and optionally have an underlying IR value, retaining names from the underlying IR where available could go a long way w.r.t both readability and relating to the loop being vectorized; This would hopefully cover most VPValues.

Sounds good, printing the name of the underlying value should be quite trivial.

BTW, VP basic blocks could also retain the name of the corresponding IR basic block.

Sounds good!

It would also be good to distinguish between "pure" VPValues and VPValues wrapping unnamed IR values (can we somehow retain the latter's enumeration?)

+1. I think we would still need a mechanism like this patch to provide a better numbering for 'pure' VPValues.

Agreed.

For VPValues with unnamed wrapped IR values, I think it should be possible to print the number in the IR ( this will implicitly create an IR slot tracker and number the function on demand). We could use a scheme similar to machine IR and use something like %ir.(IR name) for VPValues with underlying IR values.

What do you think?

On second thought, retaining original IR names and/or numbers is probably worth a separate discussion, also concerning names generated on VPlan execution.
The proposed Slot Tracker seems like a good foundation to have and possibly extend later, e.g. facilitating giving (enumerated) meaningful names to VPInstructions and retaining original IR name/number of live-ins.

Adding a pointer to VPlan in VPBlockBase would indeed be less intrusive to the code; how would this work for VPInstructions not assigned to any block?

I've put up a patch to add a pointer to VPlan: D74445. I;ve also added implementations of print() with and without SlotTracker (so we can re-use the existing tracker, e.g. from VPlanPrinter. I still need to update the slot tracker to actually use it. VPInstructions without a block (or a VPBasicBlock without parentPlan) should probably be handled similar to IR instructions without BB, by using something like 'badref'.

What do you think?

gilr added a comment.Mar 1 2020, 8:59 AM

[snip]
I've put up a patch to add a pointer to VPlan: D74445. I;ve also added implementations of print() with and without SlotTracker (so we can re-use the existing tracker, e.g. from VPlanPrinter. I still need to update the slot tracker to actually use it. VPInstructions without a block (or a VPBasicBlock without parentPlan) should probably be handled similar to IR instructions without BB, by using something like 'badref'.

What do you think?

This added pointer to VPlan can indeed take care of enumerating VPInstructions, defaulting to 'badref'.
Essentially all other VPValues except BackedgeTakenCount are wrappers of underlying IR values (right?), so they can print that value e.g. with the proposed %ir. w/o needing SlotTracker's services, thereby simplifying the patch. This would leave the somewhat special BackedgeTakenCount as the only user of the "where:" clause and current pointer-based naming.

fhahn updated this revision to Diff 247525.Mar 1 2020, 3:11 PM

Updated patch to use getPlan(), add unit tests.

[snip]
I've put up a patch to add a pointer to VPlan: D74445. I;ve also added implementations of print() with and without SlotTracker (so we can re-use the existing tracker, e.g. from VPlanPrinter. I still need to update the slot tracker to actually use it. VPInstructions without a block (or a VPBasicBlock without parentPlan) should probably be handled similar to IR instructions without BB, by using something like 'badref'.

What do you think?

This added pointer to VPlan can indeed take care of enumerating VPInstructions, defaulting to 'badref'.
Essentially all other VPValues except BackedgeTakenCount are wrappers of underlying IR values (right?), so they can print that value e.g. with the proposed %ir. w/o needing SlotTracker's services, thereby simplifying the patch. This would leave the somewhat special BackedgeTakenCount as the only user of the "where:" clause and current pointer-based naming.

I think we also introduce some VPInstructions not connected to any underlying IR during predication and will add more in the future. I think we should do (and need) both, using the IR name when available and the slot tracker otherwise. I think it would be good to have using the underlying IR names as a follow up to this patch. What do you think?

gilr added a comment.Mar 3 2020, 2:53 AM

Updated patch to use getPlan(), add unit tests.

[snip]
I've put up a patch to add a pointer to VPlan: D74445. I;ve also added implementations of print() with and without SlotTracker (so we can re-use the existing tracker, e.g. from VPlanPrinter. I still need to update the slot tracker to actually use it. VPInstructions without a block (or a VPBasicBlock without parentPlan) should probably be handled similar to IR instructions without BB, by using something like 'badref'.

What do you think?

This added pointer to VPlan can indeed take care of enumerating VPInstructions, defaulting to 'badref'.
Essentially all other VPValues except BackedgeTakenCount are wrappers of underlying IR values (right?), so they can print that value e.g. with the proposed %ir. w/o needing SlotTracker's services, thereby simplifying the patch. This would leave the somewhat special BackedgeTakenCount as the only user of the "where:" clause and current pointer-based naming.

I think we also introduce some VPInstructions not connected to any underlying IR during predication and will add more in the future. I think we should do (and need) both, using the IR name when available and the slot tracker otherwise. I think it would be good to have using the underlying IR names as a follow up to this patch. What do you think?

After D74445 all VPInstructions can indeed print themselves by either using their IR name or defaulting to SlotTracker/badref. So this patch can be simplified by not propagating the SlotTracker through the print() methods, since non-VPInstruction VPValues can rely on their underlying IR values for printing. Note that for this to work addVPValue must plant the underlying IR value into the generated VPValue as you do in D74695.
BackedgeTakenCount would then remain the only user of the current pointer-based naming and consequently of the "where:" clause.

fhahn added a comment.Mar 3 2020, 4:22 AM

[snip]
I think we also introduce some VPInstructions not connected to any underlying IR during predication and will add more in the future. I think we should do (and need) both, using the IR name when available and the slot tracker otherwise. I think it would be good to have using the underlying IR names as a follow up to this patch. What do you think?

After D74445 all VPInstructions can indeed print themselves by either using their IR name or defaulting to SlotTracker/badref. So this patch can be simplified by not propagating the SlotTracker through the print() methods, since non-VPInstruction VPValues can rely on their underlying IR values for printing. Note that for this to work addVPValue must plant the underlying IR value into the generated VPValue as you do in D74695.
BackedgeTakenCount would then remain the only user of the current pointer-based naming and consequently of the "where:" clause.

Ah I think I got what you are proposing now: just remove the ::print(raw_ostream &O, const Twine &Indent, VPSlotTracker &SlotTracker) variants taking a SlotTracker, because we can just instantiate it on demand, by just getting the parent plan for the containing VPBB. This means we might have to re-number the plan multiple times when printing the whole plan through VPlanPrinter, but as this is just for debugging (and the scope of a plan should be relatively small) we can adjust that as follow-up, if required.

gilr added a comment.Mar 3 2020, 8:27 AM

[snip]
I think we also introduce some VPInstructions not connected to any underlying IR during predication and will add more in the future. I think we should do (and need) both, using the IR name when available and the slot tracker otherwise. I think it would be good to have using the underlying IR names as a follow up to this patch. What do you think?

After D74445 all VPInstructions can indeed print themselves by either using their IR name or defaulting to SlotTracker/badref. So this patch can be simplified by not propagating the SlotTracker through the print() methods, since non-VPInstruction VPValues can rely on their underlying IR values for printing. Note that for this to work addVPValue must plant the underlying IR value into the generated VPValue as you do in D74695.
BackedgeTakenCount would then remain the only user of the current pointer-based naming and consequently of the "where:" clause.

Ah I think I got what you are proposing now: just remove the ::print(raw_ostream &O, const Twine &Indent, VPSlotTracker &SlotTracker) variants taking a SlotTracker, because we can just instantiate it on demand, by just getting the parent plan for the containing VPBB. This means we might have to re-number the plan multiple times when printing the whole plan through VPlanPrinter, but as this is just for debugging (and the scope of a plan should be relatively small) we can adjust that as follow-up, if required.

So the recursive print of any VP entity should use a single, up-to-date SlotTracker. This SlotTracker may be generated/assigned when entering the recursion (e.g. on operator<< or print()) and destroyed/invalidated when leaving it. One could optimize later to invalidate only when needed, i.e. on VPlan changes.
VPlanPrinter can also reset the printed Plan's SlotTracker upon start and invalidate it when done. We'll have to make VPlan's SlotTracker mutable for this to work, right?

fhahn added a comment.Mar 3 2020, 8:47 AM

[snip]
I think we also introduce some VPInstructions not connected to any underlying IR during predication and will add more in the future. I think we should do (and need) both, using the IR name when available and the slot tracker otherwise. I think it would be good to have using the underlying IR names as a follow up to this patch. What do you think?

After D74445 all VPInstructions can indeed print themselves by either using their IR name or defaulting to SlotTracker/badref. So this patch can be simplified by not propagating the SlotTracker through the print() methods, since non-VPInstruction VPValues can rely on their underlying IR values for printing. Note that for this to work addVPValue must plant the underlying IR value into the generated VPValue as you do in D74695.
BackedgeTakenCount would then remain the only user of the current pointer-based naming and consequently of the "where:" clause.

Ah I think I got what you are proposing now: just remove the ::print(raw_ostream &O, const Twine &Indent, VPSlotTracker &SlotTracker) variants taking a SlotTracker, because we can just instantiate it on demand, by just getting the parent plan for the containing VPBB. This means we might have to re-number the plan multiple times when printing the whole plan through VPlanPrinter, but as this is just for debugging (and the scope of a plan should be relatively small) we can adjust that as follow-up, if required.

So the recursive print of any VP entity should use a single, up-to-date SlotTracker. This SlotTracker may be generated/assigned when entering the recursion (e.g. on operator<< or print()) and destroyed/invalidated when leaving it. One could optimize later to invalidate only when needed, i.e. on VPlan changes.

I think I am missing something.

I think this is exactly what the patch does: when calling print without SlotTracker, it will be created on demand and VPlanPrinter re-uses the same slot tracker for printing all VP entities by passing it to print(). I'm not sure what I should change, besides a follow-up to use the underlying values, if available/

VPlanPrinter can also reset the printed Plan's SlotTracker upon start and invalidate it when done. We'll have to make VPlan's SlotTracker mutable for this to work, right?

hmm, IIUC we could add a SlotTracker as member to VPlan, but I am not sure if we should add it, as it is only required for debugging and keeping track of its state might be more work than it's worth.

gilr added a comment.Mar 4 2020, 8:50 AM

[snip]
I think we also introduce some VPInstructions not connected to any underlying IR during predication and will add more in the future. I think we should do (and need) both, using the IR name when available and the slot tracker otherwise. I think it would be good to have using the underlying IR names as a follow up to this patch. What do you think?

After D74445 all VPInstructions can indeed print themselves by either using their IR name or defaulting to SlotTracker/badref. So this patch can be simplified by not propagating the SlotTracker through the print() methods, since non-VPInstruction VPValues can rely on their underlying IR values for printing. Note that for this to work addVPValue must plant the underlying IR value into the generated VPValue as you do in D74695.
BackedgeTakenCount would then remain the only user of the current pointer-based naming and consequently of the "where:" clause.

Ah I think I got what you are proposing now: just remove the ::print(raw_ostream &O, const Twine &Indent, VPSlotTracker &SlotTracker) variants taking a SlotTracker, because we can just instantiate it on demand, by just getting the parent plan for the containing VPBB. This means we might have to re-number the plan multiple times when printing the whole plan through VPlanPrinter, but as this is just for debugging (and the scope of a plan should be relatively small) we can adjust that as follow-up, if required.

So the recursive print of any VP entity should use a single, up-to-date SlotTracker. This SlotTracker may be generated/assigned when entering the recursion (e.g. on operator<< or print()) and destroyed/invalidated when leaving it. One could optimize later to invalidate only when needed, i.e. on VPlan changes.

I think I am missing something.

I think this is exactly what the patch does: when calling print without SlotTracker, it will be created on demand and VPlanPrinter re-uses the same slot tracker for printing all VP entities by passing it to print(). I'm not sure what I should change, besides a follow-up to use the underlying values, if available/

Right.

VPlanPrinter can also reset the printed Plan's SlotTracker upon start and invalidate it when done. We'll have to make VPlan's SlotTracker mutable for this to work, right?

hmm, IIUC we could add a SlotTracker as member to VPlan, but I am not sure if we should add it, as it is only required for debugging and keeping track of its state might be more work than it's worth.

Agreed.

llvm/lib/Transforms/Vectorize/VPlan.h
636–638

It seems all derived recipes have the same implementation for this method, i.e. create a SlotTracker and pass it to the actually-printing variant below. Does it still need to be virtual?

llvm/lib/Transforms/Vectorize/VPlanValue.h
128

Can we retain such API, which generates an ad-hoc SlotTracker for VPInstruction?

fhahn updated this revision to Diff 248219.Mar 4 2020, 9:35 AM

Make void print(raw_ostream &O, const Twine &Indent) const non-virtual and add back
raw_ostream &llvm::operator<<(raw_ostream &OS, const VPValue &V)

fhahn marked 3 inline comments as done.Mar 4 2020, 9:36 AM
fhahn added inline comments.
llvm/lib/Transforms/Vectorize/VPlan.h
636–638

Yes, I've made the non slot tracker variant non-virtual and removed it from the subclasses.

llvm/lib/Transforms/Vectorize/VPlanValue.h
128

Yes, I've brought it back,

gilr accepted this revision.Mar 5 2020, 6:08 AM

LGTM!

This revision is now accepted and ready to land.Mar 5 2020, 6:08 AM
fhahn marked an inline comment as done.Mar 5 2020, 7:04 AM

Thanks Gil! I'll prepare a follow-up to use the underlying values if available soon.

This revision was automatically updated to reflect the committed changes.
RKSimon added a subscriber: RKSimon.Mar 5 2020, 8:54 AM

@fhahn This is causing a msvc warning: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/22465/steps/test-check-all/logs/warnings%20%281%29

C:\ps4-buildslave2\llvm-clang-x86_64-expensive-checks-win\llvm-project\llvm\unittests\Transforms\Vectorize\VPlanTest.cpp(231): warning C4129: 'l': unrecognized character escape sequence
fhahn added a comment.Mar 5 2020, 9:57 AM

@fhahn This is causing a msvc warning: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/22465/steps/test-check-all/logs/warnings%20%281%29

C:\ps4-buildslave2\llvm-clang-x86_64-expensive-checks-win\llvm-project\llvm\unittests\Transforms\Vectorize\VPlanTest.cpp(231): warning C4129: 'l': unrecognized character escape sequence

Hmm I am not sure, but it seems like this might be a bug in Visual Studio :( I don't think it should interpret/warn about escape sequences in a raw string literal. I'll try to get a work-around.

foad added a subscriber: foad.Mar 19 2020, 5:19 AM

@fhahn This is causing a msvc warning: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/22465/steps/test-check-all/logs/warnings%20%281%29

C:\ps4-buildslave2\llvm-clang-x86_64-expensive-checks-win\llvm-project\llvm\unittests\Transforms\Vectorize\VPlanTest.cpp(231): warning C4129: 'l': unrecognized character escape sequence

Hmm I am not sure, but it seems like this might be a bug in Visual Studio :( I don't think it should interpret/warn about escape sequences in a raw string literal. I'll try to get a work-around.

We're seeing this too. Did you come up with anything? Perhaps we should add C4129 to the big list of disabled warnings (https://github.com/llvm/llvm-project/blob/master/llvm/cmake/modules/HandleLLVMOptions.cmake#L515)?

fhahn added a comment.Mar 19 2020, 5:48 AM

@fhahn This is causing a msvc warning: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/22465/steps/test-check-all/logs/warnings%20%281%29

C:\ps4-buildslave2\llvm-clang-x86_64-expensive-checks-win\llvm-project\llvm\unittests\Transforms\Vectorize\VPlanTest.cpp(231): warning C4129: 'l': unrecognized character escape sequence

Hmm I am not sure, but it seems like this might be a bug in Visual Studio :( I don't think it should interpret/warn about escape sequences in a raw string literal. I'll try to get a work-around.

We're seeing this too. Did you come up with anything? Perhaps we should add C4129 to the big list of disabled warnings (https://github.com/llvm/llvm-project/blob/master/llvm/cmake/modules/HandleLLVMOptions.cmake#L515)?

I'll put up a patch to include C4129 in the list. The mix of escape sequences in the VPlan dumps make it quite tricky to get an alternative match string.

I still don't understand why you need these - does something consume custom escape sequences like these?

fhahn added a comment.Mar 19 2020, 7:11 AM

I still don't understand why you need these - does something consume custom escape sequences like these?

The dump is in the DOT graph language, which has a \l escape sequence. It might make sense to also dump just for printing (without piping through dot), but IMO that would be an unrelated improvement.