This is an archive of the discontinued LLVM Phabricator instance.

[TableGen] Change getAllDerivedDefinitions() to return an ArrayRef
AbandonedPublic

Authored by Paul-C-Anagnostopoulos on Dec 21 2020, 11:16 AM.

Details

Summary

This will become the first of a series of patches to change getAllDerivedDefinitions() to return a const std::vector reference, rather than a copy of the vector. Now that the vectors are cached, the goal is to reduce the amount of copying.

I have updated four LLVM TableGen backends to accept the const reference, along with one MLIR backend that had to be changed.

I would appreciate any and all comments about this patch with respect to too much, too little, or just the right amount of copying.

Diff Detail

Event Timeline

Paul-C-Anagnostopoulos requested review of this revision.Dec 21 2020, 11:17 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptDec 21 2020, 11:17 AM
lattner accepted this revision.Dec 21 2020, 11:21 AM

Did you consider returning ArrayRef<Record*> instead? This is a bit more of an abstract type. Both work though

mlir/tools/mlir-tblgen/OpInterfacesGen.cpp
82

plz remove the commented out code.

This revision is now accepted and ready to land.Dec 21 2020, 11:21 AM

Is ArrayRef better in some stylistic sense? I'm happy to switch to it.

How do you declare that the vector pointed to by the ArrayRef is constant?

Is ArrayRef better in some stylistic sense? I'm happy to switch to it.

It allows more flexibility in the implementation of such a function (eg: if you change the backing storage to be SmallVector<T>, the API doesn't change - whereas if the API is "const std::vector<T> &" then implementations are constrained to only be able to implement their backing store using std::vector)

How do you declare that the vector pointed to by the ArrayRef is constant?

ArrayRef<T> is constant - there is no API to mutate elements of the array. MutableArrayRef<T> is the mutable equivalent.

Then this does sound like a good idea.

What is the best style for returning the ArrayRef? Just let the return construct it?

return Pair.first->second;

And is it better style to use 'const auto' at all the various places where it's called, or 'const ArrayRef' ?

Yeah just let the return construct it. On the caller side, just say: auto x = Records.getAllDerivedDefinitions(..)

You don't need to say "const ArrayRef<x>" for the same reason you don't need to say "const int x = foo();" it doesn't add much value and causes confusion.

-Chris

I suddenly thought that the change to ArrayRef would require me to change every call to getAllDerivedDefinitions in one revision. So I changed the function and tried compiling. I got no complaints. What mechanism is it that allows an ArrayRef to be returned to a variable declared [const] std::vector<Record *> [&] without complaint?

I suddenly thought that the change to ArrayRef would require me to change every call to getAllDerivedDefinitions in one revision. So I changed the function and tried compiling. I got no complaints. What mechanism is it that allows an ArrayRef to be returned to a variable declared [const] std::vector<Record *> [&] without complaint?

I'm guessing this from ArrayRef.h

/// @}                                                                                                                                                                                                                              
/// @name Conversion operators                                                                                                                                                                                                      
/// @{                                                                                                                                                                                                                              
operator std::vector<T>() const {                                                                                                                                                                                                   
  return std::vector<T>(Data, Data+Length);                                                                                                                                                                                         
}

All righty then. Yes, that appears to do the trick.

(I'd like a word with the syntax people, please . . .)

Yeah C++ syntax isn't great. If you take the path of returning arrayref, it is a bad idea to allow it to convert the a 'const&'. My understanding is that operator you found on arrayref will make a copy of the data (defeating the whole purpose of this patch :-), and that the lifetime will be bound to the statement. This means the reference will actually dangle. :-(

You can either keep it as a const& return or return an array ref and change the callers to use 'auto' or an explicit ArrayRef declaration. In either case, please appease the clang-format gods. Thanks Paul!

-Chris

I am about to update this revision to return an ArrayRef. But there is one compilation unit so far that actually modifies the returned array. Stay tuned for my update.

Paul-C-Anagnostopoulos updated this revision to Diff 313377.EditedDec 22 2020, 9:55 AM
Paul-C-Anagnostopoulos retitled this revision from [TableGen] Change getAllDerivedDefinitions() to return const vector & to [TableGen] Change getAllDerivedDefinitions() to return an ArrayRef.

This update changes getAllDerivedDefinitions so it returns an ArrayRef.

It worked out nicely except in OpInterfacesGen.cpp. There are three calls to getAllDerivedDefinitions. The one on line 63 modifies the returned array. The two on lines 108 and 134 pass it to InterfaceGenerator(), where it is stored as a std::vector. So in all three cases I let it convert the ArrayRef to std::vector, which copies the vector.

Is it sufficient to declare the return value as a std::vector, or is that not enough warning to the reader to watch out for copying?

this looks great to me Paul. I don't see the client that modifies the array, but such a client can do an explicit copy into its own std::vector

llvm/utils/TableGen/CodeEmitterGen.cpp
463–464

I'd personally use 'auto' for this iterator type.

Paul-C-Anagnostopoulos marked an inline comment as done.Dec 22 2020, 10:05 AM

The modification occurs on line 65:

llvm::erase_if(defs, ...

My temptation would be to add a macro to make it clear what's going on:

MODIFY_DEF_VECTOR std::vector<llvm::Record *> defs =
    recordKeeper.getAllDerivedDefinitions("OpInterface");

But I haven't seen that sort of thing done in LLVM.

Yeah please don't do a macro. A comment should be enough.

Yeah C++ syntax isn't great. If you take the path of returning arrayref, it is a bad idea to allow it to convert the a 'const&'. My understanding is that operator you found on arrayref will make a copy of the data (defeating the whole purpose of this patch :-), and that the lifetime will be bound to the statement. This means the reference will actually dangle. :-(

In this case I believe you'd get Reference Lifetime Extension here and the code would be safe/correct, but generally not a good idea to rely on that syntax anyway, except in some very niche template/generic code situations.

You can either keep it as a const& return or return an array ref and change the callers to use 'auto' or an explicit ArrayRef declaration.

Yeah, for the best - I'd probably use ArrayRef personally, since 'auto' hides whether the type is lightweight or heavyweight & at least I'd tend to read as heavyweight unless there were other context clues. (I guess we're seeing more iterator_ranges and other value typed lightweight types that we're probably using auto on, so maybe my reading will/has already changed) Probably short enough to write out without harming readability greatly.

The modification occurs on line 65:

llvm::erase_if(defs, ...

My temptation would be to add a macro to make it clear what's going on:

MODIFY_DEF_VECTOR std::vector<llvm::Record *> defs =
    recordKeeper.getAllDerivedDefinitions("OpInterface");

But I haven't seen that sort of thing done in LLVM.

Yeah, as @lattner said, a macro isn't suitable here (you may find such macros for things like "mark this variable as used" because the macro would expand to an attribute the compiler can use for enforcement or warning suppression, etc - but if the macro never expands to anything it's generally better to use a comment instead (because the comment can use real prose, because it doesn't imply something semantically meaningful that isn't, etc)). If you want to make the conversion explicit you can call the 'vec()' function - I think that'd be enough for me, without further comment, to make it clear the author intended the conversion. But comment probably doesn't hurt either.

I've gone through about eight backends for the first revision. I've used 'auto' when no copy is needed, and added a comment when a copy is needed. I found two places that copy the returned vector but probably don't need to, so I added TODOs there for later consideration.

I will update this Phabricator revision later today and then let it cook for a day or two before pushing it.

I've gone through about eight backends for the first revision. I've used 'auto' when no copy is needed, and added a comment when a copy is needed. I found two places that copy the returned vector but probably don't need to, so I added TODOs there for later consideration.

Sounds good. (if you wanted to, it's possible this could be committed incrementally rather than as a monolithic patch (I don't know how many call sites you're cleaning up here - if it's hundreds of files, might be worth a more incremental approach) - while reference lifetime extension is quirky and not ideal to rely on, it might be OK in a short-lived transition, such as changing the API here to return ArrayRef in one commit, without the cleanup at call sites - then committing changes to the call sites to use auto or ArrayRef as desired in whatever chunks (individual files, per directory batches, etc) seems suitable)

I will update this Phabricator revision later today and then let it cook for a day or two before pushing it.

As mentioned elsewhere - reviews don't generally have timelines like this. If it's sent for review, it should only be committed once approved. ( https://llvm.org/docs/CodeReview.html#code-review-workflow ) There are a few cases where that doesn't tend to hold as much, but that's the general rule.

llvm/utils/TableGen/CodeEmitterGen.cpp
463–464

Yeah, looks like this could probably use a range-based for loop, even, and skip mentioning the iterator type entirely.

I will update this Phabricator revision later today and then let it cook for a day or two before pushing it.

As mentioned elsewhere - reviews don't generally have timelines like this. If it's sent for review, it should only be committed once approved. ( https://llvm.org/docs/CodeReview.html#code-review-workflow ) There are a few cases where that doesn't tend to hold as much, but that's the general rule.

Admittedly it does get a bit weird when a patch gets approved, and then has significant changes after that approval. (I'd probably have read @lattner's first comment/approval as "good to commit as-is, if ArrayRef was considered and not used for some reason", though once the change then went to ArrayRef and subsequent call-site cleanups, probably worth coming back around for a new approval)

I will update this Phabricator revision later today and then let it cook for a day or two before pushing it.

As mentioned elsewhere - reviews don't generally have timelines like this. If it's sent for review, it should only be committed once approved. ( https://llvm.org/docs/CodeReview.html#code-review-workflow ) There are a few cases where that doesn't tend to hold as much, but that's the general rule.

Admittedly it does get a bit weird when a patch gets approved, and then has significant changes after that approval. (I'd probably have read @lattner's first comment/approval as "good to commit as-is, if ArrayRef was considered and not used for some reason", though once the change then went to ArrayRef and subsequent call-site cleanups, probably worth coming back around for a new approval)

Because sometimes even the best of ideas turns out looking weird/problematic in practice.

There are about 15 call sites in these compilation units. I figured that was a good number to include with the ArrayRef change itself. I had to change OpInterfacesGen.cpp because it would not compile.

I said I would leave it here for a couple of days because of the major change after the revision was accepted. The Add Action dropdown includes "Accept Revision", but the top of the page still has "Accepted". I'm happy to wait until some posts LGTM again.

dblaikie accepted this revision.Dec 22 2020, 3:00 PM

There are about 15 call sites in these compilation units. I figured that was a good number to include with the ArrayRef change itself. I had to change OpInterfacesGen.cpp because it would not compile.

Yeah, that sounds feasible.

I said I would leave it here for a couple of days because of the major change after the revision was accepted.

Yep, and I'm sorry for jumping down your throat a bit about it - the issue is that sometimes this sort of approach has been used as a way to bypass complex review "well, no one said no, so I guess that means yes", so I think it's important not to normalize this approach/muddy the waters too much with this sort of thing, even if it's not a huge deal for this particular patch, for instance.

The Add Action dropdown includes "Accept Revision", but the top of the page still has "Accepted". I'm happy to wait until some posts LGTM again.

That'd be best. Yeah, Phab now shows the prior review as "accepted prior diff", which is handy to give some sense that the approval may be out of date (we're pretty relaxed about some post-approval changes, under some bar of "obviousness" - eg: "Approved, but please format this patch or rename this variable to match the naming convention", etc - so it's not a hardline that a post-patch change requires re-approval, but a good reminder to consider that it might - in this case I think there's sufficient changes to merit re-approval).

This is all ready to go, then? Looks OK. I'd personally use a bit less auto, but your call. & if you wanted to fix that one iterator for loop up to use range-based-for either in this patch, or a standalone commit (no need to send it for pre-commit review) etiher before or after this one, that'd be nice but not mandatory.

I will change to a rang-based for loop tomorrow.

Paul-C-Anagnostopoulos marked an inline comment as done.

Changed the loop in CodeEmitterGen.cpp to a range-based loop.

General question: What's the motivation for this work series/direction? Did the lack of caching show up in profiles? Has the addition of profiling shown to be a significant improvement in runtime?
But yeah, if the caching makes sense - seems nice to do this tidy up work that avoids making copies that are no longer needed.

llvm/utils/TableGen/CodeGenDAGPatterns.cpp
3134

I'd probably not put the comment here - it splits the expression possibly unnecessarily (I guess generally it doesn't fit on one 80 col line anyway? But still I'd avoid it - would make refactoring the expression more fussy, etc). Could you put it on a separate line before this statement? (& maybe rephrase it to explain the need to copy - which will be easier with the full 80 columns for the comment, I suspect - like "Make a copy of the forms due to subsequent modifications"?)

Also, it looks like in most of these instances, a copy isn't actually needed - maybe follow-up commits could remove these copies in favor of non-copying algorithms. For instance in this function, the elements are popped back from the vector, but equivalent non-mutating code might look something like:

for (Record* R : llvm::reverse(AMs))
  ComplexPatterns.insert(std::make_pair(R, R));

(Might be easier to make those changes to non-mutation first/in an independent commit, then that'd simplify this change by no longer needing the copies and comments)

At least so far as I can tell, that's all this code is doing - and the other copy cases in this file could be similarly modified. The only copy that looks a bit more necessary seems to be "getAllOpInterfaceDefinitions"

mlir/tools/mlir-tblgen/OpInterfacesGen.cpp
81

Looks like this is necessary because one caller passes the result of getAllOpInterfaceDefinitions to this ctor - so it needs storage. A type erased filtering iterator, or perhaps not... if InterfaceGenerator is just an implementation convenience base class - it could be a template, perhaps a CRTP base that avoids the need for getAllOpInterfaceDefinitions to manifest the copy in memory by providing an iteration API rather than a data structure. But maybe that involves too much stuff having to move into a header? Not sure.

83–84

This change looks incorrect to me - what was the motivation to change the parameter from rvalue ref?
(though it might, regardless of this change, be better written as pass-by-value - then callers with temporaries (like those passing the result of getAllOpInterfaceDefinitions) or explicit std::move for the parameter would be benefit from being able to move into the InterfaceGenerator - and callers that want to keep their own copy of the vector could do so as well - https://abseil.io/tips/117 discusses the general idea in more detail)

I'm not yet clever enough to do profiling. It just seemed like building large record vectors multiple times was a waste of time. Perhaps I embarked on this change without sufficient evidence of its necessity.

I will move the copy comments above the definition.

mlir/tools/mlir-tblgen/OpInterfacesGen.cpp
83–84

The code would not compile as it was. I will investigate the best way to do this.

I'm not yet clever enough to do profiling.

Happy to help with anything in that regard. I don't often do execution profiling, more often memory profiling, but have some rough knowledge/manage to stick it all together from time to time.

It just seemed like building large record vectors multiple times was a waste of time. Perhaps I embarked on this change without sufficient evidence of its necessity.

Yeah, this follow-on change makes sense to me if the initial change does, but that initial change to add caching seems like it should be justified with a profile or the like - otherwise it's hard to know if they're big enough for the copies to matter, etc. (& adding a cache can increase memory usage - which might have other knock-on effects, etc)

I will move the copy comments above the definition.

Great!

I'm not sure where to go from here, then. Do you think I should start over, by eliminating the caching and then profiling?

I don't think the caching causes more copies of the vectors, since the one that would have been returned is simply now cached. And additional copying should be reduced as I go through the uses of getAllDerivedDefinitions().

I'm not sure where to go from here, then. Do you think I should start over, by eliminating the caching and then profiling?

I don't think the caching causes more copies of the vectors, since the one that would have been returned is simply now cached. And additional copying should be reduced as I go through the uses of getAllDerivedDefinitions().

The caching may extend the lifetime of the vector. If it was previously a local variable, it would go out of scope and be deleted when the function ends. Now the caching will keep it around until tablegen exits. So its possible many short lived vectors that didn't have overlapping lifetimes become many long lived objects that do overlap increasing peak memory usage.

Ah yes, good point. I was concerned about time, not memory.

Hmm. Perhaps I should abandon this. What do people think?

Ah yes, good point. I was concerned about time, not memory.

Hmm. Perhaps I should abandon this. What do people think?

I'd say at least put it on hold/revisit the caching addition - sounds like you're generally interested in improving the performance of TableGen, so working on getting good profile results to motivate not just this change but others you might find, etc seems like a good place to start. If it turns out the caching's good, you can revisit this patch. (but I guess from a Phabricator patch status - "abandoned" might be an option, I guess you can always unabandon it if things go this way - your call)

You don't have to immediately revert the caching to begin with - figure out the profiling, try measuring with tblgen as it is in-tree with the caching, and then try locally reverting/profiling and see if what the difference looks like. If there's something in it, great, otherwise you can revert the caching and see what might be the hotter parts that are worth profiling.

Okay, that sounds like a good plan. I'll let this revision float for now. Shall I email you about profiling?

Okay, that sounds like a good plan. I'll let this revision float for now. Shall I email you about profiling?

Sure thing - if you want to use Windows utilities, you might try starting here: https://docs.microsoft.com/en-us/visualstudio/profiling/beginners-guide-to-performance-profiling?view=vs-2019 - if you're using unix-esque tools, https://www.thegeekstuff.com/2012/08/gprof-tutorial/ - not sure if there are quick/easy ways to plumb that through an LLVM build, maybe you can get away with just adding -pg to CFLAGS and rebuilding tblgen.

I've abandoned this revision until I spend some time studying it.