This is an archive of the discontinued LLVM Phabricator instance.

[clang] Return `std::string_view` from `TargetInfo::getClobbers()`
ClosedPublic

Authored by Stoorx on Apr 20 2023, 6:39 AM.

Details

Summary

Change the return type of getClobbers function from const char*
to std::string_view. Update the function usages in CodeGen module.

The reasoning of these changes is to remove unsafe const char*
strings and prevent unnecessary allocations for constructing the
std::string in usages of getClobbers() function.

Diff Detail

Event Timeline

Stoorx created this revision.Apr 20 2023, 6:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 20 2023, 6:39 AM
Stoorx requested review of this revision.Apr 20 2023, 6:39 AM

Context not available.

Please update the diff with context, https://llvm.org/docs/Phabricator.html#requesting-a-review-via-the-web-interface.

I'm not sure what the benefit of this change is. While I did say less raw pointers = better, perhaps I'd make an exception for const char * for constant strings. In this case, no one is manipulating the string until (I assume) after it's converted to std::string, so we're not removing risky accesses in that way. If the API were returning a const char * that we then did a bunch of find first, split at, strlen, etc on, then this would make more sense.

https://llvm.org/docs/ProgrammersManual.html#passing-strings-the-stringref-and-twine-classes

Also this says you can implicitly construct from a string literal, so I am surprised you need the llvm::StringLiteral. Unless this is just a thing that asserts that it is in fact, a literal (I've not used this before myself).

Overall I applaud the effort but in this particular case it may not be needed.

In this case, no one is manipulating the string until (I assume) after it's converted to std::string, so we're not removing risky accesses in that way.

Maybe the solution is to return some kind of string (std::string or llvm::SmallString) by value or by const reference?

If the API were returning a const char * that we then did a bunch of find first, split at, strlen, etc on, then this would make more sense.

At least StringRef has the Length inside (we do not need to call strlen or someting) and according to documentation it can handle non-null terminated strings.

/// StringRef - Represent a constant reference to a string, i.e. a character
/// array and a length, which need not be null terminated.

Offtop: Is there any discussion platform where I can share some ideas? I have some of them how to refactor TargetInfo classes, but it's a major (and maybe breaking) change I have to discuss prior to implement.

Stoorx updated this revision to Diff 515707.Apr 21 2023, 5:59 AM

Rebase & upload with context

You need to clearly state the "why" of the change in the commit message. Sometimes it seems obvious to the author but a reviewer has to assume and may assume incorrectly.

In this case, I guess that since the result of the function only has its length checked then is added to another string, you would avoid a heap allocation for the new std::string temporary by using StringRef.

You could also do this by making a StringRef from the const char* and checking its length. That does leave a small ambiguity as to whether returning nullptr and empty string is the same thing. Which it is, but it's not clear just from the prototype of the function. So changing getClobbers to return StringRef resolves both issues.

But equally, you might have had a completely different goal. Tell me if I am right :)

Offtop: Is there any discussion platform where I can share some ideas? I have some of them how to refactor TargetInfo classes, but it's a major (and maybe breaking) change I have to discuss prior to implement.

An RFC on discourse is the usual way. If you can supply some work in progress patches or a tree on github then even better (and prevent you from proposing things you later find to be impossible, not that I have ever done that of course :) ).

I've looked through usages of getClobbers again, and found that in each case the result if the function is casted to std::string.

clang/lib/CodeGen/CGBuiltin.cpp
clang/lib/CodeGen/CGStmt.cpp

std::string Constraints = "=r,*Z,~{memory}";
  std::string MachineClobbers =
      static_cast<std::string>(CGF.getTarget().getClobbers());
  if (!MachineClobbers.empty()) {
    Constraints += ',';
    Constraints += MachineClobbers;
  }

I think we can 'make two deals with one shot' if we would:

  • Make std::string_view as a return type of getClobbers (by value, since the std::string_view is 'TriviallyCopyable')
  • (Maybe) Make private static field of std::string type for actual clobber value (or std::string_view type since it can be constexpr-constructed)
  • Return the value of this field from getClobbers
  • Change type of MachineClobbers (in usages, see example above) variable to std::string_view since it actually used for read-only access. (We do not need to construct std::string and destruct it almost immediately)

As a result we get:

  • More robust usage of getClobbers (a bit ridiculous but...)
  • Zero unnecessary allocations

(Also: I've discovered that llvm::StringRef is exact the same thing as std::string_view. I think of it so much :^) )

string_view is a good call, especially as this is not doing anything fancy with it, and we have recently been allowed to use it.

(Maybe) Make private static field of std::string type for actual clobber value (or std::string_view type since it can be constexpr-constructed)

This I don't see the need for. If the string_views reference constant strings doesn't that amount to the same thing?

const std::string_view getClobbers() const override {
    return "";
  }

This makes a string view out of a pointer to a const string, but I guess it's up to the compiler how far it can optimise that.

You may be correct but it seems like a small optimisation.

Stoorx updated this revision to Diff 515728.Apr 21 2023, 7:37 AM
Stoorx retitled this revision from [clang] Return `StringRef` from `TargetInfo::getClobbers()` to [clang] Return `std::string_view` from `TargetInfo::getClobbers()`.
Stoorx edited the summary of this revision. (Show Details)

Refactoring according the discussion

I did not set const qualifier for return type because std::string_view is constant by design. Or should I mark it const explicitly?

I did not set const qualifier for return type because std::string_view is constant by design. Or should I mark it const explicitly?

I'm new to string_view but everything I see backs up that it is a constant view on the data as you say. I'm not sure what making it itself const would do, not worth looking into here.

Please update the commit message (here and locally, in case you happen to be using arc, which prefers one and I forget which one). It should include the reasoning for the change, as we've discussed here.

clang/lib/CodeGen/CGBuiltin.cpp
942–943

You dropped a =, please make sure it compiles before updating the review.

Build issues will also show up in the "Build Status" here.

Stoorx updated this revision to Diff 515752.Apr 21 2023, 8:29 AM
Stoorx edited the summary of this revision. (Show Details)

Fix, Rebase & retitle

DavidSpickett accepted this revision.Apr 21 2023, 8:40 AM

If you have commit access already, wait until the pre-commit builds finish and check there isn't anything failing there.

If you don't, I can commit this for you as I did before. I'll do that on Monday.

This revision is now accepted and ready to land.Apr 21 2023, 8:40 AM