Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
ELF/Config.h | ||
---|---|---|
47–49 ↗ | (On Diff #69090) | I wouldn't think that this class is designed well. Name and Wildcard are mutually exclusive, and if hasWildcard(Sym.Name) is false, you need to use Sym.Wildcard. That's not easy to understand. |
49 ↗ | (On Diff #69090) | This should be named Re. |
ELF/LinkerScript.cpp | ||
130–132 ↗ | (On Diff #69090) | Isn't this slow? |
ELF/Strings.cpp | ||
33 ↗ | (On Diff #69090) | This converts "[*]" to "[.*]" which is wrong. |
46 ↗ | (On Diff #69090) | You need to remove this function. This is not even a "glob" match anymore. |
ELF/Config.h | ||
---|---|---|
47–49 ↗ | (On Diff #69090) | That is because I am trying not to use regex matching when it is unnessecary, like in next case: SymbolBody *B = Sym.IsExternCpp ? Demangled[Sym.Name] : find(Sym.Name); I can probably introduce some helper method hiding this logic. |
ELF/Strings.cpp | ||
33 ↗ | (On Diff #69090) | Before this patch we did not have support of "[" or "]". This patch does not add any testcases as well as does not add support for them. |
ELF/Strings.cpp | ||
---|---|---|
33 ↗ | (On Diff #69090) | The real problem here is that is converts ".foo" to ".foo" (unchanged), what is incorrect. I`ll add testcase and fix that. |
ELF/LinkerScript.cpp | ||
---|---|---|
130–132 ↗ | (On Diff #69090) | I think this should not be slow. Even FreeBSD script |
- Fixed bug (".data" expression was generated from ".data" instead of "\.data"), added testcase.
- Addressed review comments.
ELF/Config.h | ||
---|---|---|
47–49 ↗ | (On Diff #69090) | I think we do not need that new field at all. It can be created on demand since only used in scanVersionScript(). I removed it. |
ELF/Strings.cpp | ||
46 ↗ | (On Diff #69090) | Done. |
ELF/LinkerScript.cpp | ||
---|---|---|
836–841 ↗ | (On Diff #69095) | This function can return a std::regex instead of a vector of std::regexs by concatenating regexs using |. |
ELF/Strings.cpp | ||
29–30 ↗ | (On Diff #69095) | This is not a state machine. |
ELF/SymbolTable.cpp | ||
486 ↗ | (On Diff #69095) | clang-format |
ELF/SymbolTable.h | ||
94 ↗ | (On Diff #69095) | clang-format |
Our software requirements include that GCC should be >= 4.7.0: http://llvm.org/docs/GettingStarted.html#software
However, std::regex wasn't implemented until GCC 4.9.0
Could you use llvm::Regex instead?
ELF/LinkerScript.cpp | ||
---|---|---|
836–841 ↗ | (On Diff #69095) | "|" does not work with std::regex::basic, because not a part of basic regular expressions syntax (http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_05). I guess removing std::regex::basic should be slower while there is no real profit. |
105 ↗ | (On Diff #69675) | I dont like this and other const casts I had to use, |
ELF/Strings.cpp | ||
29–30 ↗ | (On Diff #69095) | Hmm technically it can be I think. Like from initial state basing on input it do a transition to state N, I changed comment anyways :) |
ELF/LinkerScript.cpp | ||
---|---|---|
95 ↗ | (On Diff #69675) | Nit: we probably should use Re as a local variable name throughout this patch instead of Pat, Pattern nor Patterns because it is no longer a glob pattern (which Pat was named after) but a regular expression. |
847–852 ↗ | (On Diff #69675) | It shouldn't be slow. I did not actually take a look at the implementation, but at least in theory, any regular expression that we use can be compiled to a DFA because we do not use anything that requires NFA such as capturing. If you can't use | because you are using the basic re, you shouldn't use the basic re but instead extended one. |
ELF/LinkerScript.h | ||
97 ↗ | (On Diff #69675) | Why unique_ptr? |
132 ↗ | (On Diff #69675) | Why pointers? |
ELF/Strings.h | ||
19 ↗ | (On Diff #69675) | Let's name this compileGlobPattern because it complies a glob pattern to a re. |
ELF/LinkerScript.cpp | ||
---|---|---|
95 ↗ | (On Diff #69675) | Done. |
847–852 ↗ | (On Diff #69675) | Ok, I see. I can't just switch current: std::vector<llvm::Regex> ExcludedFiles; std::vector<llvm::Regex> SectionPatterns; to llvm::Regex ExcludedFiles; Because as I mantioned above, llvm::Regex does not have constructor with no arguments. |
ELF/LinkerScript.h | ||
97 ↗ | (On Diff #69675) | In compare with std::regex, llvm::Regex does not have constructor with no arguments. |
132 ↗ | (On Diff #69675) | Restriction of API. llvm::Regex is non-copyable. So I cant just copy kept patterns into KeptSections. |
106 ↗ | (On Diff #69826) | Not sure, do we want to rename SectionPatterns and FilePattern to something then ? |
ELF/LinkerScript.cpp | ||
---|---|---|
907–912 ↗ | (On Diff #69826) | Why don't you add a constructor with no arguments to llvm::Regex? |
ELF/LinkerScript.cpp | ||
---|---|---|
907–912 ↗ | (On Diff #69826) | OK, I can do that. What do you think about const_casts this patch introduces ? |
ELF/LinkerScript.cpp | ||
---|---|---|
907–912 ↗ | (On Diff #69826) | One more variant is to leave const_casts as is for now. |
llvm::Regex is not used in that many places, so it seems you can make Match const and update all places where that member function is called.
Actually it is. Just search "Regex" in llvm. I wanted to do it const here: D24027. It requires making error field to be mutable.
David Blaikie was against this change:
"Sounds like the error handling is part of the identity of the object - I'm not sure it'd be a good idea to make that mutable & make the object logically const when it's actually changing that error field. (it seems it'd be easy to introduce misuse there - where one user does something to the Regex, then passes a const ref somewhere, then examines the error field and finds it changed - let alone multithreading)"
His suggestion was:
"(or remove the side-channel error handling, and have the APIs return errors explicitly instead - the side channel's likely to be accidentally unchecked by many callers, which is probably a mistake)"
Reworking llvm::Regex API is not a little change and I am not sure it is worth that as std::regex is a part of c++ 11 and I hope soon all compilers will have a support.
How about defining a copy constructor to llvm::Regex and create mutable copies from const Regexs? Regex seems to be just two-pointer-sized object so copying it should be cheap.
I did not mean we should really do that, only that it is not that simple just to copy it properly.
Probably we can remove the set of error field from match() at all.
It has next code:
if (rc != 0) { // regexec can fail due to invalid pattern or running out of memory. error = rc; return false; }
I do not think that set a srror in case of invalid pattern is fine. As we might want to call it second time with valid one.
Out of memory probably is the error we can ignore.
Method anyways return bool result, that should be enough.
ELF/LinkerScript.cpp | ||
---|---|---|
96–97 ↗ | (On Diff #69983) | if (Re->match(S->getSectionName())) |
ELF/LinkerScript.h | ||
101 ↗ | (On Diff #69983) | Please do not use the implicit conversion from an object of type T to ArrayRef<T>. That confused me. Add {} around FilePattern. You also want to rename compileGlobPatterns since now it takes not a single glob pattern but many patterns. |
107 ↗ | (On Diff #69983) | I'd rename SectionRe and FileRe. |
ELF/Strings.cpp | ||
31 ↗ | (On Diff #69983) | There's a room to simplify this function. First , define a function that compiles single glob pattern. static std::string toRegex(StringRef S); and then use it as Regex elf::compileGlobPatterns(ArrayRef<StringRef> V) { std::string T = "^(" + toRegex(V[0]); for (StringRef S : V.slice(1)) T += "|" + toRegex(S); return Regex(T + ")$"); } |
43 ↗ | (On Diff #69983) | . is not the only one metacharacter we want to quote. For example, if it contains \, it needs to be converted to \\. Did you do that? |
ELF/LinkerScript.h | ||
---|---|---|
101 ↗ | (On Diff #69983) | Ok, I just was unsure what is the best way to avoid that. Was thinking about separate method for single pattern compilation, unfortunately it needs to be exported so rejected that way. |
- Addressed review comments.
- Returned const_casts for Regex.
ELF/LinkerScript.cpp | ||
---|---|---|
96–97 ↗ | (On Diff #69983) | Done. |
ELF/LinkerScript.h | ||
107 ↗ | (On Diff #69983) | Done. |
ELF/Strings.cpp | ||
31 ↗ | (On Diff #69983) | Done. |
43 ↗ | (On Diff #69983) | Nope. Looking on the list of metacharacters I dont think I ever saw them in globs. We also do not have testcases with them yet. Do you think this can be done in separate patch ? |
ELF/Strings.cpp | ||
---|---|---|
42 ↗ | (On Diff #70167) | Are you sure? What will happen if you have this? SECTIONS { .text : { "foo(bar*).o"( .text ); } } where "foo(bar*).o" is an actual file name. |
ELF/Strings.cpp | ||
---|---|---|
42 ↗ | (On Diff #70167) | I did not say it is not possible. I meant it is so unusual that we even do not have testcases covering that. And sutuation is unlikely to happen, that is why I suggested to prepare testcases and separate patch for that. |
test/ELF/wildcards2.s | ||
---|---|---|
18 ↗ | (On Diff #70210) | It is not possible to create file with "|" "/" or "\" in name, so this is not tested here. |
ELF/Strings.cpp | ||
---|---|---|
35–36 ↗ | (On Diff #70210) | for (char F : S) By the way, why F? |
41 ↗ | (On Diff #70210) | StringRef(".+^${}()|/\\[]").find_first_of(F) is probably more conventional. |
42 ↗ | (On Diff #70210) | What is this std::string() for? |
test/ELF/wildcards2.s | ||
18 ↗ | (On Diff #70210) | You don't need to test this this vigorously. It is too easy to break on minor filesystems. Add a test just for .. |
test/ELF/wildcards2.s | ||
---|---|---|
18 ↗ | (On Diff #70210) | Posted this comment too early. Will be done in updated diff I mean. |
- Addressed review comments.
test/ELF/wildcards2.s | ||
---|---|---|
18 ↗ | (On Diff #70210) | So, removed the last added testcase because first one already do check for dot. |
They are already. D24102 looks will not be accepted and the latest diffs of current patch does not require it (that is why it has const_casts for Regex).
ELF/LinkerScript.cpp | ||
---|---|---|
103–105 ↗ | (On Diff #70223) | This is the same as return const_cast<Regex &>(Desc->FileRe).match(Filename) && !const_cast<Regex &>(Desc->ExcludedFiles).match(Filename); |
ELF/LinkerScript.h | ||
101 ↗ | (On Diff #70223) | compileGlobPatterns (notice the last s.) |
106 ↗ | (On Diff #70223) | Maybe this should be ExcludedFileRe. |
ELF/Strings.cpp | ||
35–36 ↗ | (On Diff #70223) | I think we usually use C for a character. |
50 ↗ | (On Diff #70223) | This is not a method but just a function. It doesn't convert into a regex form, but converts into a regex object. It always takes multiple glob patterns (it's just that the array could contain just one element). |
ELF/Strings.h | ||
19 ↗ | (On Diff #70223) | S doesn't match with the actual parameter name. |
Thanks for review !
ELF/Strings.cpp | ||
---|---|---|
51 ↗ | (On Diff #70230) | Because Patterns. Actually choosed between V and P. Will switch to P then. |
ELF/Strings.cpp | ||
---|---|---|
51 ↗ | (On Diff #70230) | Actually I found that it is easy to read if you name variables of a small scope according to it's type rather than what the variable has, so please stick with V or Arr. |
ELF/Strings.cpp | ||
---|---|---|
51 ↗ | (On Diff #70230) | We know that the argument is an array of glob patterns. Because the function name and the comment says so. But if you name it P, it'd make everybody wonder what P is. At least, it doesn't make much sense to me. |