These functions should always return the opposite of the Triple{Environment,OS,Vendor}WasSpecified functions. Unspecified unknown is the same as unspecified, which is why one set of functions should give us what we want. It's possible to have specified unknown, which is why we can't just rely on checking the enum values of vendor/os/environment. We must also ensure that the names of these are empty and not "unknown".
Details
- Reviewers
aprantl labath JDevlieghere clayborg • espindola - Commits
- rGbee015efb5ff: [Utility] Remove Triple{Environment,OS,Vendor}IsUnspecifiedUnknown from ArchSpec
rL354933: [Utility] Remove Triple{Environment,OS,Vendor}IsUnspecifiedUnknown from ArchSpec
rLLDB354933: [Utility] Remove Triple{Environment,OS,Vendor}IsUnspecifiedUnknown from ArchSpec
Diff Detail
- Build Status
Buildable 28534 Build 28533: arc lint + arc unit
Event Timeline
Do we actually need to check the string values in addition to the enum values? It looks like the llvm class tries pretty hard to make that the canonical way to query it. I know triples and arches are pretty tricky in lldb though, so maybe there's a reason that we can't.
I wasn't sure when I was writing this, but I just tried and the tests seem to pass when we just check the enum values so I presume it's fine.
I think this basically defeats the purpose of the IsUnspecifiedUnknown functions, which was to differentiate between foo-unknown-bar and foo--bar triples. That in itself was a pretty big hack, but it seems that there is functionality which depends on this. There was a discussion about this back in december http://lists.llvm.org/pipermail/lldb-dev/2018-December/014437.html, and IIRC it converged to some sort of a conclusion, but I don't think there hasn't been any effort to try to implement it. I'm not sure what exactly it means about the future of this patch, but I'd be cautious about it.
I seem to remember that we currently expect that there are two cases when it comes to unknowns:
- specified unknowns
- unspecific unknowns
A "specified unknown" is when the user actually typed "unknown" in their triple. IIRC there is no "none" to specify that there is no OS, so we use "specified unknowns" for this case. In this case we expect the enum _and_ the string accessor value for vendor and/or OS to be the string "unknown".
"unspecified unkowns" is when the user just types "arrmv7" as the triple. We need to know they didn't specify anything for vendor and OS. In this case the enums return unknown, but the string accessor return values are empty.
So we can't really change this code unless we add a "none" enum and string value to the LLVM triple system so we can tell the difference. Right now we rely on the above mechanism in a lot of places around the code.
This will work when the user types "armv7" as their triple, but when they type "armv7--linux" the vendor gets implicitly set to "unknown" even though the user didn't specify it.
Part of what you said is inconsistent with my understanding. Specifically, if you specify a Triple with the string "foo-unknown-bar" and another Triple with the string "foo--bar" you end up with the same triple. We rely on the llvm::Triple::normalize function to create triples from strings, and there is no way to indicate using this method that a triple has unspecified non-optional parts. We are relying on ArchSpec being created with a Triple object and not a StringRef for this behavior to work.
Yeah it does kind of defeat the purpose of some of these functions. I would like to change it to just IsUnknown (or just delete the functions entirely) because IsUnspecifiedUnknownshould always be the opposite of the WasSpecified functions. My understanding is that the llvm Triple class tries its best to make sure things are marked as the Unknown enum value if they are unspecified or marked as unknown, but there could be some strange edge case I'm not aware of.
Regardless, I would like to add these tests (or similar ones) to document the understanding of when vendors, OSes, and environments are considered unspecified or not.
The "llvm::Triple::normalize" must have been added at some point and I missed it. That will mess up some assumptions in LLDB that have been around for a while.
Yeah it does kind of defeat the purpose of some of these functions. I would like to change it to just IsUnknown (or just delete the functions entirely) because IsUnspecifiedUnknownshould always be the opposite of the WasSpecified functions. My understanding is that the llvm Triple class tries its best to make sure things are marked as the Unknown enum value if they are unspecified or marked as unknown, but there could be some strange edge case I'm not aware of.
Regardless, I would like to add these tests (or similar ones) to document the understanding of when vendors, OSes, and environments are considered unspecified or not.
I agree testing is good. The main things I would like to stay consistent are:
- if any part of a triple isn't specified, we need to be able to tell so that MergeFrom can merge in any unspecified parts (arch, env, os, vendor)
- if any part of a triple was specified, including "unknown" (since there is no "none" for OS or vendor), then it should remain as is during a MergeFrom
Target triples are used by many plug-ins to determine if they should load or not. The DynamicLoaderStatic is used for bare board development and relies on someone being able to specify "armv7-unknown-unknown" so that it knows the triple is has a specified "unknown" for OS and vendor. Then it loads all images at the address that they are in the file (load address == file address).
I would be all for adding a "none" as a valid vendor, os or environment to the llvm triple class. Then all of the unspecified unknown stuff goes away. But I am not sure how likely the LLVM community will like this change since it won't be useful for non LLDB code.
In that case, I plan on doing the following:
- Removing the IsUnspecifiedUnknown functions.
- Adding more tests cases so we know if triple parsing changes later in a way that breaks us in the future.
@clayborg I recently ran into a similar issue and I think that perhaps adding explicit llvm::Triple::Any{Vendor|OS|...} enumerators to llvm::Triple to make this distinction explicit would be the cleanest solution.
Any is fine, we just need a way to say "no os" or "no vendor" or "no environment". The way I am thinking about the way things would be:
If we had a "None":
- "Any" would be "unknown" (specified or unspecified)
- "None" would be the new enum/string
If we add "Any":
- "Any" would be the new enum/string
- "None" would be "unknown" (specified or unspecified)
Seems like less work to add the "None". Otherwise we end up having to change all much more triple stuff in LLVM because if we init a triple with "armv7", it currently defaults to unknown for os, vendor and env and that would need to change. Or all constructors in LLDB would need to change to call a different constructor that would force any unspecified parts to become "Any". Thoughts?
Not sure which is easier.
Still issues from what I see. We need to know if something was specified or not and with your change we don't know the difference between the "none" state (which is current enum = unknown and string = "unknown") and the just not specified case (which is current enum = unknown and string = <empty>)
Actually on re-reading the patch, we are detecting if anything wasn't set and setting it. Makes sense.