Page MenuHomePhabricator

[Utility] Remove Triple{Environment,OS,Vendor}IsUnspecifiedUnknown from ArchSpec
ClosedPublic

Authored by xiaobai on Feb 25 2019, 4:30 PM.

Details

Summary

These functions should always return the opposite of the Triple{Environment,OS,Vendor}WasSpecified functions. Unspecified unknown is the same as unspecified, which is why one set of functions should give us what we want. It's possible to have specified unknown, which is why we can't just rely on checking the enum values of vendor/os/environment. We must also ensure that the names of these are empty and not "unknown".

Diff Detail

Repository
rLLDB LLDB

Event Timeline

xiaobai created this revision.Feb 25 2019, 4:30 PM

Do we actually need to check the string values in addition to the enum values? It looks like the llvm class tries pretty hard to make that the canonical way to query it. I know triples and arches are pretty tricky in lldb though, so maybe there's a reason that we can't.

Do we actually need to check the string values in addition to the enum values? It looks like the llvm class tries pretty hard to make that the canonical way to query it. I know triples and arches are pretty tricky in lldb though, so maybe there's a reason that we can't.

I wasn't sure when I was writing this, but I just tried and the tests seem to pass when we just check the enum values so I presume it's fine.

xiaobai updated this revision to Diff 188309.Feb 25 2019, 10:43 PM

Remove redundant checks for triple components

I think this basically defeats the purpose of the IsUnspecifiedUnknown functions, which was to differentiate between foo-unknown-bar and foo--bar triples. That in itself was a pretty big hack, but it seems that there is functionality which depends on this. There was a discussion about this back in december http://lists.llvm.org/pipermail/lldb-dev/2018-December/014437.html, and IIRC it converged to some sort of a conclusion, but I don't think there hasn't been any effort to try to implement it. I'm not sure what exactly it means about the future of this patch, but I'd be cautious about it.

clayborg requested changes to this revision.Feb 26 2019, 7:13 AM

I seem to remember that we currently expect that there are two cases when it comes to unknowns:

  • specified unknowns
  • unspecific unknowns

A "specified unknown" is when the user actually typed "unknown" in their triple. IIRC there is no "none" to specify that there is no OS, so we use "specified unknowns" for this case. In this case we expect the enum _and_ the string accessor value for vendor and/or OS to be the string "unknown".

"unspecified unkowns" is when the user just types "arrmv7" as the triple. We need to know they didn't specify anything for vendor and OS. In this case the enums return unknown, but the string accessor return values are empty.

So we can't really change this code unless we add a "none" enum and string value to the LLVM triple system so we can tell the difference. Right now we rely on the above mechanism in a lot of places around the code.

This revision now requires changes to proceed.Feb 26 2019, 7:13 AM

A "specified unknown" is when the user actually typed "unknown" in their triple. IIRC there is no "none" to specify that there is no OS, so we use "specified unknowns" for this case. In this case we expect the enum _and_ the string accessor value for vendor and/or OS to be the string "unknown".

This will work when the user types "armv7" as their triple, but when they type "armv7--linux" the vendor gets implicitly set to "unknown" even though the user didn't specify it.

I think this basically defeats the purpose of the IsUnspecifiedUnknown functions, which was to differentiate between foo-unknown-bar and foo--bar triples. That in itself was a pretty big hack, but it seems that there is functionality which depends on this. There was a discussion about this back in december http://lists.llvm.org/pipermail/lldb-dev/2018-December/014437.html, and IIRC it converged to some sort of a conclusion, but I don't think there hasn't been any effort to try to implement it. I'm not sure what exactly it means about the future of this patch, but I'd be cautious about it.

Part of what you said is inconsistent with my understanding. Specifically, if you specify a Triple with the string "foo-unknown-bar" and another Triple with the string "foo--bar" you end up with the same triple. We rely on the llvm::Triple::normalize function to create triples from strings, and there is no way to indicate using this method that a triple has unspecified non-optional parts. We are relying on ArchSpec being created with a Triple object and not a StringRef for this behavior to work.

Yeah it does kind of defeat the purpose of some of these functions. I would like to change it to just IsUnknown (or just delete the functions entirely) because IsUnspecifiedUnknownshould always be the opposite of the WasSpecified functions. My understanding is that the llvm Triple class tries its best to make sure things are marked as the Unknown enum value if they are unspecified or marked as unknown, but there could be some strange edge case I'm not aware of.

Regardless, I would like to add these tests (or similar ones) to document the understanding of when vendors, OSes, and environments are considered unspecified or not.

A "specified unknown" is when the user actually typed "unknown" in their triple. IIRC there is no "none" to specify that there is no OS, so we use "specified unknowns" for this case. In this case we expect the enum _and_ the string accessor value for vendor and/or OS to be the string "unknown".

This will work when the user types "armv7" as their triple, but when they type "armv7--linux" the vendor gets implicitly set to "unknown" even though the user didn't specify it.

I think this basically defeats the purpose of the IsUnspecifiedUnknown functions, which was to differentiate between foo-unknown-bar and foo--bar triples. That in itself was a pretty big hack, but it seems that there is functionality which depends on this. There was a discussion about this back in december http://lists.llvm.org/pipermail/lldb-dev/2018-December/014437.html, and IIRC it converged to some sort of a conclusion, but I don't think there hasn't been any effort to try to implement it. I'm not sure what exactly it means about the future of this patch, but I'd be cautious about it.

Part of what you said is inconsistent with my understanding. Specifically, if you specify a Triple with the string "foo-unknown-bar" and another Triple with the string "foo--bar" you end up with the same triple. We rely on the llvm::Triple::normalize function to create triples from strings, and there is no way to indicate using this method that a triple has unspecified non-optional parts. We are relying on ArchSpec being created with a Triple object and not a StringRef for this behavior to work.

The "llvm::Triple::normalize" must have been added at some point and I missed it. That will mess up some assumptions in LLDB that have been around for a while.

Yeah it does kind of defeat the purpose of some of these functions. I would like to change it to just IsUnknown (or just delete the functions entirely) because IsUnspecifiedUnknownshould always be the opposite of the WasSpecified functions. My understanding is that the llvm Triple class tries its best to make sure things are marked as the Unknown enum value if they are unspecified or marked as unknown, but there could be some strange edge case I'm not aware of.

Regardless, I would like to add these tests (or similar ones) to document the understanding of when vendors, OSes, and environments are considered unspecified or not.

I agree testing is good. The main things I would like to stay consistent are:

  • if any part of a triple isn't specified, we need to be able to tell so that MergeFrom can merge in any unspecified parts (arch, env, os, vendor)
  • if any part of a triple was specified, including "unknown" (since there is no "none" for OS or vendor), then it should remain as is during a MergeFrom

Target triples are used by many plug-ins to determine if they should load or not. The DynamicLoaderStatic is used for bare board development and relies on someone being able to specify "armv7-unknown-unknown" so that it knows the triple is has a specified "unknown" for OS and vendor. Then it loads all images at the address that they are in the file (load address == file address).

I would be all for adding a "none" as a valid vendor, os or environment to the llvm triple class. Then all of the unspecified unknown stuff goes away. But I am not sure how likely the LLVM community will like this change since it won't be useful for non LLDB code.

I agree testing is good. The main things I would like to stay consistent are:

  • if any part of a triple isn't specified, we need to be able to tell so that MergeFrom can merge in any unspecified parts (arch, env, os, vendor)
  • if any part of a triple was specified, including "unknown" (since there is no "none" for OS or vendor), then it should remain as is during a MergeFrom

Target triples are used by many plug-ins to determine if they should load or not. The DynamicLoaderStatic is used for bare board development and relies on someone being able to specify "armv7-unknown-unknown" so that it knows the triple is has a specified "unknown" for OS and vendor. Then it loads all images at the address that they are in the file (load address == file address).

In that case, I plan on doing the following:

  • Removing the IsUnspecifiedUnknown functions.
  • Adding more tests cases so we know if triple parsing changes later in a way that breaks us in the future.
xiaobai updated this revision to Diff 188454.Feb 26 2019, 1:50 PM

Add more tests and rethink approach

xiaobai retitled this revision from [Utility] Allow the value 'unknown' when checking if triple components are unknown to [Utility] Remove Triple{Environment,OS,Vendor}IsUnspecifiedUnknown from ArchSpec.Feb 26 2019, 1:57 PM
xiaobai edited the summary of this revision. (Show Details)

@clayborg I recently ran into a similar issue and I think that perhaps adding explicit llvm::Triple::Any{Vendor|OS|...} enumerators to llvm::Triple to make this distinction explicit would be the cleanest solution.

@clayborg I recently ran into a similar issue and I think that perhaps adding explicit llvm::Triple::Any{Vendor|OS|...} enumerators to llvm::Triple to make this distinction explicit would be the cleanest solution.

Any is fine, we just need a way to say "no os" or "no vendor" or "no environment". The way I am thinking about the way things would be:

If we had a "None":

  • "Any" would be "unknown" (specified or unspecified)
  • "None" would be the new enum/string

If we add "Any":

  • "Any" would be the new enum/string
  • "None" would be "unknown" (specified or unspecified)

Seems like less work to add the "None". Otherwise we end up having to change all much more triple stuff in LLVM because if we init a triple with "armv7", it currently defaults to unknown for os, vendor and env and that would need to change. Or all constructors in LLDB would need to change to call a different constructor that would force any unspecified parts to become "Any". Thoughts?
Not sure which is easier.

clayborg requested changes to this revision.Feb 26 2019, 2:28 PM

Still issues from what I see. We need to know if something was specified or not and with your change we don't know the difference between the "none" state (which is current enum = unknown and string = "unknown") and the just not specified case (which is current enum = unknown and string = <empty>)

This revision now requires changes to proceed.Feb 26 2019, 2:28 PM
clayborg accepted this revision.Feb 26 2019, 2:34 PM

Actually on re-reading the patch, we are detecting if anything wasn't set and setting it. Makes sense.

This revision is now accepted and ready to land.Feb 26 2019, 2:34 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 26 2019, 3:49 PM