This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Fix reading i686-windows executables with GNU environment
ClosedPublic

Authored by mstorsjo on Jun 21 2022, 5:51 AM.

Details

Summary

25c8a061c5739677d2fc0af29a8cc9520207b923 / D127048 added an option
for setting the ABI to GNU.

When an object file is loaded, there's only minimal verification
done for the architecture spec set for it, if the object file only
provides one.

However, for i386 object files, the PECOFF object file plugin
provides two architectures, i386-pc-windows and i686-pc-windows.
This picks a totally different codepath in
TargetList::CreateTargetInternal, where it's treated as a fat
binary. This goes through more verifications to see if the
architectures provided by the object file matches what the
platform plugin supports.

The PlatformWindows() constructor explicitly adds the
"i386-pc-windows" and "i686-pc-windows" architectures (even when
running on other architectures), which allows this "fat binary
verification" to succeed for the i386 object files that provide
two architectures.

However, after that commit, if the object file is advertised with
the different environment (either when lldb is built in a mingw
environment, or if that setting is set), the fat binary validation
won't accept the file any longer.

Update ArchSpec::IsEqualTo with more logic for the Windows use
cases; mismatching vendors is not an issue (they don't have any
practical effect on Windows), and GNU and MSVC environments are
compatible to the point that PlatformWindows can handle object
files for both environments/ABIs.

As a separate path forward, one could also consider to stop returning
two architecture specs from ObjectFilePECOFF::GetModuleSpecifications
for i386 files.

Diff Detail

Event Timeline

mstorsjo created this revision.Jun 21 2022, 5:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 21 2022, 5:51 AM
mstorsjo requested review of this revision.Jun 21 2022, 5:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 21 2022, 5:51 AM

However, after this commit, if the object file is advertised with the different environment (either when built in a mingw environment, or if that setting is set), the fat binary validation won't accept the file any longer.

Is "this commit" referring to https://reviews.llvm.org/rG25c8a061c5739677d2fc0af29a8cc9520207b923 or to the change we're reviewing here?

I'm struggling to think through what this is actually fixing. Is the issue that one half of a fat binary can have a different ABI? Wouldn't both sides of the fat binary be loaded as the ABI chosen in the setting, or is that exactly what you're fixing.

lldb/source/Utility/ArchSpec.cpp
1006

You could use isOSWindows if you want. Maybe use an intermediate bool to clear up the if statement.

bool both_windows = lhs_triple.isOSWindows() && rhs_triple.isOSWindows();
if ((lhs_triple_vendor != rhs_triple_vendor) && 
(exact_match || !both_windows)) {
alvinhochun added a comment.EditedJun 21 2022, 7:00 AM

Yes, this (patch) looks right to me.

However, after this commit, if the object file is advertised with the different environment (either when built in a mingw environment, or if that setting is set), the fat binary validation won't accept the file any longer.

Is "this commit" referring to https://reviews.llvm.org/rG25c8a061c5739677d2fc0af29a8cc9520207b923 or to the change we're reviewing here?

Sorry, I meant 25c8a061c5739677d2fc0af29a8cc9520207b923.

I'm struggling to think through what this is actually fixing. Is the issue that one half of a fat binary can have a different ABI? Wouldn't both sides of the fat binary be loaded as the ABI chosen in the setting, or is that exactly what you're fixing.

The issue is that in the case of a fat binary, we run a loop to check and pick an architecture out of it: https://github.com/llvm/llvm-project/blob/main/lldb/source/Target/TargetList.cpp#L173-L184 This is checked against the architectures added here: https://github.com/llvm/llvm-project/blob/main/lldb/source/Plugins/Platform/Windows/PlatformWindows.cpp#L127-L131

(In the case of a non-fat binary, i.e. all other architectures than i386/i686 on PECOFF, we hit this case here, where we don't validate the arch spec at all: https://github.com/llvm/llvm-project/blob/main/lldb/source/Target/TargetList.cpp#L160-L164)

Now since the addition of the new setting in 25c8a061c5739677d2fc0af29a8cc9520207b923, ObjectFile::GetModuleSpecifications can return i386-pc-windows-gnu and i686-pc-windows-gnu, while PlatformWindows provides a list containing i386-pc-windows (which is normalized to i386-pc-windows-msvc) and i686-pc-windows (...-msvc). Due to the differing environments, these are deemed incompatible, and LLDB flat out refuses to load the file, with the error message error: no matching platforms found for this file.

Now since the addition of the new setting in 25c8a061c5739677d2fc0af29a8cc9520207b923, ObjectFile::GetModuleSpecifications can return i386-pc-windows-gnu and i686-pc-windows-gnu, while PlatformWindows provides a list containing i386-pc-windows (which is normalized to i386-pc-windows-msvc) and i686-pc-windows (...-msvc). Due to the differing environments, these are deemed incompatible, and LLDB flat out refuses to load the file, with the error message error: no matching platforms found for this file.

Cool, now I understand. Can you add something like that as a comment in the test file? Without that context the test seems not to be testing much at all.

mstorsjo updated this revision to Diff 438733.Jun 21 2022, 9:13 AM

Added a comment to the testcase.

mstorsjo edited the summary of this revision. (Show Details)Jun 21 2022, 9:14 AM
DavidSpickett accepted this revision.Jun 22 2022, 2:09 AM

LGTM. I like the ifs the way I suggested but up to you, either way it's not the easiest logic to parse.

This revision is now accepted and ready to land.Jun 22 2022, 2:09 AM

LGTM. I like the ifs the way I suggested but up to you, either way it's not the easiest logic to parse.

Oh, sorry, I forgot about that comment. Yes, that does indeed look even nicer, I'll try to add that.

mstorsjo updated this revision to Diff 438951.Jun 22 2022, 2:36 AM

Refactored the code to add the both_windows variable as suggested (which turned out much nicer, thanks!).

As a separate path forward, one could also consider to stop returning
two architecture specs from ObjectFilePECOFF::GetModuleSpecifications
for i386 files.

I think that would be much more preferable. I think the current implementation GetModuleSpecifications is based on a misconception about what the multiple results of that function mean (a fat binary -- as you've discovered).

We wouldn't want to return armv6-*-* (or armv7, v8, ...) for a file that claims it can be run on armv5. Elf files also return just a single result for i386 file.

As a separate path forward, one could also consider to stop returning
two architecture specs from ObjectFilePECOFF::GetModuleSpecifications
for i386 files.

I think that would be much more preferable. I think the current implementation GetModuleSpecifications is based on a misconception about what the multiple results of that function mean (a fat binary -- as you've discovered).

We wouldn't want to return armv6-*-* (or armv7, v8, ...) for a file that claims it can be run on armv5. Elf files also return just a single result for i386 file.

Yep, that's probably a good path forward for the future - I was just weary of that possibly opening another huge can of worms. And regardless of that, I think the functional change of this patch, allowing considering various variations of triples that commonly occur on windows as compatible, is sensible on its own (such issues have caused unnecessary tweaking back and forth once or twice before too).

As a separate path forward, one could also consider to stop returning
two architecture specs from ObjectFilePECOFF::GetModuleSpecifications
for i386 files.

I think that would be much more preferable. I think the current implementation GetModuleSpecifications is based on a misconception about what the multiple results of that function mean (a fat binary -- as you've discovered).

We wouldn't want to return armv6-*-* (or armv7, v8, ...) for a file that claims it can be run on armv5. Elf files also return just a single result for i386 file.

If we'd just set this to the baseline, i386, would that have any effect for how lldb e.g. is able to disassemble/interpret instructions that don't exist in the i386 baseline architecture?

As a separate path forward, one could also consider to stop returning
two architecture specs from ObjectFilePECOFF::GetModuleSpecifications
for i386 files.

I think that would be much more preferable. I think the current implementation GetModuleSpecifications is based on a misconception about what the multiple results of that function mean (a fat binary -- as you've discovered).

We wouldn't want to return armv6-*-* (or armv7, v8, ...) for a file that claims it can be run on armv5. Elf files also return just a single result for i386 file.

If we'd just set this to the baseline, i386, would that have any effect for how lldb e.g. is able to disassemble/interpret instructions that don't exist in the i386 baseline architecture?

It should not have any effect (if it does, that's a separate fix). In the disassembler, we explicitly enable the latest instruction set, and I can't think of anything else that would be impacted by it.

If we'd just set this to the baseline, i386, would that have any effect for how lldb e.g. is able to disassemble/interpret instructions that don't exist in the i386 baseline architecture?

It should not have any effect (if it does, that's a separate fix). In the disassembler, we explicitly enable the latest instruction set, and I can't think of anything else that would be impacted by it.

Thanks - I did some cursory testing with removing the extra i686 everywhere, and at least on a quick test, it seems to work just fine (and requires a minor adjustment to only one testcase).

I found that this duality was introduced in 5e6f45201f0b62c1e7a24fc396f3ea6e10dc880d / D7120 and ad587ae4ca143d388c0ec4ef2faa1b5eddedbf67 / D4658 (CC @zturner), what do you make out of the reasonings in those commits?

If we'd just set this to the baseline, i386, would that have any effect for how lldb e.g. is able to disassemble/interpret instructions that don't exist in the i386 baseline architecture?

It should not have any effect (if it does, that's a separate fix). In the disassembler, we explicitly enable the latest instruction set, and I can't think of anything else that would be impacted by it.

Thanks - I did some cursory testing with removing the extra i686 everywhere, and at least on a quick test, it seems to work just fine (and requires a minor adjustment to only one testcase).

I found that this duality was introduced in 5e6f45201f0b62c1e7a24fc396f3ea6e10dc880d / D7120 and ad587ae4ca143d388c0ec4ef2faa1b5eddedbf67 / D4658 (CC @zturner), what do you make out of the reasonings in those commits?

The first patch seems like it's just working around some mismatches in the windows dynamic loader plugin. I think a better approach would be to have the dynamic loader claim both architectures, though I don't think that is necessary if we're just consistent about what we use. I don't see anything wrong with the second patch (the darwin platform does something similar for arm architectures, even though I'm not convinced it's necessary (the reason it's necessary for darwin is because there they actually make a distinction between armv6XX and armv7YY and treat those as different architectures).

I found that this duality was introduced in 5e6f45201f0b62c1e7a24fc396f3ea6e10dc880d / D7120 and ad587ae4ca143d388c0ec4ef2faa1b5eddedbf67 / D4658 (CC @zturner), what do you make out of the reasonings in those commits?

The first patch seems like it's just working around some mismatches in the windows dynamic loader plugin. I think a better approach would be to have the dynamic loader claim both architectures, though I don't think that is necessary if we're just consistent about what we use. I don't see anything wrong with the second patch (the darwin platform does something similar for arm architectures, even though I'm not convinced it's necessary (the reason it's necessary for darwin is because there they actually make a distinction between armv6XX and armv7YY and treat those as different architectures).

The odd thing about the second one is the added hardcoded AddArch(ArchSpec("i686-pc-windows")); and AddArch(ArchSpec("i386-pc-windows")); at the top of PlatformWindows.cpp.

Anyway, it does seem to work fine in my quick test to get rid of this duality, so I'll post a patch that does that.

The odd thing about the second one is the added hardcoded AddArch(ArchSpec("i686-pc-windows")); and AddArch(ArchSpec("i386-pc-windows")); at the top of PlatformWindows.cpp.

Anyway, it does seem to work fine in my quick test to get rid of this duality, so I'll post a patch that does that.

I think those hardcoded lines are there precisely because of the duality (i.e. wanting to support both i386 and i686 regardless of what the host layer reports). If that is removed, maybe all of that can be changed to the pattern used in other platform plugins.

The odd thing about the second one is the added hardcoded AddArch(ArchSpec("i686-pc-windows")); and AddArch(ArchSpec("i386-pc-windows")); at the top of PlatformWindows.cpp.

Anyway, it does seem to work fine in my quick test to get rid of this duality, so I'll post a patch that does that.

I think those hardcoded lines are there precisely because of the duality (i.e. wanting to support both i386 and i686 regardless of what the host layer reports).

Both that, and because the i386+i686 fat binary arches from ObjectFile triggered the more elaborate arch validation, these had to be present to make it work. Anyway, after getting rid of the duality, those hardcoded ones don't seem to be needed either.

If that is removed, maybe all of that can be changed to the pattern used in other platform plugins.

That looks reasonable!