This is an archive of the discontinued LLVM Phabricator instance.

[ms] Add @feat.00 magic symbol to COFF object files from AsmParser
AbandonedPublic

Authored by epastor on Nov 26 2019, 5:53 PM.

Details

Reviewers
rnk
craig.topper
Summary

Allows assembly files targeting 32-bit Windows to be linked with /safeseh.

For example, Boost Context distributes GNU-syntax assembly files for its primitives, but if assembled with clang, neither link.exe nor lld-link can link them with /safeseh.

Code ported from X86AsmPrinter.cpp, which already emitted this into X86 assembly... but we need it here to inject it into assembly that doesn't already have the symbol defined. (See http://llvm-cs.pcc.me.uk/lib/Target/X86/X86AsmPrinter.cpp#599)

Event Timeline

epastor created this revision.Nov 26 2019, 5:53 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 26 2019, 5:53 PM
epastor updated this revision to Diff 231159.Nov 26 2019, 5:57 PM
  • Fix comment.
Harbormaster completed remote builds in B41533: Diff 231159.
epastor updated this revision to Diff 231290.Nov 27 2019, 9:51 AM
  • Restrict @feat.00 symbol creation to the initial text segment.
epastor edited the summary of this revision. (Show Details)Nov 27 2019, 12:01 PM
epastor updated this revision to Diff 231315.Nov 27 2019, 1:15 PM
  • Fix tests
  • Add @feat.00 testing to basic-coff{,-64}.s

Build result: fail - 60328 tests passed, 1 failed and 732 were skipped.

failed: LLVM.MC/COFF/feat00.s

Log files: console-log.txt, CMakeCache.txt

Is it possible for someone to write something in their hand written assembly related to SEH that would make this behavior incorrect?

epastor updated this revision to Diff 231319.Nov 27 2019, 1:35 PM
  • Fix missed change

Build result: pass - 60329 tests passed, 0 failed and 732 were skipped.

Log files: console-log.txt, CMakeCache.txt

epastor added a comment.EditedNov 27 2019, 1:50 PM

Is it possible for someone to write something in their hand written assembly related to SEH that would make this behavior incorrect?

Not yet, no - or maybe just not yet without extreme contortions. This is only incorrect if an SEH handler is registered, but not registered for safeseh processing (by listing its symbol index in the .sxdata section).

To register an SEH handler, one normally uses the .seh_handler directive - which LLVM doesn't currently implement for 32-bit configurations. (In MASM, one would use the FRAME parameter to PROC.) This would be somewhere between odd and impossible to do by hand.

If at some later date we implement .seh_handler for 32-bit configurations, or (as I'm working on) support PROC when building via a new llvm-ml driver, this would become incorrect; we'd need to set this bit iff the user specified safeseh to the assembler, as ml.exe does with the /safeseh command-line switch.

rnk requested changes to this revision.Nov 27 2019, 3:21 PM

I don't think we should do this implicitly. Assembler should be a low level representation of what the programmer wants in the symbol table. That includes @feat.00. There is existing nasm documentation covering this. Basically, this is something the programmer has to know if they want to target windows:
https://nasm.us/doc/nasmdoc7.html

There are a couple of other @feat.00 bits that the compiler could set. Wouldn't this create problems then with clang -save-temps, where we parse the .s output and assemble it? I'd expect the default @feat.00 and the explicit one in the .s file to conflict.

This revision now requires changes to proceed.Nov 27 2019, 3:21 PM
rnk added a comment.Nov 27 2019, 3:22 PM

Another way to look at it would be, if you round-tripped assembly through llvm-mc repeatedly, would you accumulate @feat.00 symbol definitions in the textual assembly?

epastor added a comment.EditedNov 27 2019, 3:36 PM
In D70757#1762394, @rnk wrote:

There is existing nasm documentation covering this. Basically, this is something the programmer has to know if they want to target windows:
https://nasm.us/doc/nasmdoc7.html

Thanks, Reid! So to clarify: our standards for this are to match nasm for the current interface?

As for roundtripping... I'll have to check, but it seems likely to be a problem. If necessary, I'll withdraw this and leave it to be managed by whatever interface ends up used to emulate MASM. (ml.exe and ml64.exe both emit @feat.00 implicitly into their symbol tables, with ml.exe setting the LSB if /safeseh is passed.)

In D70757#1762394, @rnk wrote:

I don't think we should do this implicitly. Assembler should be a low level representation of what the programmer wants in the symbol table. That includes @feat.00. There is existing nasm documentation covering this. Basically, this is something the programmer has to know if they want to target windows:
https://nasm.us/doc/nasmdoc7.html

Hang on. NASM documents the following:

Former can be easily achieved with any NASM version by adding following line to source code:

$@feat.00 equ 1

As of version 2.03 NASM adds this absolute symbol automatically. If it's not already present to be precise.

Would it be acceptable to match NASM, adding the symbol automatically unless it's already defined? Or do we want to stick closer to the ideal that we emit no symbols other than those explicitly given?

epastor updated this revision to Diff 231734.Dec 2 2019, 10:31 AM
  • Switch to adding @feat.00 as a final symbol, and only if not defined.

Build result: pass - 60329 tests passed, 0 failed and 732 were skipped.

Log files: console-log.txt, CMakeCache.txt

rnk added a comment.Dec 2 2019, 1:12 PM

Thanks, Reid! So to clarify: our standards for this are to match nasm for the current interface?

No, LLVM MC is modeled after GNU binutils as. It's not strictly compatible, since we had to extend the .section directive for COFF to express certain extensions, but binutils is usually the first place we look for prior art. Nasm is just a good second place to look for how other assemblers handle common issues.

As for roundtripping... I'll have to check, but it seems likely to be a problem. If necessary, I'll withdraw this and leave it to be managed by whatever interface ends up used to emulate MASM. (ml.exe and ml64.exe both emit @feat.00 implicitly into their symbol tables, with ml.exe setting the LSB if /safeseh is passed.)

In the spirit of following gnu as when possible, I think it would be better to make this conditional on some kind of masm mode setting.

epastor abandoned this revision.Oct 19 2020, 1:52 PM