This is an archive of the discontinued LLVM Phabricator instance.

[ms] [llvm-ml] Add support for ALIGN, EVEN, and ORG directives
ClosedPublic

Authored by epastor on Dec 2 2020, 1:21 PM.

Details

Summary

Match ML.EXE's behavior for ALIGN, EVEN, and ORG directives both at file level and in STRUCTs.

We currently reject negative offsets passed to ORG inside STRUCTs (in ML.EXE and ML64.EXE, they wrap around as for an unsigned 32-bit integer).

Also, if a STRUCT is declared using an ORG directive, no value of that type can be defined.

Diff Detail

Event Timeline

epastor created this revision.Dec 2 2020, 1:21 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 2 2020, 1:21 PM
epastor requested review of this revision.Dec 2 2020, 1:21 PM
thakis added a comment.Dec 3 2020, 6:51 AM

Does it wrap around 32-bit in 64-bit output too?

Is this relied on? Or could we error out on negative values instead?

thakis requested changes to this revision.Dec 3 2020, 6:52 AM

(marking "request changes" just so it disappears form my dashboard and can reappear once you've answered -- doesn't necessarily need changes)

This revision now requires changes to proceed.Dec 3 2020, 6:52 AM

Does it wrap around 32-bit in 64-bit output too?

Oddly enough: yes.

Is this relied on? Or could we error out on negative values instead?

I don't think we can prove that it isn't relied on, especially when ORG explicitly allows putting things "outside bounds", so I'd rather maintain compatibility.

epastor requested review of this revision.Dec 10 2020, 7:07 AM
epastor updated this revision to Diff 310890.Dec 10 2020, 7:37 AM

Rebase on parent

Does it wrap around 32-bit in 64-bit output too?

Oddly enough: yes.

Is this relied on? Or could we error out on negative values instead?

I don't think we can prove that it isn't relied on, especially when ORG explicitly allows putting things "outside bounds", so I'd rather maintain compatibility.

Well, we don't need to prove it if we make it an error: We'll know if the diag fires. So if we make this an error first and then build lots of files with it, we'll learn if we run into it. At that point we'll know we need to make it compatible, but maybe we won't have to. Doesn't it seem better to default to less surprising behavior and only put in the compat warts once there's evidence that we need them?

thakis requested changes to this revision.Dec 15 2020, 7:52 AM

(marking "request changes" just so it disappears form my dashboard and can reappear once you've answered -- doesn't necessarily need changes)

This revision now requires changes to proceed.Dec 15 2020, 7:52 AM
epastor updated this revision to Diff 331307.Mar 17 2021, 10:27 AM

Emit an error on negative ORG offsets, per feedback.

We can switch this to a warning if we find that real MASM code relies on the 32-bit wraparound found in ml.exe.

epastor updated this revision to Diff 331310.Mar 17 2021, 10:29 AM

Restore the explicit cast to make it easier to switch to a Warning later

epastor edited the summary of this revision. (Show Details)Mar 17 2021, 10:31 AM
thakis accepted this revision.Apr 16 2021, 8:00 AM
thakis added inline comments.
llvm/lib/MC/MCParser/MasmParser.cpp
3673–3677

we do this in a bunch of places -- maybe Struct should have a updateSizeForField()?

Many of these "add a field" functions look a bit repetitive -- not sure if that's the best way to cut back on that, or if there's a bigger win to be had.

4540

can you add a comment here saying what an empty field name means? ("fields in nameless struct get added to containing struct")

This revision is now accepted and ready to land.Apr 16 2021, 8:00 AM

Does it wrap around 32-bit in 64-bit output too?

Oddly enough: yes.

Is this relied on? Or could we error out on negative values instead?

I don't think we can prove that it isn't relied on, especially when ORG explicitly allows putting things "outside bounds", so I'd rather maintain compatibility.

Well, we don't need to prove it if we make it an error: We'll know if the diag fires. So if we make this an error first and then build lots of files with it, we'll learn if we run into it. At that point we'll know we need to make it compatible, but maybe we won't have to. Doesn't it seem better to default to less surprising behavior and only put in the compat warts once there's evidence that we need them?

I'm not convinced on the philosophical grounds... but I went poking around GitHub for evidence, and couldn't find any uses of ORG inside a STRUCT with a negative offset. So - made into an error!

While I was working on the comments above, I discovered a problem - after improving the tests and fixing various errors, the way we were dealing with determining field offsets caused a failure on the second field after an ORG offset change. I've been working on and off to adjust the code to handle this; it requires a nontrivial rework, as Structure has to track both its Size and the offset of the next field. I'll upload that revision shortly, and I'm afraid it will require a re-review. Sorry, thakis - and I'll try to find someone else if more appropriate.

epastor updated this revision to Diff 351055.Jun 9 2021, 9:19 PM

Fix handling of the second (and later) field after an ORG directive

This required a nontrivial rework of how we handle offset tracking - but I think it simplifies the code significantly.

epastor requested review of this revision.Jun 9 2021, 9:20 PM
epastor marked an inline comment as done.
epastor added inline comments.
llvm/lib/MC/MCParser/MasmParser.cpp
3673–3677

We should definitely do a rework of this. I'll see what I can do in a followup commit.

thakis added inline comments.Jun 23 2021, 1:30 PM
llvm/lib/MC/MCParser/MasmParser.cpp
4663

This is never set anywhere in this function. Remove, and just return false at the end? And I guess the comment in the line above is wrong?

Or if you meant to ReturnVal = true instead of return true on line 4576, add a test that shows why that's important.

Also, please use a better name than ReturnVal then.

4701

Do you expect parseDirectiveEven() to become longer in the future? Why doesn't this do

if (emitAlignTo(2))
  return addErrorSuffix(" in even directive");
epastor updated this revision to Diff 354249.Jun 24 2021, 7:36 AM
epastor marked an inline comment as done.

Addressed comments (removed copy/paste cruft that proved unnecessary)

epastor updated this revision to Diff 354250.Jun 24 2021, 7:43 AM
epastor marked 3 inline comments as done.

Added comment explaining anonymous substructures

llvm/lib/MC/MCParser/MasmParser.cpp
4540

Added for clarity.

4701

Simplified. Thanks!

epastor updated this revision to Diff 354255.Jun 24 2021, 7:57 AM

Rebase on trunk to re-trigger build (after removing parent commit)

thakis accepted this revision.Jun 24 2021, 12:30 PM

lgtm.

This revision is now accepted and ready to land.Jun 24 2021, 12:30 PM