Page MenuHomePhabricator

[ARM] Error out on .arm assembler directives on windows
ClosedPublic

Authored by mstorsjo on Feb 7 2018, 2:06 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

mstorsjo created this revision.Feb 7 2018, 2:06 AM

My initial thought is that there must be a better way than hard-coding a windows OS choice in the hasARM() field, particularly as all the others work entirely from the feature bits. Do you happen to know why clearing the feature bits is insufficient? I haven't touched this part of the code for a long time so I haven't got an answer for why myself. I can see that the assembler might recalculate them from the CPU description when particular directives are used.

I'd also say that there is a more generally useful feature here of "While my processor supports both Arm and Thumb state I choose to enforce that either Arm or Thumb state is used." this could be then enabled for Thumb2 by default for the Windows platform.

So overall I'd like to think that there was no possible alternative before doing this.

fhahn added a subscriber: fhahn.EditedFeb 7 2018, 3:03 AM

Clearing it from there doesn't seem to be enough when assembling .s files from clang though.

Looks like the ARMSubtarget is not initialized for the assembler from clang. But it would be great if we could handle that in ARMSubtarget, rather than duplicating the Windows implies NoARM logic here.

Clearing it from there doesn't seem to be enough when assembling .s files from clang though.

Looks like the ARMSubtarget is not initialized for the assembler from clang. But it would be great if we could handle that in ARMSubtarget, rather than duplicating the Windows implies NoARM logic here.

Yes, that'd clearly be the best. It's the same also if assembling with llvm-mc fwiw, the ARMSubtarget isn't instantiated in that case.

For the practical usecase, I don't have much difference between this (clearly erroring out on .arm directives) and just allowing them through; currently we hit an llvm_unreachable in ARMWinCOFFStreamer::EmitAssemblerFlag if we encounter such a directive. Emitting arm instructions does work technically, but the linker isn't ready to handle all of it (and interworking isn't handled right in all cases). But erroring out obviously is clearer, than having to hunt down cases where things build correctly but fail much more subtly.

I think we should first clarify that currently the only Windows on ARM support is Windows ARM NT (Windows ARM CE still does have both modes of execution). The important thing to note here is that although the CPU supports both modes of execution, the Windows ARM NT kernel does not guarantee the current mode will be preserved (that is, if you trap into the kernel in ARM mode, you are probably going to come out in Thumb mode and fault). I agree that we should try to fix this generally rather than hardcode this to Windows ARM.

I think we should first clarify that currently the only Windows on ARM support is Windows ARM NT (Windows ARM CE still does have both modes of execution). The important thing to note here is that although the CPU supports both modes of execution, the Windows ARM NT kernel does not guarantee the current mode will be preserved (that is, if you trap into the kernel in ARM mode, you are probably going to come out in Thumb mode and fault).

Yup - this is mentioned (briefly) in the corresponding place in ARMSubtarget.cpp, where NoARM is set for windows.

I agree that we should try to fix this generally rather than hardcode this to Windows ARM.

Any concrete suggestions on how to go about it? The flag ARM::FeatureNoARM is generally assigned based on the cpu in ARM.td.

I agree that we should try to fix this generally rather than hardcode this to Windows ARM.

Any concrete suggestions on how to go about it? The flag ARM::FeatureNoARM is generally assigned based on the cpu in ARM.td.

Apologies I haven't got ideas off the top of my head. I may be able to find some time tomorrow or early next week to take a look and make some suggestions.

I agree that we should try to fix this generally rather than hardcode this to Windows ARM.

Any concrete suggestions on how to go about it? The flag ARM::FeatureNoARM is generally assigned based on the cpu in ARM.td.

Apologies I haven't got ideas off the top of my head. I may be able to find some time tomorrow or early next week to take a look and make some suggestions.

@peter.smith - any ideas?

I agree that we should try to fix this generally rather than hardcode this to Windows ARM.

Any concrete suggestions on how to go about it? The flag ARM::FeatureNoARM is generally assigned based on the cpu in ARM.td.

Apologies I haven't got ideas off the top of my head. I may be able to find some time tomorrow or early next week to take a look and make some suggestions.

@peter.smith - any ideas?

My apologies, I've not had a chance to look yet, my current task is taking rather longer than I'd like. I'm intending to take a look when I've finished, hopefully next week, although don't let that stop you taking a look yourself.

I can verify that if execution enters the kernel via trap or otherwise from a program in ARM mode on WinRT or Winphone 8.1 using the default MSVC compilers no ARM->Thumb2 switch occurs and the kernel's Thumb2 code executes in ARM mode, usually causing the system to black screen or hard lock with no dump / bugcheck. The kernel doesn't make any attempt to change the mode on entry or exit. It's been years since I did the research on that but I suspect it hasn't changed extensively and in any event makes the earlier devices flat out disastrous to execute ARM code on from a system perspective; It really doesn't support the mode.

I haven't had the time to disassemble or a device to run Windows 10 IoT or similar yet. Hopefully they've at least limited the crash to userland if it still occurs.

Apologies to take so long to get back to you. I've had a chance to take a look. While I've not got a magic solution I do have some questions/suggestions.

We already set the NoARM variable in ARMSubtarget but we don't clear the FeatureNoARM feature bit. Clearing it from there doesn't seem to be enough when assembling .s files from clang though.

If I set the noarm feature string directly for example --triple=armv7-none-eabi -mattr=+noarm then I get the desired behaviour on your test case. It might be that setting the feature bit directly is insufficient as it may get recalculated later from the triple and mattr, but if you can add +noarm to the mattr feature string this should keep the noarm feature set.
The NoARM and corresponding hasARMOps() in its current form looks like it could and probably should be removed.

  • NoARM is only set when isTargetWindows(), so hasARMOps() is effectively !isTargetWindows(), to me this makes the name hasARMOps() misleading. If it is truly hasARMOps() then it should be calculated from the feature bits.
  • hasARMOps() is only used in a few places and could be easily replaced by !isTargetWindows() if that is the intention.

To summarise:

  • It should be possible to accomplish what you want by using just the featurebits. Can you elaborate on why this doesn't work for you?
  • I suggest renaming or removing hasARMOps(), have I misunderstood what it is used for?

To summarise:

  • It should be possible to accomplish what you want by using just the featurebits. Can you elaborate on why this doesn't work for you?

The big question I have is, where should I initialize/override those feature bits? I see it done statically for different architecture variants in ARM.td, but here it should be set not only based on cpu but based on the OS. The existing case of setting a similar flag is done in ARMSubtarget.cpp, but when just invoking the assembler, ARMSubtarget is never constructed at all. So where should I place the code that sets the FeatureNoARM bit when OS == Windows, so that it gets set when invoking the assembler?

  • I suggest renaming or removing hasARMOps(), have I misunderstood what it is used for?

I'd rather extend it to not only check for windows but also check for the FeatureNoARM bit (and if we resolve the above issue, check only that) - then it would be more useful.

To summarise:

  • It should be possible to accomplish what you want by using just the featurebits. Can you elaborate on why this doesn't work for you?

The big question I have is, where should I initialize/override those feature bits? I see it done statically for different architecture variants in ARM.td, but here it should be set not only based on cpu but based on the OS. The existing case of setting a similar flag is done in ARMSubtarget.cpp, but when just invoking the assembler, ARMSubtarget is never constructed at all. So where should I place the code that sets the FeatureNoARM bit when OS == Windows, so that it gets set when invoking the assembler?

To clarify further - if I invoke the assembler as in the testcase I'm adding here, llvm-mc -triple=thumbv7-win32 < %s (or clang -target armv7-win32 -c file.S, which internally remaps it to thumbv7-win32 before it hits llvm), without manually setting the flag, what class/file should set it?

To clarify further - if I invoke the assembler as in the testcase I'm adding here, llvm-mc -triple=thumbv7-win32 < %s (or clang -target armv7-win32 -c file.S, which internally remaps it to thumbv7-win32 before it hits llvm), without manually setting the flag, what class/file should set it?

Thanks for the response, that makes things a bit clearer. Off the top of my head the place I'd start would be arm::getARMTargetFeatures() in Driver/Toolchains/Arch.cpp As I understand it this builds up the target feature string based on the clang command line arguments, this does have access to the triple so it should be possible to add "+noarm" to the feature string if the OS is windows after the CPU defaults have been added. It looks like that should be called prior to the integrated assembler being launched. I've not looked into llvm-mc, although I suspect it will have a similar place where the feature string is built up.

I've not tested any of the above, it is just where I'd start. I may be able to get some more time later in the week to look at this more thoroughly.

To clarify further - if I invoke the assembler as in the testcase I'm adding here, llvm-mc -triple=thumbv7-win32 < %s (or clang -target armv7-win32 -c file.S, which internally remaps it to thumbv7-win32 before it hits llvm), without manually setting the flag, what class/file should set it?

Thanks for the response, that makes things a bit clearer. Off the top of my head the place I'd start would be arm::getARMTargetFeatures() in Driver/Toolchains/Arch.cpp As I understand it this builds up the target feature string based on the clang command line arguments, this does have access to the triple so it should be possible to add "+noarm" to the feature string if the OS is windows after the CPU defaults have been added. It looks like that should be called prior to the integrated assembler being launched. I've not looked into llvm-mc, although I suspect it will have a similar place where the feature string is built up.

I've not tested any of the above, it is just where I'd start. I may be able to get some more time later in the week to look at this more thoroughly.

Ok, that sounds doable. But ideally there'd be a central place in llvm for setting it, in some place where the arm target is initialized for both normal compilation and for assembling, to avoid scattering the same windows => noarm logic in all users of the target (clang, llvm-mc, possibly others).

mstorsjo updated this revision to Diff 139086.Mar 20 2018, 1:47 AM
mstorsjo edited the summary of this revision. (Show Details)

Clearing the bit in code (as opposed to tablegen source files) when creating the MCSubtargetInfo for this OS.

This should hopefully be a bit cleaner.

Looks a lot better than before. One possible alternative to making the subtarget then subtracting ARM, could we use the same way that nacl adds the nacl-trap feature in ParseARMTriple?

--- a/lib/Target/ARM/MCTargetDesc/ARMMCTargetDesc.cpp
+++ b/lib/Target/ARM/MCTargetDesc/ARMMCTargetDesc.cpp
@@ -153,6 +153,13 @@ std::string ARM_MC::ParseARMTriple(const Triple &TT, StringRef CPU) {
       ARMArchFeature += ",+nacl-trap";
   }
 
+  if (TT.isOSWindows()) {
+    if (ARMArchFeature.empty())
+      ARMArchFeature += "+noarm";
+    else
+      ARMArchFeature += ",+noarm";
+  }
+
   return ARMArchFeature;
 }

This will mean that the target features will be set up without ARM without needing to toggle it later.

Looks a lot better than before. One possible alternative to making the subtarget then subtracting ARM, could we use the same way that nacl adds the nacl-trap feature in ParseARMTriple?

--- a/lib/Target/ARM/MCTargetDesc/ARMMCTargetDesc.cpp
+++ b/lib/Target/ARM/MCTargetDesc/ARMMCTargetDesc.cpp
@@ -153,6 +153,13 @@ std::string ARM_MC::ParseARMTriple(const Triple &TT, StringRef CPU) {
       ARMArchFeature += ",+nacl-trap";
   }
 
+  if (TT.isOSWindows()) {
+    if (ARMArchFeature.empty())
+      ARMArchFeature += "+noarm";
+    else
+      ARMArchFeature += ",+noarm";
+  }
+
   return ARMArchFeature;
 }

This will mean that the target features will be set up without ARM without needing to toggle it later.

Indeed, that's even more straightforward - I didn't notice this part before.

mstorsjo updated this revision to Diff 139257.Mar 21 2018, 12:57 AM

Adding "+noarm" to ARMArchFeature, as suggested, instead of toggling bits later.

Thanks, I'm happy with that. Will be worth waiting to see if there are any comments from the US timezones.

One future refactoring (in a separate patch after this) would be to change all the additions from

if (TT.isThumb()) {
  if (ARMArchFeature.empty())
    ARMArchFeature = "+thumb-mode,+v4t";
  else
    ARMArchFeature += ",+thumb-mode,+v4t";
}

into

if (TT.isThumb()) {
  if (!ARMArchFeature.empty())
    ARMArchFeature += ",";
  ARMArchFeature = "+thumb-mode,+v4t";
}

But I rather do that change consistently for all three blocks after this patch. (Or are there performance reasons for preferring doing it like it is today?)

No performance implications that I know of, the source code history might contain some clues. I'm guessing that the function wouldn't be significant in the context of a compile so I personally think it would be good to clean it up.

Thanks, I'm happy with that. Will be worth waiting to see if there are any comments from the US timezones.

As there hasn't been any comments on it from US timezones for a few days now, I'll commit this soon.

This revision was not accepted when it landed; it landed in state Needs Review.Mar 23 2018, 2:13 AM
This revision was automatically updated to reflect the committed changes.