Page MenuHomePhabricator

[InstCombine] Remove assumptions about int having 32 bits

Authored by msebor on Aug 11 2022, 3:25 PM.



The patch submitted in D129915 and subsequently committed in rG0dcfe7aa35cd exposes a number of assumptions in SimplifyLibcalls.cpp about int having 32 bits. Prior to the change the assumptions, although strictly unsafe, were benign even on 16-bit targets because the validation in TargetLibraryInfoImpl::isValidProtoForLibFunc enforced the corresponding restriction. The change has corrected the function to consider valid on 16-bit targets those functions in whose declarations the int return type or argument corresponds to i16, but it did not update the corresponding call handlers in SimplifyLibcalls.cpp. The end result is that some valid C code that compiled successfully before the change (albeit with suboptimal output) now causes an abort in the handler when targeting 16-bit environments (AVR or MSP430 in tree).

As a follow-up to D129915, this change removes these 32-bit assumptions to let the same code compile successfully even for 16-bit targets, and to also enable the same optimizing transformations for those targets as for 32-bit ones.

I've tested this patch on x86_64-linux.

Diff Detail

Event Timeline

msebor created this revision.Aug 11 2022, 3:25 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 11 2022, 3:25 PM
msebor requested review of this revision.Aug 11 2022, 3:25 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 11 2022, 3:25 PM

I've verified that this patch fixes the crash for ffs that we saw for a target with 16bit int. Thanks!
I have no idea about test coverage though, I think we're pretty bad at testing lib calls for our own compiler. The ffs crash was the only I saw.

So all in all, if the original patch caused other mismatches between the checks and later expectations I have no clue about really.

But this seems to be a step in the right direction.

Thanks for the confirmation! I've CC'd @efriedma who has reviewed changes in this in this area before. If there are no other comments I'll plan on installing the patch later this week to resolve the ffs abort.

bjope accepted this revision.Aug 15 2022, 9:11 AM

I still think it is hard to understand for which functions isValidProtoForLibFunc was modified in the earlier commit that both refactored and modified the behavior in the same commit. For an out-of-tree target maintainer I need to understand how this impacted our downstream modifications to isValidProtoForLibFunc, while at the same time understanding if the changes impact our target (which isn't just having 16-bit ints, but for example char is also 16-bits, etc). Anyway, now I guess the only way is forward (even if other out-of-tree target maintainers might end up having the same trouble when rebasing later), so I guess I'll stop dwelling about that.

LGTM, as this seem to solve problems seen with D129915.


I do not fully understand the part with "Putchar converts it to unsigned char regardless". Is that referring to the libc implementation , or llvm::emitPutChar, or what?

Not sure if we can deal with this better regardless of the answer. I guess this would work for in-tree targets and "normal" compiler hosts. But unsigned char on the host may have a different size compared to size of unsigned char in the runtime libc implementation for the target. Since the getConstantStringInfo call above assume that a char is 8 bits (at least the in-tree version of getConstantStringInfo) and the compiler probably is built for a host with 8 bit char, then at least the compiler code match up here.

(FWIW, our OOT target with 16-bit char is now guarded by getConstantStringInfo returning false, while in the past I think isValidProtoForLibFunc might have guarded from even calling LibCallSimplifier::optimizePuts and LibCallSimplifier::optimizePrintFString.)

This revision is now accepted and ready to land.Aug 15 2022, 9:11 AM
msebor added inline comments.Aug 16 2022, 2:36 PM

The putchar comment refers to the libc implementation of the function on the target. The host putchar shouldn't come into play here, but I've seen the assumption that char is 8 bits wide on both the host as on the target hardwired into transformations that don't depend on getConstantStringInfo, such as optimizeMemChr and optimizeStrChr (I'd be surprised if those were the only ones). If other sizes are meant to be supported these assumptions should be reviewed and if necessary removed. I don't have access to such targets but I'm willing to help with that.

This revision was automatically updated to reflect the committed changes.