This is an archive of the discontinued LLVM Phabricator instance.

[AST] Use correct APSInt width when evaluating string literals
ClosedPublic

Authored by barannikov88 on Jul 8 2023, 12:45 PM.

Details

Summary

The width of the APSInt values should be the width of an element.
getCharByteWidth returns the size of an element in _host_ bytes, which
makes the width N times greater, where N is the ratio between target's
CHAR_BIT and host's CHAR_BIT.
This is NFC for in-tree targets because all of them have CHAR_BIT == 8.

Diff Detail

Event Timeline

barannikov88 created this revision.Jul 8 2023, 12:45 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 8 2023, 12:45 PM
barannikov88 requested review of this revision.Jul 8 2023, 12:45 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 8 2023, 12:45 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
cor3ntin accepted this revision.Jul 9 2023, 2:38 AM
cor3ntin added a subscriber: aaron.ballman.

I think this is fine, and it's a nice simplification.

However it doesn't seem to do much of anything in practice: if you look at StringLiteral::mapCharByteWidth, supporting different CHAR_BIT would be more involved. And that's just part of the front end.
I really don't how much effort would be involved for complete support, and... well, as you say, we have no way to test anything.

So my feeling is that we can land that because it simplifies the code, but further work into non 8-bits platforms probably require the input of more folks
and a RFC may be warranted.

Looking around non 8 bits CHAR_BIT seems to have been discussed before https://discourse.llvm.org/t/rfc-on-non-8-bit-bytes-and-the-target-for-it/53455
but i don't know if a conclusion was reached. Maybe worth poking that thread if you are interested?

WDYT @aaron.ballman ?

This revision is now accepted and ready to land.Jul 9 2023, 2:38 AM
aaron.ballman accepted this revision.Jul 10 2023, 8:39 AM

I think this is a good NFC change as it brings us one TINY step closer towards supporting platforms where CHAR_BIT != 8. But I agree with Corentin that a lot more work needs to happen to support that kind of target; restarting the discussion on the linked RFC would be a good idea if you plan to aim for full support.

@barannikov88 Don't forget to land this!

This revision was landed with ongoing or failed builds.Sep 5 2023, 9:14 AM
This revision was automatically updated to reflect the committed changes.