This is an archive of the discontinued LLVM Phabricator instance.

Add function pointer alignment to DataLayout
Needs ReviewPublic

Authored by gchatelet on Feb 6 2023, 4:34 AM.

Details

Reviewers
efriedma
Summary

This is to follow up on https://reviews.llvm.org/D57335.
The patch should not be submitted as-is, the tests are completely broken, but before having a look at them I want to make sure this patch makes sense.

Diff Detail

Event Timeline

gchatelet created this revision.Feb 6 2023, 4:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 6 2023, 4:34 AM
gchatelet requested review of this revision.Feb 6 2023, 4:34 AM

Almost all the specifications you're specifying are wrong. Very few targets should be using "i". Off the top of my head, 32-bit ARM, MIPS targets with MicroMIPS, and certain PowerPC targets should, but most common targets shouldn't. ("i" means that either the target mangles the bottom bits of function pointers, or function pointers don't point directly at code.) And the x86 doesn't specify 4-byte alignment.

For each target, we should try to handle the LLVM backend at the same time...

gchatelet added a comment.EditedFeb 7 2023, 8:21 AM

Almost all the specifications you're specifying are wrong. Very few targets should be using "i". Off the top of my head, 32-bit ARM, MIPS targets with MicroMIPS, and certain PowerPC targets should, but most common targets shouldn't. ("i" means that either the target mangles the bottom bits of function pointers, or function pointers don't point directly at code.)

Ok so I misunderstood the meaning of "i" then.

And the x86 doesn't specify 4-byte alignment.

I thought that specifying it in the DataLayout would help getting rid of the special handling in ConstantFold. I'm confused now.

For each target, we should try to handle the LLVM backend at the same time...

I'm not familiar with this part of the codebase as you can tell. Would you have any pointers to the backend part that needs to be updated?
Shall we try to do one target at a time then?

efriedma added a comment.EditedFeb 7 2023, 12:50 PM

And the x86 doesn't specify 4-byte alignment.

I thought that specifying it in the DataLayout would help getting rid of the special handling in ConstantFold. I'm confused now.

That hack basically says "if the datalayout doesn't specify the alignment of function pointers, assume function pointers are at least 4-byte aligned". If we add the correct datalayout markings for x86, that should solve the problems discussed in D55115 with any hacks, hopefully. The relevant functions should have an explicit alignment marking, so under an "Fn1" datalayout their alignment would be greater than one.

Currently the x86 backend doesn't actually enforce any alignment for functions at -Os/-Oz(no call to setMinFunctionAlignment etc.). If it turns out some minimum is actually necessary for the sake of IR optimizations, we'd need to fix that separately.

For each target, we should try to handle the LLVM backend at the same time...

I'm not familiar with this part of the codebase as you can tell. Would you have any pointers to the backend part that needs to be updated?
Shall we try to do one target at a time then?

I haven't really looked at this in a while... but for each target, the backend has a method to compute the "canonical" datalayout string. This is exposed as TargetMachine::createDataLayout, but it's computed in each target's TargetMachine constructor. Most ways to invoke the backend enforce this string on the module. (I think clang does this separately so that we can do semantic analysis and IR generation even if the relevant backend isn't compiled in. Don't remember the details.)