This is an archive of the discontinued LLVM Phabricator instance.

[SLC] Simplify strcpy and friends with non-zero address spaces
ClosedPublic

Authored by arichardson on Jan 21 2021, 8:09 AM.

Details

Summary

[SLC] Simplify strcpy and friends with non-zero address spaces

The current logic in TargetLibraryInfoImpl::getLibFunc() was only treating
strcpy, etc. with i8* arguments in address space zero as a valid library
function. However, in the CHERI and Morello targets we expect all libc
functions to use address space 200 arguments.

This commit updates isValidProtoForLibFunc() to check that the argument
is a pointer type. This also drops the check for i8* since we should not
be checking the pointee type any more.

Depends on D95138

Diff Detail

Event Timeline

arichardson created this revision.Jan 21 2021, 8:09 AM
arichardson requested review of this revision.Jan 21 2021, 8:09 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 21 2021, 8:09 AM
arsenm added inline comments.Jan 21 2021, 8:12 AM
llvm/lib/Analysis/TargetLibraryInfo.cpp
647–649

This wouldn't really be right for AMDGPU, but we have no library calls anyway. I also think it's not right to use GlobalsAddressSpace for this purpose. This has nothing to do with globals or creating new values

650

Should never depend on pointee element type

jrtc27 added inline comments.Jan 21 2021, 8:17 AM
llvm/lib/Analysis/TargetLibraryInfo.cpp
647–649

The issue is there's no notion of "the address space that libfuncs use". The additional granularity for globals vs stack (vs others too maybe?) is unhelpful here; for us they'll all be the same (AS 200), but there is no sensible mapping from C language void * (or, in this case, char *) to the AS in use.

650

But then it's not a char pointer?

arichardson added inline comments.Jan 21 2021, 8:17 AM
llvm/lib/Analysis/TargetLibraryInfo.cpp
647–649

More than happy to drop this restriction and just accept any i8*.

I guess something like a "malloc/heap" address space would be more accurate since it's used when checking the realloc return value. It doesn't really matter too much for CHERI since we use 200 for program+alloca+globals.

650

I know this will cause issues in the future, but I wasn't sure how else I can check for i8* ignoring the addres space.

What is the recommended way to avoid this? Or should I just check for ->isPointerType()?

arsenm added inline comments.Jan 21 2021, 8:30 AM
llvm/lib/Analysis/TargetLibraryInfo.cpp
647–649

I believe the current precedent assumes libcalls must use address space 0

650

The pointee type is meaningless in the IR and will be going away. You can just ignore it

arichardson added inline comments.Jan 21 2021, 8:41 AM
llvm/lib/Analysis/TargetLibraryInfo.cpp
647–649

Unfortunately we can't use AS0 for CHERI due to compatibility with with MIPS/RISC-V/AArch64.

Would you be okay with me replacing the == PCharTy with ->isPointerType()? That would allow us to simplify these libcalls and also doesn't depend on the pointee type any more.

arsenm added inline comments.Jan 21 2021, 4:10 PM
llvm/lib/Analysis/TargetLibraryInfo.cpp
647–649

Yes, only being a pointer matters

Use isPointerType() instead.

arichardson edited the summary of this revision. (Show Details)Jan 22 2021, 2:01 AM

"fix" failing strncpy_chk-2.ll

xbolva00 added inline comments.Jan 25 2021, 4:22 AM
llvm/test/Transforms/InstCombine/strcpy-nonzero-as.ll
51–52

Remove TODOs

arichardson marked an inline comment as done.

Drop resolved TODO comments

rebase after latest changes to TLI

arichardson marked 7 inline comments as done.

rebase. Is this okay to land now?

arsenm accepted this revision.Mar 8 2021, 6:16 AM
This revision is now accepted and ready to land.Mar 8 2021, 6:16 AM
mdchen added a subscriber: mdchen.EditedNov 28 2021, 7:05 PM

The pointee type is meaningless in the IR and will be going away.

@arsenm @arichardson Hi, this change let user-defined string functions(e.g., size_t strlen(int*)) pass the check and considered a libfunc, which can cause failure in later optimizations like optimizeStrLen which assumes CharSize is 8. And pointed type does matter in this case. What do you think? (This patch doesn't cause a specific bug but I have doubt about the direction towards)

The pointee type is meaningless in the IR and will be going away.

@arsenm @arichardson Hi, this change let user-defined string functions(e.g., size_t strlen(int*)) pass the check and considered a libfunc, which can cause failure in later optimizations like optimizeStrLen which assumes CharSize is 8. And pointed type does matter in this case. What do you think?

The code should've been compiled with either -fno-builtin-strlen or -ffreestanding, or used the right type for strlen, anything else is invalid. See -Wincompatible-library-redeclaration.

The pointee type is meaningless in the IR and will be going away.

@arsenm @arichardson Hi, this change let user-defined string functions(e.g., size_t strlen(int*)) pass the check and considered a libfunc, which can cause failure in later optimizations like optimizeStrLen which assumes CharSize is 8. And pointed type does matter in this case. What do you think?

The code should've been compiled with either -fno-builtin-strlen or -ffreestanding, or used the right type for strlen, anything else is invalid. See -Wincompatible-library-redeclaration.

Oh I see, thanks for pointing it out!