Page MenuHomePhabricator

[InstCombine] Transform memcpy to ptr load/stores if TBAA says so
AcceptedPublic

Authored by aqjune on Sun, Apr 18, 1:38 AM.

Details

Summary

Folding inttoptr(ptrtoint p) -> p has been known as a source of miscompilation (http://llvm.org/pr34548)
because it swaps the object that some pointer is based-on.
Recently I've been playing with CastInst::isEliminableCastPair to see how removal of it
affects performance, and got very nice results.

On x86-64, compiling single amalgamated file benchmarks (bzip2, gzip, oggenc, ph7, sqlite3, gcc)[1, 2, 3]
didn't show a single difference in assembly even after the folding is removed.
I guess this is highly related to the recent efforts (D88788, D88789, D88979) in reducing the number of
inttoptr casts unnecessarily generated by middle-end transformations.

The int->ptr casts used in the source programs are either

(1) int->ptr casts of a constant, e.g. (int*)-1
(2) manipulating bits using constants, e.g. (int*)((intptr_t)x | 0x3

The first one cannot gain benefit from inttoptr(ptrtoint p) -> p folding anyway.
The second one may gain benefit in theory, but for the 6 benchmarks it didn't. Also, there are possible cases where folding is allowed, and we can suggest using llvm.ptrmask as well for better performance.

For larger benchmark, I chose LLVM (as suggested by @nlopes).
Compiling LLVM with/without this folding showed quite a few differences in assembly, however.
But they are still mostly from inttoptrs generated by the optimizations.

This patch fixes LLVM by using load/store i8* (which is char* in C/C++) instead if !tbaa_struct
says the element had a character type: https://godbolt.org/z/PTaM51fKo

This patch reduces the number of different assembly files from 1255 to 845 (-32%).
This also reduces the total number of inttoptrs after -O3 significantly for some files.

[1] https://people.csail.mit.edu/smcc/projects/single-file-programs/
[2] https://www.sqlite.org/amalgamation.html
[3] https://ph7.symisc.net/downloads.html

Diff Detail

Event Timeline

aqjune created this revision.Sun, Apr 18, 1:38 AM
aqjune requested review of this revision.Sun, Apr 18, 1:38 AM
Herald added a project: Restricted Project. · View Herald TranscriptSun, Apr 18, 1:38 AM
lebedev.ri edited the summary of this revision. (Show Details)Sun, Apr 18, 1:50 AM
aqjune edited the summary of this revision. (Show Details)Sun, Apr 18, 1:50 AM
lebedev.ri edited the summary of this revision. (Show Details)Sun, Apr 18, 1:52 AM
aqjune added a comment.EditedSun, Apr 18, 1:52 AM

Oops, there was a race!

I'll apply your edit.


Nevermind, thank you for the link.

lebedev.ri accepted this revision.Sun, Apr 18, 1:55 AM
lebedev.ri added inline comments.
llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
455

ArrayRef<StringLiteral> / ArrayRef<StringRef>

This revision is now accepted and ready to land.Sun, Apr 18, 1:55 AM
lebedev.ri resigned from this revision.Sun, Apr 18, 1:55 AM

err, didn't mean to accept just yet

This revision now requires review to proceed.Sun, Apr 18, 1:55 AM
lebedev.ri edited the summary of this revision. (Show Details)Sun, Apr 18, 1:58 AM
lebedev.ri accepted this revision.Sun, Apr 18, 2:01 AM
lebedev.ri added a subscriber: lebedev.ri.

Yep, this looks pretty good!

llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
459–461

return is_contained(Descs, Tag1->getString());

469–471

return is_contained(Descs, Id->getString());

This revision is now accepted and ready to land.Sun, Apr 18, 2:01 AM
nikic added a comment.Sun, Apr 18, 2:01 AM

I'm uncomfortable with using TBAA for this purpose. The problem is that not all languages use TBAA, simply because that does not match their aliasing model. They shouldn't be penalized because TBAA information gets reused for an unrelated purpose.

Might it make sense to simply always use a pointer type here (assuming size matches of course)?

I'm uncomfortable with using TBAA for this purpose.

The problem is that not all languages use TBAA, simply because that does not match their aliasing model.
They shouldn't be penalized because TBAA information gets reused for an unrelated purpose.

How does this penalize them?
If they don't use TBAA then there simply won't be a TBAA info on that memcpy,
and we'll use the previous behavior of performing an integer op.

Might it make sense to simply always use a pointer type here (assuming size matches of course)?

No. If it was a pointer we should treat it as such, if it was an integer we should treat it as such.
We shouldn't introduce bogus ptr<->int casts that weren't there originally.

nikic added a comment.Sun, Apr 18, 2:31 AM

I'm uncomfortable with using TBAA for this purpose.

The problem is that not all languages use TBAA, simply because that does not match their aliasing model.
They shouldn't be penalized because TBAA information gets reused for an unrelated purpose.

How does this penalize them?
If they don't use TBAA then there simply won't be a TBAA info on that memcpy,
and we'll use the previous behavior of performing an integer op.

It penalizes them because they cannot use TBAA metadata, because they don't use type-based alias analysis. So they don't have a way to provide the necessary information, without also opting into alias analysis behavior that does not match their language semantics. It binds two things (type hints for memcpy and alias analysis) together that don't have any relation.

I'm assuming here that it's not possible to use TBAA without also impacting alias analysis -- but I'm not really familiar with TBAA and maybe my assumption is simply wrong. Is it possible to specify !tbaa.struct metadata on memcpy's without affecting AA behavior in any way?

Might it make sense to simply always use a pointer type here (assuming size matches of course)?

No. If it was a pointer we should treat it as such, if it was an integer we should treat it as such.
We shouldn't introduce bogus ptr<->int casts that weren't there originally.

Just to clarify my thinking here: While we cannot fold inttoptr(ptrtoint p) -> p, we can fold ptrtoint(inttoptr p) -> p, so always doing the transfer as a pointer would avoid issues if the former fold is removed, because the latter would still eliminate cast pairs if it turns out to be the wrong type.

I'm uncomfortable with using TBAA for this purpose.

The problem is that not all languages use TBAA, simply because that does not match their aliasing model.
They shouldn't be penalized because TBAA information gets reused for an unrelated purpose.

How does this penalize them?
If they don't use TBAA then there simply won't be a TBAA info on that memcpy,
and we'll use the previous behavior of performing an integer op.

It penalizes them because they cannot use TBAA metadata, because they don't use type-based alias analysis. So they don't have a way to provide the necessary information, without also opting into alias analysis behavior that does not match their language semantics.

Sure, but how does *this* patch penalize them?
Isn't that more about the removal of faulty inttoptr(ptrtoint p) -> p fold?

It binds two things (type hints for memcpy and alias analysis) together that don't have any relation.

I'm assuming here that it's not possible to use TBAA without also impacting alias analysis -- but I'm not really familiar with TBAA and maybe my assumption is simply wrong. Is it possible to specify !tbaa.struct metadata on memcpy's without affecting AA behavior in any way?

Might it make sense to simply always use a pointer type here (assuming size matches of course)?

No. If it was a pointer we should treat it as such, if it was an integer we should treat it as such.
We shouldn't introduce bogus ptr<->int casts that weren't there originally.

Just to clarify my thinking here: While we cannot fold inttoptr(ptrtoint p) -> p, we can fold ptrtoint(inttoptr p) -> p, so always doing the transfer as a pointer would avoid issues if the former fold is removed, because the latter would still eliminate cast pairs if it turns out to be the wrong type.

Both don't really sound sound to me, but i may be wrong.
Regardless, this seems like it will create polarly opposite problem to those fixed in referenced patches.

It penalizes them because they cannot use TBAA metadata, because they don't use type-based alias analysis. So they don't have a way to provide the necessary information, without also opting into alias analysis behavior that does not match their language semantics. It binds two things (type hints for memcpy and alias analysis) together that don't have any relation.
I'm assuming here that it's not possible to use TBAA without also impacting alias analysis -- but I'm not really familiar with TBAA and maybe my assumption is simply wrong. Is it possible to specify !tbaa.struct metadata on memcpy's without affecting AA behavior in any way?

TBAA is used because otherwise there is no way to preserve the source’s data type down to middle-end optimizations.
To avoid the unwanted alias analysis, I think we have following options:

  1. Use unsigned char for all other fields in tbaa_struct: unsigned char pointer is supposed to alias any pointer in C/C++, so it won't hurt.
  2. Update optimization pipeline to never run TypeBasedAliasAnalysis for specific frontends.

If the language is supposed to add their own !tbaa/!tbaa_struct support, I guess either

  1. TypeBasedAliasAnalysis can be extended to be more pluggable (allow one to add its own rule)
  2. The language’s own TypeBasedAliasAnalysis can be just implemented and used instead.

Might it make sense to simply always use a pointer type here (assuming size matches of course)?

Just to clarify my thinking here: While we cannot fold inttoptr(ptrtoint p) -> p, we can fold ptrtoint(inttoptr p) -> p, so always doing the transfer as a pointer would avoid issues if the former fold is removed, because the latter would still eliminate cast pairs if it turns out to be the wrong type.

I also think it is desirable to preserve the original pointee type; it's because it again brings the complexity to the semantics of load/store.
Also, it is possible to introduce inttoptr again:

store i64 %i, i64* %p
memcpy(q, p, 8)
->
store i64 %i, i64* %p
%j = load i8* %p
store i8* %j, i8** %q
->
store i64 %i, i64* %p
%j = inttoptr %i to i8*
store i8* j, i8** %q

Existence of inttoptr complicates analyses to many attributes, e.g. nocapture/nofree/etc, so let's move towards avoiding introduction of them. :)

fhahn added a comment.Sun, Apr 18, 3:09 AM

I don't see how this use of TBAA metadata is within the specification of the metadata.

IIUC the type descriptors only encode the layout and relationship between types, but not whether a scalar type is a pointer, integer, floating point or other type. It just happens to be that Clang uses names that are related to the source types. But IIUC those names are arbitrary and Clang could just as well use A, B, C,... or any other naming scheme.

Please let me know if I am missing something that provides the guarantees the patch uses.

I don't see how this use of TBAA metadata is within the specification of the metadata.

IIUC the type descriptors only encode the layout and relationship between types, but not whether a scalar type is a pointer, integer, floating point or other type. It just happens to be that Clang uses names that are related to the source types. But IIUC those names are arbitrary and Clang could just as well use A, B, C,... or any other naming scheme.

Please let me know if I am missing something that provides the guarantees the patch uses.

For metadata nodes whose descriptions are simply C/C++'s type name they shouldn't carry info, but for those with special descriptions would it be reasonable if they are given special meaning?
I thought TBAA used the description because MDNode::isTBAAVtableAccess checks whether description is "vtable pointer".

I don't see how this use of TBAA metadata is within the specification of the metadata.

IIUC the type descriptors only encode the layout and relationship between types, but not whether a scalar type is a pointer, integer, floating point or other type. It just happens to be that Clang uses names that are related to the source types. But IIUC those names are arbitrary and Clang could just as well use A, B, C,... or any other naming scheme.

Please let me know if I am missing something that provides the guarantees the patch uses.

For metadata nodes whose descriptions are simply C/C++'s type name they shouldn't carry info, but for those with special descriptions would it be reasonable if they are given special meaning?
I thought TBAA used the description because MDNode::isTBAAVtableAccess checks whether description is "vtable pointer".

The vtable pointer case seems only to be used by the ThreadSanitizer.

I am not opposed to provide special meaning to certain nodes, but this should be mentioned in the documentation. Currently vtable pointer and any pointer are not documented as special descriptors in the langref's tbaa/tbaa.struct chapters.

@jdoerfert what do you think ?

I am not sure what kind of complications to expect when you have pointers pointing to a non-0 address space. These can differ in size and representation, but that should be taken care of by the size check.

First thought: Why is there no test in which we generate no int2ptr with the patch but do without?

@fhahn: I don't see how this use of TBAA metadata is within the specification of the metadata.

I read through the code and I'm not sure if this is even a semantic change. I mean, can't we pick any
type to do the memory transfer expansion? If so, TBAA metadata as a heuristic should be totally fine.
We could also look at uses of the source and target pointers, for example. That said, we should write
in the lang ref that we do use the TBAA names for heuristics and they should be chosen to match the code/intent.

First thought: Why is there no test in which we generate no int2ptr with the patch but do without?

@fhahn: I don't see how this use of TBAA metadata is within the specification of the metadata.

I read through the code and I'm not sure if this is even a semantic change. I mean, can't we pick any
type to do the memory transfer expansion? If so, TBAA metadata as a heuristic should be totally fine.
We could also look at uses of the source and target pointers, for example. That said, we should write
in the lang ref that we do use the TBAA names for heuristics and they should be chosen to match the code/intent.

For CHERI/Morello the type of the memory transfer does matter. For us pointers are 65/129-bit capabilities (32/64-bit addresses, bounds & permission metadata, and a hidden validity bit). The hardware clears the validity bit when storing an integer to a memory location that holds a valid capability. So if we convert a pointer-type memcpy to a i64/i128 load+store the pointer becomes invalid and will trigger in a trap on next dereference.
We therefore have to use pointer-type loads+stores as the default if the size matches the pointer size and pointers in that AS are capabilities.
In our fork this can be checked using DL.isFatPointer(AS) (a weaker version of DL.isNonIntegralAddressSpace(AS) that allows ptrtoint, but not inttoptr).

A few years ago @theraven added a workaround to our fork to allow the memcpy->load+store transformation for 128-bit CHERI (since it is guarded by Size <=8): https://github.com/CTSRD-CHERI/llvm-project/commit/f477ef4d292eb45871fc6cab0262d0844ce5d5aa

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
177

I wonder if it would make sense to default to pointer load+store for non-integral address space?
For CHERI we would definitely want this to be the default, but we can carry that patch downstream.

178
lebedev.ri added inline comments.Mon, Apr 19, 10:26 AM
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
177

I'm not really sure why that would be the right default,
but regardless let's not conflate that here.

arichardson added inline comments.Mon, Apr 19, 10:30 AM
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
177

Please ignore this suggestion. For some reason I read this as SrcAddrSp being the address space of the pointee, which we of course don't have here. The address space of the pointer is irrelevant in this case.

@aqjune can you please add requested langref wording?

nikic added a comment.Wed, Apr 21, 2:34 PM

Once the LangRef changes are there, I'd suggest cross-posting this to llvm-dev for visibility. This patch was discussed at the last alias analysis meeting, and I think most people were in favor of using metadata for this purpose, but weren't super fond of determining the type to use based on string matching. Maybe someone else has more detailed notes...

lebedev.ri added a comment.EditedWed, Apr 21, 2:41 PM

Once the LangRef changes are there, I'd suggest cross-posting this to llvm-dev for visibility. This patch was discussed at the last alias analysis meeting,

(hm, was there no reminder mail, or did i miss it?)

and I think most people were in favor of using metadata for this purpose, but weren't super fond of determining the type to use based on string matching. Maybe someone else has more detailed notes...

To be noted, isn't really inventing a new approach though, it's already done for vtable pointers.
I feel like that if this isn't ok, then the existing code also should be removed.

Once the LangRef changes are there, I'd suggest cross-posting this to llvm-dev for visibility. This patch was discussed at the last alias analysis meeting,

(hm, was there no reminder mail, or did i miss it?)

and I think most people were in favor of using metadata for this purpose, but weren't super fond of determining the type to use based on string matching. Maybe someone else has more detailed notes...

To be noted, isn't really inventing a new approach though, it's already done for vtable pointers.
I feel like that if this isn't ok, then the existing code also should be removed.

It is done for vtable pointers, but that is only used by the thread sanitizer. There was discussion about using the available information in tbaa for guiding the type to load/store (1) vs forcing to use tbaa to guide what kind of load/stores are used. (2)

(1) is deemed to be acceptable, if it is clearly documented. The forced naming based on what names clang produces is unfortunate, but as a stopgap this would be acceptable until we come up with a cleaner mechanism.
That cleaner mechanism could be part of the rework of new struct path tbaa (that was also discussed at the meeting).

For (2) we don't have a good idea yet. Maybe a separate !tb.struct with the same layout could be used for this particular purpose if !tbaa.struct is not available (just thinking out loud).

Thank you for the comments. I'll update this patch to contain tests that were creating inttoptr before.

It is done for vtable pointers, but that is only used by the thread sanitizer. There was discussion about using the available information in tbaa for guiding the type to load/store (1) vs forcing to use tbaa to guide what kind of load/stores are used. (2)

(1) is deemed to be acceptable, if it is clearly documented. The forced naming based on what names clang produces is unfortunate, but as a stopgap this would be acceptable until we come up with a cleaner mechanism.
That cleaner mechanism could be part of the rework of new struct path tbaa (that was also discussed at the meeting).

For (2) we don't have a good idea yet. Maybe a separate !tb.struct with the same layout could be used for this particular purpose if !tbaa.struct is not available (just thinking out loud).

I believe this patch falls into category (1) - this patch is mainly for removal of inttoptrs created by middle-end transformations, helping deletion of inttoptr(ptrtoint p) -> p fold.

I'll create a LangRef patch that describes the policy & share it to llvm-dev tomorrow.

For CHERI/Morello the type of the memory transfer does matter. For us pointers are 65/129-bit capabilities (32/64-bit addresses, bounds & permission metadata, and a hidden validity bit). The hardware clears the validity bit when storing an integer to a memory location that holds a valid capability. So if we convert a pointer-type memcpy to a i64/i128 load+store the pointer becomes invalid and will trigger in a trap on next dereference.
We therefore have to use pointer-type loads+stores as the default if the size matches the pointer size and pointers in that AS are capabilities.
In our fork this can be checked using DL.isFatPointer(AS) (a weaker version of DL.isNonIntegralAddressSpace(AS) that allows ptrtoint, but not inttoptr).

A few years ago @theraven added a workaround to our fork to allow the memcpy->load+store transformation for 128-bit CHERI (since it is guarded by Size <=8): https://github.com/CTSRD-CHERI/llvm-project/commit/f477ef4d292eb45871fc6cab0262d0844ce5d5aa

This is really interesting! IIUC my patch will bring benefit to this case as well? cheri-memcpy.ll also uses !tbaa.struct and my patch preserves the address space of the pointer type.

aqjune updated this revision to Diff 340008.Fri, Apr 23, 6:53 AM

Add the inttoptr-removing example, address comments

aqjune marked 3 inline comments as done.Fri, Apr 23, 6:54 AM

Looks good.

Still two remarks:

  • Can you add a langref update indicating that vtable pointer and any pointer names might be used to drive optimization passes/code generation ?
  • Can you extend the testcase with a vtable pointer and an int case ?

And also two questions:

  • The testcase tries out a p64:64:64 and a p32:32:32 case, assuming that the p32:32:32 case should use a i64. Is this indeed the expected lowering ? Or would we expect two pointer load/stores ?
  • When I try out a struct with single pointer fields and array pointer fields, I see that clang produces separate pointer fields for the single pointers and a large 'char' field for the array of pointers. (See https://www.godbolt.org/z/PT3qorvnq ) Maybe that is something that could be changed later in a pointer field with a larger size ?
aqjune updated this revision to Diff 340105.Fri, Apr 23, 10:30 AM

extend the test case, split the 32bit cast into a separate file to memcpy 4 bytes only

  • The testcase tries out a p64:64:64 and a p32:32:32 case, assuming that the p32:32:32 case should use a i64. Is this indeed the expected lowering ? Or would we expect two pointer load/stores ?

I found that the test case was indeed bogus a bit; the corresponding !tbaa_struct has one pointer field, so it is 4 bytes, but the memcpy was copying 8 bytes.
I made a separate test (memcpy-tbaa32.ll) that copies 4 bytes only.

  • When I try out a struct with single pointer fields and array pointer fields, I see that clang produces separate pointer fields for the single pointers and a large 'char' field for the array of pointers. (See https://www.godbolt.org/z/PT3qorvnq ) Maybe that is something that could be changed later in a pointer field with a larger size ?

This is interesting, the !tbaa_struct is using char. I also think using a pointer type is totally fine in this case.

penzn added a subscriber: penzn.Fri, Apr 23, 2:18 PM

I read through the code and I'm not sure if this is even a semantic change. I mean, can't we pick any
type to do the memory transfer expansion? If so, TBAA metadata as a heuristic should be totally fine.
We could also look at uses of the source and target pointers, for example. That said, we should write
in the lang ref that we do use the TBAA names for heuristics and they should be chosen to match the code/intent.

I agree with @nikic and @fhahn that we are trying to encode new information beyond what TBAA supposed to describe according to LangRef. I think vtable pointer handling is particularly telling - it benefits from special treatment, since it is not a "regular" pointer. Conceptually this change (along with vtable support) belong to the same problem space as encoding non-aliasing pointers (cough, Fortran). In all this cases we need to know when something is a pointer, while simulateneously looking at the rules of source language to short-circuit optimization heuristics.

Aside from may be generic pointer handling, I am not super comfortable in baking this kind of logic into the backend and would be interested in finding a better solution in the long run. Though the challenge is that different source languages have completely unrelated pointer rules, which would require different heuristics.

@aqjune please can you prepare the same change for SROA?

fhahn added a comment.Mon, Apr 26, 8:12 AM

I read through the code and I'm not sure if this is even a semantic change. I mean, can't we pick any
type to do the memory transfer expansion? If so, TBAA metadata as a heuristic should be totally fine.
We could also look at uses of the source and target pointers, for example. That said, we should write
in the lang ref that we do use the TBAA names for heuristics and they should be chosen to match the code/intent.

I agree with @nikic and @fhahn that we are trying to encode new information beyond what TBAA supposed to describe according to LangRef. I think vtable pointer handling is particularly telling - it benefits from special treatment, since it is not a "regular" pointer. Conceptually this change (along with vtable support) belong to the same problem space as encoding non-aliasing pointers (cough, Fortran). In all this cases we need to know when something is a pointer, while simulateneously looking at the rules of source language to short-circuit optimization heuristics.

Aside from may be generic pointer handling, I am not super comfortable in baking this kind of logic into the backend and would be interested in finding a better solution in the long run. Though the challenge is that different source languages have completely unrelated pointer rules, which would require different heuristics.

Instead of using !tbaa as proxy for intent, couldn't we get the actual intent by looking at the users of the pointer?

I read through the code and I'm not sure if this is even a semantic change. I mean, can't we pick any
type to do the memory transfer expansion? If so, TBAA metadata as a heuristic should be totally fine.
We could also look at uses of the source and target pointers, for example. That said, we should write
in the lang ref that we do use the TBAA names for heuristics and they should be chosen to match the code/intent.

I agree with @nikic and @fhahn that we are trying to encode new information beyond what TBAA supposed to describe according to LangRef. I think vtable pointer handling is particularly telling - it benefits from special treatment, since it is not a "regular" pointer. Conceptually this change (along with vtable support) belong to the same problem space as encoding non-aliasing pointers (cough, Fortran). In all this cases we need to know when something is a pointer, while simulateneously looking at the rules of source language to short-circuit optimization heuristics.

Aside from may be generic pointer handling, I am not super comfortable in baking this kind of logic into the backend and would be interested in finding a better solution in the long run. Though the challenge is that different source languages have completely unrelated pointer rules, which would require different heuristics.

Instead of using !tbaa as proxy for intent, couldn't we get the actual intent by looking at the users of the pointer?

No.

fhahn added a comment.Mon, Apr 26, 8:37 AM

I read through the code and I'm not sure if this is even a semantic change. I mean, can't we pick any
type to do the memory transfer expansion? If so, TBAA metadata as a heuristic should be totally fine.
We could also look at uses of the source and target pointers, for example. That said, we should write
in the lang ref that we do use the TBAA names for heuristics and they should be chosen to match the code/intent.

I agree with @nikic and @fhahn that we are trying to encode new information beyond what TBAA supposed to describe according to LangRef. I think vtable pointer handling is particularly telling - it benefits from special treatment, since it is not a "regular" pointer. Conceptually this change (along with vtable support) belong to the same problem space as encoding non-aliasing pointers (cough, Fortran). In all this cases we need to know when something is a pointer, while simulateneously looking at the rules of source language to short-circuit optimization heuristics.

Aside from may be generic pointer handling, I am not super comfortable in baking this kind of logic into the backend and would be interested in finding a better solution in the long run. Though the challenge is that different source languages have completely unrelated pointer rules, which would require different heuristics.

Instead of using !tbaa as proxy for intent, couldn't we get the actual intent by looking at the users of the pointer?

No.

That does not really answer my question for this specific case though I think (or I miss how this is related to this particular issue). IIUC the linked patch correctly, it undos converting pointer load/stores to integer ones if possible.

But my suggestion/question is the other way around: check if a pointer is used to load a pointer instead of checking if tbaa claims it should be treated as pointer.

Aside from may be generic pointer handling, I am not super comfortable in baking this kind of logic into the backend and would be interested in finding a better solution in the long run. Though the challenge is that different source languages have completely unrelated pointer rules, which would require different heuristics.

I think the underlying problem is that there are only two options of types when canonicalizing memcpy/load/store: either integer type or pointer type.
Once it is canonicalized into one of these two, memory accesses with another type requires introduction of casting that is not easy to be removed due to the miscompilation issues.

But my suggestion/question is the other way around: check if a pointer is used to load a pointer instead of checking if tbaa claims it should be treated as pointer.

Well, actually this sounds like a great idea... Let me check whether it works.

@aqjune please can you prepare the same change for SROA?

I'm not familiar with SROA, but will try.

But my suggestion/question is the other way around: check if a pointer is used to load a pointer instead of checking if tbaa claims it should be treated as pointer.

Well, actually this sounds like a great idea... Let me check whether it works.

At least one case where this won't work is when the llvm.memcpy is expanded first, before the containing function is inlined in another one, where the access types are exposed.

But my suggestion/question is the other way around: check if a pointer is used to load a pointer instead of checking if tbaa claims it should be treated as pointer.

Well, actually this sounds like a great idea... Let me check whether it works.

At least one case where this won't work is when the llvm.memcpy is expanded first, before the containing function is inlined in another one, where the access types are exposed.

That's precisely my point about that being a flipped-direction of D88789.
Nowadays, it should be a obviously not a good idea.
I don't believe i will be okay with a patch doing that.

Matt added a subscriber: Matt.Thu, May 6, 7:32 AM