This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
docs/
-
Passes.rst
-
include/llvm/
-
llvm/
-
Analysis/
-
TargetTransformInfo.h
-
TargetTransformInfoImpl.h
-
CodeGen/
2/8
BasicTTIImpl.h
1/1
InitializePasses.h
-
Transforms/
-
Scalar.h
-
Utils/
-
RelLookupTableConverter.h
-
lib/
-
Analysis/
-
TargetTransformInfo.cpp
-
Passes/
-
PassBuilder.cpp
-
PassRegistry.def
-
Transforms/
-
IPO/
1/1
PassManagerBuilder.cpp
-
Utils/
-
CMakeLists.txt
11/16
RelLookupTableConverter.cpp
-
Utils.cpp
-
test/
-
CodeGen/AMDGPU/
-
AMDGPU/
-
opt-pipeline.ll
-
Other/
-
new-pm-defaults.ll
-
new-pm-thinlto-defaults.ll
-
new-pm-thinlto-postlink-pgo-defaults.ll
-
new-pm-thinlto-postlink-samplepgo-defaults.ll
-
opt-O2-pipeline.ll
1/4
opt-O3-pipeline-enable-matrix.ll
-
opt-O3-pipeline.ll
-
opt-Os-pipeline.ll
-
pass-pipelines.ll
-
Transforms/RelLookupTableConverter/
-
RelLookupTableConverter/
1
relative_lookup_table.ll
2/2
switch_relative_lookup_table.ll
-
utils/gn/secondary/llvm/lib/Transforms/Utils/
-
gn/
-
secondary/
-
llvm/
-
lib/
-
Transforms/
-
Utils/
1/3
BUILD.gn

Differential D94355

[Passes] Add relative lookup table converter pass
ClosedPublic

Authored by gulfem on Jan 8 2021, 6:28 PM.

Download Raw Diff

Details

Reviewers

leonardchan
mcgrathr
phosek
hans
lebedev.ri

Commits

rGe96df3e531f5: [Passes] Add relative lookup table converter pass
rG5178ffc7cf92: [Passes] Add relative lookup table converter pass
rG5fd001a5ffba: [Passes] Add relative lookup table converter pass
rG78a65cd945d0: [Passes] Add relative lookup table converter pass

Summary

Lookup tables generate non PIC-friendly code, which requires dynamic relocation as described in:
https://bugs.llvm.org/show_bug.cgi?id=45244
This patch adds a pass that converts lookup tables to relative lookup tables to make them PIC-friendly.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

leonardchan added inline comments.Jan 14 2021, 11:07 AM

llvm/lib/Transforms/Utils/SimplifyCFG.cpp
5740–5747 ↗	(On Diff #316538)	Nit: Since this seems to be the exact same as the start of the `ArrayKind` case, it might be cleaner to move this into a lambda or function.

Add dso_local check
Remove -relative-switch-lookup-table flag, and enable it by default
Use shift instead of mul
Remove /ARM test and merge it with /X86 test

gulfem marked 4 inline comments as done.Jan 14 2021, 6:30 PM

gulfem marked 2 inline comments as done.Jan 14 2021, 6:36 PM

gulfem added inline comments.

llvm/test/Transforms/SimplifyCFG/ARM/switch-to-relative-lookup-table.ll
1 ↗	(On Diff #316538)	Any reason to have tests both under ARM/ and X86/ for this? There are two existing switch-to-lookup tests one under ARM/ and one under /X86 (I don't know the original reason). /ARM test is much simpler. I now put all the relative lookup tests under /X86. Also I'd like to see a test which shows that the transformation happens for PIC code but not for non-PIC. Could you please clarify that? Do you want to see a test that checks the code for non-PIC case when -fno-PIC is enabled?

Harbormaster completed remote builds in B85277: Diff 316824.Jan 14 2021, 7:16 PM

hans added inline comments.Jan 15 2021, 1:32 AM

llvm/lib/Transforms/Utils/SimplifyCFG.cpp
5469 ↗	(On Diff #316824)	Since this is a helper function used internally by the class, it should be private.
5473 ↗	(On Diff #316824)	Actually, this should probably be private too.
5701 ↗	(On Diff #316824)	It's a little annoying that we have to pass in a Builder here (and also to the SwitchLookupTable class) just to create integer types. I don't think a builder is strictly necessary for that. But more importantly, I'm not sure hard-coding this to 64 bits for the pointer and 32 bits for the offset is correct. Shouldn't this depend on the target?
5740–5747 ↗	(On Diff #316538)	One alternative would also be to bake the cases together: case ArrayKind: case RelOffsetArrayKind: { ... common stuff ... if (Kind == RelOffsetArrayKind) { .. special stuff .. }
llvm/test/Transforms/SimplifyCFG/ARM/switch-to-relative-lookup-table.ll
1 ↗	(On Diff #316538)	Do you want to see a test that checks the code for non-PIC case when -fno-PIC is enabled? Yes, exactly. This seems important since we don't want the transformation to run for the no-PIC case (and currently I don't see anything in the code which ensures that.) It might help to put the relative lookup table test in a separate file, and use two invocations once with PIC and one without, and check the expected results with different FileCheck prefixes.

Some targets already do this. Please check that you don't create regressions, especially on PowerPC.

In D94355#2500795, @joerg wrote:

Some targets already do this. Please check that you don't create regressions, especially on PowerPC.

Joerg: any pointers to where ppc does this and what it does exactly?

It might be specific to the jump table case, but it should be instructional on how to do it. One important point is that it avoids inter-section relocations, which are a problem at least on MIPS.

Many backends generate PIC-friendly jump tables. This is about generating IR initializers that translate to the same kind of backend assembly expressions as those backends use for their jump tables.

On all targets I'm aware of, the only kind of expression involving two symbols like this that can be expressed at all in assembly / relocations is a PC-relative case where one of the symbols is in the same section of the same TU as the data containing the reference (and is not dynamically interposable, i.e. has internal linkage or is dso_local). I don't think mips or powerpc is at all unlike any other platform in this regard. Indeed this applies to targets where the pointer size is not 64 bits, too. But as far as I'm aware, in all cases there is a 32-bit signed offset option for PC-relative references and in many that's the only option. So it probably is reasonable to hard-code the table entry offset size as 32 bits, though the pointer size obviously should be whatever is the size of pointers on the particular target.

It's possible the 32-bit offsets won't work when in medium or large code models on 64-bit machines. On some machines that have code models where 32 bit offsets are not always sufficient, there are 64-bit PC-relative relocs you can use so the same method but using 64-bit table entries is feasible. But I'm not sure that's the case of all 64-bit machines (it is true of aarch64 and x86-64 when using ELF, but I don't know others off hand). So it might be necessary to parameterize either enabling the PIC-friendly flavor of the optimization, or the offset size to use in table entries, or both, based on the target and the code model setting as interpreted for each particular target.

Apply relative lookup table generation in fPIC mode
Add a test case for single value case
Remove hard-coding 64 bit pointers
Improve implementation and test coverage

Herald added a project: Restricted Project. · View Herald TranscriptJan 21 2021, 5:59 PM

Herald added a subscriber: cfe-commits. · View Herald Transcript

gulfem marked 4 inline comments as done.Jan 21 2021, 6:03 PM

gulfem marked 2 inline comments as done.Jan 21 2021, 6:49 PM

Harbormaster completed remote builds in B86201: Diff 318372.Jan 21 2021, 9:10 PM

Almost there! Just a few more comments on my end.

llvm/lib/Transforms/Utils/SimplifyCFG.cpp
5694 ↗	(On Diff #318372)	I think we should also return false if it doesn't turn out to be a GlobalValue here.
llvm/test/Transforms/SimplifyCFG/X86/switch_to_relative_lookup_table.ll
201–203 ↗	(On Diff #318372)	It looks like this might not be testing what the comment says it should. It looks like in the phi statements below that each of the cases have different values. I think maybe you'll want something like: %str1.0 = phi i8* [ getelementptr inbounds ([5 x i8], [5 x i8]* @.str, i64 0, i64 0), %sw.default ], [ getelementptr inbounds ([5 x i8], [5 x i8]* @.str, i64 0, i64 0), %sw.bb2 ], [ getelementptr inbounds ([5 x i8], [5 x i8]* @.str, i64 0, i64 0), %sw.bb1 ], [ getelementptr inbounds ([5 x i8], [5 x i8]* @.str, i64 0, i64 0), %entry ] ret i8* %str1.0 ; instead of `ret void` to ensure the `%str1.0` isn't optimized out Then ensure this just lowers directly to a GEP for `@.str` and no lookup table is generated.

hans added inline comments.Jan 25 2021, 2:31 AM

clang/test/CodeGen/switch-to-lookup-table.c
2 ↗	(On Diff #318372)	Clang codegen tests are not normally used to test LLVM optimizations. I think the tests for this should all live in LLVM, not Clang. (Also aarch64 is not guaranteed to be included as a target in the LLVM build, so this test would not necessarily run.)
6 ↗	(On Diff #318372)	This table and the one below are very hard to read like this. Could you split it into multiple lines using FNOPIC-SAME?
36 ↗	(On Diff #318372)	I think the minimum number of cases for the switch-to-lookup table transformation is only 4 or 5. To make the test easier to read, I'd suggest using the minimum number of cases in the switch.
llvm/lib/Transforms/Utils/SimplifyCFG.cpp
5658 ↗	(On Diff #318372)	This comment seems unnecessary, at this point we know we're generating the relative table.
5695 ↗	(On Diff #318372)	I don't remember, will isDSOLocal() return true also if it's a private or internal symbol? Otherwise maybe this should check isLocalLinkage() also.
5709 ↗	(On Diff #318372)	I do worry about hard-coding this to 32 bits. As someone pointed out, it would not necessary hold in all code models for x86. Similarly to the PIC check in ShouldBuildRelLookupTable(), is there some check we could do to make sure 32 bits is appropriate?
5712 ↗	(On Diff #318372)	The Builder points to a specific insertion point in a basic block for the lookup, so it knows the Module and adding the Module parameter is redundant.
llvm/test/Transforms/SimplifyCFG/X86/switch_to_relative_lookup_table.ll
7 ↗	(On Diff #318372)	Same comment as I made for the test under clang/ above: I think fewer switch cases are probably enough to test this, and would make it easier to read. Also splitting the lookup tables over multiple lines would help too.

Simplified test cases and increased readibility of the tables
Added x86_64 and aarch64 check and tiny or small code modes check to ensure 32 offsets
Modified single value test case

gulfem marked 9 inline comments as done.Jan 28 2021, 7:03 PM

gulfem added inline comments.

clang/test/CodeGen/switch-to-lookup-table.c
2 ↗	(On Diff #318372)	I'm not able to use -fPIC and -fno-PIC options in the `opt` tool. I am setting the `PIC Level` flag to enable -fPIC in `opt. I thought that testing -fPIC and -fno-PIC in the same file seems easier and more readable in CodeGen tests. Please let me know if you have a good suggestion how to do that with `opt`. I changed the target to `x86_64-linux` in this test.
llvm/lib/Transforms/Utils/SimplifyCFG.cpp
5709 ↗	(On Diff #318372)	I added `x86_64` and `aarch64` target check and tiny or small code mode check to ensure 32 offsets. Please let me know if you have any other concerns about that.

Harbormaster completed remote builds in B87108: Diff 320025.Jan 28 2021, 8:08 PM

Can you please add an explanation to the patch's description as to why
we don't want to instead convert non-relative/relative LUT's elsewhere,
please.

lebedev.ri added inline comments.Jan 29 2021, 1:38 AM

llvm/lib/Transforms/Utils/SimplifyCFG.cpp
5512–5525 ↗	(On Diff #320025)	This should be some TLI/TTI hook.

In D94355#2530225, @lebedev.ri wrote:

Can you please add an explanation to the patch's description as to why
we don't want to instead convert non-relative/relative LUT's elsewhere,
please.

@mcgrathr gave some explanation to that:

Many backends generate PIC-friendly jump tables. This is about generating IR initializers that translate to the same kind of backend assembly expressions as those backends use for their jump tables.

I also want to add to that:
This task specifically tries make switch-to-lookup tables PIC-friendly, but it does not necessarily try to make all the jump tables PIC-friendly.
There is also another work going on to generate PIC-friendly vtables.
Therefore, converting non-relative lookup tables to relative tables elsewhere can be explored as a separate task.

llvm/lib/Transforms/Utils/SimplifyCFG.cpp
5512–5525 ↗	(On Diff #320025)	This should be some TLI/TTI hook. Could you please elaborate on that? Are you talking about getting the PIC level?

In D94355#2531504, @gulfem wrote:

In D94355#2530225, @lebedev.ri wrote:

Can you please add an explanation to the patch's description as to why
we don't want to instead convert non-relative/relative LUT's elsewhere,
please.

@mcgrathr gave some explanation to that:

Many backends generate PIC-friendly jump tables. This is about generating IR initializers that translate to the same kind of backend assembly expressions as those backends use for their jump tables.

I also want to add to that:
This task specifically tries make switch-to-lookup tables PIC-friendly, but it does not necessarily try to make all the jump tables PIC-friendly.
There is also another work going on to generate PIC-friendly vtables.
Therefore, converting non-relative lookup tables to relative tables elsewhere can be explored as a separate task.

Personally, i read that as non-answer,
because this just reiterates that it can be done elsewhere,
and doesn't answer my question as to why that isn't the path taken.

First of all, I find this patch to be nearly impossible to read. It seems to mix a lot of refactoring with a functional change, making it very hard to focus on the core.

The main difference to the jump table logic is that the latter knows that all referenced addresses are within a function and therefore well contained. Nothing of the like seems to be found here. E.g. if this is supposed to address only unnamed pointers, it should be grouping them together and compute the offsets and then pick the optimal size. That's a transformation that can be beneficial for all modes for a not too large table. But it is hard to see what is going on here with all the seemingly unrelated changes.

Implement it as a separate pass and apply it to user-defined lookup tables as well.

Herald added a subscriber: mgorny. · View Herald TranscriptFeb 16 2021, 6:02 PM

@lebedev.ri based on your feedback, I added it as a separate pass and added support for user-defined lookup tables.
Please let me know if you have any comments.

Harbormaster completed remote builds in B89456: Diff 324144.Feb 16 2021, 6:45 PM

gulfem retitled this revision from [SimplifyCFG] Add relative switch lookup tables to [Passes] Add relative lookup table generator pass.Feb 17 2021, 8:54 AM

gulfem edited the summary of this revision. (Show Details)

Thanks for pushing this forward! I think this will be a nice transformation once all the details are worked out.

clang/test/CodeGen/switch-to-lookup-table.c
2 ↗	(On Diff #318372)	Buildbots may not have x86_64 as a registered target, so this test will break some buildbots. I think the opt flags -relocation-model=pic and -relocation-model=static will do the right thing (see for example llvm/test/Transforms/SimplifyCFG/ARM/switch-to-lookup-table.ll
llvm/include/llvm/InitializePasses.h
321	In some places the pass is referred to as a generator and here it's a converter. I think converter is a better name, and it should be consistent.
llvm/include/llvm/Transforms/Utils/RelLookupTableGenerator.h
10 ↗	(On Diff #324144)	For this kind of pass, I think it would be helpful to have a small example in the comment that shows what kind of IR it's looking for, and what it will be transformed to. (Either here or in the .cpp file.)
22 ↗	(On Diff #324144)	No need to call it simple :)
33 ↗	(On Diff #324144)	As clang-tidy points out, the comment doesn't match the actual macro name.
llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp
25 ↗	(On Diff #324144)	Since the function below are only used within this file, they should be static or in an anonymous namespace. Since you're already using an anonymous namespace for the RelLookupTableConverterPass class, I'd suggest using that for these functions too.
28 ↗	(On Diff #324144)	This explains the "what" of what the code does, but not the "why". Why should the transformation only run for these two targets?
34 ↗	(On Diff #324144)	This also needs the "why".
39 ↗	(On Diff #324144)	Again, this needs the "why". And perhaps it would make sense to put this check first.
58 ↗	(On Diff #324144)	The comment doesn't match the code exactly I think, since further down you also allow GetElementPtr expressions. Maybe the comment could be clearer.
78 ↗	(On Diff #324144)	Should it be hasOneUse() or hasOneUser()? Also maybe this check could some before the for-loop, since it's cheaper and may return early. There probably also needs to be a check that the global has local linkage, otherwise it could be referenced outside the module.
88 ↗	(On Diff #324144)	if you know that the initializer is a ConstantArray, you can use cast<> instead of dyn_cast<> here
91 ↗	(On Diff #324144)	The int32 type here has been mentioned in reviews before. I think at least, the code needs to have a good motivation for why 32 bits is enough.
119 ↗	(On Diff #324144)	The "can" part of the name seems misleading, since this doesn't return true or false, but actually tries to build the lookup table (and possibly returns null if it can't). I'd just drop "can" from the name. Oh, but there is already a generateRelLookupTable() function.. well, maybe there is a better name for one of them.
122 ↗	(On Diff #324144)	This code which checks that the user of the table is a GEP, followed by a load, etc. feels like a continuation of the checks in shouldGenerateRelLookupTableForGlobal(). I would suggest just moving those checks into this function.
125 ↗	(On Diff #324144)	getNextNode() doesn't seem right. It gets the next instruction, but that is not necessarily the same as the instruction using the GEP. Also, the code probably needs to check that the GEP only has one use
129 ↗	(On Diff #324144)	(nit: commonly, llvm code uses an M variable for the module)
166 ↗	(On Diff #324144)	I'd suggest moving this to the top of the function, before even declaring Changed.
169 ↗	(On Diff #324144)	This would be simpler as for (GlobalVariable *GlobalVar : M.globals()) {
llvm/test/Transforms/SimplifyCFG/X86/relative_lookup_table.ll
2 ↗	(On Diff #324144)	(you could skip the -mtriple argument here since there is a "target triple" line in the IR below)

It looks like you have everything setup for running on the new PM, but it doesn't look like the pass is added anywhere in the new PM pipeline. Unfortunately, I don't *think* there's anything in the new PM that acts as a "default pipeline that gets added to all other pipelines" similar to PassManagerBuilder::populateModulePassManager, so we'll need to manually include this somewhere in PassBuilder::buildO0DefaultPipeline and PassBuilder::buildPerModuleDefaultPipeline.

Depending on if we also want to support this in [thin]LTO, we may need to add this to more pipelines.

llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp
46 ↗	(On Diff #324144)	We should also check if the switch table itself is dso_local since the right relocation won't be generated if it isn't.
51 ↗	(On Diff #324144)	`cast` since we checked before this is a ConstantArray. Alternatively, what you could have is: if (!GlobalVar.hasInitializer()) return false; ConstantArray *Array = dyn_cast<ConstantArray>(GlobalVar.getInitializer()); if (!Array \|\| !Array->getType()->getElementType()->isPointerTy()) return false;
61 ↗	(On Diff #324144)	Rather than checking the linkage explicitly, you can use `isImplicitDSOLocal` which also has some visibility checks. GlobalVarOp->isDSOLocal() \|\| GlobalVarOp->isImplicitDSOLocal()
64–70 ↗	(On Diff #324144)	It seems that with this, we're limiting this to only arrays with GEPs with globals as the base, but I think this will return false if the array element is just a dso_local global. We definitely should still be taking into account GEPs though. I'm thinking `IsConstantOffsetFromGlobal` might be more useful here since it already contains a bunch of logic for handling ConstantExpr GEPs, then you can check if the global found by that is dso_local.
76–77 ↗	(On Diff #324144)	What's the reason for why the number of users matters?
92–95 ↗	(On Diff #324144)	I think the visibility and linkage should be set the same as those of the original lookup table. I think to avoid many changes to the original lookup table and only focus on the new layout, we should also propagate any properties of the original table to the new relative table. That is, visibility, linkage, attributes, unnamed_addr, etc. should be copied from the original. (For copying attributes, you can use `copyAttributesFrom`.)
llvm/test/Transforms/SimplifyCFG/X86/relative_lookup_table.ll
31 ↗	(On Diff #324144)	We should also have cases that cover other linkages/visibilities: If the table elements are `extern dso_local` or `extern hidden`, we should still expect a relative lookup table If the switch table is not dso_local/hidden, we shouldn't expect a relative lookup table

Rename the pass to RelLookupTableConverter to be consistent
Addressed reviewers' feedback
Added tests for user-defined lookup tables and hidden visibility

Herald added subscribers: wenlei, nikic, kerbowa and 3 others. · View Herald TranscriptFeb 24 2021, 6:45 PM

gulfem retitled this revision from [Passes] Add relative lookup table generator pass to [Passes] Add relative lookup table converter pass.Feb 24 2021, 6:47 PM

gulfem marked 17 inline comments as done.Feb 24 2021, 6:54 PM

gulfem added inline comments.

clang/test/CodeGen/switch-to-lookup-table.c
2 ↗	(On Diff #318372)	I added `x86-registered-target` to ensure that it only runs on buildbots that have `x86_64` as a registered target
llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp
39 ↗	(On Diff #324144)	I added the reason, please let me know if that's not clear enough.
169 ↗	(On Diff #324144)	Please keep that in mind that I'm deleting a global variable while I'm iterating over global variables. I don't want to have invalidated iterator, and this is why I did not use a simple for loop.

Thanks for pushing this forward! I think this will be a nice transformation once all the details are worked out.

Thank you very much for all of your wonderful constructive feedback!
I learned much more about LLVM and IR internals.
Appreciate all your help!

In D94355#2572553, @leonardchan wrote:

It looks like you have everything setup for running on the new PM, but it doesn't look like the pass is added anywhere in the new PM pipeline.

Thank you very much for pointing that out @leonardchan !
I added this pass into both pass managers now.

gulfem added inline comments.Feb 24 2021, 7:09 PM

llvm/lib/Transforms/Utils/RelLookupTableGenerator.cpp
76–77 ↗	(On Diff #324144)	We generate one lookup table per switch statement whenever possible. I think there is only one use of that lookup table which is the`GetElementPtr` instruction. That is why I checked for one use. Do you see any issue with that?

Harbormaster completed remote builds in B90728: Diff 326265.Feb 24 2021, 9:59 PM

gulfem updated this revision to Diff 328354.Mar 4 2021, 6:29 PM

Use TTI hook for target machine checks

Harbormaster completed remote builds in B92195: Diff 328354.Mar 5 2021, 11:55 AM

One thing that just occurred to me: do we also perhaps want to hide this behind a flag? Right now it's being added to various default optimization pipelines, so some users might be surprised if they suddenly see their lookup tables change (either compiler or user generated). Do we want to make this something more opt-in, or is the layout itself not necessarily a concern to most users?

clang/test/CodeGen/switch-to-lookup-table.c
2 ↗	(On Diff #318372)	+1 on this. Unless this functionally changes something in the clang codebase, this test shouldn't be here. As hans pointed out, setting the `-relocation-model` should be enough.
llvm/include/llvm/CodeGen/BasicTTIImpl.h
384	Sorry, I think you might have explained this offline, but what was the reason for needing this in TTI? I would've though this information could be found in the `Module` (PIC/no PIC, 64-bit or not, code model). If it turns out all of this is available in `Module`, then we could greatly simplify some logic here by just checking this at the start of the pass run. If TTI is needed, then perhaps it may be better to just inline all these checks in `convertToRelativeLookupTables` since this is the only place this is called. I think we would only want to keep this here as a virtual method if we plan to have multiple TTI-impls overriding this.
llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
880	Should this be added at the end of the pipeline? It could be possible that other passes insert lookup tables but they may be untouched by this if this pass runs before them.
llvm/lib/Transforms/Utils/RelLookupTableConverter.cpp
31–32	Nit: `//`
41	`isa`
54–71	I think I mentioned this in a previous round of comments, but what you probably want here is just a way to determine if the operand is either a global or some constant offset from global. Right now it looks this this will ignore other constant expressions like bitcasts or ptrtoints which we also want to catch. I think it might be better to use `IsConstantOffsetFromGlobal` which already handles these cases.
80	`cast`
219	Should we be returning the value returned by this?

Used IsConstantOffsetFromGlobal() function
Added this pass to the end of legacy pass manager pipeline
Moved all the tests under a new test directory

gulfem marked 7 inline comments as done.Mar 12 2021, 6:12 PM

gulfem added inline comments.

llvm/include/llvm/CodeGen/BasicTTIImpl.h
384	Code model or PIC/noPIC is only set in the module if the user explicitly specifies them. TTI hook is necessary to access target machine information like the default code model. TTI is basically the interface to communicate between IR transformations and codegen. I think the checks cannot be moved into the pass because it uses the TargetMachine which is only available/visible in the TTI implementation itself. That's why I added a function into TTI to do target machine specific checks. Similar approach is used during lookup table generation (`shouldBuildLookupTables`) to check codegen info.

Harbormaster completed remote builds in B93629: Diff 330405.Mar 12 2021, 7:15 PM

LGTM pending a few more comments. Should also give some time to let others respond if they have feedback.

llvm/lib/Transforms/Utils/RelLookupTableConverter.cpp
41–42	Not needed here since you have the `isa` below.
llvm/test/Transforms/RelLookupTableConverter/relative_lookup_table.ll
2–3	We should also check some other `RUN`s to check that this isn't run on cases that return false in `shouldBuildRelLookupTables`: non-PIC, non-64-bit, other code model sizes, etc.
llvm/test/Transforms/RelLookupTableConverter/switch_relative_lookup_table.ll
39	It looks like this test case isn't much different from `string_table` in `relative_lookup_table.ll`? If so, then this file could be removed.
llvm/utils/gn/secondary/llvm/lib/Transforms/Utils/BUILD.gn
64	Good that you added this, but I think Nico has a bot that automatically updates these BUILD.gn files so manually updating them may not be necessary.

This revision is now accepted and ready to land.Mar 15 2021, 11:56 AM

lebedev.ri added inline comments.Mar 15 2021, 12:04 PM

llvm/include/llvm/CodeGen/BasicTTIImpl.h
390–391	But all tests are using `x86_64` triple? This is somewhat backwards. if the target wants to disable this, it will need to override this function with `return false;`.

Sorry for being unresponsive for a while, I got distracted by various bugs.

I skimmed this and it's looking great. Just added a few nit picks.

llvm/lib/Transforms/Utils/RelLookupTableConverter.cpp
32	It would be better if the comment said why. I suppose the reason is we need to be sure there aren't other uses of the table, because then it can't be replaced. But it would be cool if a future version of this patch could handle when there are multiple loads from the table which can all be replaced -- for example this could happen if a function which uses a lookup table gets inlined into multiple places.
47	It would be good with a "why" here too.

Add no_relative_lookup_table.ll test case that checks the cases where relative lookup table should not be generated
Add comments about dso_local check and single use check

gulfem marked 4 inline comments as done.Mar 16 2021, 6:10 PM

gulfem added inline comments.

llvm/lib/Transforms/Utils/RelLookupTableConverter.cpp
32	I actually ran into the exact case that you described during testing, where a function that uses a switch gets inlined into multiple call sites :) This is only to simplify the analysis, and I now added a TODO section to explore that later.
llvm/test/Transforms/RelLookupTableConverter/switch_relative_lookup_table.ll
39	I renamed this test case to no_relative_lookup_table.ll that checks the cases where relative lookup table should not be generated like in non-pic mode, medium or large code models, and 32 bit architectures, etc.

gulfem marked 2 inline comments as done.Mar 16 2021, 6:20 PM

gulfem added inline comments.

llvm/include/llvm/CodeGen/BasicTTIImpl.h
390–391	Although I used `x86_64 triple`, this optimization can be applied to other 64-bit architectures too, because it not target dependent except `isArch64Bit` and `getCodeModel` check. Is there a target that you have in mind that we need to disable this optimization? I thought that it makes sense to enable this optimization by default on all the targets that can support it. In case targets want to disable it, they can override it as you said. How can we improve the implementation? If you have suggestions, I'm happy to incorporate that.

Harbormaster completed remote builds in B94151: Diff 331139.Mar 16 2021, 6:59 PM

@lebedev.ri do you have further comments?
If not, I would like to submit this patch.

lebedev.ri added inline comments.Mar 19 2021, 12:01 PM

llvm/include/llvm/CodeGen/BasicTTIImpl.h
390–391	I'm sorry, i do not understand. Why does `!TM.getTargetTriple().isArch64Bit()` check exist? To me it reads as "if we aren't compiling for AArch64, don't build rel lookup tables". Am i misreading this?

gulfem added inline comments.Mar 19 2021, 2:19 PM

llvm/include/llvm/CodeGen/BasicTTIImpl.h
390–391	`isArch64Bit` checks whether we have a 64-bit architecture, right? I don't think it specifically checks for `AArch64`, and it can cover other 64-bit architectures like `x86_64` as well.

Thank you!

llvm/include/llvm/CodeGen/BasicTTIImpl.h
390–391	isArch64Bit checks whether we have a 64-bit architecture, right? D'oh. I really did read it as AArch64 :/ Sorry.
llvm/lib/Transforms/Utils/RelLookupTableConverter.cpp
173–174	`make_early_inc_range()`?
201	I would think this should be PreservedAnalyses PA; PA.preserveSet<CFGAnalyses>(); return PA; since this doesn't touch CFG at all. I think this should get rid of redundant `Running analysis: TargetIRAnalysis`.

Closed by commit rG78a65cd945d0: [Passes] Add relative lookup table converter pass (authored by gulfem). · Explain WhyMar 22 2021, 3:09 PM

This revision was automatically updated to reflect the committed changes.

gulfem added a commit: rG78a65cd945d0: [Passes] Add relative lookup table converter pass.

thakis added a subscriber: thakis.Mar 22 2021, 3:36 PM

thakis added inline comments.

llvm/utils/gn/secondary/llvm/lib/Transforms/Utils/BUILD.gn
64	Please don't touch gn files unless you use them. Simple file additions in cmake files are synced automatically http://github.com/llvmgnsyncbot You forgot to add the trailing comma and now I had to fix it up manually instead of doing nothing :P

gulfem added inline comments.Mar 22 2021, 3:50 PM

llvm/utils/gn/secondary/llvm/lib/Transforms/Utils/BUILD.gn
64	Sorry about that Niko. I can fix it, so you don't need to do anything. Leo actually pointed that out, but I thought that manually changing it won't do any harm. Apparently it did though!

gulfem added a reverting change: rGe3a6d70c6834: Revert "[Passes] Add relative lookup table converter pass".Mar 22 2021, 5:44 PM

arichardson added a subscriber: arichardson.Mar 23 2021, 1:37 AM

arichardson added inline comments.

llvm/lib/Transforms/Utils/RelLookupTableConverter.cpp
92	This should use the same address space as the original global.

gulfem added a commit: rG5fd001a5ffba: [Passes] Add relative lookup table converter pass.Mar 24 2021, 10:31 AM

gulfem added a reverting change: rG5fbe1fdf1702: Revert "[Passes] Add relative lookup table converter pass".Mar 24 2021, 12:01 PM

gulfem added a commit: rG5178ffc7cf92: [Passes] Add relative lookup table converter pass.Mar 29 2021, 3:12 PM

It looks like this is breaking the Windows/ARM(64) target - it doesn't produce the right relative relocations for symbol differences. It can be reproduced with a testcase like this:

$ cat test.s
        .text
func1:
        ret
func2:
        ret

        .section        .rdata,"dr"
        .p2align        2
table:
        .long   func1-table
        .long   func2-table
$ clang -target aarch64-windows -c -o - test.s | llvm-objdump -r -s -

<stdin>:        file format coff-arm64

RELOCATION RECORDS FOR [.rdata]:
OFFSET           TYPE                     VALUE
0000000000000000 IMAGE_REL_ARM64_ADDR32   func1
0000000000000004 IMAGE_REL_ARM64_ADDR32   func2

Contents of section .text:
 0000 c0035fd6 c0035fd6                    .._..._.
Contents of section .rdata: 
 0000 00000000 04000000                    ........

Those relocations would need to be IMAGE_REL_ARM64_REL32. It looks like the arm/windows target has got the same issue as well.

Would you be ok with reverting this change until I can sort that out, or can we disable the pass for those targets until then?

mstorsjo mentioned this in D99572: [AArch64] [COFF] Properly produce cross-section relative relocations.Mar 30 2021, 3:26 AM

In D94355#2657881, @mstorsjo wrote:

It looks like this is breaking the Windows/ARM(64) target - it doesn't produce the right relative relocations for symbol differences. It can be reproduced with a testcase like this:

[...]

Those relocations would need to be IMAGE_REL_ARM64_REL32. It looks like the arm/windows target has got the same issue as well.

Would you be ok with reverting this change until I can sort that out, or can we disable the pass for those targets until then?

It turned out to not be all that hard to fix actually, see D99572 for such a fix. If I can get that landed soon, I think we might not need to act on this one.

krasimir added a reverting change: rGc51e91e04681: Revert "[Passes] Add relative lookup table converter pass".Mar 30 2021, 5:15 AM

rnk added subscribers: aeubanks, rnk.Mar 31 2021, 1:16 PM

rnk added inline comments.

llvm/test/Other/opt-O3-pipeline-enable-matrix.ll
322–323	Putting a ModulePass in the middle of the CodeGen pass pipeline creates a "pass barrier": now instead of applying every pass to each function in turn, the old pass manager will stop, run this whole-module pass, and then run subseqeunt passes in the next function pass manager on each function in turn. This isn't ideal. @aeubanks, can you follow-up to make sure this is addressed? We had the same issues with the SymbolRewriter pass, which if you grep for "Rewrite Symbols" you can see has the same issue. I remember writing a patch to fix it, but I guess I never landed it.

aeubanks added inline comments.Mar 31 2021, 1:52 PM

llvm/test/Other/opt-O3-pipeline-enable-matrix.ll
322–323	I see "Rewrite Symbols" in the codegen pipeline and yeah it's splitting the function pass manager. For this patch, can we just not add the pass to the legacy PM pipeline? It's deprecated and the new PM is already the default for the optimization pipeline.

aeubanks added inline comments.Mar 31 2021, 11:31 PM

llvm/test/Other/opt-O3-pipeline-enable-matrix.ll
322–323	(https://reviews.llvm.org/D99707 for anybody interested)

Would you be ok with reverting this change until I can sort that out, or can we disable the pass for those targets until then?

I will disable the pass for those targets for now.
When the issue is resolved, I would like to enable it for those targets as well.

llvm/test/Other/opt-O3-pipeline-enable-matrix.ll
322–323	For this patch, can we just not add the pass to the legacy PM pipeline? It's deprecated and the new PM is already the default for the optimization pipeline. @rnk @aeubanks If it causes issues, I'm ok to remove it from the legacy PM pipeline. When I land this patch, I'll only add it to new PM.

gulfem added a commit: rGe96df3e531f5: [Passes] Add relative lookup table converter pass.Apr 12 2021, 6:30 PM

mstorsjo mentioned this in rGd5c5cf5ce8d9: [AArch64] [COFF] Properly produce cross-section relative relocations.Apr 14 2021, 2:57 AM

In D94355#2665532, @gulfem wrote:

Would you be ok with reverting this change until I can sort that out, or can we disable the pass for those targets until then?

I will disable the pass for those targets for now.
When the issue is resolved, I would like to enable it for those targets as well.

FYI, I pushed the fix for the aarch64-coff issue now (D99572, rGd5c5cf5ce8d921fc8c5e1b608c298a1ffa688d37) and pushed another commit to remove the code for disabling the pass on aarch64 (rG57b259a852a6383880f5d0875d848420bb3c2945).

FYI, I pushed the fix for the aarch64-coff issue now (D99572, rGd5c5cf5ce8d921fc8c5e1b608c298a1ffa688d37) and pushed another commit to remove the code for disabling the pass on aarch64 (rG57b259a852a6383880f5d0875d848420bb3c2945).

Thank you @mstorsjo!

jsji mentioned this in D100584: [PowerPC] Disable relative lookup table converter pass for AIX.Apr 15 2021, 11:10 AM

jsji mentioned this in rGd88d8c5b8607: [PowerPC] Disable relative lookup table converter pass for AIX.Apr 19 2021, 12:28 PM

This patch breaks a two stage build with LTO:

$ git bisect log
# bad: [f0bc2782f281ca05221d2f1735bbaff6c4b81ebb] [TTI] NFC: Remove unused 'OptSize' parameter from shouldMaximizeVectorBandwidth
# good: [9829f5e6b1bca9b61efc629770d28bb9014dec45] [CVP] @llvm.[us]{min,max}() intrinsics handling
git bisect start 'f0bc2782f281ca05221d2f1735bbaff6c4b81ebb' '9829f5e6b1bca9b61efc629770d28bb9014dec45'
# bad: [1af35e77f4b8c3314dc20a10d579b52f22c75a00] [TTI] NFC: Change getVectorInstrCost to return InstructionCost
git bisect bad 1af35e77f4b8c3314dc20a10d579b52f22c75a00
# bad: [45f8946a759a780e6131256d6d206977b9c128ee] [CodeView] Fix the ARM64 CPUType enum
git bisect bad 45f8946a759a780e6131256d6d206977b9c128ee
# good: [0b439e4cc9dbb5c226121383b84d4f48ab669c55] [libc++] Split std::allocator out of <memory>
git bisect good 0b439e4cc9dbb5c226121383b84d4f48ab669c55
# good: [2eb98d89ac866e32cb56727174e4d1c1413479c8] [mlir][spirv] Allow bitwidth emulation on runtime arrays
git bisect good 2eb98d89ac866e32cb56727174e4d1c1413479c8
# bad: [80aa9b0f7b3ebe53220a398b2939610d8a49e24b] [PowerPC] stop reverse mem op generation for some cases.
git bisect bad 80aa9b0f7b3ebe53220a398b2939610d8a49e24b
# good: [a8ab1f98d22cf15f39dd1c2ce77675e628fceb31] [Evaluator] Look through invariant.group intrinsics
git bisect good a8ab1f98d22cf15f39dd1c2ce77675e628fceb31
# bad: [30f591c3869f3bbe6eca1249dcef1b8337312de6] [lldb] Disable TestLaunchProcessPosixSpawn.py with reproducers
git bisect bad 30f591c3869f3bbe6eca1249dcef1b8337312de6
# good: [6c4f2508e4278ac789230cb05f2bb56a8a7297dc] Revert "[lldb] [gdb-remote client] Refactor handling qSupported"
git bisect good 6c4f2508e4278ac789230cb05f2bb56a8a7297dc
# bad: [e96df3e531f506eea75da0f13d0f8aa9a267f975] [Passes] Add relative lookup table converter pass
git bisect bad e96df3e531f506eea75da0f13d0f8aa9a267f975
# good: [1310a19af06262122a6e9e4f6fbbe9c39ebad76e] [mlir] Use MCJIT to fix integration tests
git bisect good 1310a19af06262122a6e9e4f6fbbe9c39ebad76e
# first bad commit: [e96df3e531f506eea75da0f13d0f8aa9a267f975] [Passes] Add relative lookup table converter pass

My reproduction steps, apologies if they are not reduced enough:

$ mkdir -p build/stage{1,2}

$ cmake \
   -B build/stage1 \
   -G Ninja \
   -DCMAKE_C_COMPILER=$(command -v clang) \
   -DCMAKE_CXX_COMPILER=$(command -v clang++) \
   -DLLVM_USE_LINKER=$(command -v ld.lld) \
   -DLLVM_ENABLE_PROJECTS="clang;lld" \
   -DLLVM_TARGETS_TO_BUILD=host \
   -DLLVM_CCACHE_BUILD=ON \
   -DCMAKE_BUILD_TYPE=Release \
   llvm
...

$ ninja -C build/stage1 all
...

$ cmake \
   -B build/stage2 \
   -G Ninja \
   -DCMAKE_AR=$PWD/build/stage1/bin/llvm-ar \
   -DCMAKE_C_COMPILER=$PWD/build/stage1/bin/clang \
   -DCLANG_TABLEGEN=$PWD/build/stage1/bin/clang-tblgen \
   -DCMAKE_CXX_COMPILER=$PWD/build/stage1/bin/clang++ \
   -DLLVM_USE_LINKER=$PWD/build/stage1/bin/ld.lld \
   -DLLVM_TABLEGEN=$PWD/build/stage1/bin/llvm-tblgen \
   -DCMAKE_RANLIB=$PWD/build/stage1/bin/llvm-ranlib \
   -DLLVM_ENABLE_PROJECTS=clang \
   -DLLVM_TARGETS_TO_BUILD=host \
   -DCMAKE_BUILD_TYPE=Release \
   -DLLVM_ENABLE_LTO=Full \
   llvm

$ ninja -C build/stage2 lib/libclang-cpp.so.13git
...
[2296/2296] Linking CXX shared library lib/libclang-cpp.so.13git
FAILED: lib/libclang-cpp.so.13git
...
ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol llvm::detail::unit<std::ratio<3600l, 1l> >::value; recompile with -fPIC
>>> defined in lto.tmp
>>> referenced by ld-temp.o
>>>               lto.tmp:(.data.rel.ro..Lreltable._ZN5clang10TargetInfo21getTypeFormatModifierENS_23TransferrableTargetInfo7IntTypeE+0x8)

ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol llvm::detail::unit<std::ratio<3600l, 1l> >::value; recompile with -fPIC
>>> defined in lto.tmp
>>> referenced by ld-temp.o
>>>               lto.tmp:(.data.rel.ro..Lreltable._ZN5clang10TargetInfo21getTypeFormatModifierENS_23TransferrableTargetInfo7IntTypeE+0xC)

ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol llvm::detail::unit<std::ratio<3600l, 1l> >::value; recompile with -fPIC
>>> defined in lto.tmp
>>> referenced by ld-temp.o
>>>               lto.tmp:(.data.rel.ro..Lreltable._ZNK5clang21analyze_format_string14LengthModifier8toStringEv+0x8)

ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol llvm::detail::unit<std::ratio<60l, 1l> >::value; recompile with -fPIC
>>> defined in lto.tmp
>>> referenced by ld-temp.o
>>>               lto.tmp:(.data.rel.ro..Lreltable._ZNK5clang21analyze_format_string14LengthModifier8toStringEv+0x3C)

ld.lld: error: relocation R_X86_64_PC32 cannot be used against symbol llvm::detail::unit<std::ratio<3600l, 1l> >::value; recompile with -fPIC
>>> defined in lto.tmp
>>> referenced by ld-temp.o
>>>               lto.tmp:(.data.rel.ro..Lreltable._ZN4llvm15MCSymbolRefExpr18getVariantKindNameENS0_11VariantKindE+0xC0)
clang-13: error: linker command failed with exit code 1 (use -v to see invocation)
ninja: build stopped: subcommand failed.

This patch breaks a two stage build with LTO:

Thanks for the report @nathanchance, and I'm looking at it now.
Is this failure from one of the bots?

In D94355#2717813, @gulfem wrote:

Thanks for the report @nathanchance, and I'm looking at it now.

Thanks!

Is this failure from one of the bots?

No, this was discovered by a user of my LLVM build script:

https://github.com/ClangBuiltLinux/tc-build

https://github.com/kdrag0n/proton-clang-build/runs/2371499466?check_suite_focus=true

@nathanchance do you prefer me to revert the patch first as it might take me a while to investigate it?
Btw, I was able to reproduce the issue by following your steps.

In D94355#2717958, @gulfem wrote:

@nathanchance do you prefer me to revert the patch first as it might take me a while to investigate it?
Btw, I was able to reproduce the issue by following your steps.

I would say if it is going to take longer than the end of the week to fix the issue, it might be nice to have it reverted so that other people's builds continue to work (I know a few people who build with full LTO in a two stage configuration every week).

I would say if it is going to take longer than the end of the week to fix the issue, it might be nice to have it reverted so that other people's builds continue to work (I know a few people who build with full LTO in a two stage configuration every week).

Definitely! If I cannot fix it in two days, I'll revert the change.
Update: @nathanchance I need to investigate this issue further, so I disabled this pass in LTO pre-link phase to fix the broken build (https://reviews.llvm.org/D94355).
Please let me know if there is still an issue in that build!

gulfem mentioned this in D101664: [NewPM] Disable RelLookupTableConverter pass in lto.Apr 30 2021, 2:00 PM

gulfem mentioned this in rG4423a7a09b1b: [NewPM] Disable RelLookupTableConverter pass in LTO.Apr 30 2021, 2:24 PM

pcc added a subscriber: pcc.Jun 23 2021, 8:50 PM

pcc added inline comments.

llvm/lib/Transforms/Utils/RelLookupTableConverter.cpp
74	In the version of the patch that you committed, you have this check here: // If operand is mutable, do not generate a relative lookup table. auto *GlovalVarOp = dyn_cast<GlobalVariable>(GVOp); if (!GlovalVarOp \|\| !GlovalVarOp->isConstant()) return false; Nit: Gloval -> Global Why is it important whether the referenced global is mutable? The pointer itself is constant.

Herald added a subscriber: ormris. · View Herald TranscriptJun 23 2021, 8:50 PM

gulfem added inline comments.Jun 25 2021, 4:16 PM

llvm/lib/Transforms/Utils/RelLookupTableConverter.cpp
74	That's a typo, and I will fix that. This optimization does not do a detailed analysis, and it is being conservative. In this case, `GlobalVar` points to the switch lookup table and `GlobalVarOp` points to the elements in the lookup table like strings. To make sure that relative arithmetic works, it just checks whether both `GlobalVar` and `GlobalVarOp` pointers are constants. Did you see an issue on that?

kaz7 mentioned this in D106224: [VE] Disable relative lookup table converter pass for VE.Jul 17 2021, 12:41 PM

kaz7 mentioned this in rGb28e5b791064: [VE] Disable relative lookup table converter pass for VE.Jul 19 2021, 3:25 AM

FWIW, this commit turned out to break the FreeBSD dns/bind916 port, see https://bugs.freebsd.org/259921.

The short story is that the bind9 code on and after this line: https://gitlab.isc.org/isc-projects/bind9/-/blob/main/lib/isc/log.c#L1525 gets changed from something like:

.Ltmp661:
        #DEBUG_VALUE: isc_log_doit:category_channels <- $r12
        .loc    3 0 58                          # log.c:0:58
        xorl    %eax, %eax
        testl   %r15d, %r15d
        setg    %al
        movl    %r15d, %ecx
        negl    %ecx
        movq    %rcx, -840(%rbp)                # 8-byte Spill
        leaq    8328(%r13), %rcx
        #DEBUG_VALUE: isc_log_doit:matched <- 0
        movq    %rcx, -808(%rbp)                # 8-byte Spill
.Ltmp662:
        .loc    3 1552 25 is_stmt 1             # log.c:1552:25

to using a relative lookup table:

.Ltmp661:
        #DEBUG_VALUE: isc_log_doit:category_channels <- $r12
        .loc    3 0 58                          # log.c:0:58
        xorl    %eax, %eax
        testl   %r15d, %r15d
        setg    %al
        movl    %r15d, %edx
        negl    %edx
        leaq    reltable.isc_log_doit(%rip), %rcx
        movq    %rdx, -848(%rbp)                # 8-byte Spill
        movslq  (%rcx,%rdx,4), %rdx
        addq    %rcx, %rdx
        movq    %rdx, -840(%rbp)                # 8-byte Spill
        leaq    8328(%r13), %rcx
        #DEBUG_VALUE: isc_log_doit:matched <- 0
        movq    %rcx, -808(%rbp)                # 8-byte Spill
.Ltmp662:
        .loc    3 1552 25 is_stmt 1             # log.c:1552:25

However, the value of %rcx at the movslq (%rcx,%rdx,4), %rdx statement becomes -2, so it attempts to access data before reltable.isc_log_doit. As that is in .rodata, this leads to a segfault.

The current working theory is that some code is hoisted out of the do-while loop starting at https://gitlab.isc.org/isc-projects/bind9/-/blob/main/lib/isc/log.c#L1531, in particular the [-level] accesses on lines 1613 and 1843:

                                snprintf(level_string, sizeof(level_string),
                                         "%s: ", log_level_strings[-level]);
...
                        } else {
                                syslog_level = syslog_map[-level];
                        }

but maybe these negative offsets confuse the lookup table converter?

jrtc27 added inline comments.Dec 10 2021, 2:46 PM

llvm/lib/Transforms/Utils/RelLookupTableConverter.cpp
133	This line causes the bug seen in bind. In that case, the GEP has been hoisted, but the load has not. In general the GEP could be in a different basic block, or even in the same basic block with an instruction that may not return (intrinsic, real function call, well-defined language-level exception, etc). You can insert the reltable.shift where the GEP is, and that probably makes sense given it serves (part of) the same purpose, but you must insert the actual reltable.intrinsic where the original load is, unless you've gone to great lengths to prove it's safe not to (which seems best left to the usual culprits like LICM). IR test cases: https://godbolt.org/z/YMdaMrobE (bind is characterised by the first of the two functions)

jrtc27 added inline comments.Dec 10 2021, 2:52 PM

llvm/include/llvm/CodeGen/BasicTTIImpl.h
396	The meanings of code models isn't really portable across targets... e.g. RISC-V's medium (underlying LLVM name for -mcmodel=medany) assumes 32-bit PC-relative offsets, and thus using a 32-bit table-relative offset is safe

gulfem added inline comments.Dec 10 2021, 7:39 PM

llvm/lib/Transforms/Utils/RelLookupTableConverter.cpp
133	@dim and @jrtc27 thank you for reporting it. I see what's going wrong, and I uploaded a patch that fixes the issue by ensuring that the call to load.relative.intrinsic is inserted before the load, but not gep. Please see https://reviews.llvm.org/D115571.

Revision Contents

Path

Size

llvm/

docs/

Passes.rst

5 lines

include/

llvm/

Analysis/

TargetTransformInfo.h

7 lines

TargetTransformInfoImpl.h

3 lines

CodeGen/

BasicTTIImpl.h

21 lines

InitializePasses.h

1 line

Transforms/

Scalar.h

1 line

Utils/

RelLookupTableConverter.h

70 lines

lib/

Analysis/

TargetTransformInfo.cpp

5 lines

Passes/

PassBuilder.cpp

3 lines

PassRegistry.def

7 lines

Transforms/

IPO/

PassManagerBuilder.cpp

2 lines

Utils/

CMakeLists.txt

1 line

RelLookupTableConverter.cpp

244 lines

Utils.cpp

1 line

test/

CodeGen/

AMDGPU/

opt-pipeline.ll

6 lines

Other/

new-pm-defaults.ll

8 lines

new-pm-thinlto-defaults.ll

8 lines

new-pm-thinlto-postlink-pgo-defaults.ll

10 lines

new-pm-thinlto-postlink-samplepgo-defaults.ll

8 lines

opt-O2-pipeline.ll

2 lines

opt-O3-pipeline-enable-matrix.ll

2 lines

opt-O3-pipeline.ll

2 lines

opt-Os-pipeline.ll

2 lines

pass-pipelines.ll

2 lines

Transforms/

RelLookupTableConverter/

relative_lookup_table.ll

310 lines

switch_relative_lookup_table.ll

73 lines

utils/

gn/

secondary/

llvm/

lib/

Transforms/

Utils/

BUILD.gn

1 line

Diff 330405

llvm/docs/Passes.rst

	Show First 20 Lines • Show All 967 Lines • ▼ Show 20 Lines
	For example: 4 + (x + 5) ⇒ x + (4 + 5)			For example: 4 + (x + 5) ⇒ x + (4 + 5)

	In the implementation of this algorithm, constants are assigned rank = 0,			In the implementation of this algorithm, constants are assigned rank = 0,
	function arguments are rank = 1, and other values are assigned ranks			function arguments are rank = 1, and other values are assigned ranks
	corresponding to the reverse post order traversal of current function (starting			corresponding to the reverse post order traversal of current function (starting
	at 2), which effectively gives values in deep loops higher rank than values not			at 2), which effectively gives values in deep loops higher rank than values not
	in loops.			in loops.

				``-rel-lookup-table-converter``: Relative lookup table converter
				-----------------------------------------

				This pass converts lookup tables to PIC-friendly relative lookup tables.

	``-reg2mem``: Demote all values to stack slots			``-reg2mem``: Demote all values to stack slots
	----------------------------------------------			----------------------------------------------

	This file demotes all registers to memory references. It is intended to be the			This file demotes all registers to memory references. It is intended to be the
	inverse of :ref:`mem2reg <passes-mem2reg>`. By converting to ``load``			inverse of :ref:`mem2reg <passes-mem2reg>`. By converting to ``load``
	instructions, the only values live across basic blocks are ``alloca``			instructions, the only values live across basic blocks are ``alloca``
	instructions and ``load`` instructions before ``phi`` nodes. It is intended			instructions and ``load`` instructions before ``phi`` nodes. It is intended
	that this should make CFG hacking much easier. To make later hacking easier,			that this should make CFG hacking much easier. To make later hacking easier,
	▲ Show 20 Lines • Show All 240 Lines • Show Last 20 Lines

llvm/include/llvm/Analysis/TargetTransformInfo.h

Show First 20 Lines • Show All 709 Lines • ▼ Show 20 Lines	public:
/// Return true if switches should be turned into lookup tables for the		/// Return true if switches should be turned into lookup tables for the
/// target.		/// target.
bool shouldBuildLookupTables() const;		bool shouldBuildLookupTables() const;

/// Return true if switches should be turned into lookup tables		/// Return true if switches should be turned into lookup tables
/// containing this constant value for the target.		/// containing this constant value for the target.
bool shouldBuildLookupTablesForConstant(Constant *C) const;		bool shouldBuildLookupTablesForConstant(Constant *C) const;

		/// Return true if lookup tables should be turned into relative lookup tables.
		bool shouldBuildRelLookupTables() const;

/// Return true if the input function which is cold at all call sites,		/// Return true if the input function which is cold at all call sites,
/// should use coldcc calling convention.		/// should use coldcc calling convention.
bool useColdCCForColdCall(Function &F) const;		bool useColdCCForColdCall(Function &F) const;

/// Estimate the overhead of scalarizing an instruction. Insert and Extract		/// Estimate the overhead of scalarizing an instruction. Insert and Extract
/// are set if the demanded result elements need to be inserted and/or		/// are set if the demanded result elements need to be inserted and/or
/// extracted from vectors.		/// extracted from vectors.
unsigned getScalarizationOverhead(VectorType *Ty, const APInt &DemandedElts,		unsigned getScalarizationOverhead(VectorType *Ty, const APInt &DemandedElts,
▲ Show 20 Lines • Show All 742 Lines • ▼ Show 20 Lines	public:
virtual bool LSRWithInstrQueries() = 0;		virtual bool LSRWithInstrQueries() = 0;
virtual bool isTruncateFree(Type Ty1, Type Ty2) = 0;		virtual bool isTruncateFree(Type Ty1, Type Ty2) = 0;
virtual bool isProfitableToHoist(Instruction *I) = 0;		virtual bool isProfitableToHoist(Instruction *I) = 0;
virtual bool useAA() = 0;		virtual bool useAA() = 0;
virtual bool isTypeLegal(Type *Ty) = 0;		virtual bool isTypeLegal(Type *Ty) = 0;
virtual unsigned getRegUsageForType(Type *Ty) = 0;		virtual unsigned getRegUsageForType(Type *Ty) = 0;
virtual bool shouldBuildLookupTables() = 0;		virtual bool shouldBuildLookupTables() = 0;
virtual bool shouldBuildLookupTablesForConstant(Constant *C) = 0;		virtual bool shouldBuildLookupTablesForConstant(Constant *C) = 0;
		virtual bool shouldBuildRelLookupTables() = 0;
virtual bool useColdCCForColdCall(Function &F) = 0;		virtual bool useColdCCForColdCall(Function &F) = 0;
virtual unsigned getScalarizationOverhead(VectorType *Ty,		virtual unsigned getScalarizationOverhead(VectorType *Ty,
const APInt &DemandedElts,		const APInt &DemandedElts,
bool Insert, bool Extract) = 0;		bool Insert, bool Extract) = 0;
virtual unsigned		virtual unsigned
getOperandsScalarizationOverhead(ArrayRef<const Value *> Args,		getOperandsScalarizationOverhead(ArrayRef<const Value *> Args,
ArrayRef<Type *> Tys) = 0;		ArrayRef<Type *> Tys) = 0;
virtual bool supportsEfficientVectorElementLoadStore() = 0;		virtual bool supportsEfficientVectorElementLoadStore() = 0;
▲ Show 20 Lines • Show All 366 Lines • ▼ Show 20 Lines	unsigned getRegUsageForType(Type *Ty) override {
return Impl.getRegUsageForType(Ty);		return Impl.getRegUsageForType(Ty);
}		}
bool shouldBuildLookupTables() override {		bool shouldBuildLookupTables() override {
return Impl.shouldBuildLookupTables();		return Impl.shouldBuildLookupTables();
}		}
bool shouldBuildLookupTablesForConstant(Constant *C) override {		bool shouldBuildLookupTablesForConstant(Constant *C) override {
return Impl.shouldBuildLookupTablesForConstant(C);		return Impl.shouldBuildLookupTablesForConstant(C);
}		}
		bool shouldBuildRelLookupTables() override {
		return Impl.shouldBuildRelLookupTables();
		}
bool useColdCCForColdCall(Function &F) override {		bool useColdCCForColdCall(Function &F) override {
return Impl.useColdCCForColdCall(F);		return Impl.useColdCCForColdCall(F);
}		}

unsigned getScalarizationOverhead(VectorType *Ty, const APInt &DemandedElts,		unsigned getScalarizationOverhead(VectorType *Ty, const APInt &DemandedElts,
bool Insert, bool Extract) override {		bool Insert, bool Extract) override {
return Impl.getScalarizationOverhead(Ty, DemandedElts, Insert, Extract);		return Impl.getScalarizationOverhead(Ty, DemandedElts, Insert, Extract);
}		}
▲ Show 20 Lines • Show All 444 Lines • Show Last 20 Lines

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h

Show First 20 Lines • Show All 276 Lines • ▼ Show 20 Lines	public:

bool useAA() const { return false; }		bool useAA() const { return false; }

bool isTypeLegal(Type *Ty) const { return false; }		bool isTypeLegal(Type *Ty) const { return false; }

unsigned getRegUsageForType(Type *Ty) const { return 1; }		unsigned getRegUsageForType(Type *Ty) const { return 1; }

bool shouldBuildLookupTables() const { return true; }		bool shouldBuildLookupTables() const { return true; }

bool shouldBuildLookupTablesForConstant(Constant *C) const { return true; }		bool shouldBuildLookupTablesForConstant(Constant *C) const { return true; }

		bool shouldBuildRelLookupTables() const { return true; }

bool useColdCCForColdCall(Function &F) const { return false; }		bool useColdCCForColdCall(Function &F) const { return false; }

unsigned getScalarizationOverhead(VectorType *Ty, const APInt &DemandedElts,		unsigned getScalarizationOverhead(VectorType *Ty, const APInt &DemandedElts,
bool Insert, bool Extract) const {		bool Insert, bool Extract) const {
return 0;		return 0;
}		}

unsigned getOperandsScalarizationOverhead(ArrayRef<const Value *> Args,		unsigned getOperandsScalarizationOverhead(ArrayRef<const Value *> Args,
▲ Show 20 Lines • Show All 843 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/BasicTTIImpl.h

Show All 39 Lines
#include "llvm/IR/Operator.h"		#include "llvm/IR/Operator.h"
#include "llvm/IR/Type.h"		#include "llvm/IR/Type.h"
#include "llvm/IR/Value.h"		#include "llvm/IR/Value.h"
#include "llvm/Support/Casting.h"		#include "llvm/Support/Casting.h"
#include "llvm/Support/CommandLine.h"		#include "llvm/Support/CommandLine.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/MachineValueType.h"		#include "llvm/Support/MachineValueType.h"
#include "llvm/Support/MathExtras.h"		#include "llvm/Support/MathExtras.h"
		#include "llvm/Target/TargetMachine.h"

#include <algorithm>		#include <algorithm>
#include <cassert>		#include <cassert>
#include <cstdint>		#include <cstdint>
#include <limits>		#include <limits>
#include <utility>		#include <utility>

namespace llvm {		namespace llvm {

▲ Show 20 Lines • Show All 318 Lines • ▼ Show 20 Lines	public:
}		}

bool shouldBuildLookupTables() {		bool shouldBuildLookupTables() {
const TargetLoweringBase *TLI = getTLI();		const TargetLoweringBase *TLI = getTLI();
return TLI->isOperationLegalOrCustom(ISD::BR_JT, MVT::Other) \|\|		return TLI->isOperationLegalOrCustom(ISD::BR_JT, MVT::Other) \|\|
TLI->isOperationLegalOrCustom(ISD::BRIND, MVT::Other);		TLI->isOperationLegalOrCustom(ISD::BRIND, MVT::Other);
}		}

		bool shouldBuildRelLookupTables() {
		leonardchanUnsubmitted Not Done Reply Inline Actions Sorry, I think you might have explained this offline, but what was the reason for needing this in TTI? I would've though this information could be found in the `Module` (PIC/no PIC, 64-bit or not, code model). If it turns out all of this is available in `Module`, then we could greatly simplify some logic here by just checking this at the start of the pass run. If TTI is needed, then perhaps it may be better to just inline all these checks in `convertToRelativeLookupTables` since this is the only place this is called. I think we would only want to keep this here as a virtual method if we plan to have multiple TTI-impls overriding this. leonardchan: Sorry, I think you might have explained this offline, but what was the reason for needing this…
		gulfemAuthorUnsubmitted Done Reply Inline Actions Code model or PIC/noPIC is only set in the module if the user explicitly specifies them. TTI hook is necessary to access target machine information like the default code model. TTI is basically the interface to communicate between IR transformations and codegen. I think the checks cannot be moved into the pass because it uses the TargetMachine which is only available/visible in the TTI implementation itself. That's why I added a function into TTI to do target machine specific checks. Similar approach is used during lookup table generation (`shouldBuildLookupTables`) to check codegen info. gulfem: Code model or PIC/noPIC is only set in the module if the user explicitly specifies them. TTI…
		const TargetMachine &TM = getTLI()->getTargetMachine();
		// If non-PIC mode, do not generate a relative lookup table.
		if (!TM.isPositionIndependent())
		return false;

		if (!TM.getTargetTriple().isArch64Bit())
		return false;
		lebedev.riUnsubmitted Not Done Reply Inline Actions But all tests are using `x86_64` triple? This is somewhat backwards. if the target wants to disable this, it will need to override this function with `return false;`. lebedev.ri: 1. But all tests are using `x86_64` triple? 2. This is somewhat backwards. if the target wants…
		gulfemAuthorUnsubmitted Done Reply Inline Actions Although I used `x86_64 triple`, this optimization can be applied to other 64-bit architectures too, because it not target dependent except `isArch64Bit` and `getCodeModel` check. Is there a target that you have in mind that we need to disable this optimization? I thought that it makes sense to enable this optimization by default on all the targets that can support it. In case targets want to disable it, they can override it as you said. How can we improve the implementation? If you have suggestions, I'm happy to incorporate that. gulfem: 1. Although I used `x86_64 triple`, this optimization can be applied to other 64-bit…
		lebedev.riUnsubmitted Not Done Reply Inline Actions I'm sorry, i do not understand. Why does `!TM.getTargetTriple().isArch64Bit()` check exist? To me it reads as "if we aren't compiling for AArch64, don't build rel lookup tables". Am i misreading this? lebedev.ri: I'm sorry, i do not understand. Why does `!TM.getTargetTriple().isArch64Bit()` check exist? To…
		gulfemAuthorUnsubmitted Not Done Reply Inline Actions `isArch64Bit` checks whether we have a 64-bit architecture, right? I don't think it specifically checks for `AArch64`, and it can cover other 64-bit architectures like `x86_64` as well. gulfem: `isArch64Bit` checks whether we have a 64-bit architecture, right? I don't think it…
		lebedev.riUnsubmitted Not Done Reply Inline Actions isArch64Bit checks whether we have a 64-bit architecture, right? D'oh. I really did read it as AArch64 :/ Sorry. lebedev.ri: > isArch64Bit checks whether we have a 64-bit architecture, right? D'oh. I really did read it…

		/// Relative lookup table entries consist of 32-bit offsets.
		/// Do not generate relative lookup tables for large code models
		/// in 64-bit achitectures where 32-bit offsets might not be enough.
		if (TM.getCodeModel() == CodeModel::Medium \|\|
		jrtc27Unsubmitted Not Done Reply Inline Actions The meanings of code models isn't really portable across targets... e.g. RISC-V's medium (underlying LLVM name for -mcmodel=medany) assumes 32-bit PC-relative offsets, and thus using a 32-bit table-relative offset is safe jrtc27: The meanings of code models isn't really portable across targets... e.g. RISC-V's medium…
		TM.getCodeModel() == CodeModel::Large)
		return false;

		return true;
		}

bool haveFastSqrt(Type *Ty) {		bool haveFastSqrt(Type *Ty) {
const TargetLoweringBase *TLI = getTLI();		const TargetLoweringBase *TLI = getTLI();
EVT VT = TLI->getValueType(DL, Ty);		EVT VT = TLI->getValueType(DL, Ty);
return TLI->isTypeLegal(VT) &&		return TLI->isTypeLegal(VT) &&
TLI->isOperationLegalOrCustom(ISD::FSQRT, VT);		TLI->isOperationLegalOrCustom(ISD::FSQRT, VT);
}		}

bool isFCmpOrdCheaperThanFCmpZero(Type *Ty) {		bool isFCmpOrdCheaperThanFCmpZero(Type *Ty) {
▲ Show 20 Lines • Show All 1,663 Lines • Show Last 20 Lines

llvm/include/llvm/InitializePasses.h

	Show First 20 Lines • Show All 312 Lines • ▼ Show 20 Lines
	void initializeMetaRenamerPass(PassRegistry&);			void initializeMetaRenamerPass(PassRegistry&);
	void initializeModuleDebugInfoLegacyPrinterPass(PassRegistry &);			void initializeModuleDebugInfoLegacyPrinterPass(PassRegistry &);
	void initializeModuleMemProfilerLegacyPassPass(PassRegistry &);			void initializeModuleMemProfilerLegacyPassPass(PassRegistry &);
	void initializeModuleSummaryIndexWrapperPassPass(PassRegistry&);			void initializeModuleSummaryIndexWrapperPassPass(PassRegistry&);
	void initializeModuloScheduleTestPass(PassRegistry&);			void initializeModuloScheduleTestPass(PassRegistry&);
	void initializeMustExecutePrinterPass(PassRegistry&);			void initializeMustExecutePrinterPass(PassRegistry&);
	void initializeMustBeExecutedContextPrinterPass(PassRegistry&);			void initializeMustBeExecutedContextPrinterPass(PassRegistry&);
	void initializeNameAnonGlobalLegacyPassPass(PassRegistry&);			void initializeNameAnonGlobalLegacyPassPass(PassRegistry&);
				void initializeRelLookupTableConverterLegacyPassPass(PassRegistry &);
				hansUnsubmitted Done Reply Inline Actions In some places the pass is referred to as a generator and here it's a converter. I think converter is a better name, and it should be consistent. hans: In some places the pass is referred to as a generator and here it's a converter. I think…
	void initializeUniqueInternalLinkageNamesLegacyPassPass(PassRegistry &);			void initializeUniqueInternalLinkageNamesLegacyPassPass(PassRegistry &);
	void initializeNaryReassociateLegacyPassPass(PassRegistry&);			void initializeNaryReassociateLegacyPassPass(PassRegistry&);
	void initializeNewGVNLegacyPassPass(PassRegistry&);			void initializeNewGVNLegacyPassPass(PassRegistry&);
	void initializeObjCARCAAWrapperPassPass(PassRegistry&);			void initializeObjCARCAAWrapperPassPass(PassRegistry&);
	void initializeObjCARCAPElimPass(PassRegistry&);			void initializeObjCARCAPElimPass(PassRegistry&);
	void initializeObjCARCContractLegacyPassPass(PassRegistry &);			void initializeObjCARCContractLegacyPassPass(PassRegistry &);
	void initializeObjCARCExpandPass(PassRegistry&);			void initializeObjCARCExpandPass(PassRegistry&);
	void initializeObjCARCOptLegacyPassPass(PassRegistry &);			void initializeObjCARCOptLegacyPassPass(PassRegistry &);
	▲ Show 20 Lines • Show All 129 Lines • Show Last 20 Lines

llvm/include/llvm/Transforms/Scalar.h

	Show First 20 Lines • Show All 511 Lines • ▼ Show 20 Lines
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// LoopDataPrefetch - Perform data prefetching in loops.			// LoopDataPrefetch - Perform data prefetching in loops.
	//			//
	FunctionPass *createLoopDataPrefetchPass();			FunctionPass *createLoopDataPrefetchPass();

	///===---------------------------------------------------------------------===//			///===---------------------------------------------------------------------===//
	ModulePass *createNameAnonGlobalPass();			ModulePass *createNameAnonGlobalPass();
				ModulePass *createRelLookupTableConverterPass();
	ModulePass *createCanonicalizeAliasesPass();			ModulePass *createCanonicalizeAliasesPass();

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// LibCallsShrinkWrap - Shrink-wraps a call to function if the result is not			// LibCallsShrinkWrap - Shrink-wraps a call to function if the result is not
	// used.			// used.
	//			//
	FunctionPass *createLibCallsShrinkWrapPass();			FunctionPass *createLibCallsShrinkWrapPass();
	Show All 32 Lines

llvm/include/llvm/Transforms/Utils/RelLookupTableConverter.h

This file was added.

				//===-- RelLookupTableConverterPass.h - Rel Table Conv ----------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				/// \file
				/// This file implements relative lookup table converter that converts
				/// lookup tables to relative lookup tables to make them PIC-friendly.
				///
				/// Switch lookup table example:
				/// @switch.table.foo = private unnamed_addr constant [3 x i8*]
				/// [
				/// i8* getelementptr inbounds ([5 x i8], [5 x i8]* @.str, i64 0, i64 0),
				/// i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.1, i64 0, i64 0),
				/// i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.2, i64 0, i64 0)
				/// ], align 8
				///
				/// switch.lookup:
				/// %1 = sext i32 %cond to i64
				/// %switch.gep = getelementptr inbounds [3 x i8*],
				/// [3 x i8] @switch.table.foo, i64 0, i64 %1
				/// %switch.load = load i8, i8* %switch.gep, align 8
				/// ret i8* %switch.load
				///
				/// Switch lookup table will become a relative lookup table that
				/// consists of relative offsets.
				///
				/// @reltable.foo = private unnamed_addr constant [3 x i32]
				/// [
				/// i32 trunc (i64 sub (i64 ptrtoint ([5 x i8]* @.str to i64),
				/// i64 ptrtoint ([3 x i32]* @reltable.foo to i64)) to i32),
				/// i32 trunc (i64 sub (i64 ptrtoint ([4 x i8]* @.str.1 to i64),
				/// i64 ptrtoint ([3 x i32]* @reltable.foo to i64)) to i32),
				/// i32 trunc (i64 sub (i64 ptrtoint ([4 x i8]* @.str.2 to i64),
				/// i64 ptrtoint ([3 x i32]* @reltable.foo to i64)) to i32)
				/// ], align 4
				///
				/// IR after converting to a relative lookup table:
				/// switch.lookup:
				/// %1 = sext i32 %cond to i64
				/// %reltable.shift = shl i64 %1, 2
				/// %reltable.intrinsic = call i8* @llvm.load.relative.i64(
				/// i8* bitcast ([3 x i32]* @reltable.foo to i8*),
				/// i64 %reltable.shift)
				/// ret i8* %reltable.intrinsic
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_TRANSFORMS_UTILS_RELLOOKUPTABLECONVERTER_H
				#define LLVM_TRANSFORMS_UTILS_RELLOOKUPTABLECONVERTER_H

				#include "llvm/IR/Module.h"
				#include "llvm/IR/PassManager.h"

				namespace llvm {

				// Pass that converts lookup tables to relative lookup tables.
				class RelLookupTableConverterPass
				: public PassInfoMixin<RelLookupTableConverterPass> {
				public:
				RelLookupTableConverterPass() = default;

				PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM);
				};

				} // end namespace llvm

				#endif // LLVM_TRANSFORMS_UTILS_RELLOOKUPTABLECONVERTER_H

llvm/lib/Analysis/TargetTransformInfo.cpp

	Show First 20 Lines • Show All 446 Lines • ▼ Show 20 Lines

	unsigned TargetTransformInfo::getRegUsageForType(Type *Ty) const {			unsigned TargetTransformInfo::getRegUsageForType(Type *Ty) const {
	return TTIImpl->getRegUsageForType(Ty);			return TTIImpl->getRegUsageForType(Ty);
	}			}

	bool TargetTransformInfo::shouldBuildLookupTables() const {			bool TargetTransformInfo::shouldBuildLookupTables() const {
	return TTIImpl->shouldBuildLookupTables();			return TTIImpl->shouldBuildLookupTables();
	}			}

	bool TargetTransformInfo::shouldBuildLookupTablesForConstant(			bool TargetTransformInfo::shouldBuildLookupTablesForConstant(
	Constant *C) const {			Constant *C) const {
	return TTIImpl->shouldBuildLookupTablesForConstant(C);			return TTIImpl->shouldBuildLookupTablesForConstant(C);
	}			}

				bool TargetTransformInfo::shouldBuildRelLookupTables() const {
				return TTIImpl->shouldBuildRelLookupTables();
				}

	bool TargetTransformInfo::useColdCCForColdCall(Function &F) const {			bool TargetTransformInfo::useColdCCForColdCall(Function &F) const {
	return TTIImpl->useColdCCForColdCall(F);			return TTIImpl->useColdCCForColdCall(F);
	}			}

	unsigned			unsigned
	TargetTransformInfo::getScalarizationOverhead(VectorType *Ty,			TargetTransformInfo::getScalarizationOverhead(VectorType *Ty,
	const APInt &DemandedElts,			const APInt &DemandedElts,
	bool Insert, bool Extract) const {			bool Insert, bool Extract) const {
	▲ Show 20 Lines • Show All 956 Lines • Show Last 20 Lines

llvm/lib/Passes/PassBuilder.cpp

Show First 20 Lines • Show All 221 Lines • ▼ Show 20 Lines
#include "llvm/Transforms/Utils/LibCallsShrinkWrap.h"		#include "llvm/Transforms/Utils/LibCallsShrinkWrap.h"
#include "llvm/Transforms/Utils/LoopSimplify.h"		#include "llvm/Transforms/Utils/LoopSimplify.h"
#include "llvm/Transforms/Utils/LoopVersioning.h"		#include "llvm/Transforms/Utils/LoopVersioning.h"
#include "llvm/Transforms/Utils/LowerInvoke.h"		#include "llvm/Transforms/Utils/LowerInvoke.h"
#include "llvm/Transforms/Utils/LowerSwitch.h"		#include "llvm/Transforms/Utils/LowerSwitch.h"
#include "llvm/Transforms/Utils/Mem2Reg.h"		#include "llvm/Transforms/Utils/Mem2Reg.h"
#include "llvm/Transforms/Utils/MetaRenamer.h"		#include "llvm/Transforms/Utils/MetaRenamer.h"
#include "llvm/Transforms/Utils/NameAnonGlobals.h"		#include "llvm/Transforms/Utils/NameAnonGlobals.h"
		#include "llvm/Transforms/Utils/RelLookupTableConverter.h"
#include "llvm/Transforms/Utils/StripGCRelocates.h"		#include "llvm/Transforms/Utils/StripGCRelocates.h"
#include "llvm/Transforms/Utils/StripNonLineTableDebugInfo.h"		#include "llvm/Transforms/Utils/StripNonLineTableDebugInfo.h"
#include "llvm/Transforms/Utils/SymbolRewriter.h"		#include "llvm/Transforms/Utils/SymbolRewriter.h"
#include "llvm/Transforms/Utils/UnifyFunctionExitNodes.h"		#include "llvm/Transforms/Utils/UnifyFunctionExitNodes.h"
#include "llvm/Transforms/Utils/UnifyLoopExits.h"		#include "llvm/Transforms/Utils/UnifyLoopExits.h"
#include "llvm/Transforms/Utils/UniqueInternalLinkageNames.h"		#include "llvm/Transforms/Utils/UniqueInternalLinkageNames.h"
#include "llvm/Transforms/Vectorize/LoadStoreVectorizer.h"		#include "llvm/Transforms/Vectorize/LoadStoreVectorizer.h"
#include "llvm/Transforms/Vectorize/LoopVectorize.h"		#include "llvm/Transforms/Vectorize/LoopVectorize.h"
▲ Show 20 Lines • Show All 1,172 Lines • ▼ Show 20 Lines	PassBuilder::buildModuleOptimizationPipeline(OptimizationLevel Level,

// Now we need to do some global optimization transforms.		// Now we need to do some global optimization transforms.
// FIXME: It would seem like these should come first in the optimization		// FIXME: It would seem like these should come first in the optimization
// pipeline and maybe be the bottom of the canonicalization pipeline? Weird		// pipeline and maybe be the bottom of the canonicalization pipeline? Weird
// ordering here.		// ordering here.
MPM.addPass(GlobalDCEPass());		MPM.addPass(GlobalDCEPass());
MPM.addPass(ConstantMergePass());		MPM.addPass(ConstantMergePass());

		MPM.addPass(RelLookupTableConverterPass());

return MPM;		return MPM;
}		}

ModulePassManager		ModulePassManager
PassBuilder::buildPerModuleDefaultPipeline(OptimizationLevel Level,		PassBuilder::buildPerModuleDefaultPipeline(OptimizationLevel Level,
bool LTOPreLink) {		bool LTOPreLink) {
assert(Level != OptimizationLevel::O0 &&		assert(Level != OptimizationLevel::O0 &&
"Must request optimizations for the default pipeline!");		"Must request optimizations for the default pipeline!");
▲ Show 20 Lines • Show All 1,760 Lines • Show Last 20 Lines

llvm/lib/Passes/PassRegistry.def

	Show First 20 Lines • Show All 57 Lines • ▼ Show 20 Lines
	MODULE_PASS("globalopt", GlobalOptPass())			MODULE_PASS("globalopt", GlobalOptPass())
	MODULE_PASS("globalsplit", GlobalSplitPass())			MODULE_PASS("globalsplit", GlobalSplitPass())
	MODULE_PASS("hotcoldsplit", HotColdSplittingPass())			MODULE_PASS("hotcoldsplit", HotColdSplittingPass())
	MODULE_PASS("hwasan", HWAddressSanitizerPass(false, false))			MODULE_PASS("hwasan", HWAddressSanitizerPass(false, false))
	MODULE_PASS("khwasan", HWAddressSanitizerPass(true, true))			MODULE_PASS("khwasan", HWAddressSanitizerPass(true, true))
	MODULE_PASS("inferattrs", InferFunctionAttrsPass())			MODULE_PASS("inferattrs", InferFunctionAttrsPass())
	MODULE_PASS("inliner-wrapper", ModuleInlinerWrapperPass())			MODULE_PASS("inliner-wrapper", ModuleInlinerWrapperPass())
	MODULE_PASS("inliner-wrapper-no-mandatory-first", ModuleInlinerWrapperPass(			MODULE_PASS("inliner-wrapper-no-mandatory-first", ModuleInlinerWrapperPass(
	getInlineParams(),			getInlineParams(),
	DebugLogging,			DebugLogging,
	false))			false))
	MODULE_PASS("insert-gcov-profiling", GCOVProfilerPass())			MODULE_PASS("insert-gcov-profiling", GCOVProfilerPass())
	MODULE_PASS("instrorderfile", InstrOrderFilePass())			MODULE_PASS("instrorderfile", InstrOrderFilePass())
	MODULE_PASS("instrprof", InstrProfiling())			MODULE_PASS("instrprof", InstrProfiling())
	MODULE_PASS("internalize", InternalizePass())			MODULE_PASS("internalize", InternalizePass())
	MODULE_PASS("invalidate<all>", InvalidateAllAnalysesPass())			MODULE_PASS("invalidate<all>", InvalidateAllAnalysesPass())
	MODULE_PASS("ipsccp", IPSCCPPass())			MODULE_PASS("ipsccp", IPSCCPPass())
	MODULE_PASS("iroutliner", IROutlinerPass())			MODULE_PASS("iroutliner", IROutlinerPass())
	Show All 12 Lines
	MODULE_PASS("print-profile-summary", ProfileSummaryPrinterPass(dbgs()))			MODULE_PASS("print-profile-summary", ProfileSummaryPrinterPass(dbgs()))
	MODULE_PASS("print-callgraph", CallGraphPrinterPass(dbgs()))			MODULE_PASS("print-callgraph", CallGraphPrinterPass(dbgs()))
	MODULE_PASS("print", PrintModulePass(dbgs()))			MODULE_PASS("print", PrintModulePass(dbgs()))
	MODULE_PASS("print-lcg", LazyCallGraphPrinterPass(dbgs()))			MODULE_PASS("print-lcg", LazyCallGraphPrinterPass(dbgs()))
	MODULE_PASS("print-lcg-dot", LazyCallGraphDOTPrinterPass(dbgs()))			MODULE_PASS("print-lcg-dot", LazyCallGraphDOTPrinterPass(dbgs()))
	MODULE_PASS("print-must-be-executed-contexts", MustBeExecutedContextPrinterPass(dbgs()))			MODULE_PASS("print-must-be-executed-contexts", MustBeExecutedContextPrinterPass(dbgs()))
	MODULE_PASS("print-stack-safety", StackSafetyGlobalPrinterPass(dbgs()))			MODULE_PASS("print-stack-safety", StackSafetyGlobalPrinterPass(dbgs()))
	MODULE_PASS("print<module-debuginfo>", ModuleDebugInfoPrinterPass(dbgs()))			MODULE_PASS("print<module-debuginfo>", ModuleDebugInfoPrinterPass(dbgs()))
				MODULE_PASS("rel-lookup-table-converter", RelLookupTableConverterPass())
	MODULE_PASS("rewrite-statepoints-for-gc", RewriteStatepointsForGC())			MODULE_PASS("rewrite-statepoints-for-gc", RewriteStatepointsForGC())
	MODULE_PASS("rewrite-symbols", RewriteSymbolPass())			MODULE_PASS("rewrite-symbols", RewriteSymbolPass())
	MODULE_PASS("rpo-function-attrs", ReversePostOrderFunctionAttrsPass())			MODULE_PASS("rpo-function-attrs", ReversePostOrderFunctionAttrsPass())
	MODULE_PASS("sample-profile", SampleProfileLoaderPass())			MODULE_PASS("sample-profile", SampleProfileLoaderPass())
	MODULE_PASS("scc-oz-module-inliner",			MODULE_PASS("scc-oz-module-inliner",
	buildInlinerPipeline(OptimizationLevel::Oz, ThinOrFullLTOPhase::None))			buildInlinerPipeline(OptimizationLevel::Oz, ThinOrFullLTOPhase::None))
	MODULE_PASS("loop-extract-single", LoopExtractorPass(1))			MODULE_PASS("loop-extract-single", LoopExtractorPass(1))
	MODULE_PASS("strip", StripSymbolsPass())			MODULE_PASS("strip", StripSymbolsPass())
	▲ Show 20 Lines • Show All 173 Lines • ▼ Show 20 Lines
	FUNCTION_PASS("print<divergence>", DivergenceAnalysisPrinterPass(dbgs()))			FUNCTION_PASS("print<divergence>", DivergenceAnalysisPrinterPass(dbgs()))
	FUNCTION_PASS("print<domtree>", DominatorTreePrinterPass(dbgs()))			FUNCTION_PASS("print<domtree>", DominatorTreePrinterPass(dbgs()))
	FUNCTION_PASS("print<postdomtree>", PostDominatorTreePrinterPass(dbgs()))			FUNCTION_PASS("print<postdomtree>", PostDominatorTreePrinterPass(dbgs()))
	FUNCTION_PASS("print<delinearization>", DelinearizationPrinterPass(dbgs()))			FUNCTION_PASS("print<delinearization>", DelinearizationPrinterPass(dbgs()))
	FUNCTION_PASS("print<demanded-bits>", DemandedBitsPrinterPass(dbgs()))			FUNCTION_PASS("print<demanded-bits>", DemandedBitsPrinterPass(dbgs()))
	FUNCTION_PASS("print<domfrontier>", DominanceFrontierPrinterPass(dbgs()))			FUNCTION_PASS("print<domfrontier>", DominanceFrontierPrinterPass(dbgs()))
	FUNCTION_PASS("print<func-properties>", FunctionPropertiesPrinterPass(dbgs()))			FUNCTION_PASS("print<func-properties>", FunctionPropertiesPrinterPass(dbgs()))
	FUNCTION_PASS("print<inline-cost>", InlineCostAnnotationPrinterPass(dbgs()))			FUNCTION_PASS("print<inline-cost>", InlineCostAnnotationPrinterPass(dbgs()))
	FUNCTION_PASS("print<inliner-size-estimator>",			FUNCTION_PASS("print<inliner-size-estimator>",
	InlineSizeEstimatorAnalysisPrinterPass(dbgs()))			InlineSizeEstimatorAnalysisPrinterPass(dbgs()))
	FUNCTION_PASS("print<loops>", LoopPrinterPass(dbgs()))			FUNCTION_PASS("print<loops>", LoopPrinterPass(dbgs()))
	FUNCTION_PASS("print<memoryssa>", MemorySSAPrinterPass(dbgs()))			FUNCTION_PASS("print<memoryssa>", MemorySSAPrinterPass(dbgs()))
	FUNCTION_PASS("print<phi-values>", PhiValuesPrinterPass(dbgs()))			FUNCTION_PASS("print<phi-values>", PhiValuesPrinterPass(dbgs()))
	FUNCTION_PASS("print<regions>", RegionInfoPrinterPass(dbgs()))			FUNCTION_PASS("print<regions>", RegionInfoPrinterPass(dbgs()))
	FUNCTION_PASS("print<scalar-evolution>", ScalarEvolutionPrinterPass(dbgs()))			FUNCTION_PASS("print<scalar-evolution>", ScalarEvolutionPrinterPass(dbgs()))
	FUNCTION_PASS("print<stack-safety-local>", StackSafetyPrinterPass(dbgs()))			FUNCTION_PASS("print<stack-safety-local>", StackSafetyPrinterPass(dbgs()))
	// TODO: rename to print<foo> after NPM switch			// TODO: rename to print<foo> after NPM switch
	▲ Show 20 Lines • Show All 130 Lines • Show Last 20 Lines

llvm/lib/Transforms/IPO/PassManagerBuilder.cpp

Show First 20 Lines • Show All 871 Lines • ▼ Show 20 Lines	void PassManagerBuilder::populateModulePassManager(

// GlobalOpt already deletes dead functions and globals, at -O2 try a		// GlobalOpt already deletes dead functions and globals, at -O2 try a
// late pass of GlobalDCE. It is capable of deleting dead cycles.		// late pass of GlobalDCE. It is capable of deleting dead cycles.
if (OptLevel > 1) {		if (OptLevel > 1) {
MPM.add(createGlobalDCEPass()); // Remove dead fns and globals.		MPM.add(createGlobalDCEPass()); // Remove dead fns and globals.
MPM.add(createConstantMergePass()); // Merge dup global constants		MPM.add(createConstantMergePass()); // Merge dup global constants
}		}

// See comment in the new PM for justification of scheduling splitting at		// See comment in the new PM for justification of scheduling splitting at
		leonardchanUnsubmitted Done Reply Inline Actions Should this be added at the end of the pipeline? It could be possible that other passes insert lookup tables but they may be untouched by this if this pass runs before them. leonardchan: Should this be added at the end of the pipeline? It could be possible that other passes insert…
// this stage (\ref buildModuleSimplificationPipeline).		// this stage (\ref buildModuleSimplificationPipeline).
if (EnableHotColdSplit && !(PrepareForLTO \|\| PrepareForThinLTO))		if (EnableHotColdSplit && !(PrepareForLTO \|\| PrepareForThinLTO))
MPM.add(createHotColdSplittingPass());		MPM.add(createHotColdSplittingPass());

if (EnableIROutliner)		if (EnableIROutliner)
MPM.add(createIROutlinerPass());		MPM.add(createIROutlinerPass());

if (MergeFunctions)		if (MergeFunctions)
Show All 15 Lines	void PassManagerBuilder::populateModulePassManager(
// passes to avoid re-sinking, but before SimplifyCFG because it can allow		// passes to avoid re-sinking, but before SimplifyCFG because it can allow
// flattening of blocks.		// flattening of blocks.
MPM.add(createDivRemPairsPass());		MPM.add(createDivRemPairsPass());

// LoopSink (and other loop passes since the last simplifyCFG) might have		// LoopSink (and other loop passes since the last simplifyCFG) might have
// resulted in single-entry-single-exit or empty blocks. Clean up the CFG.		// resulted in single-entry-single-exit or empty blocks. Clean up the CFG.
MPM.add(createCFGSimplificationPass());		MPM.add(createCFGSimplificationPass());

		MPM.add(createRelLookupTableConverterPass());

addExtensionsToPM(EP_OptimizerLast, MPM);		addExtensionsToPM(EP_OptimizerLast, MPM);

if (PrepareForLTO) {		if (PrepareForLTO) {
MPM.add(createCanonicalizeAliasesPass());		MPM.add(createCanonicalizeAliasesPass());
// Rename anon globals to be able to handle them in the summary		// Rename anon globals to be able to handle them in the summary
MPM.add(createNameAnonGlobalPass());		MPM.add(createNameAnonGlobalPass());
}		}

▲ Show 20 Lines • Show All 369 Lines • Show Last 20 Lines

llvm/lib/Transforms/Utils/CMakeLists.txt

Show First 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	add_llvm_component_library(LLVMTransformUtils
LowerSwitch.cpp		LowerSwitch.cpp
MatrixUtils.cpp		MatrixUtils.cpp
Mem2Reg.cpp		Mem2Reg.cpp
MetaRenamer.cpp		MetaRenamer.cpp
ModuleUtils.cpp		ModuleUtils.cpp
NameAnonGlobals.cpp		NameAnonGlobals.cpp
PredicateInfo.cpp		PredicateInfo.cpp
PromoteMemoryToRegister.cpp		PromoteMemoryToRegister.cpp
		RelLookupTableConverter.cpp
ScalarEvolutionExpander.cpp		ScalarEvolutionExpander.cpp
StripGCRelocates.cpp		StripGCRelocates.cpp
SSAUpdater.cpp		SSAUpdater.cpp
SSAUpdaterBulk.cpp		SSAUpdaterBulk.cpp
SampleProfileLoaderBaseUtil.cpp		SampleProfileLoaderBaseUtil.cpp
SanitizerStats.cpp		SanitizerStats.cpp
SimplifyCFG.cpp		SimplifyCFG.cpp
SimplifyIndVar.cpp		SimplifyIndVar.cpp
Show All 24 Lines

llvm/lib/Transforms/Utils/RelLookupTableConverter.cpp

This file was added.

				//===- RelLookupTableConverterPass - Rel Table Conv -----------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements relative lookup table converter that converts
				// lookup tables to relative lookup tables to make them PIC-friendly.
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/Transforms/Utils/RelLookupTableConverter.h"
				#include "llvm/Analysis/ConstantFolding.h"
				#include "llvm/Analysis/TargetTransformInfo.h"
				#include "llvm/IR/BasicBlock.h"
				#include "llvm/IR/IRBuilder.h"
				#include "llvm/IR/Instructions.h"
				#include "llvm/IR/Module.h"
				#include "llvm/InitializePasses.h"
				#include "llvm/Pass.h"
				#include "llvm/Transforms/Utils/BasicBlockUtils.h"

				using namespace llvm;

				static bool shouldConvertToRelLookupTable(Module &M, GlobalVariable &GV) {
				if (!GV.hasInitializer())
				return false;

				// If lookup table has more than one user,
				// do not generate a relative lookup table.
				leonardchanUnsubmitted Done Reply Inline Actions Nit: `//` leonardchan: Nit: `//`
				hansUnsubmitted Done Reply Inline Actions It would be better if the comment said why. I suppose the reason is we need to be sure there aren't other uses of the table, because then it can't be replaced. But it would be cool if a future version of this patch could handle when there are multiple loads from the table which can all be replaced -- for example this could happen if a function which uses a lookup table gets inlined into multiple places. hans: It would be better if the comment said why. I suppose the reason is we need to be sure there…
				gulfemAuthorUnsubmitted Done Reply Inline Actions I actually ran into the exact case that you described during testing, where a function that uses a switch gets inlined into multiple call sites :) This is only to simplify the analysis, and I now added a TODO section to explore that later. gulfem: I actually ran into the exact case that you described during testing, where a function that…
				if (!GV.hasOneUse())
				return false;

				GetElementPtrInst *GEP =
				dyn_cast<GetElementPtrInst>(GV.use_begin()->getUser());
				if (!GEP \|\| !GEP->hasOneUse())
				return false;

				if (!dyn_cast<LoadInst>(GEP->use_begin()->getUser()))
				leonardchanUnsubmitted Done Reply Inline Actions `isa` leonardchan: `isa`
				return false;
				leonardchanUnsubmitted Done Reply Inline Actions Not needed here since you have the `isa` below. leonardchan: Not needed here since you have the `isa` below.

				if (!isa<LoadInst>(GEP->use_begin()->getUser()))
				return false;

				// If the original lookup table is not dso_local,
				hansUnsubmitted Done Reply Inline Actions It would be good with a "why" here too. hans: It would be good with a "why" here too.
				// do not generate a relative lookup table.
				if (!(GV.isDSOLocal() \|\| GV.isImplicitDSOLocal()))
				return false;

				ConstantArray *Array = dyn_cast<ConstantArray>(GV.getInitializer());
				// If values are not pointers, do not generate a relative lookup table.
				if (!Array \|\| !Array->getType()->getElementType()->isPointerTy())
				return false;

				const DataLayout &DL = M.getDataLayout();
				for (const Use &Op : Array->operands()) {
				Constant *ConstOp = cast<Constant>(&Op);
				GlobalValue *GVOp;
				APInt Offset;

				// If an operand is not a constant offset from a lookup table,
				// do not generate a relative lookup table.
				if (!IsConstantOffsetFromGlobal(ConstOp, GVOp, Offset, DL))
				return false;

				// If an operand in the lookup table is not dso_local,
				// do not generate a relative lookup table.
				if (!(GVOp->isDSOLocal() \|\| GVOp->isImplicitDSOLocal()))
				return false;
				leonardchanUnsubmitted Done Reply Inline Actions I think I mentioned this in a previous round of comments, but what you probably want here is just a way to determine if the operand is either a global or some constant offset from global. Right now it looks this this will ignore other constant expressions like bitcasts or ptrtoints which we also want to catch. I think it might be better to use `IsConstantOffsetFromGlobal` which already handles these cases. leonardchan: I think I mentioned this in a previous round of comments, but what you probably want here is…
				}

				return true;
				pccUnsubmitted Not Done Reply Inline Actions In the version of the patch that you committed, you have this check here: // If operand is mutable, do not generate a relative lookup table. auto GlovalVarOp = dyn_cast<GlobalVariable>(GVOp); if (!GlovalVarOp \|\| !GlovalVarOp->isConstant()) return false; Nit: Gloval -> Global Why is it important whether the referenced global is mutable? The pointer itself is constant. pcc:* In the version of the patch that you committed, you have this check here: ``` // If operand…
				gulfemAuthorUnsubmitted Done Reply Inline Actions That's a typo, and I will fix that. This optimization does not do a detailed analysis, and it is being conservative. In this case, `GlobalVar` points to the switch lookup table and `GlobalVarOp` points to the elements in the lookup table like strings. To make sure that relative arithmetic works, it just checks whether both `GlobalVar` and `GlobalVarOp` pointers are constants. Did you see an issue on that? gulfem: 1. That's a typo, and I will fix that. 2. This optimization does not do a detailed analysis…
				}

				static GlobalVariable *createRelLookupTable(Function &Func,
				GlobalVariable &LookupTable) {
				Module &M = *Func.getParent();
				ConstantArray *LookupTableArr =
				leonardchanUnsubmitted Done Reply Inline Actions `cast` leonardchan: `cast`
				cast<ConstantArray>(LookupTable.getInitializer());
				unsigned NumElts = LookupTableArr->getType()->getNumElements();
				ArrayType *IntArrayTy =
				ArrayType::get(Type::getInt32Ty(M.getContext()), NumElts);
				GlobalVariable *RelLookupTable = new GlobalVariable(
				M, IntArrayTy, LookupTable.isConstant(), LookupTable.getLinkage(),
				nullptr, "reltable." + Func.getName());
				RelLookupTable->copyAttributesFrom(&LookupTable);

				uint64_t Idx = 0;
				SmallVector<Constant *, 64> RelLookupTableContents(NumElts);

				arichardsonUnsubmitted Not Done Reply Inline Actions This should use the same address space as the original global. arichardson: This should use the same address space as the original global.
				for (Use &Operand : LookupTableArr->operands()) {
				Constant *Element = cast<Constant>(Operand);
				Type *IntPtrTy = M.getDataLayout().getIntPtrType(M.getContext());
				Constant *Base = llvm::ConstantExpr::getPtrToInt(RelLookupTable, IntPtrTy);
				Constant *Target = llvm::ConstantExpr::getPtrToInt(Element, IntPtrTy);
				Constant *Sub = llvm::ConstantExpr::getSub(Target, Base);
				Constant *RelOffset =
				llvm::ConstantExpr::getTrunc(Sub, Type::getInt32Ty(M.getContext()));
				RelLookupTableContents[Idx++] = RelOffset;
				}

				Constant *Initializer =
				ConstantArray::get(IntArrayTy, RelLookupTableContents);
				RelLookupTable->setInitializer(Initializer);
				RelLookupTable->setUnnamedAddr(GlobalValue::UnnamedAddr::Global);
				RelLookupTable->setAlignment(llvm::Align(4));
				return RelLookupTable;
				}

				static void convertToRelLookupTable(GlobalVariable &LookupTable) {
				GetElementPtrInst *GEP =
				cast<GetElementPtrInst>(LookupTable.use_begin()->getUser());
				LoadInst *Load = cast<LoadInst>(GEP->use_begin()->getUser());

				Module &M = *LookupTable.getParent();
				BasicBlock *BB = GEP->getParent();
				IRBuilder<> Builder(BB);
				Function &Func = *BB->getParent();

				// Generate an array that consists of relative offsets.
				GlobalVariable *RelLookupTable = createRelLookupTable(Func, LookupTable);

				// Place new instruction sequence after GEP.
				Builder.SetInsertPoint(GEP);
				Value *Index = GEP->getOperand(2);
				IntegerType *IntTy = cast<IntegerType>(Index->getType());
				Value *Offset =
				Builder.CreateShl(Index, ConstantInt::get(IntTy, 2), "reltable.shift");

				Function *LoadRelIntrinsic = llvm::Intrinsic::getDeclaration(
				&M, Intrinsic::load_relative, {Index->getType()});
				jrtc27Unsubmitted Not Done Reply Inline Actions This line causes the bug seen in bind. In that case, the GEP has been hoisted, but the load has not. In general the GEP could be in a different basic block, or even in the same basic block with an instruction that may not return (intrinsic, real function call, well-defined language-level exception, etc). You can insert the reltable.shift where the GEP is, and that probably makes sense given it serves (part of) the same purpose, but you must insert the actual reltable.intrinsic where the original load is, unless you've gone to great lengths to prove it's safe not to (which seems best left to the usual culprits like LICM). IR test cases: https://godbolt.org/z/YMdaMrobE (bind is characterised by the first of the two functions) jrtc27: This line causes the bug seen in bind. In that case, the GEP has been hoisted, but the load has…
				gulfemAuthorUnsubmitted Done Reply Inline Actions @dim and @jrtc27 thank you for reporting it. I see what's going wrong, and I uploaded a patch that fixes the issue by ensuring that the call to load.relative.intrinsic is inserted before the load, but not gep. Please see https://reviews.llvm.org/D115571. gulfem: @dim and @jrtc27 thank you for reporting it. I see what's going wrong, and I uploaded a patch…
				Value *Base = Builder.CreateBitCast(RelLookupTable, Builder.getInt8PtrTy());

				// Create a call to load.relative intrinsic that computes the target address
				// by adding base address (lookup table address) and relative offset.
				Value *Result = Builder.CreateCall(LoadRelIntrinsic, {Base, Offset},
				"reltable.intrinsic");

				// Create a bitcast instruction if necessary.
				if (Load->getType() != Builder.getInt8PtrTy())
				Result = Builder.CreateBitCast(Result, Load->getType(), "reltable.bitcast");

				// Replace load instruction with the new generated instruction sequence.
				BasicBlock::iterator InsertPoint(Load);
				ReplaceInstWithValue(Load->getParent()->getInstList(), InsertPoint, Result);

				// Remove GEP instruction.
				GEP->eraseFromParent();
				}

				// Convert lookup tables to relative lookup tables in the module.
				static bool convertToRelativeLookupTables(
				Module &M, function_ref<TargetTransformInfo &(Function &)> GetTTI) {
				Module::iterator FI = M.begin();
				if (FI == M.end())
				return false;

				// Check if we have a target that supports relative lookup tables.
				if (!GetTTI(*FI).shouldBuildRelLookupTables())
				return false;

				bool Changed = false;

				for (auto GVI = M.global_begin(), E = M.global_end(); GVI != E;) {
				GlobalVariable &GlobalVar = *GVI++;

				if (!shouldConvertToRelLookupTable(M, GlobalVar))
				continue;

				convertToRelLookupTable(GlobalVar);

				// Remove the original lookup table.
				lebedev.riUnsubmitted Not Done Reply Inline Actions `make_early_inc_range()`? lebedev.ri: `make_early_inc_range()`?
				GlobalVar.eraseFromParent();
				Changed = true;
				}

				return Changed;
				}

				PreservedAnalyses RelLookupTableConverterPass::run(Module &M,
				ModuleAnalysisManager &AM) {
				FunctionAnalysisManager &FAM =
				AM.getResult<FunctionAnalysisManagerModuleProxy>(M).getManager();

				auto GetTTI = [&](Function &F) -> TargetTransformInfo & {
				return FAM.getResult<TargetIRAnalysis>(F);
				};

				if (!convertToRelativeLookupTables(M, GetTTI))
				return PreservedAnalyses::all();

				return PreservedAnalyses::none();
				}

				namespace {

				/// Pass that converts lookup tables to relative lookup tables.
				class RelLookupTableConverterLegacyPass : public ModulePass {

				lebedev.riUnsubmitted Not Done Reply Inline Actions I would think this should be PreservedAnalyses PA; PA.preserveSet<CFGAnalyses>(); return PA; since this doesn't touch CFG at all. I think this should get rid of redundant `Running analysis: TargetIRAnalysis`. lebedev.ri: I would think this should be ``` PreservedAnalyses PA; PA.preserveSet<CFGAnalyses>(); return PA…
				public:
				/// Pass identification, replacement for typeid
				static char ID;

				/// Specify pass name for debug output
				StringRef getPassName() const override {
				return "Relative Lookup Table Converter";
				}

				RelLookupTableConverterLegacyPass() : ModulePass(ID) {
				initializeRelLookupTableConverterLegacyPassPass(
				*PassRegistry::getPassRegistry());
				}

				bool runOnModule(Module &M) override {
				auto GetTTI = [this](Function &F) -> TargetTransformInfo & {
				return this->getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);
				};
				leonardchanUnsubmitted Done Reply Inline Actions Should we be returning the value returned by this? leonardchan: Should we be returning the value returned by this?
				return convertToRelativeLookupTables(M, GetTTI);
				}

				void getAnalysisUsage(AnalysisUsage &AU) const override {
				AU.addRequired<TargetTransformInfoWrapperPass>();
				}
				};

				} // anonymous namespace

				char RelLookupTableConverterLegacyPass::ID = 0;

				INITIALIZE_PASS_BEGIN(RelLookupTableConverterLegacyPass,
				"rel-lookup-table-converter",
				"Convert to relative lookup tables", false, false)
				INITIALIZE_PASS_DEPENDENCY(TargetTransformInfoWrapperPass)
				INITIALIZE_PASS_END(RelLookupTableConverterLegacyPass,
				"rel-lookup-table-converter",
				"Convert to relative lookup tables", false, false)

				namespace llvm {
				ModulePass *createRelLookupTableConverterPass() {
				return new RelLookupTableConverterLegacyPass();
				}
				} // end namespace llvm

llvm/lib/Transforms/Utils/Utils.cpp

Show All 31 Lines	void llvm::initializeTransformUtils(PassRegistry &Registry) {
initializeInstNamerPass(Registry);		initializeInstNamerPass(Registry);
initializeLCSSAWrapperPassPass(Registry);		initializeLCSSAWrapperPassPass(Registry);
initializeLibCallsShrinkWrapLegacyPassPass(Registry);		initializeLibCallsShrinkWrapLegacyPassPass(Registry);
initializeLoopSimplifyPass(Registry);		initializeLoopSimplifyPass(Registry);
initializeLowerInvokeLegacyPassPass(Registry);		initializeLowerInvokeLegacyPassPass(Registry);
initializeLowerSwitchLegacyPassPass(Registry);		initializeLowerSwitchLegacyPassPass(Registry);
initializeNameAnonGlobalLegacyPassPass(Registry);		initializeNameAnonGlobalLegacyPassPass(Registry);
initializePromoteLegacyPassPass(Registry);		initializePromoteLegacyPassPass(Registry);
		initializeRelLookupTableConverterLegacyPassPass(Registry);
initializeStripNonLineTableDebugLegacyPassPass(Registry);		initializeStripNonLineTableDebugLegacyPassPass(Registry);
initializeUnifyFunctionExitNodesLegacyPassPass(Registry);		initializeUnifyFunctionExitNodesLegacyPassPass(Registry);
initializeMetaRenamerPass(Registry);		initializeMetaRenamerPass(Registry);
initializeStripGCRelocatesLegacyPass(Registry);		initializeStripGCRelocatesLegacyPass(Registry);
initializePredicateInfoPrinterLegacyPassPass(Registry);		initializePredicateInfoPrinterLegacyPassPass(Registry);
initializeInjectTLIMappingsLegacyPass(Registry);		initializeInjectTLIMappingsLegacyPass(Registry);
initializeFixIrreduciblePass(Registry);		initializeFixIrreduciblePass(Registry);
initializeUnifyLoopExitsLegacyPassPass(Registry);		initializeUnifyLoopExitsLegacyPassPass(Registry);
Show All 19 Lines

llvm/test/CodeGen/AMDGPU/opt-pipeline.ll

	Show First 20 Lines • Show All 300 Lines • ▼ Show 20 Lines
	; GCN-O1-NEXT: Loop Pass Manager			; GCN-O1-NEXT: Loop Pass Manager
	; GCN-O1-NEXT: Loop Sink			; GCN-O1-NEXT: Loop Sink
	; GCN-O1-NEXT: Lazy Branch Probability Analysis			; GCN-O1-NEXT: Lazy Branch Probability Analysis
	; GCN-O1-NEXT: Lazy Block Frequency Analysis			; GCN-O1-NEXT: Lazy Block Frequency Analysis
	; GCN-O1-NEXT: Optimization Remark Emitter			; GCN-O1-NEXT: Optimization Remark Emitter
	; GCN-O1-NEXT: Remove redundant instructions			; GCN-O1-NEXT: Remove redundant instructions
	; GCN-O1-NEXT: Hoist/decompose integer division and remainder			; GCN-O1-NEXT: Hoist/decompose integer division and remainder
	; GCN-O1-NEXT: Simplify the CFG			; GCN-O1-NEXT: Simplify the CFG
				; GCN-O1-NEXT: Relative Lookup Table Converter
				; GCN-O1-NEXT: FunctionPass Manager
	; GCN-O1-NEXT: Annotation Remarks			; GCN-O1-NEXT: Annotation Remarks

	; GCN-O1-NEXT: Pass Arguments:			; GCN-O1-NEXT: Pass Arguments:
	; GCN-O1-NEXT: FunctionPass Manager			; GCN-O1-NEXT: FunctionPass Manager
	; GCN-O1-NEXT: Dominator Tree Construction			; GCN-O1-NEXT: Dominator Tree Construction

	; GCN-O1-NEXT: Pass Arguments:			; GCN-O1-NEXT: Pass Arguments:
	; GCN-O1-NEXT: FunctionPass Manager			; GCN-O1-NEXT: FunctionPass Manager
	▲ Show 20 Lines • Show All 338 Lines • ▼ Show 20 Lines
	; GCN-O2-NEXT: Loop Pass Manager			; GCN-O2-NEXT: Loop Pass Manager
	; GCN-O2-NEXT: Loop Sink			; GCN-O2-NEXT: Loop Sink
	; GCN-O2-NEXT: Lazy Branch Probability Analysis			; GCN-O2-NEXT: Lazy Branch Probability Analysis
	; GCN-O2-NEXT: Lazy Block Frequency Analysis			; GCN-O2-NEXT: Lazy Block Frequency Analysis
	; GCN-O2-NEXT: Optimization Remark Emitter			; GCN-O2-NEXT: Optimization Remark Emitter
	; GCN-O2-NEXT: Remove redundant instructions			; GCN-O2-NEXT: Remove redundant instructions
	; GCN-O2-NEXT: Hoist/decompose integer division and remainder			; GCN-O2-NEXT: Hoist/decompose integer division and remainder
	; GCN-O2-NEXT: Simplify the CFG			; GCN-O2-NEXT: Simplify the CFG
				; GCN-O2-NEXT: Relative Lookup Table Converter
				; GCN-O2-NEXT: FunctionPass Manager
	; GCN-O2-NEXT: Annotation Remarks			; GCN-O2-NEXT: Annotation Remarks

	; GCN-O2-NEXT: Pass Arguments:			; GCN-O2-NEXT: Pass Arguments:
	; GCN-O2-NEXT: FunctionPass Manager			; GCN-O2-NEXT: FunctionPass Manager
	; GCN-O2-NEXT: Dominator Tree Construction			; GCN-O2-NEXT: Dominator Tree Construction

	; GCN-O2-NEXT: Pass Arguments:			; GCN-O2-NEXT: Pass Arguments:
	; GCN-O2-NEXT: FunctionPass Manager			; GCN-O2-NEXT: FunctionPass Manager
	▲ Show 20 Lines • Show All 343 Lines • ▼ Show 20 Lines
	; GCN-O3-NEXT: Loop Pass Manager			; GCN-O3-NEXT: Loop Pass Manager
	; GCN-O3-NEXT: Loop Sink			; GCN-O3-NEXT: Loop Sink
	; GCN-O3-NEXT: Lazy Branch Probability Analysis			; GCN-O3-NEXT: Lazy Branch Probability Analysis
	; GCN-O3-NEXT: Lazy Block Frequency Analysis			; GCN-O3-NEXT: Lazy Block Frequency Analysis
	; GCN-O3-NEXT: Optimization Remark Emitter			; GCN-O3-NEXT: Optimization Remark Emitter
	; GCN-O3-NEXT: Remove redundant instructions			; GCN-O3-NEXT: Remove redundant instructions
	; GCN-O3-NEXT: Hoist/decompose integer division and remainder			; GCN-O3-NEXT: Hoist/decompose integer division and remainder
	; GCN-O3-NEXT: Simplify the CFG			; GCN-O3-NEXT: Simplify the CFG
				; GCN-O3-NEXT: Relative Lookup Table Converter
				; GCN-O3-NEXT: FunctionPass Manager
	; GCN-O3-NEXT: Annotation Remarks			; GCN-O3-NEXT: Annotation Remarks

	; GCN-O3-NEXT: Pass Arguments:			; GCN-O3-NEXT: Pass Arguments:
	; GCN-O3-NEXT: FunctionPass Manager			; GCN-O3-NEXT: FunctionPass Manager
	; GCN-O3-NEXT: Dominator Tree Construction			; GCN-O3-NEXT: Dominator Tree Construction

	; GCN-O3-NEXT: Pass Arguments:			; GCN-O3-NEXT: Pass Arguments:
	; GCN-O3-NEXT: FunctionPass Manager			; GCN-O3-NEXT: FunctionPass Manager
	Show All 23 Lines

llvm/test/Other/new-pm-defaults.ll

	Show First 20 Lines • Show All 107 Lines • ▼ Show 20 Lines
	; CHECK-O-NEXT: Running pass: ModuleInlinerWrapperPass			; CHECK-O-NEXT: Running pass: ModuleInlinerWrapperPass
	; CHECK-O-NEXT: Running analysis: InlineAdvisorAnalysis			; CHECK-O-NEXT: Running analysis: InlineAdvisorAnalysis
	; CHECK-O-NEXT: Starting llvm::Module pass manager run.			; CHECK-O-NEXT: Starting llvm::Module pass manager run.
	; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}GlobalsAA			; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}GlobalsAA
	; CHECK-O-NEXT: Running analysis: GlobalsAA			; CHECK-O-NEXT: Running analysis: GlobalsAA
	; CHECK-O-NEXT: Running analysis: CallGraphAnalysis			; CHECK-O-NEXT: Running analysis: CallGraphAnalysis
	; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis			; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis
	; CHECK-O-NEXT: Running analysis: ProfileSummaryAnalysis			; CHECK-O-NEXT: Running analysis: ProfileSummaryAnalysis
	; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy			; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy
	; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis			; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis
	; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy			; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy
	; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy<{{.}}LazyCallGraph::SCC{{.}}>			; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy<{{.}}LazyCallGraph::SCC{{.}}>
	; CHECK-O-NEXT: Running pass: DevirtSCCRepeatedPass			; CHECK-O-NEXT: Running pass: DevirtSCCRepeatedPass
	; CHECK-O-NEXT: Starting CGSCC pass manager run.			; CHECK-O-NEXT: Starting CGSCC pass manager run.
	; CHECK-O-NEXT: Running pass: InlinerPass			; CHECK-O-NEXT: Running pass: InlinerPass
	; CHECK-O-NEXT: Running pass: InlinerPass			; CHECK-O-NEXT: Running pass: InlinerPass
	; CHECK-O-NEXT: Running pass: PostOrderFunctionAttrsPass			; CHECK-O-NEXT: Running pass: PostOrderFunctionAttrsPass
	; CHECK-O3-NEXT: Running pass: ArgumentPromotionPass			; CHECK-O3-NEXT: Running pass: ArgumentPromotionPass
	; CHECK-O2-NEXT: Running pass: OpenMPOptPass on (foo)			; CHECK-O2-NEXT: Running pass: OpenMPOptPass on (foo)
	▲ Show 20 Lines • Show All 124 Lines • ▼ Show 20 Lines
	; CHECK-O-NEXT: Running pass: DivRemPairsPass			; CHECK-O-NEXT: Running pass: DivRemPairsPass
	; CHECK-O-NEXT: Running pass: SimplifyCFGPass			; CHECK-O-NEXT: Running pass: SimplifyCFGPass
	; CHECK-O-NEXT: Running pass: SpeculateAroundPHIsPass			; CHECK-O-NEXT: Running pass: SpeculateAroundPHIsPass
	; CHECK-EP-OPTIMIZER-LAST: Running pass: NoOpFunctionPass			; CHECK-EP-OPTIMIZER-LAST: Running pass: NoOpFunctionPass
	; CHECK-O-NEXT: Finished llvm::Function pass manager run.			; CHECK-O-NEXT: Finished llvm::Function pass manager run.
	; CHECK-O-NEXT: Running pass: CGProfilePass			; CHECK-O-NEXT: Running pass: CGProfilePass
	; CHECK-O-NEXT: Running pass: GlobalDCEPass			; CHECK-O-NEXT: Running pass: GlobalDCEPass
	; CHECK-O-NEXT: Running pass: ConstantMergePass			; CHECK-O-NEXT: Running pass: ConstantMergePass
				; CHECK-O-NEXT: Running pass: RelLookupTableConverterPass
				; CHECK-O-NEXT: Running analysis: TargetIRAnalysis
	; CHECK-O-NEXT: Running pass: AnnotationRemarksPass on foo			; CHECK-O-NEXT: Running pass: AnnotationRemarksPass on foo
	; CHECK-LTO-NEXT: Running pass: CanonicalizeAliasesPass			; CHECK-LTO-NEXT: Running pass: CanonicalizeAliasesPass
	; CHECK-LTO-NEXT: Running pass: NameAnonGlobalPass			; CHECK-LTO-NEXT: Running pass: NameAnonGlobalPass
	; CHECK-O-NEXT: Running pass: PrintModulePass			; CHECK-O-NEXT: Running pass: PrintModulePass
	;			;
	; Make sure we get the IR back out without changes when we print the module.			; Make sure we get the IR back out without changes when we print the module.
	; CHECK-O-LABEL: define void @foo(i32 %n) local_unnamed_addr {			; CHECK-O-LABEL: define void @foo(i32 %n) local_unnamed_addr {
	; CHECK-O-NEXT: entry:			; CHECK-O-NEXT: entry:
	Show All 27 Lines

llvm/test/Other/new-pm-thinlto-defaults.ll

	Show First 20 Lines • Show All 92 Lines • ▼ Show 20 Lines
	; CHECK-O-NEXT: Running pass: ModuleInlinerWrapperPass			; CHECK-O-NEXT: Running pass: ModuleInlinerWrapperPass
	; CHECK-O-NEXT: Running analysis: InlineAdvisorAnalysis			; CHECK-O-NEXT: Running analysis: InlineAdvisorAnalysis
	; CHECK-O-NEXT: Starting llvm::Module pass manager run.			; CHECK-O-NEXT: Starting llvm::Module pass manager run.
	; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}GlobalsAA			; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}GlobalsAA
	; CHECK-O-NEXT: Running analysis: GlobalsAA			; CHECK-O-NEXT: Running analysis: GlobalsAA
	; CHECK-O-NEXT: Running analysis: CallGraphAnalysis			; CHECK-O-NEXT: Running analysis: CallGraphAnalysis
	; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis			; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis
	; CHECK-PRELINK-O-NEXT: Running analysis: ProfileSummaryAnalysis			; CHECK-PRELINK-O-NEXT: Running analysis: ProfileSummaryAnalysis
	; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy			; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy
	; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis			; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis
	; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy			; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy
	; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy			; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy
	; CHECK-O-NEXT: Running pass: DevirtSCCRepeatedPass			; CHECK-O-NEXT: Running pass: DevirtSCCRepeatedPass
	; CHECK-O-NEXT: Starting CGSCC pass manager run.			; CHECK-O-NEXT: Starting CGSCC pass manager run.
	; CHECK-O-NEXT: Running pass: InlinerPass			; CHECK-O-NEXT: Running pass: InlinerPass
	; CHECK-O-NEXT: Running pass: InlinerPass			; CHECK-O-NEXT: Running pass: InlinerPass
	; CHECK-O-NEXT: Running pass: PostOrderFunctionAttrsPass			; CHECK-O-NEXT: Running pass: PostOrderFunctionAttrsPass
	; CHECK-O3-NEXT: Running pass: ArgumentPromotionPass			; CHECK-O3-NEXT: Running pass: ArgumentPromotionPass
	; CHECK-O2-NEXT: Running pass: OpenMPOptPass on (foo)			; CHECK-O2-NEXT: Running pass: OpenMPOptPass on (foo)
	▲ Show 20 Lines • Show All 126 Lines • ▼ Show 20 Lines
	; CHECK-POSTLINK-O-NEXT: Running pass: InstSimplifyPass			; CHECK-POSTLINK-O-NEXT: Running pass: InstSimplifyPass
	; CHECK-POSTLINK-O-NEXT: Running pass: DivRemPairsPass			; CHECK-POSTLINK-O-NEXT: Running pass: DivRemPairsPass
	; CHECK-POSTLINK-O-NEXT: Running pass: SimplifyCFGPass			; CHECK-POSTLINK-O-NEXT: Running pass: SimplifyCFGPass
	; CHECK-POSTLINK-O-NEXT: Running pass: SpeculateAroundPHIsPass			; CHECK-POSTLINK-O-NEXT: Running pass: SpeculateAroundPHIsPass
	; CHECK-POSTLINK-O-NEXT: Finished llvm::Function pass manager run.			; CHECK-POSTLINK-O-NEXT: Finished llvm::Function pass manager run.
	; CHECK-POSTLINK-O-NEXT: Running pass: CGProfilePass			; CHECK-POSTLINK-O-NEXT: Running pass: CGProfilePass
	; CHECK-POSTLINK-O-NEXT: Running pass: GlobalDCEPass			; CHECK-POSTLINK-O-NEXT: Running pass: GlobalDCEPass
	; CHECK-POSTLINK-O-NEXT: Running pass: ConstantMergePass			; CHECK-POSTLINK-O-NEXT: Running pass: ConstantMergePass
				; CHECK-POSTLINK-O-NEXT: Running pass: RelLookupTableConverterPass
				; CHECK-POSTLINK-O-NEXT: Running analysis: TargetIRAnalysis
	; CHECK-O-NEXT: Running pass: AnnotationRemarksPass on foo			; CHECK-O-NEXT: Running pass: AnnotationRemarksPass on foo
	; CHECK-PRELINK-O-NEXT: Running pass: CanonicalizeAliasesPass			; CHECK-PRELINK-O-NEXT: Running pass: CanonicalizeAliasesPass
	; CHECK-PRELINK-O-NEXT: Running pass: NameAnonGlobalPass			; CHECK-PRELINK-O-NEXT: Running pass: NameAnonGlobalPass
	; CHECK-O-NEXT: Running pass: PrintModulePass			; CHECK-O-NEXT: Running pass: PrintModulePass

	; Make sure we get the IR back out without changes when we print the module.			; Make sure we get the IR back out without changes when we print the module.
	; CHECK-O-LABEL: define void @foo(i32 %n) local_unnamed_addr {			; CHECK-O-LABEL: define void @foo(i32 %n) local_unnamed_addr {
	; CHECK-O-NEXT: entry:			; CHECK-O-NEXT: entry:
	Show All 27 Lines

llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll

	Show First 20 Lines • Show All 62 Lines • ▼ Show 20 Lines
	; CHECK-O-DAG: Running analysis: LoopAnalysis on foo			; CHECK-O-DAG: Running analysis: LoopAnalysis on foo
	; CHECK-O-DAG: Running analysis: BranchProbabilityAnalysis on foo			; CHECK-O-DAG: Running analysis: BranchProbabilityAnalysis on foo
	; CHECK-O-NEXT: Running analysis: PostDominatorTreeAnalysis on foo			; CHECK-O-NEXT: Running analysis: PostDominatorTreeAnalysis on foo
	; CHECK-O-NEXT: Running pass: SimplifyCFGPass			; CHECK-O-NEXT: Running pass: SimplifyCFGPass
	; CHECK-O-NEXT: Finished {{.*}}Function pass manager run.			; CHECK-O-NEXT: Finished {{.*}}Function pass manager run.
	; CHECK-O-NEXT: Running pass: ModuleInlinerWrapperPass			; CHECK-O-NEXT: Running pass: ModuleInlinerWrapperPass
	; CHECK-O-NEXT: Running analysis: InlineAdvisorAnalysis			; CHECK-O-NEXT: Running analysis: InlineAdvisorAnalysis
	; CHECK-O-NEXT: Starting {{.*}}Module pass manager run.			; CHECK-O-NEXT: Starting {{.*}}Module pass manager run.
	; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}GlobalsAA			; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}GlobalsAA
	; CHECK-O-NEXT: Running analysis: GlobalsAA			; CHECK-O-NEXT: Running analysis: GlobalsAA
	; CHECK-O-NEXT: Running analysis: CallGraphAnalysis			; CHECK-O-NEXT: Running analysis: CallGraphAnalysis
	; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis			; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis
	; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy			; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy
	; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis			; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis
	; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy			; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy
	; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy<{{.}}LazyCallGraph::SCC{{.}}>			; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy<{{.}}LazyCallGraph::SCC{{.}}>
	; CHECK-O-NEXT: Running pass: DevirtSCCRepeatedPass			; CHECK-O-NEXT: Running pass: DevirtSCCRepeatedPass
	; CHECK-O-NEXT: Starting CGSCC pass manager run.			; CHECK-O-NEXT: Starting CGSCC pass manager run.
	; CHECK-O-NEXT: Running pass: InlinerPass			; CHECK-O-NEXT: Running pass: InlinerPass
	; CHECK-O-NEXT: Running pass: InlinerPass			; CHECK-O-NEXT: Running pass: InlinerPass
	▲ Show 20 Lines • Show All 124 Lines • ▼ Show 20 Lines
	; CHECK-O-NEXT: Running pass: InstSimplifyPass			; CHECK-O-NEXT: Running pass: InstSimplifyPass
	; CHECK-O-NEXT: Running pass: DivRemPairsPass			; CHECK-O-NEXT: Running pass: DivRemPairsPass
	; CHECK-O-NEXT: Running pass: SimplifyCFGPass			; CHECK-O-NEXT: Running pass: SimplifyCFGPass
	; CHECK-O-NEXT: Running pass: SpeculateAroundPHIsPass			; CHECK-O-NEXT: Running pass: SpeculateAroundPHIsPass
	; CHECK-O-NEXT: Finished {{.*}}Function pass manager run.			; CHECK-O-NEXT: Finished {{.*}}Function pass manager run.
	; CHECK-O-NEXT: Running pass: CGProfilePass			; CHECK-O-NEXT: Running pass: CGProfilePass
	; CHECK-O-NEXT: Running pass: GlobalDCEPass			; CHECK-O-NEXT: Running pass: GlobalDCEPass
	; CHECK-O-NEXT: Running pass: ConstantMergePass			; CHECK-O-NEXT: Running pass: ConstantMergePass
				; CHECK-O-NEXT: Running pass: RelLookupTableConverterPass
				; CHECK-O-NEXT: Running analysis: TargetIRAnalysis
	; CHECK-O-NEXT: Running pass: AnnotationRemarksPass on foo			; CHECK-O-NEXT: Running pass: AnnotationRemarksPass on foo
	; CHECK-O-NEXT: Running pass: PrintModulePass			; CHECK-O-NEXT: Running pass: PrintModulePass

	; Make sure we get the IR back out without changes when we print the module.			; Make sure we get the IR back out without changes when we print the module.
	; CHECK-O-LABEL: define void @foo(i32 %n) local_unnamed_addr {			; CHECK-O-LABEL: define void @foo(i32 %n) local_unnamed_addr {
	; CHECK-O-NEXT: entry:			; CHECK-O-NEXT: entry:
	; CHECK-O-NEXT: br label %loop			; CHECK-O-NEXT: br label %loop
	; CHECK-O: loop:			; CHECK-O: loop:
	▲ Show 20 Lines • Show All 57 Lines • Show Last 20 Lines

llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll

	Show First 20 Lines • Show All 72 Lines • ▼ Show 20 Lines
	; CHECK-O-DAG: Running analysis: BranchProbabilityAnalysis on foo			; CHECK-O-DAG: Running analysis: BranchProbabilityAnalysis on foo
	; CHECK-O-NEXT: Running analysis: PostDominatorTreeAnalysis on foo			; CHECK-O-NEXT: Running analysis: PostDominatorTreeAnalysis on foo
	; CHECK-O-NEXT: Running pass: SimplifyCFGPass on foo			; CHECK-O-NEXT: Running pass: SimplifyCFGPass on foo
	; CHECK-O-NEXT: Finished {{.*}}Function pass manager run			; CHECK-O-NEXT: Finished {{.*}}Function pass manager run

	; CHECK-O-NEXT: Running pass: ModuleInlinerWrapperPass			; CHECK-O-NEXT: Running pass: ModuleInlinerWrapperPass
	; CHECK-O-NEXT: Running analysis: InlineAdvisorAnalysis			; CHECK-O-NEXT: Running analysis: InlineAdvisorAnalysis
	; CHECK-O-NEXT: Starting {{.*}}Module pass manager run.			; CHECK-O-NEXT: Starting {{.*}}Module pass manager run.
	; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}GlobalsAA			; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}GlobalsAA
	; CHECK-O-NEXT: Running analysis: GlobalsAA			; CHECK-O-NEXT: Running analysis: GlobalsAA
	; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis			; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis
	; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy			; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy
	; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis			; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis
	; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy			; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy
	; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy			; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy
	; CHECK-O-NEXT: Running pass: DevirtSCCRepeatedPass			; CHECK-O-NEXT: Running pass: DevirtSCCRepeatedPass
	; CHECK-O-NEXT: Starting CGSCC pass manager run.			; CHECK-O-NEXT: Starting CGSCC pass manager run.
	; CHECK-O-NEXT: Running pass: InlinerPass			; CHECK-O-NEXT: Running pass: InlinerPass
	; CHECK-O-NEXT: Running pass: InlinerPass			; CHECK-O-NEXT: Running pass: InlinerPass
	▲ Show 20 Lines • Show All 127 Lines • ▼ Show 20 Lines
	; CHECK-O-NEXT: Running pass: InstSimplifyPass			; CHECK-O-NEXT: Running pass: InstSimplifyPass
	; CHECK-O-NEXT: Running pass: DivRemPairsPass			; CHECK-O-NEXT: Running pass: DivRemPairsPass
	; CHECK-O-NEXT: Running pass: SimplifyCFGPass			; CHECK-O-NEXT: Running pass: SimplifyCFGPass
	; CHECK-O-NEXT: Running pass: SpeculateAroundPHIsPass			; CHECK-O-NEXT: Running pass: SpeculateAroundPHIsPass
	; CHECK-O-NEXT: Finished {{.*}}Function pass manager run.			; CHECK-O-NEXT: Finished {{.*}}Function pass manager run.
	; CHECK-O-NEXT: Running pass: CGProfilePass			; CHECK-O-NEXT: Running pass: CGProfilePass
	; CHECK-O-NEXT: Running pass: GlobalDCEPass			; CHECK-O-NEXT: Running pass: GlobalDCEPass
	; CHECK-O-NEXT: Running pass: ConstantMergePass			; CHECK-O-NEXT: Running pass: ConstantMergePass
				; CHECK-O-NEXT: Running pass: RelLookupTableConverterPass
				; CHECK-O-NEXT: Running analysis: TargetIRAnalysis
	; CHECK-O-NEXT: Running pass: AnnotationRemarksPass on foo			; CHECK-O-NEXT: Running pass: AnnotationRemarksPass on foo
	; CHECK-O-NEXT: Running pass: PrintModulePass			; CHECK-O-NEXT: Running pass: PrintModulePass

	; Make sure we get the IR back out without changes when we print the module.			; Make sure we get the IR back out without changes when we print the module.
	; CHECK-O-LABEL: define void @foo(i32 %n) local_unnamed_addr			; CHECK-O-LABEL: define void @foo(i32 %n) local_unnamed_addr
	; CHECK-O-NEXT: entry:			; CHECK-O-NEXT: entry:
	; CHECK-O-NEXT: br label %loop			; CHECK-O-NEXT: br label %loop
	; CHECK-O: loop:			; CHECK-O: loop:
	Show All 27 Lines

llvm/test/Other/opt-O2-pipeline.ll

	Show First 20 Lines • Show All 301 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: Loop Pass Manager			; CHECK-NEXT: Loop Pass Manager
	; CHECK-NEXT: Loop Sink			; CHECK-NEXT: Loop Sink
	; CHECK-NEXT: Lazy Branch Probability Analysis			; CHECK-NEXT: Lazy Branch Probability Analysis
	; CHECK-NEXT: Lazy Block Frequency Analysis			; CHECK-NEXT: Lazy Block Frequency Analysis
	; CHECK-NEXT: Optimization Remark Emitter			; CHECK-NEXT: Optimization Remark Emitter
	; CHECK-NEXT: Remove redundant instructions			; CHECK-NEXT: Remove redundant instructions
	; CHECK-NEXT: Hoist/decompose integer division and remainder			; CHECK-NEXT: Hoist/decompose integer division and remainder
	; CHECK-NEXT: Simplify the CFG			; CHECK-NEXT: Simplify the CFG
				; CHECK-NEXT: Relative Lookup Table Converter
				; CHECK-NEXT: FunctionPass Manager
	; CHECK-NEXT: Annotation Remarks			; CHECK-NEXT: Annotation Remarks
	; CHECK-NEXT: Module Verifier			; CHECK-NEXT: Module Verifier
	; CHECK-NEXT: Bitcode Writer			; CHECK-NEXT: Bitcode Writer
	; CHECK-NEXT: Pass Arguments:			; CHECK-NEXT: Pass Arguments:
	; CHECK-NEXT: FunctionPass Manager			; CHECK-NEXT: FunctionPass Manager
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Pass Arguments:			; CHECK-NEXT: Pass Arguments:
	; CHECK-NEXT: Target Library Information			; CHECK-NEXT: Target Library Information
	Show All 18 Lines

llvm/test/Other/opt-O3-pipeline-enable-matrix.ll

	Show First 20 Lines • Show All 313 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: Loop Pass Manager			; CHECK-NEXT: Loop Pass Manager
	; CHECK-NEXT: Loop Sink			; CHECK-NEXT: Loop Sink
	; CHECK-NEXT: Lazy Branch Probability Analysis			; CHECK-NEXT: Lazy Branch Probability Analysis
	; CHECK-NEXT: Lazy Block Frequency Analysis			; CHECK-NEXT: Lazy Block Frequency Analysis
	; CHECK-NEXT: Optimization Remark Emitter			; CHECK-NEXT: Optimization Remark Emitter
	; CHECK-NEXT: Remove redundant instructions			; CHECK-NEXT: Remove redundant instructions
	; CHECK-NEXT: Hoist/decompose integer division and remainder			; CHECK-NEXT: Hoist/decompose integer division and remainder
	; CHECK-NEXT: Simplify the CFG			; CHECK-NEXT: Simplify the CFG
				; CHECK-NEXT: Relative Lookup Table Converter
				; CHECK-NEXT: FunctionPass Manager
				rnkUnsubmitted Not Done Reply Inline Actions Putting a ModulePass in the middle of the CodeGen pass pipeline creates a "pass barrier": now instead of applying every pass to each function in turn, the old pass manager will stop, run this whole-module pass, and then run subseqeunt passes in the next function pass manager on each function in turn. This isn't ideal. @aeubanks, can you follow-up to make sure this is addressed? We had the same issues with the SymbolRewriter pass, which if you grep for "Rewrite Symbols" you can see has the same issue. I remember writing a patch to fix it, but I guess I never landed it. rnk: Putting a ModulePass in the middle of the CodeGen pass pipeline creates a "pass barrier": now…
				aeubanksUnsubmitted Not Done Reply Inline Actions I see "Rewrite Symbols" in the codegen pipeline and yeah it's splitting the function pass manager. For this patch, can we just not add the pass to the legacy PM pipeline? It's deprecated and the new PM is already the default for the optimization pipeline. aeubanks: I see "Rewrite Symbols" in the codegen pipeline and yeah it's splitting the function pass…
				aeubanksUnsubmitted Not Done Reply Inline Actions (https://reviews.llvm.org/D99707 for anybody interested) aeubanks: (https://reviews.llvm.org/D99707 for anybody interested)
				gulfemAuthorUnsubmitted Done Reply Inline Actions For this patch, can we just not add the pass to the legacy PM pipeline? It's deprecated and the new PM is already the default for the optimization pipeline. @rnk @aeubanks If it causes issues, I'm ok to remove it from the legacy PM pipeline. When I land this patch, I'll only add it to new PM. gulfem: > For this patch, can we just not add the pass to the legacy PM pipeline? It's deprecated and…
	; CHECK-NEXT: Annotation Remarks			; CHECK-NEXT: Annotation Remarks
	; CHECK-NEXT: Module Verifier			; CHECK-NEXT: Module Verifier
	; CHECK-NEXT: Bitcode Writer			; CHECK-NEXT: Bitcode Writer
	; CHECK-NEXT: Pass Arguments:			; CHECK-NEXT: Pass Arguments:
	; CHECK-NEXT: FunctionPass Manager			; CHECK-NEXT: FunctionPass Manager
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Pass Arguments:			; CHECK-NEXT: Pass Arguments:
	; CHECK-NEXT: Target Library Information			; CHECK-NEXT: Target Library Information
	Show All 18 Lines

llvm/test/Other/opt-O3-pipeline.ll

	Show First 20 Lines • Show All 306 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: Loop Pass Manager			; CHECK-NEXT: Loop Pass Manager
	; CHECK-NEXT: Loop Sink			; CHECK-NEXT: Loop Sink
	; CHECK-NEXT: Lazy Branch Probability Analysis			; CHECK-NEXT: Lazy Branch Probability Analysis
	; CHECK-NEXT: Lazy Block Frequency Analysis			; CHECK-NEXT: Lazy Block Frequency Analysis
	; CHECK-NEXT: Optimization Remark Emitter			; CHECK-NEXT: Optimization Remark Emitter
	; CHECK-NEXT: Remove redundant instructions			; CHECK-NEXT: Remove redundant instructions
	; CHECK-NEXT: Hoist/decompose integer division and remainder			; CHECK-NEXT: Hoist/decompose integer division and remainder
	; CHECK-NEXT: Simplify the CFG			; CHECK-NEXT: Simplify the CFG
				; CHECK-NEXT: Relative Lookup Table Converter
				; CHECK-NEXT: FunctionPass Manager
	; CHECK-NEXT: Annotation Remarks			; CHECK-NEXT: Annotation Remarks
	; CHECK-NEXT: Module Verifier			; CHECK-NEXT: Module Verifier
	; CHECK-NEXT: Bitcode Writer			; CHECK-NEXT: Bitcode Writer
	; CHECK-NEXT: Pass Arguments:			; CHECK-NEXT: Pass Arguments:
	; CHECK-NEXT: FunctionPass Manager			; CHECK-NEXT: FunctionPass Manager
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Pass Arguments:			; CHECK-NEXT: Pass Arguments:
	; CHECK-NEXT: Target Library Information			; CHECK-NEXT: Target Library Information
	Show All 18 Lines

llvm/test/Other/opt-Os-pipeline.ll

	Show First 20 Lines • Show All 287 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: Loop Pass Manager			; CHECK-NEXT: Loop Pass Manager
	; CHECK-NEXT: Loop Sink			; CHECK-NEXT: Loop Sink
	; CHECK-NEXT: Lazy Branch Probability Analysis			; CHECK-NEXT: Lazy Branch Probability Analysis
	; CHECK-NEXT: Lazy Block Frequency Analysis			; CHECK-NEXT: Lazy Block Frequency Analysis
	; CHECK-NEXT: Optimization Remark Emitter			; CHECK-NEXT: Optimization Remark Emitter
	; CHECK-NEXT: Remove redundant instructions			; CHECK-NEXT: Remove redundant instructions
	; CHECK-NEXT: Hoist/decompose integer division and remainder			; CHECK-NEXT: Hoist/decompose integer division and remainder
	; CHECK-NEXT: Simplify the CFG			; CHECK-NEXT: Simplify the CFG
				; CHECK-NEXT: Relative Lookup Table Converter
				; CHECK-NEXT: FunctionPass Manager
	; CHECK-NEXT: Annotation Remarks			; CHECK-NEXT: Annotation Remarks
	; CHECK-NEXT: Module Verifier			; CHECK-NEXT: Module Verifier
	; CHECK-NEXT: Bitcode Writer			; CHECK-NEXT: Bitcode Writer
	; CHECK-NEXT: Pass Arguments:			; CHECK-NEXT: Pass Arguments:
	; CHECK-NEXT: FunctionPass Manager			; CHECK-NEXT: FunctionPass Manager
	; CHECK-NEXT: Dominator Tree Construction			; CHECK-NEXT: Dominator Tree Construction
	; CHECK-NEXT: Pass Arguments:			; CHECK-NEXT: Pass Arguments:
	; CHECK-NEXT: Target Library Information			; CHECK-NEXT: Target Library Information
	Show All 18 Lines

llvm/test/Other/pass-pipelines.ll

	Show First 20 Lines • Show All 100 Lines • ▼ Show 20 Lines
	; the runtime unrolling though.			; the runtime unrolling though.
	; CHECK-O2: Loop Pass Manager			; CHECK-O2: Loop Pass Manager
	; CHECK-O2-NEXT: Loop Invariant Code Motion			; CHECK-O2-NEXT: Loop Invariant Code Motion
	; SPLIT: Hot Cold Splitting			; SPLIT: Hot Cold Splitting
	; CHECK-O2: FunctionPass Manager			; CHECK-O2: FunctionPass Manager
	; CHECK-O2: Loop Pass Manager			; CHECK-O2: Loop Pass Manager
	; CHECK-O2-NEXT: Loop Sink			; CHECK-O2-NEXT: Loop Sink
	; CHECK-O2: Simplify the CFG			; CHECK-O2: Simplify the CFG
				; CHECK-O2: Relative Lookup Table Converter
				; CHECK-O2: FunctionPass Manager
	; CHECK-O2-NOT: Manager			; CHECK-O2-NOT: Manager
	;			;
	; FIXME: There really shouldn't be another pass manager, especially one that			; FIXME: There really shouldn't be another pass manager, especially one that
	; just builds the domtree. It doesn't even run the verifier.			; just builds the domtree. It doesn't even run the verifier.
	; CHECK-O2: Pass Arguments:			; CHECK-O2: Pass Arguments:
	; CHECK-O2: FunctionPass Manager			; CHECK-O2: FunctionPass Manager
	; CHECK-O2-NEXT: Dominator Tree Construction			; CHECK-O2-NEXT: Dominator Tree Construction

	define void @foo() {			define void @foo() {
	ret void			ret void
	}			}

llvm/test/Transforms/RelLookupTableConverter/relative_lookup_table.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -rel-lookup-table-converter -relocation-model=pic -S \| FileCheck %s
				; RUN: opt < %s -passes=rel-lookup-table-converter -relocation-model=pic -S \| FileCheck %s
				leonardchanUnsubmitted Not Done Reply Inline Actions We should also check some other `RUN`s to check that this isn't run on cases that return false in `shouldBuildRelLookupTables`: non-PIC, non-64-bit, other code model sizes, etc. leonardchan: We should also check some other `RUN`s to check that this isn't run on cases that return false…
				target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				@.str = private unnamed_addr constant [5 x i8] c"zero\00", align 1
				@.str.1 = private unnamed_addr constant [4 x i8] c"one\00", align 1
				@.str.2 = private unnamed_addr constant [4 x i8] c"two\00", align 1
				@.str.3 = private unnamed_addr constant [8 x i8] c"default\00", align 1
				@.str.4 = private unnamed_addr constant [6 x i8] c"three\00", align 1
				@.str.5 = private unnamed_addr constant [5 x i8] c"str1\00", align 1
				@.str.6 = private unnamed_addr constant [5 x i8] c"str2\00", align 1
				@.str.7 = private unnamed_addr constant [12 x i8] c"singlevalue\00", align 1

				@a1 = external global i32, align 4
				@b1 = external global i32, align 4
				@c1 = external global i32, align 4
				@d1 = external global i32, align 4

				@a2 = internal global i32 0, align 4
				@b2 = internal global i32 0, align 4
				@c2 = internal global i32 0, align 4
				@d2 = internal global i32 0, align 4

				@hidden0 = external hidden global i32, align 8
				@hidden1 = external hidden global i32, align 8
				@hidden2 = external hidden global i32, align 8
				@hidden3 = external hidden global i32, align 8

				@switch.table.no_dso_local = private unnamed_addr constant [3 x i32] [i32 @a1, i32* @b1, i32* @c1], align 8

				@switch.table.dso_local = private unnamed_addr constant [3 x i32] [i32 @a2, i32* @b2, i32* @c2], align 8

				@switch.table.hidden = private unnamed_addr constant [3 x i32] [i32 @hidden0, i32* @hidden1, i32* @hidden2], align 8

				@switch.table.string_table = private unnamed_addr constant [3 x i8*]
				[
				i8* getelementptr inbounds ([5 x i8], [5 x i8]* @.str, i64 0, i64 0),
				i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.1, i64 0, i64 0),
				i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.2, i64 0, i64 0)
				], align 8

				@switch.table.string_table_holes = private unnamed_addr constant [4 x i8*]
				[
				i8* getelementptr inbounds ([5 x i8], [5 x i8]* @.str, i64 0, i64 0),
				i8* getelementptr inbounds ([8 x i8], [8 x i8]* @.str.3, i64 0, i64 0),
				i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.2, i64 0, i64 0),
				i8* getelementptr inbounds ([6 x i8], [6 x i8]* @.str.4, i64 0, i64 0)
				], align 8

				@switch.table.single_value = private unnamed_addr constant [3 x i8*]
				[
				i8* getelementptr inbounds ([5 x i8], [5 x i8]* @.str, i64 0, i64 0),
				i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.1, i64 0, i64 0),
				i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.2, i64 0, i64 0)
				], align 8

				@user_defined_lookup_table.table = internal unnamed_addr constant [3 x i8*]
				[
				i8* getelementptr inbounds ([5 x i8], [5 x i8]* @.str, i32 0, i32 0),
				i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.1, i32 0, i32 0),
				i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.2, i32 0, i32 0)
				], align 16

				; Lookup table for non dso-local integer pointers
				; CHECK: @switch.table.no_dso_local = private unnamed_addr constant [3 x i32] [i32 @a1, i32* @b1, i32* @c1], align

				; Relative switch lookup table for dso-local integer pointers
				; CHECK: @reltable.dso_local = private unnamed_addr constant [3 x i32]
				; CHECK-SAME: [
				; CHECK-SAME: i32 trunc (i64 sub (i64 ptrtoint (i32* @a2 to i64), i64 ptrtoint ([3 x i32]* @reltable.dso_local to i64)) to i32),
				; CHECK-SAME: i32 trunc (i64 sub (i64 ptrtoint (i32* @b2 to i64), i64 ptrtoint ([3 x i32]* @reltable.dso_local to i64)) to i32),
				; CHECK-SAME: i32 trunc (i64 sub (i64 ptrtoint (i32* @c2 to i64), i64 ptrtoint ([3 x i32]* @reltable.dso_local to i64)) to i32)
				; CHECK-SAME: ], align 4

				; Relative switch lookup table for integer pointers with hidden visibility
				; CHECK: @reltable.hidden = private unnamed_addr constant [3 x i32]
				; CHECK-SAME: [
				; CHECK-SAME: i32 trunc (i64 sub (i64 ptrtoint (i32* @hidden0 to i64), i64 ptrtoint ([3 x i32]* @reltable.hidden to i64)) to i32),
				; CHECK-SAME: i32 trunc (i64 sub (i64 ptrtoint (i32* @hidden1 to i64), i64 ptrtoint ([3 x i32]* @reltable.hidden to i64)) to i32),
				; CHECK-SAME: i32 trunc (i64 sub (i64 ptrtoint (i32* @hidden2 to i64), i64 ptrtoint ([3 x i32]* @reltable.hidden to i64)) to i32)
				; CHECK-SAME: ], align 4

				; Relative switch lookup table for strings
				; CHECK: @reltable.string_table = private unnamed_addr constant [3 x i32]
				; CHECK-SAME: [
				; CHECK-SAME: i32 trunc (i64 sub (i64 ptrtoint ([5 x i8]* @.str to i64), i64 ptrtoint ([3 x i32]* @reltable.string_table to i64)) to i32),
				; CHECK-SAME: i32 trunc (i64 sub (i64 ptrtoint ([4 x i8]* @.str.1 to i64), i64 ptrtoint ([3 x i32]* @reltable.string_table to i64)) to i32),
				; CHECK-SAME: i32 trunc (i64 sub (i64 ptrtoint ([4 x i8]* @.str.2 to i64), i64 ptrtoint ([3 x i32]* @reltable.string_table to i64)) to i32)
				; CHECK-SAME: ], align 4

				; Relative switch lookup table for strings with holes, where holes are filled with relative offset to default values
				; CHECK: @reltable.string_table_holes = private unnamed_addr constant [4 x i32]
				; CHECK-SAME: [
				; CHECK-SAME: i32 trunc (i64 sub (i64 ptrtoint ([5 x i8]* @.str to i64), i64 ptrtoint ([4 x i32]* @reltable.string_table_holes to i64)) to i32),
				; CHECK-SAME: i32 trunc (i64 sub (i64 ptrtoint ([8 x i8]* @.str.3 to i64), i64 ptrtoint ([4 x i32]* @reltable.string_table_holes to i64)) to i32),
				; CHECK-SAME: i32 trunc (i64 sub (i64 ptrtoint ([4 x i8]* @.str.2 to i64), i64 ptrtoint ([4 x i32]* @reltable.string_table_holes to i64)) to i32),
				; CHECK-SAME: i32 trunc (i64 sub (i64 ptrtoint ([6 x i8]* @.str.4 to i64), i64 ptrtoint ([4 x i32]* @reltable.string_table_holes to i64)) to i32)
				; CHECK-SAME: ], align 4

				; Single value check
				; CHECK: @reltable.single_value = private unnamed_addr constant [3 x i32]
				; CHECK-SAME: [
				; CHECK-SAME: i32 trunc (i64 sub (i64 ptrtoint ([5 x i8]* @.str to i64), i64 ptrtoint ([3 x i32]* @reltable.single_value to i64)) to i32),
				; CHECK-SAME: i32 trunc (i64 sub (i64 ptrtoint ([4 x i8]* @.str.1 to i64), i64 ptrtoint ([3 x i32]* @reltable.single_value to i64)) to i32),
				; CHECK-SAME: i32 trunc (i64 sub (i64 ptrtoint ([4 x i8]* @.str.2 to i64), i64 ptrtoint ([3 x i32]* @reltable.single_value to i64)) to i32)
				; CHECK-SAME: ], align 4
				;

				; Lookup table check for non dso-local integer pointers
				define i32* @no_dso_local(i32 %cond) {
				; CHECK-LABEL: @no_dso_local(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = icmp ult i32 [[COND:%.]], 3
				; CHECK-NEXT: br i1 [[TMP0]], label [[SWITCH_LOOKUP:%.]], label [[RETURN:%.]]
				; CHECK: switch.lookup:
				; CHECK-NEXT: [[SWITCH_GEP:%.]] = getelementptr inbounds [3 x i32], [3 x i32] @switch.table.no_dso_local, i32 0, i32 [[COND:%.*]]
				; CHECK-NEXT: [[SWITCH_LOAD:%.]] = load i32, i32** [[SWITCH_GEP]], align 8
				; CHECK-NEXT: ret i32* [[SWITCH_LOAD]]
				; CHECK: return:
				; CHECK-NEXT: ret i32* @d1
				;
				entry:
				%0 = icmp ult i32 %cond, 3
				br i1 %0, label %switch.lookup, label %return

				switch.lookup: ; preds = %entry
				%switch.gep = getelementptr inbounds [3 x i32], [3 x i32]* @switch.table.no_dso_local, i32 0, i32 %cond
				%switch.load = load i32, i32* %switch.gep, align 8
				ret i32* %switch.load

				return: ; preds = %entry
				ret i32* @d1
				}

				; Relative switch lookup table for dso-local integer pointers
				define i32* @dso_local(i32 %cond) {
				; CHECK-LABEL: @dso_local(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = icmp ult i32 [[COND:%.]], 3
				; CHECK-NEXT: br i1 [[TMP0]], label [[SWITCH_LOOKUP:%.]], label [[RETURN:%.]]
				; CHECK: switch.lookup:
				; CHECK-NEXT: [[RELTABLE_SHIFT:%.*]] = shl i32 %cond, 2
				; CHECK-NEXT: [[RELTABLE_INTRINSIC:%.]] = call i8 @llvm.load.relative.i32(i8* bitcast ([3 x i32]* @reltable.dso_local to i8*), i32 [[RELTABLE_SHIFT]])
				; CHECK-NEXT: [[BIT_CAST:%.]] = bitcast i8 [[RELTABLE_INTRINSIC]] to i32*
				; CHECK-NEXT: ret i32* [[BIT_CAST]]
				; CHECK: return:
				; CHECK-NEXT: ret i32* @d2
				;
				entry:
				%0 = icmp ult i32 %cond, 3
				br i1 %0, label %switch.lookup, label %return

				switch.lookup: ; preds = %entry
				%switch.gep = getelementptr inbounds [3 x i32], [3 x i32]* @switch.table.dso_local, i32 0, i32 %cond
				%switch.load = load i32, i32* %switch.gep, align 8
				ret i32* %switch.load

				return: ; preds = %entry
				ret i32* @d2
				}

				; Relative switch lookup table for integer pointers with hidden visibility
				define i32* @hidden(i32 %cond) {
				; CHECK-LABEL: @hidden(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = icmp ult i32 [[COND:%.]], 3
				; CHECK-NEXT: br i1 [[TMP0]], label [[SWITCH_LOOKUP:%.]], label [[RETURN:%.]]
				; CHECK: switch.lookup:
				; CHECK-NEXT: [[RELTABLE_SHIFT:%.*]] = shl i32 %cond, 2
				; CHECK-NEXT: [[RELTABLE_INTRINSIC:%.]] = call i8 @llvm.load.relative.i32(i8* bitcast ([3 x i32]* @reltable.hidden to i8*), i32 [[RELTABLE_SHIFT]])
				; CHECK-NEXT: [[BIT_CAST:%.]] = bitcast i8 [[RELTABLE_INTRINSIC]] to i32*
				; CHECK-NEXT: ret i32* [[BIT_CAST]]
				; CHECK: return:
				; CHECK-NEXT: ret i32* @d2
				;
				entry:
				%0 = icmp ult i32 %cond, 3
				br i1 %0, label %switch.lookup, label %return

				switch.lookup: ; preds = %entry
				%switch.gep = getelementptr inbounds [3 x i32], [3 x i32]* @switch.table.hidden, i32 0, i32 %cond
				%switch.load = load i32, i32* %switch.gep, align 8
				ret i32* %switch.load

				return: ; preds = %entry
				ret i32* @d2
				}

				; ; Relative switch lookup table for strings
				define i8* @string_table(i32 %cond) {
				; CHECK-LABEL: @string_table(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = icmp ult i32 [[COND:%.]], 3
				; CHECK-NEXT: br i1 [[TMP0]], label [[SWITCH_LOOKUP:%.]], label [[RETURN:%.]]
				; CHECK: switch.lookup:
				; CHECK-NEXT: [[RELTABLE_SHIFT:%.*]] = shl i32 %cond, 2
				; CHECK-NEXT: [[RELTABLE_INTRINSIC:%.]] = call i8 @llvm.load.relative.i32(i8* bitcast ([3 x i32]* @reltable.string_table to i8*), i32 [[RELTABLE_SHIFT]])
				; CHECK-NEXT: ret i8* [[RELTABLE_INTRINSIC]]
				; CHECK: return:
				; CHECK-NEXT: ret i8* getelementptr inbounds ([8 x i8], [8 x i8]* @.str.3, i64 0, i64 0)
				;
				entry:
				%0 = icmp ult i32 %cond, 3
				br i1 %0, label %switch.lookup, label %return

				switch.lookup: ; preds = %entry
				%switch.gep = getelementptr inbounds [3 x i8], [3 x i8]* @switch.table.string_table, i32 0, i32 %cond
				%switch.load = load i8, i8* %switch.gep, align 8
				ret i8* %switch.load

				return: ; preds = %entry
				ret i8* getelementptr inbounds ([8 x i8], [8 x i8]* @.str.3, i64 0, i64 0)
				}

				; Relative switch lookup table for strings with holes, where holes are filled with relative offset to default values
				define i8* @string_table_holes(i32 %cond) {
				; CHECK-LABEL: @string_table_holes(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = icmp ult i32 [[COND:%.]], 4
				; CHECK-NEXT: br i1 [[TMP0]], label [[SWITCH_LOOKUP:%.]], label [[RETURN:%.]]
				; CHECK: switch.lookup:
				; CHECK-NEXT: [[RELTABLE_SHIFT:%.*]] = shl i32 [[COND]], 2
				; CHECK-NEXT: [[RELTABLE_INTRINSIC:%.]] = call i8 @llvm.load.relative.i32(i8* bitcast ([4 x i32]* @reltable.string_table_holes to i8*), i32 [[RELTABLE_SHIFT]])
				; CHECK-NEXT: ret i8* [[RELTABLE_INTRINSIC]]
				; CHECK: return:
				; CHECK-NEXT: ret i8* getelementptr inbounds ([8 x i8], [8 x i8]* @.str.3, i64 0, i64 0)
				;
				entry:
				%0 = icmp ult i32 %cond, 4
				br i1 %0, label %switch.lookup, label %return

				switch.lookup: ; preds = %entry
				%switch.gep = getelementptr inbounds [4 x i8], [4 x i8]* @switch.table.string_table_holes, i32 0, i32 %cond
				%switch.load = load i8, i8* %switch.gep, align 8
				ret i8* %switch.load

				return: ; preds = %entry
				ret i8* getelementptr inbounds ([8 x i8], [8 x i8]* @.str.3, i64 0, i64 0)
				}


				; Single value check
				; If there is a lookup table, where each element contains the same value,
				; a relative lookup should not be generated
				define void @single_value(i32 %cond) {
				; CHECK-LABEL: @single_value(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = icmp ult i32 [[COND:%.]], 3
				; CHECK-NEXT: br i1 [[TMP0]], label [[SWITCH_LOOKUP:%.]], label [[RETURN:%.]]
				; CHECK: switch.lookup:
				; CHECK-NEXT: [[RELTABLE_SHIFT:%.*]] = shl i32 [[COND]], 2
				; CHECK-NEXT: [[RELTABLE_INTRINSIC:%.]] = call i8 @llvm.load.relative.i32(i8* bitcast ([3 x i32]* @reltable.single_value to i8*), i32 [[RELTABLE_SHIFT]])
				; CHECK: sw.epilog:
				; CHECK-NEXT: [[STR1:%.]] = phi i8 [ getelementptr inbounds ([5 x i8], [5 x i8]* @.str.5, i64 0, i64 0), %entry ], [ getelementptr inbounds ([12 x i8], [12 x i8]* @.str.7, i64 0, i64 0), %switch.lookup ]
				; CHECK-NEXT: [[STR2:%.]] = phi i8 [ getelementptr inbounds ([5 x i8], [5 x i8]* @.str.6, i64 0, i64 0), %entry ], [ [[RELTABLE_INTRINSIC]], [[SWITCH_LOOKUP]] ]
				; CHECK-NEXT: ret void

				entry:
				%0 = icmp ult i32 %cond, 3
				br i1 %0, label %switch.lookup, label %sw.epilog

				switch.lookup: ; preds = %entry
				%switch.gep = getelementptr inbounds [3 x i8], [3 x i8]* @switch.table.single_value, i32 0, i32 %cond
				%switch.load = load i8, i8* %switch.gep, align 8
				br label %sw.epilog

				sw.epilog: ; preds = %switch.lookup, %entry
				%str1.0 = phi i8* [ getelementptr inbounds ([5 x i8], [5 x i8]* @.str.5, i64 0, i64 0), %entry ], [ getelementptr inbounds ([12 x i8], [12 x i8]* @.str.7, i64 0, i64 0), %switch.lookup ]
				%str2.0 = phi i8* [ getelementptr inbounds ([5 x i8], [5 x i8]* @.str.6, i64 0, i64 0), %entry ], [ %switch.load, %switch.lookup ]
				ret void
				}

				; Relative lookup table generated for a user-defined lookup table
				define i8* @user_defined_lookup_table(i32 %cond) {
				; CHECK-LABEL: @user_defined_lookup_table(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = icmp sgt i32 [[COND:%.]], 3
				; CHECK-NEXT: br i1 [[TMP0]], label [[SWITCH_LOOKUP:%.]], label [[RETURN:%.]]
				; CHECK: cond.false:
				; CHECK-NEXT: [[IDX_PROM:%.*]] = sext i32 [[COND]] to i64
				; CHECK-NEXT: [[RELTABLE_SHIFT:%.*]] = shl i64 [[IDX_PROM]], 2
				; CHECK-NEXT: [[RELTABLE_INTRINSIC:%.]] = call i8 @llvm.load.relative.i64(i8* bitcast ([3 x i32]* @reltable.user_defined_lookup_table to i8*), i64 [[RELTABLE_SHIFT]])
				; CHECK-NEXT: br label %cond.end
				; CHECK: cond.end:
				; CHECK-NEXT: [[COND1:%.]] = phi i8 [ [[RELTABLE_INTRINSIC]], %cond.false ], [ getelementptr inbounds ([8 x i8], [8 x i8]* @.str.3, i64 0, i64 0), %entry ]
				; CHECK-NEXT: ret i8* [[COND1]]
				;
				entry:
				%cmp = icmp sgt i32 %cond, 3
				br i1 %cmp, label %cond.end, label %cond.false

				cond.false: ; preds = %entry
				%idxprom = sext i32 %cond to i64
				%arrayidx = getelementptr inbounds [3 x i8], [3 x i8]* @user_defined_lookup_table.table, i64 0, i64 %idxprom
				%0 = load i8, i8* %arrayidx, align 8, !tbaa !4
				br label %cond.end

				cond.end: ; preds = %entry, %cond.false
				%cond1 = phi i8* [ %0, %cond.false ], [ getelementptr inbounds ([8 x i8], [8 x i8]* @.str.3, i64 0, i64 0), %entry ]
				ret i8* %cond1
				}

				!llvm.module.flags = !{!0, !1}
				!0 = !{i32 7, !"PIC Level", i32 2}
				!1 = !{i32 1, !"Code Model", i32 1}
				!4 = !{!"any pointer", !5, i64 0}
				!5 = !{!"omnipotent char", !6, i64 0}
				!6 = !{!"Simple C/C++ TBAA"}

llvm/test/Transforms/RelLookupTableConverter/switch_relative_lookup_table.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt < %s -rel-lookup-table-converter -S \| FileCheck %s --check-prefix=FNOPIC
				; RUN: opt < %s -rel-lookup-table-converter -relocation-model=pic -S \| FileCheck %s --check-prefix=FPIC

				; RUN: opt < %s -passes=rel-lookup-table-converter -S \| FileCheck %s --check-prefix=FNOPIC
				; RUN: opt < %s -passes=rel-lookup-table-converter -relocation-model=pic -S \| FileCheck %s --check-prefix=FPIC
				target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				@.str = private unnamed_addr constant [5 x i8] c"zero\00", align 1
				@.str.1 = private unnamed_addr constant [4 x i8] c"one\00", align 1
				@.str.2 = private unnamed_addr constant [4 x i8] c"two\00", align 1
				@.str.3 = private unnamed_addr constant [8 x i8] c"default\00", align 1

				@switch.table.string_table = private unnamed_addr constant [3 x i8*]
				[
				i8* getelementptr inbounds ([5 x i8], [5 x i8]* @.str, i64 0, i64 0),
				i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.1, i64 0, i64 0),
				i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.2, i64 0, i64 0)
				], align 8

				; Switch lookup table
				; FNOPIC: @switch.table.string_table = private unnamed_addr constant [3 x i8*]
				; FNOPIC-SAME: [
				; FNOPIC-SAME: i8* getelementptr inbounds ([5 x i8], [5 x i8]* @.str, i64 0, i64 0),
				; FNOPIC-SAME: i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.1, i64 0, i64 0),
				; FNOPIC-SAME: i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.2, i64 0, i64 0)
				; FNOPIC-SAME: ], align 8

				; Relative switch lookup table
				; FPIC: @reltable.string_table = private unnamed_addr constant [3 x i32]
				; FPIC-SAME: [
				; FPIC-SAME: i32 trunc (i64 sub (i64 ptrtoint ([5 x i8]* @.str to i64), i64 ptrtoint ([3 x i32]* @reltable.string_table to i64)) to i32),
				; FPIC-SAME: i32 trunc (i64 sub (i64 ptrtoint ([4 x i8]* @.str.1 to i64), i64 ptrtoint ([3 x i32]* @reltable.string_table to i64)) to i32),
				; FPIC-SAME: i32 trunc (i64 sub (i64 ptrtoint ([4 x i8]* @.str.2 to i64), i64 ptrtoint ([3 x i32]* @reltable.string_table to i64)) to i32)
				; FPIC-SAME: ], align 4

				; ; Relative switch lookup table for strings
				define i8* @string_table(i32 %cond) {
				leonardchanUnsubmitted Done Reply Inline Actions It looks like this test case isn't much different from `string_table` in `relative_lookup_table.ll`? If so, then this file could be removed. leonardchan: It looks like this test case isn't much different from `string_table` in `relative_lookup_table.
				gulfemAuthorUnsubmitted Done Reply Inline Actions I renamed this test case to no_relative_lookup_table.ll that checks the cases where relative lookup table should not be generated like in non-pic mode, medium or large code models, and 32 bit architectures, etc. gulfem: I renamed this test case to no_relative_lookup_table.ll that checks the cases where relative…
				; FNOPIC-LABEL: @string_table(
				; FNOPIC-NEXT: entry:
				; FNOPIC-NEXT: [[TMP0:%.]] = icmp ult i32 [[COND:%.]], 3
				; FNOPIC-NEXT: br i1 [[TMP0]], label [[SWITCH_LOOKUP:%.]], label [[RETURN:%.]]
				; FNOPIC: switch.lookup:
				; FNOPIC-NEXT: [[SWITCH_GEP:%.]] = getelementptr inbounds [3 x i8], [3 x i8] @switch.table.string_table, i32 0, i32 [[COND]]
				; FNOPIC-NEXT: [[SWITCH_LOAD:%.]] = load i8, i8** [[SWITCH_GEP]], align 8
				; FNOPIC-NEXT: ret i8* [[SWITCH_LOAD]]
				; FNOPIC: return:
				; FNOPIC-NEXT: ret i8* getelementptr inbounds ([8 x i8], [8 x i8]* @.str.3, i64 0, i64 0)

				; FPIC-LABEL: @string_table(
				; FPIC-NEXT: entry:
				; FPIC-NEXT: [[TMP0:%.]] = icmp ult i32 [[COND:%.]], 3
				; FPIC-NEXT: br i1 [[TMP0]], label [[SWITCH_LOOKUP:%.]], label [[RETURN:%.]]
				; FPIC: switch.lookup:
				; FPIC-NEXT: [[RELTABLE_SHIFT:%.*]] = shl i32 %cond, 2
				; FPIC-NEXT: [[RELTABLE_INTRINSIC:%.]] = call i8 @llvm.load.relative.i32(i8* bitcast ([3 x i32]* @reltable.string_table to i8*), i32 [[RELTABLE_SHIFT]])
				; FPIC-NEXT: ret i8* [[RELTABLE_INTRINSIC]]
				; FPIC: return:
				; FPIC-NEXT: ret i8* getelementptr inbounds ([8 x i8], [8 x i8]* @.str.3, i64 0, i64 0)
				;
				entry:
				%0 = icmp ult i32 %cond, 3
				br i1 %0, label %switch.lookup, label %return

				switch.lookup: ; preds = %entry
				%switch.gep = getelementptr inbounds [3 x i8], [3 x i8]* @switch.table.string_table, i32 0, i32 %cond
				%switch.load = load i8, i8* %switch.gep, align 8
				ret i8* %switch.load

				return: ; preds = %entry
				ret i8* getelementptr inbounds ([8 x i8], [8 x i8]* @.str.3, i64 0, i64 0)
				}

llvm/utils/gn/secondary/llvm/lib/Transforms/Utils/BUILD.gn

Show First 20 Lines • Show All 55 Lines • ▼ Show 20 Lines	sources = [
"LowerSwitch.cpp",		"LowerSwitch.cpp",
"MatrixUtils.cpp",		"MatrixUtils.cpp",
"Mem2Reg.cpp",		"Mem2Reg.cpp",
"MetaRenamer.cpp",		"MetaRenamer.cpp",
"ModuleUtils.cpp",		"ModuleUtils.cpp",
"NameAnonGlobals.cpp",		"NameAnonGlobals.cpp",
"PredicateInfo.cpp",		"PredicateInfo.cpp",
"PromoteMemoryToRegister.cpp",		"PromoteMemoryToRegister.cpp",
		"RelLookupTableConverter.cpp"
		leonardchanUnsubmitted Not Done Reply Inline Actions Good that you added this, but I think Nico has a bot that automatically updates these BUILD.gn files so manually updating them may not be necessary. leonardchan: Good that you added this, but I think Nico has a bot that automatically updates these BUILD.gn…
		thakisUnsubmitted Not Done Reply Inline Actions Please don't touch gn files unless you use them. Simple file additions in cmake files are synced automatically http://github.com/llvmgnsyncbot You forgot to add the trailing comma and now I had to fix it up manually instead of doing nothing :P thakis: Please don't touch gn files unless you use them. Simple file additions in cmake files are…
		gulfemAuthorUnsubmitted Done Reply Inline Actions Sorry about that Niko. I can fix it, so you don't need to do anything. Leo actually pointed that out, but I thought that manually changing it won't do any harm. Apparently it did though! gulfem: Sorry about that Niko. I can fix it, so you don't need to do anything. Leo actually pointed…
"SSAUpdater.cpp",		"SSAUpdater.cpp",
"SSAUpdaterBulk.cpp",		"SSAUpdaterBulk.cpp",
"SampleProfileLoaderBaseUtil.cpp",		"SampleProfileLoaderBaseUtil.cpp",
"SanitizerStats.cpp",		"SanitizerStats.cpp",
"ScalarEvolutionExpander.cpp",		"ScalarEvolutionExpander.cpp",
"SimplifyCFG.cpp",		"SimplifyCFG.cpp",
"SimplifyIndVar.cpp",		"SimplifyIndVar.cpp",
"SimplifyLibCalls.cpp",		"SimplifyLibCalls.cpp",
Show All 13 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[Passes] Add relative lookup table converter passClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 330405

llvm/docs/Passes.rst

llvm/include/llvm/Analysis/TargetTransformInfo.h

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h

llvm/include/llvm/CodeGen/BasicTTIImpl.h

llvm/include/llvm/InitializePasses.h

llvm/include/llvm/Transforms/Scalar.h

llvm/include/llvm/Transforms/Utils/RelLookupTableConverter.h

llvm/lib/Analysis/TargetTransformInfo.cpp

llvm/lib/Passes/PassBuilder.cpp

llvm/lib/Passes/PassRegistry.def

llvm/lib/Transforms/IPO/PassManagerBuilder.cpp

llvm/lib/Transforms/Utils/CMakeLists.txt

llvm/lib/Transforms/Utils/RelLookupTableConverter.cpp

llvm/lib/Transforms/Utils/Utils.cpp

llvm/test/CodeGen/AMDGPU/opt-pipeline.ll

llvm/test/Other/new-pm-defaults.ll

llvm/test/Other/new-pm-thinlto-defaults.ll

llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll

llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll

llvm/test/Other/opt-O2-pipeline.ll

llvm/test/Other/opt-O3-pipeline-enable-matrix.ll

llvm/test/Other/opt-O3-pipeline.ll

llvm/test/Other/opt-Os-pipeline.ll

llvm/test/Other/pass-pipelines.ll

llvm/test/Transforms/RelLookupTableConverter/relative_lookup_table.ll

llvm/test/Transforms/RelLookupTableConverter/switch_relative_lookup_table.ll

llvm/utils/gn/secondary/llvm/lib/Transforms/Utils/BUILD.gn

[Passes] Add relative lookup table converter pass
ClosedPublic