Here is the IR output for a small example, in the _ZNSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEE6__zeroEv function: https://godbolt.org/z/174Pav
Feb 18 2021
Hmm, I thought mem2reg was supposed to run even at O0. Turns out that's not the case, which explains why the alloca [3 x i64]* was not removed.
Feb 7 2021
Makes sense to me, since the offsets now match the symbol table checked earlier in the test. But I'm not really familiar with symbol processing, and I don't have a local WAsm toolchain to test with.
Oct 7 2020
Hmm, looks like this was already fixed by e5158b52730d323bb8cd2cba6dc6c89b90cba452. I guess I'll just commit the test then?
Oct 5 2020
Sep 28 2020
Sep 26 2020
Fix optional regex
Sep 25 2020
Sep 21 2020
Landed, mind looking at https://reviews.llvm.org/D87258#2267267 when you get the time?
Check relocation output
Is this essentially a fix for the work that was started in https://reviews.llvm.org/D79462 ?
Sep 10 2020
So is the idea that the table indexes would be reserved also by the linker?
Right now wasm-ld completely ignores the table elements in the object files and generates contiguous table entries when if finds TABLE_INDEX relocations.
Ah, so under LTO, the output from WasmObjectWriter is linked by lld with libc, etc, which writes the final output? I was hoping to avoid changing the object format to explicitly assign table element indices, since that would involve amending the specification, by just emitting the table elements in the correct order. Would it be sufficient to have WasmObjectWriter::recordRelocation also ensure that TABLE_INDEX relocations are inserted into CodeRelocations in the correct assigned order? Right now I only push them to the front of CodeRelocations to avoid conflicts when assigning indices with subsequent elements.
Switch to absolute_symbol metadata, which is undroppable, instead of wasm.index
Sep 9 2020
Fix undefined arithmetic range ops in LowerTypeTests, split WebAssembly relative symbol bugfix into D87407
Sep 8 2020
Currently, I am seeing some false positive CFI failures that only occur with WebAssembly and not native, so I still need to look into what's causing that.
One thing that's happened since you did all of this originally is multiple-table support; the reference-types proposal includes multiple tables, has been standardized, and will soon be (or is already?) available in all the browsers. IIRC you originally did both a version with and without multiple-tables, but I forget how much of that was in s2wasm vs in LLVM. IMO we should just make CFI require multiple tables, and then we can dedicate as many tables for whatever purposes we like.
I assume that we'll still need the core functionality of this CL, which is having the table layout be determined at IPO time rather than being assigned by the objectWriter/linker?
Can you remind me of what the state was for multiple-table support when you originally wrote all this?
Sep 7 2020
Fix Twine issue
Jul 10 2020
The original intention was to have tests run with both constraint managers, unless Z3 was required, in which case it would be Z3 only, but AFAIK that testing infrastructure was never set up. This was before Z3 got merged into LLVM proper, so the additional USE_Z3_SOLVER was to prevent bots from failing if Z3 wasn't installed. Given that all of this should be unused, if there's interest in setting up test bots for the SMT constraint manager, it should be fine to change any of this as needed.
May 21 2020
Thanks for mentioning it; yeah, I think restrict support would be useful for functions that do access arguments, but it wouldn't cover aliasing in the second case between external functions with attributes that don't access arguments.
Sure, I will add some comments and push. Thanks!
Thanks for the feedback!
May 20 2020
Drop ineffective loop aliasing check
The idea does not seem right. If a call can only write and not read, it cannot necessarily be moved.
Your test2() example looks incorrect. If foo2b writes to str and foo2c reads from it and writes to it, then foo2c should read the value written inside the loop, which can be overwritten on different iterations.
- foo2b writes value 2 to str.
- foo2c reads value from str, increments by 1 and writes back.
- loop runs 3 times.
With foo2b inside the loop, final value in str is 3.
With foo2b outside the loop, final value in str is 5.
Check aliasing on arguments, update tests
I can trigger the runtime crash when comparing the _ZN13LaplaceSolver6SolverILi3EE22assemble_linear_systemERNS1_12LinearSystemE function from the SPEC2006 benchmark 447.dealII, but unfortunately even with llvm-extract, I haven't been able to generate a simple testcase for this.
I'm not entirely convinced that it's a race, but a separate file seems fine. Have you tested with e.g. LLVM_OPTIMIZED_TABLEGEN? It might be safer to use LLVM_MAIN_SRC_DIR instead of CMAKE_SOURCE_DIR.
May 19 2020
I'm not too familiar with this part of CMake, but this mechanism is how CMake themselves implement C++ compilation and testing: https://github.com/Kitware/CMake/blob/master/Modules/CheckCXXSourceRuns.cmake . In fact, this code does include CheckCXXSourceRuns, but seems to reimplement most of the internals. Perhaps @esteffin would know more?
May 17 2020
ping, any feedback?
Apr 24 2020
In order to reload the gold plugin, I'm modifying the Clang driver to pass in our own path as a separate argument, which is the most generic approach. Another method would be to use e.g. dladdr() to grab our own path from one of our exported functions, but that method appears to be a glibc extension which isn't cross-platform.
Thanks, seems to be working now for statically-built passes on a default LLVM build without LLVM_LINK_LLVM_DYLIB.
Support statically-built passes
Apr 19 2020
Apr 16 2020
Apr 15 2020
Apr 14 2020
Here's the updated patch, though it still only works when LLVM is built with LLVM_LINK_LLVM_DYLIB=ON. Otherwise, with or without the LLVM_EXPORTED_SYMBOL_FILE definition, I get the following error with a static build:
Apr 13 2020
It looks like this overlaps with https://reviews.llvm.org/D76866 ; maybe wait for that to merge, then we can handle the gold-specific changes as a followup.
(Probably you need export_executable_symbols_for_plugins to fix the missing symbol issues.)
Alright, is it feasible to add small tests or provide instructions on how you would load an example pass?
Apr 8 2020
Originally, I tried forwarding the -Xclang -load arguments, but couldn't access the options::OPT_plugin arguments from the argument list. I'm not familiar with tablegen and argument processing, there was some issue with the group not being available causing the arguments to be skipped when filtering the argument list, so I ended up reusing the OPT_fplugin and OPT_fpass_plugin arguments instead. As far as I can tell, OPT_fplugin seems to work for both Clang and LLVM plugins, but I haven't used the former.
Sure. I've written some local optimizations in a loadable pass that uses one of the LTO extension points, EP_FullLinkTimeOptimizationEarly, but it seems that there's no way to actually load out-of-tree LTO passes into gold. This patch fixes that by modifying the gold plugin to support two additional arguments for loading external passes with the old/new pass manager, though I've only tested it with my local legacy pass.
Apr 7 2020
Mar 31 2020
Some people don't use -DLLVM_USE_LINKER=lld, but use -DLLVM_ENABLE_LLD=On.
Mar 30 2020
Update tests with update_test_checks.py, fix bug
Update tests with update_test_checks.py
I'm unsure about the scope of this. It seems to match a particular pattern and it is unclear this is the right place to do so. Have you considered doing this as part of AAPrivatizablePtr (or a new AbstractAttribute) in the Attributor?
Mar 28 2020
Add more tests, fix bugs
I'm not sure yet if I'll need to support ptrtoint; I haven't rebuilt my benchmarks with the parent changes to see if optimizations are being inhibited. But I am more inclined to avoid it if possible since I'll need to refactor and call the code to infer pointer capture.
Remove unneeded store code
Hmm, yeah, undef is probably too strong. Ok, how about I replace the argument with a fresh alloca? It should still permit load/store optimizations on the original alloca, while still providing some alloca that isn't accessed.
Create fresh alloca instead of undefvalue
Fix load type and store
Thanks for the feedback, I've sent you an email.
Mar 27 2020
I'm not quite sure what the intended semantics for inaccessiblememonly and readnone are here. If a pointer operand is converted to an integer and passed as an argument to a function call, when can that pointer be optimized out? Does that function need to have either attribute, or does it not matter that an integer argument is casted from a pointer? This patch currently assumes that the attribute must still be present on the function (since it can't be set on a non-pointer argument).
Fix test failures
Sure, I'm working on a instrumentation pass which is inserting calls that inhibit optimization, so I'm trying to work around the issue using function attributes, and need to look into memory to register promotion next.
Revise based on feedback