This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/Driver/ToolChains/
-
Driver/
-
ToolChains/
7/14
WebAssembly.cpp
-
test/Driver/
-
Driver/
1/2
wasm-toolchain-lto.c

Differential D70500

[WebAssembly] Enable use of wasm-opt and LTO-enabled system libraries
Needs RevisionPublic

Authored by sunfish on Nov 20 2019, 10:36 AM.

Download Raw Diff

Details

Reviewers

sbc100
dschuff
aheejin
thakis

Commits

rG812828984c10: [WebAssembly] Use wasm-opt and LTO libraries when available.

Summary

When the WASM_OPT environment variable is set, run the wasm-opt tool to optimize LLVM's output. Note that using wasm-opt currently means invalidating all debug info, however we can just tell users this when we tell them about WASM_OPT. This fixes PR43796.

Also, add an "llvm-lto" directory to the sysroot library search paths, so that sysroots can optionally provide LTO-enabled system libraries.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

sunfish created this revision.Nov 20 2019, 10:36 AM

Herald added a project: Restricted Project. · View Herald TranscriptNov 20 2019, 10:36 AM

Herald added subscribers: dexonsmith, steven_wu, hiraditya and 4 others. · View Herald Transcript

dschuff added inline comments.Nov 20 2019, 10:49 AM

clang/lib/Driver/ToolChains/WebAssembly.cpp
97	What would you think about adding a way to pass arguments through to wasm-opt on the command line, like we have for the linker, assembler, etc? Something like `-Wo,-O2` (or `-Ww` or whatever; analogous to `-Wl` and `-Wa`). That seems nicer than an env var, although it doesn't solve the problem of controlling whether to run the optimizer in the first place.

Herald added a subscriber: ormris. · View Herald TranscriptNov 20 2019, 10:49 AM

sunfish marked an inline comment as done.Nov 20 2019, 11:11 AM

sunfish added inline comments.

clang/lib/Driver/ToolChains/WebAssembly.cpp
97	My guess here is that we don't need clang to have an option for passing additional flags -- if people want to do extra special things with wasm-opt on clang's output they can just run wasm-opt directly themselves. Does that sound reasonable?

Kinds seems like two different changes here. But lgtm.

clang/lib/Driver/ToolChains/WebAssembly.cpp
141	Is there any kind of president here? i.e. do any other platforms have support for this kind thing? Seems like good idea I'd just like to be sure we follow existing practices.

sunfish marked an inline comment as done.Nov 20 2019, 12:42 PM

sunfish added inline comments.

clang/lib/Driver/ToolChains/WebAssembly.cpp
141	I looked around, but didn't see anything similar.

On I just remember why this is probably a bad idea. llvm bitcode is not designed to be stable, unlike object files, so its probably not a good idea to encourage the distributing of bitcode files in SDKs and such. Unless you stronly disagree maybe a good idea to split this patch in two since the WASM_OPT part is non-controversial and logically separate?

If the SDK is distributed with the compiler and version-locked, it seems like it should be ok.

In D70500#1754032, @dschuff wrote:

If the SDK is distributed with the compiler and version-locked, it seems like it should be ok.

Right, but if a given SDK chooses to do that, it can always at its own -L flags. I'm not sure we want to bake this into the driver.

dschuff added inline comments.Nov 20 2019, 12:56 PM

clang/lib/Driver/ToolChains/WebAssembly.cpp
97	Maybe. But I still don't like the use of an env var for this kind of behavior-effecting (i.e. non-debugging) use case. It's hard enough to get reproducible and hermetic build behavior as it is, I definitely wouldn't want to worry about the environment affecting the output in addition to all the random flags, included files, etc.

In D70500#1754012, @sbc100 wrote:

On I just remember why this is probably a bad idea. llvm bitcode is not designed to be stable, unlike object files, so its probably not a good idea to encourage the distributing of bitcode files in SDKs and such.

That's a good point. I think we can address this by including the LLVM revision in the path --I've now updated the patch to do this.

sunfish marked an inline comment as done.Nov 20 2019, 1:26 PM

sunfish added inline comments.

clang/lib/Driver/ToolChains/WebAssembly.cpp
97	If we did -Wo,-O2 or so, we'd still need to be able to find the wasm-opt program to be able to run it. We could just search for it in PATH, but that's also a little dicey from a hermetic build perspective. I borrowed "WASM_OPT" from cargo-wasi. I'm also not a fan of environment variables in general, but this way does have the advantage that people can set it once, and not have to modify their Makefiles to add new flags. Users can think of it as just being part of -O2 and friends.

dschuff added inline comments.Nov 20 2019, 2:38 PM

clang/lib/Driver/ToolChains/WebAssembly.cpp
97	What's the usual way to locate things like external assemblers? Presumably we could use the same mechanism for wasm-opt?

sunfish marked an inline comment as done.Nov 20 2019, 3:18 PM

sunfish added inline comments.

clang/lib/Driver/ToolChains/WebAssembly.cpp
97	It checks the COMPILER_PATH environment variable and -B command-line flags, which I'm not sure we should use here, but it looks like it does fall back to checking PATH at the end. So, what would you think of just checking PATH for wasm-opt?

dschuff added inline comments.Nov 20 2019, 3:51 PM

clang/lib/Driver/ToolChains/WebAssembly.cpp
97	I suspect we'll end up with -B flags if/when people start building interesting or nontrivial toolchains with clang (or we try to make emscripten more standardish), but I'm fine with leaving that out for now. Checking PATH for wasm-opt seems fine to me to locate the binary. Did you have that in mind as also the way to determine whether or not to run wasm-opt (i.e. run if it's there, don't if it's not)? That seems slightly error-prone in the sense that there would be a silent behavior change if something went wrong (e.g. wasm-opt goes missing) but in a world where most clang users are using wasm-opt then using wasm-opt by default seems reasonable; so that seems fine as a way to start out. I do think we will eventually want some way to modify the behavior of wasm-opt though. For that matter wasm-opt might at some point become able to optimize object files (allowing faster links at the cost of less LTO-optimized binaries; we'd run a reduced set of passes post-link or none at all). In that case there'd also have to be further changes if we want builtin support for that.

aprantl removed a subscriber: aprantl.Nov 20 2019, 4:20 PM

Use PATH instead of WASM_OPT to find wasm-opt.

sunfish marked an inline comment as done.Nov 20 2019, 4:41 PM

sunfish added inline comments.

clang/lib/Driver/ToolChains/WebAssembly.cpp
97	Yes, we'd just run wasm-opt if it's in the PATH. See the now updated patch. And yeah, this means if your wasm-opt disappears, you silently won't get as much optimization, which is a little unfortunate, but it is the most convenient for the common use case. And yeah, we can always add more options in the future :-).

LGTM on the approach, just one more question on the wasm-opt flags.

clang/lib/Driver/ToolChains/WebAssembly.cpp
105	This chain is slightly confusing. I assume this `getValue()` captures the number in `-O3`, `-O2`, etc? But why then do we need special-casing for 0 and 4 above? For that matter, we should probably not run wasm-opt at all if the opt-level is 0, right?

sbc100 added inline comments.Nov 20 2019, 5:43 PM

clang/test/Driver/wasm-toolchain-lto.c
7	Include the final path segment in the expectation?

sunfish marked 2 inline comments as done.Nov 20 2019, 6:39 PM

sunfish added inline comments.

clang/lib/Driver/ToolChains/WebAssembly.cpp
105	This isn't a wasm thing; O0 and O4 are special. See also here and here for other code which does similar things. The wasm-opt version here is actually simpler because we don't need to special-case Os or Oz. The `if (Arg *A = Args.getLastArg(options::OPT_O_Group)) {` guards against the case where no -O option is given, but you're right, we shouldn't run wasm-opt if -O0 is given. I'll update the patch.
clang/test/Driver/wasm-toolchain-lto.c
7	It's a git revision, so it'd be constantly changing.

Don't run wasm-opt with -O0.

WRT the LTO directory name, there's theoretically the danger that someone (e.g. emscripten) could be doing a rolling release of the compiler and get invalidated within a major revision. But in that case, 1) they should be rebuilding their libraries on each release anyway, and 2) last time I checked, policy was to make auto-upgrade of bitcode work within a revision. So it's probably not a problem.

This revision is now accepted and ready to land.Nov 21 2019, 4:28 PM

Also if at some point we are able to remove a bunch of the driver logic from emcc and move it here, (e.g. running clang to link instead of lld directly) we'll need to watch out for this.

In D70500#1756011, @dschuff wrote:

WRT the LTO directory name, there's theoretically the danger that someone (e.g. emscripten) could be doing a rolling release of the compiler and get invalidated within a major revision.

The LLVM revision here is the git hash, not the major version, so we should be good even if this happens.

Closed by commit rG812828984c10: [WebAssembly] Use wasm-opt and LTO libraries when available. (authored by • sunfishcode). · Explain WhyNov 22 2019, 11:28 PM

This revision was automatically updated to reflect the committed changes.

Please don't add code to the driver that runs programs off PATH. Nothing else does this. If you need to run an external program, look for it next to the compiler, like we do for gas with -fno-integrated-as, linker, etc. People can use the existing -B flag to make clang look elsewhere if they need that.

It's also somewhat uncommon to run non-LLVM binaries, but I can see the need for doing that at least during some transition period (like we did with the assembler).

This revision is now accepted and ready to land.Nov 23 2019, 4:35 AM

ps: The test fails for me, probably because I don't have wasm-opt on PATH? How does this work on the regular bots? Do they all have emscripten installed?

thakis requested changes to this revision.Nov 23 2019, 4:47 AM

This revision now requires changes to proceed.Nov 23 2019, 4:47 AM

In D70500#1757738, @thakis wrote:

ps: The test fails for me, probably because I don't have wasm-opt on PATH? How does this work on the regular bots? Do they all have emscripten installed?

Looks like it might fail with LLVM_APPEND_VC_REV=NO set.

thakis added inline comments.Nov 23 2019, 5:04 AM

clang/lib/Driver/ToolChains/WebAssembly.cpp
141	One immediate problem of this approach is that if HAVE_VCS_VERSION_INC is not set, then the test fails, but if it is set, `clang --version` reports a current git hash, which is either out of date or requires a pretty big rebuild on every single git commit (i.e. not just after `git pull` but also after local commits). Neither's great. Do you expected that sysroots will ship 3-4 LTO folders, to support ~2 years of clang releases? Or do you expect that a sysroot will always support a single clang/llvm revision? If so, maybe the LLVM version is enough?

The test also break in our internal integrate.
Relying on having programs in path and ignoring the -B and sysroot is definitely a bad idea.

I would suggest reverting this patch and re-landing after doing another round of review and fixing those issues.
@thakis, @sunfish, any objections or thoughts on this?

In D70500#1757735, @thakis wrote:

Please don't add code to the driver that runs programs off PATH. Nothing else does this.

Clang's normal GetProgramPath does do this: https://github.com/llvm/llvm-project/blob/master/clang/lib/Driver/Driver.cpp#L4719

If you need to run an external program, look for it next to the compiler, like we do for gas with -fno-integrated-as, linker, etc. People can use the existing -B flag to make clang look elsewhere if they need that.

Unfortunately, wasm-opt isn't typically installed alongside the compiler. It's also not a typical system tool like an assembler, which is why it didn't seem right to search the -B paths.

The first version of my patch used a dedicated environment variable, WASM_OPT, rather than PATH, but I changed it to PATH after review feedback. If people feel using -B paths, and COMPILER_PATH, are appropriate, we can use GetProgramPath itself, though note that that still does have a PATH fallback.

clang/lib/Driver/ToolChains/WebAssembly.cpp
141	Yes, I think we can switch from the VCS string to the LLVM_VERSION string. The documentation says LLVM guarantees bitcode backwards compatibility between minor versions, so this should be sufficient. I'll post a new patch for that, which should also fix the test when HAVE_VCS_VERSION_INC is false.

I've now posted https://reviews.llvm.org/D70677 which should fix the test failure when LLVM_APPEND_VC_REV=NO is set.

In D70500#1758963, @sunfish wrote:

If people feel using -B paths, and COMPILER_PATH, are appropriate, we can use GetProgramPath itself, though note that that still does have a PATH fallback.

I'm not an expert in driver code and how it should behave, but being consistent with how other tools are found definitely looks much better than special-casing a single tool.
Unless there are reasons why wasm-opt should be special, why not use the mechanism used by all other tools?

In D70500#1759831, @ilya-biryukov wrote:

I'm not an expert in driver code and how it should behave, but being consistent with how other tools are found definitely looks much better than special-casing a single tool.
Unless there are reasons why wasm-opt should be special, why not use the mechanism used by all other tools?

@dschuff also suggested this approach offline, so it sounds like the way to go. I've now submitted https://reviews.llvm.org/D70780 which implements this.

(sorry for the slow response here)

If you need to run an external program, look for it next to the compiler, like we do for gas with -fno-integrated-as, linker, etc. People can use the existing -B flag to make clang look elsewhere if they need that.

Unfortunately, wasm-opt isn't typically installed alongside the compiler. It's also not a typical system tool like an assembler, which is why it didn't seem right to search the -B paths.

The first version of my patch used a dedicated environment variable, WASM_OPT, rather than PATH, but I changed it to PATH after review feedback. If people feel using -B paths, and COMPILER_PATH, are appropriate, we can use GetProgramPath itself, though note that that still does have a PATH fallback.

Here's the scenario I'm worried about: Imagine someone packaging a hermetic toolchain that bundles clang, lld, and so on, but it doesn't bundle wasm-opt. After this patch, the toolchain won't behave hermetically for wasm compilations, since it now does different things based on if wasm-opt is on PATH (which it is if a dev is locally playing with emscripten, say).

It's true that the driver falls back to PATH as a last resort measure for the linker, but, as you say, that's usually next to the compiler binary and the fallback isn't needed, but wasm-opt usually isn't there.

What should be the way forward for that? Options:

Add a driver flag that explicitly sets if PATH should be used as final fallback for all tools (including this one)
- and maybe change the default to false since this behavior now goes form "basically never used" to "usually used, for wasm compilations"
Add a dedicated flag that controls if wasm-opt should be looked for in PATH
...?

(See also http://blog.llvm.org/2019/11/deterministic-builds-with-clang-and-lld.html ctrl-f "environment variables")

(ps: Thanks for the two changes landed so far!)

The first option seems reasonable to me. That is, adding a flag that controls whether the PATH is searched for all tools. Hermetic builds would then pass this flag to prevent PATH from playing any role.

I do find it odd that there is a PATH fallback in the existing code in the first place. I agree that basically no compiler other than the "system" compiler should ever use it (and also even the concept of the "system" compiler really only makes much sense on systems like Linux and BSDs where compiling things for the local system is common). I guess the other option here would be to just require that wasm-opt be in the same directory as clang (or one of the -B directories), which we can arrange in wasi-sdk or wherever.
In other words wasm-opt would still be different in that it would be optional (unlike more traditional tools such as ld or as) but it wouldn't have to be unusual in how it's located.

Are there plans to offer a way to disable this behavior (or have it optional in the first place)?
We'd like to run some custom processing between wasm-ld and wasm-opt which can't happen after the latter due to some of its one-way destructive optimizations (i.e. memory-packing or simplify-globals passes).
The only way now is to tell our users to place wasm-opt somewhere where clang can't find it. Or instead of using one clang super-command to manually call -cc1 and wasm-ld separately which is disappointing.

Also, is it even common to place wasm-opt next to the clang executable? Who is this for? Is this documented?
I only stumbled into this by pure luck and was very confused for a while until I ran with -v.

In D70500#1998994, @bernhard wrote:

Are there plans to offer a way to disable this behavior (or have it optional in the first place)?
We'd like to run some custom processing between wasm-ld and wasm-opt which can't happen after the latter due to some of its one-way destructive optimizations (i.e. memory-packing or simplify-globals passes).
The only way now is to tell our users to place wasm-opt somewhere where clang can't find it. Or instead of using one clang super-command to manually call -cc1 and wasm-ld separately which is disappointing.

Also, is it even common to place wasm-opt next to the clang executable? Who is this for? Is this documented?

It's for users who want smaller wasm binaries. It's not currently documented, though yes, it would be nice to document it.

That said, details about how LLVM lays out memory or uses globals are intentionally not documented. Just as we don't make any guarantees about exactly which optimizations we do at -O1 vs -O2, we're don't make any guarantees about which optimizations are done in clang vs wasm-ld vs wasm-opt. Clang and wasm-ld are free to do anything wasm-opt does. Could you describe what you want to do in more detail? It may be possible to find alternative approaches, or to design a feature for it with a clear and documentable scope.

It's for users who want smaller wasm binaries. It's not currently documented, though yes, it would be nice to document it.

But how would a user even end up with wasm-opt in the same directory of clang binaries?

Clang and wasm-ld are free to do anything wasm-opt does. Could you describe what you want to do in more detail? It may be possible to find alternative approaches, or to design a feature for it with a clear and documentable scope.

I'm generating additional data segments which while doing so increases the initial memory pages count and shifts any globals that point to __heap_base and global[0] (stack ptr init).

Due to memory-packing the end of data becomes ambiguous (-export=__data_end would mitigate that but it's nice to be compatible with any un-opt'ed wasm file), and due to simplify-globals with remove-unused-module-elements the global[0] can get copied/moved directly into the code section.
I am aware that this treads very much in undocumented assumptions, it is still experimental but it works. This hidden execution of wasm-opt is mostly making documentation on my end a bit harder.

If wasm-ld were to do these passes on its own I think there would at least be options passable with -Xlinker to make it not do that.

In D70500#1999268, @bernhard wrote:

It's for users who want smaller wasm binaries. It's not currently documented, though yes, it would be nice to document it.

But how would a user even end up with wasm-opt in the same directory of clang binaries?

wasi-sdk may package wasm-opt that way in the future. But, wasm-opt can also be found in the PATH.

Clang and wasm-ld are free to do anything wasm-opt does. Could you describe what you want to do in more detail? It may be possible to find alternative approaches, or to design a feature for it with a clear and documentable scope.

I'm generating additional data segments which while doing so increases the initial memory pages count and shifts any globals that point to __heap_base and global[0] (stack ptr init).

Would it work to increase the memory size, and then put your data in the new space this creates at the end of memory? This would avoid the need to shift anything around or depend on anything about the layout. It isn't yet documented anywhere, but the pattern of always allocating new memory at the end of the address space (as memory.grow does) is common enough and important enough that we probably should document it.

Due to memory-packing the end of data becomes ambiguous (-export=__data_end would mitigate that but it's nice to be compatible with any un-opt'ed wasm file), and due to simplify-globals with remove-unused-module-elements the global[0] can get copied/moved directly into the code section.
I am aware that this treads very much in undocumented assumptions, it is still experimental but it works. This hidden execution of wasm-opt is mostly making documentation on my end a bit harder.

__data_end and __heap_base aren't things you can move around once the linker has defined them. Their values can be baked in elsewhere in the wasm module, so it's not safe in general to insert new memory in between them and shift things around. And it happens that wasi-libc has some code that we hope to enable soon which allocates malloc memory using __heap_base which would likely break if things get shifted around after linking.

Would it work to increase the memory size, and then put your data in the new space this creates at the end of memory?

__data_end and __heap_base aren't things you can move around once the linker has defined them. Their values can be baked in elsewhere in the wasm module, so it's not safe in general to insert new memory in between them and shift things around.

My plan was to embed arbitrary sized files directly into the data memory after the wasm file was created. But I think you're right, it ends up only working with simple wasm programs, more complex code crashes somewhere during __wasm_call_ctors with the approach I took.

I guess custom sections would be the official route but I was hoping to avoid having to pass the entire data through JavaScript before it reaches WASM land.

Also I'm very sorry to have hijacked this patch discussion with something very different. If you have some more hints for me how I could accomplish this I'd very gladly hear them, but if this is the wrong place we should probably leave it at that. Thank you for your time!

We can definitely still talk about what you are trying to do, and it would probably be useful to have more folks involved. Opening an issue on https://github.com/WebAssembly/tool-conventions/ might be useful since it involves the conventions that LLVM and clang use. If it's specific to the web, then https://groups.google.com/forum/#!forum/emscripten-discuss could be another place (even if you don't plan on using emscripten's JS code).

I had an application crash with optimizations enabled, so I wanted to keep the debug info but the automatic wasm-opt kept removing it and I lost another 30 minutes to this, so I'm back :-)

Can we get clang not to automatically call wasm-opt if it is called without -Xlinker -strip-all, or when called with -g ?
Or at the very least, could it forward the -g along the -O* to wasm-opt?

In D70500#1999934, @bernhard wrote:

Would it work to increase the memory size, and then put your data in the new space this creates at the end of memory?

__data_end and __heap_base aren't things you can move around once the linker has defined them. Their values can be baked in elsewhere in the wasm module, so it's not safe in general to insert new memory in between them and shift things around.

My plan was to embed arbitrary sized files directly into the data memory after the wasm file was created. But I think you're right, it ends up only working with simple wasm programs, more complex code crashes somewhere during __wasm_call_ctors with the approach I took.

I think the way to embed data blobs in your final binary is to pass them to the linker as inputs. There are various techniques listed here: https://csl.name/post/embedding-binary-data/. I imagine not all them them will work with today's toolchain but we should strive to make those approaches work where possible.

I guess custom sections would be the official route but I was hoping to avoid having to pass the entire data through JavaScript before it reaches WASM land.

Also I'm very sorry to have hijacked this patch discussion with something very different. If you have some more hints for me how I could accomplish this I'd very gladly hear them, but if this is the wrong place we should probably leave it at that. Thank you for your time!

Revision Contents

Path

Size

clang/

lib/

Driver/

ToolChains/

WebAssembly.cpp

36 lines

test/

Driver/

wasm-toolchain-lto.c

6 lines

Diff 230750

clang/lib/Driver/ToolChains/WebAssembly.cpp

//===--- WebAssembly.cpp - WebAssembly ToolChain Implementation -- C++ --===//		//===--- WebAssembly.cpp - WebAssembly ToolChain Implementation -- C++ --===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "WebAssembly.h"		#include "WebAssembly.h"
#include "CommonArgs.h"		#include "CommonArgs.h"
		#include "clang/Basic/Version.h"
#include "clang/Config/config.h"		#include "clang/Config/config.h"
#include "clang/Driver/Compilation.h"		#include "clang/Driver/Compilation.h"
#include "clang/Driver/Driver.h"		#include "clang/Driver/Driver.h"
#include "clang/Driver/DriverDiagnostic.h"		#include "clang/Driver/DriverDiagnostic.h"
#include "clang/Driver/Options.h"		#include "clang/Driver/Options.h"
#include "llvm/Support/FileSystem.h"		#include "llvm/Support/FileSystem.h"
#include "llvm/Support/Path.h"		#include "llvm/Support/Path.h"
#include "llvm/Option/ArgList.h"		#include "llvm/Option/ArgList.h"
▲ Show 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	if (!Args.hasArg(options::OPT_nostdlib, options::OPT_nodefaultlibs)) {
CmdArgs.push_back("-lc");		CmdArgs.push_back("-lc");
AddRunTimeLibs(ToolChain, ToolChain.getDriver(), CmdArgs, Args);		AddRunTimeLibs(ToolChain, ToolChain.getDriver(), CmdArgs, Args);
}		}

CmdArgs.push_back("-o");		CmdArgs.push_back("-o");
CmdArgs.push_back(Output.getFilename());		CmdArgs.push_back(Output.getFilename());

C.addCommand(std::make_unique<Command>(JA, *this, Linker, CmdArgs, Inputs));		C.addCommand(std::make_unique<Command>(JA, *this, Linker, CmdArgs, Inputs));

		// When optimizing, if wasm-opt is in the PATH, run wasm-opt.
		if (Arg *A = Args.getLastArg(options::OPT_O_Group)) {
		if (llvm::ErrorOr<std::string> WasmOptPath =
		dschuffUnsubmitted Not Done Reply Inline Actions What would you think about adding a way to pass arguments through to wasm-opt on the command line, like we have for the linker, assembler, etc? Something like `-Wo,-O2` (or `-Ww` or whatever; analogous to `-Wl` and `-Wa`). That seems nicer than an env var, although it doesn't solve the problem of controlling whether to run the optimizer in the first place. dschuff: What would you think about adding a way to pass arguments through to wasm-opt on the command…
		sunfishAuthorUnsubmitted Done Reply Inline Actions My guess here is that we don't need clang to have an option for passing additional flags -- if people want to do extra special things with wasm-opt on clang's output they can just run wasm-opt directly themselves. Does that sound reasonable? sunfish: My guess here is that we don't need clang to have an option for passing additional flags -- if…
		dschuffUnsubmitted Not Done Reply Inline Actions Maybe. But I still don't like the use of an env var for this kind of behavior-effecting (i.e. non-debugging) use case. It's hard enough to get reproducible and hermetic build behavior as it is, I definitely wouldn't want to worry about the environment affecting the output in addition to all the random flags, included files, etc. dschuff: Maybe. But I still don't like the use of an env var for this kind of behavior-effecting (i.e.
		sunfishAuthorUnsubmitted Done Reply Inline Actions If we did -Wo,-O2 or so, we'd still need to be able to find the wasm-opt program to be able to run it. We could just search for it in PATH, but that's also a little dicey from a hermetic build perspective. I borrowed "WASM_OPT" from cargo-wasi. I'm also not a fan of environment variables in general, but this way does have the advantage that people can set it once, and not have to modify their Makefiles to add new flags. Users can think of it as just being part of -O2 and friends. sunfish: If we did -Wo,-O2 or so, we'd still need to be able to find the wasm-opt program to be able to…
		dschuffUnsubmitted Not Done Reply Inline Actions What's the usual way to locate things like external assemblers? Presumably we could use the same mechanism for wasm-opt? dschuff: What's the usual way to locate things like external assemblers? Presumably we could use the…
		sunfishAuthorUnsubmitted Done Reply Inline Actions It checks the COMPILER_PATH environment variable and -B command-line flags, which I'm not sure we should use here, but it looks like it does fall back to checking PATH at the end. So, what would you think of just checking PATH for wasm-opt? sunfish: It checks the COMPILER_PATH environment variable and -B command-line flags, which I'm not sure…
		dschuffUnsubmitted Not Done Reply Inline Actions I suspect we'll end up with -B flags if/when people start building interesting or nontrivial toolchains with clang (or we try to make emscripten more standardish), but I'm fine with leaving that out for now. Checking PATH for wasm-opt seems fine to me to locate the binary. Did you have that in mind as also the way to determine whether or not to run wasm-opt (i.e. run if it's there, don't if it's not)? That seems slightly error-prone in the sense that there would be a silent behavior change if something went wrong (e.g. wasm-opt goes missing) but in a world where most clang users are using wasm-opt then using wasm-opt by default seems reasonable; so that seems fine as a way to start out. I do think we will eventually want some way to modify the behavior of wasm-opt though. For that matter wasm-opt might at some point become able to optimize object files (allowing faster links at the cost of less LTO-optimized binaries; we'd run a reduced set of passes post-link or none at all). In that case there'd also have to be further changes if we want builtin support for that. dschuff: I suspect we'll end up with -B flags if/when people start building interesting or nontrivial…
		sunfishAuthorUnsubmitted Done Reply Inline Actions Yes, we'd just run wasm-opt if it's in the PATH. See the now updated patch. And yeah, this means if your wasm-opt disappears, you silently won't get as much optimization, which is a little unfortunate, but it is the most convenient for the common use case. And yeah, we can always add more options in the future :-). sunfish: Yes, we'd just run wasm-opt if it's in the PATH. See the now updated patch. And yeah, this…
		llvm::sys::findProgramByName("wasm-opt")) {
		StringRef OOpt = "s";
		if (A->getOption().matches(options::OPT_O4) \|\|
		A->getOption().matches(options::OPT_Ofast))
		OOpt = "4";
		else if (A->getOption().matches(options::OPT_O0))
		OOpt = "0";
		else if (A->getOption().matches(options::OPT_O))
		dschuffUnsubmitted Not Done Reply Inline Actions This chain is slightly confusing. I assume this `getValue()` captures the number in `-O3`, `-O2`, etc? But why then do we need special-casing for 0 and 4 above? For that matter, we should probably not run wasm-opt at all if the opt-level is 0, right? dschuff: This chain is slightly confusing. I assume this `getValue()` captures the number in `-O3`, `…
		sunfishAuthorUnsubmitted Done Reply Inline Actions This isn't a wasm thing; O0 and O4 are special. See also here and here for other code which does similar things. The wasm-opt version here is actually simpler because we don't need to special-case Os or Oz. The `if (Arg A = Args.getLastArg(options::OPT_O_Group)) {` guards against the case where no -O option is given, but you're right, we shouldn't run wasm-opt if -O0 is given. I'll update the patch. sunfish:* This isn't a wasm thing; O0 and O4 [are special](https://github.com/llvm/llvm…
		OOpt = A->getValue();

		if (OOpt != "0") {
		const char WasmOpt = Args.MakeArgString(WasmOptPath);
		ArgStringList CmdArgs;
		CmdArgs.push_back(Output.getFilename());
		CmdArgs.push_back(Args.MakeArgString(llvm::Twine("-O") + OOpt));
		CmdArgs.push_back("-o");
		CmdArgs.push_back(Output.getFilename());
		C.addCommand(std::make_unique<Command>(JA, *this, WasmOpt, CmdArgs, Inputs));
		}
		}
		}
}		}

WebAssembly::WebAssembly(const Driver &D, const llvm::Triple &Triple,		WebAssembly::WebAssembly(const Driver &D, const llvm::Triple &Triple,
const llvm::opt::ArgList &Args)		const llvm::opt::ArgList &Args)
: ToolChain(D, Triple, Args) {		: ToolChain(D, Triple, Args) {

assert(Triple.isArch32Bit() != Triple.isArch64Bit());		assert(Triple.isArch32Bit() != Triple.isArch64Bit());

getProgramPaths().push_back(getDriver().getInstalledDir());		getProgramPaths().push_back(getDriver().getInstalledDir());

if (getTriple().getOS() == llvm::Triple::UnknownOS) {		if (getTriple().getOS() == llvm::Triple::UnknownOS) {
// Theoretically an "unknown" OS should mean no standard libraries, however		// Theoretically an "unknown" OS should mean no standard libraries, however
// it could also mean that a custom set of libraries is in use, so just add		// it could also mean that a custom set of libraries is in use, so just add
// /lib to the search path. Disable multiarch in this case, to discourage		// /lib to the search path. Disable multiarch in this case, to discourage
// paths containing "unknown" from acquiring meanings.		// paths containing "unknown" from acquiring meanings.
getFilePaths().push_back(getDriver().SysRoot + "/lib");		getFilePaths().push_back(getDriver().SysRoot + "/lib");
} else {		} else {
const std::string MultiarchTriple =		const std::string MultiarchTriple =
getMultiarchTriple(getDriver(), Triple, getDriver().SysRoot);		getMultiarchTriple(getDriver(), Triple, getDriver().SysRoot);
		if (D.isUsingLTO()) {
		auto LLVMRevision = getLLVMRevision();
		if (!LLVMRevision.empty()) {
		// For LTO, enable use of lto-enabled sysroot libraries too, if available.
		sbc100Unsubmitted Not Done Reply Inline Actions Is there any kind of president here? i.e. do any other platforms have support for this kind thing? Seems like good idea I'd just like to be sure we follow existing practices. sbc100: Is there any kind of president here? i.e. do any other platforms have support for this kind…
		sunfishAuthorUnsubmitted Done Reply Inline Actions I looked around, but didn't see anything similar. sunfish: I looked around, but didn't see anything similar.
		thakisUnsubmitted Not Done Reply Inline Actions One immediate problem of this approach is that if HAVE_VCS_VERSION_INC is not set, then the test fails, but if it is set, `clang --version` reports a current git hash, which is either out of date or requires a pretty big rebuild on every single git commit (i.e. not just after `git pull` but also after local commits). Neither's great. Do you expected that sysroots will ship 3-4 LTO folders, to support ~2 years of clang releases? Or do you expect that a sysroot will always support a single clang/llvm revision? If so, maybe the LLVM version is enough? thakis: One immediate problem of this approach is that if HAVE_VCS_VERSION_INC is not set, then the…
		sunfishAuthorUnsubmitted Done Reply Inline Actions Yes, I think we can switch from the VCS string to the LLVM_VERSION string. The documentation says LLVM guarantees bitcode backwards compatibility between minor versions, so this should be sufficient. I'll post a new patch for that, which should also fix the test when HAVE_VCS_VERSION_INC is false. sunfish: Yes, I think we can switch from the VCS string to the LLVM_VERSION string. The documentation…
		// Note that the directory is keyed to the LLVM revision, as LLVM's
		// bitcode format is not stable.
		getFilePaths().push_back(getDriver().SysRoot + "/lib/" + MultiarchTriple +
		"/llvm-lto/" + LLVMRevision);
		}
		}
getFilePaths().push_back(getDriver().SysRoot + "/lib/" + MultiarchTriple);		getFilePaths().push_back(getDriver().SysRoot + "/lib/" + MultiarchTriple);
}		}
}		}

bool WebAssembly::IsMathErrnoDefault() const { return false; }		bool WebAssembly::IsMathErrnoDefault() const { return false; }

bool WebAssembly::IsObjCNonFragileABIDefault() const { return true; }		bool WebAssembly::IsObjCNonFragileABIDefault() const { return true; }

▲ Show 20 Lines • Show All 178 Lines • Show Last 20 Lines

clang/test/Driver/wasm-toolchain-lto.c

This file was added.

				// A basic C link command-line with optimization with known OS and LTO enabled.

				// RUN: %clang -### -O2 -flto -no-canonical-prefixes -target wasm32-wasi --sysroot=/foo %s 2>&1 \
				// RUN: \| FileCheck -check-prefix=LINK_OPT_KNOWN %s
				// LINK_OPT_KNOWN: clang{{.}}" "-cc1" {{.}} "-o" "[[temp:[^"]*]]"
				// LINK_OPT_KNOWN: wasm-ld{{.*}}" "-L/foo/lib/wasm32-wasi/llvm-lto/
				sbc100Unsubmitted Not Done Reply Inline Actions Include the final path segment in the expectation? sbc100: Include the final path segment in the expectation?
				sunfishAuthorUnsubmitted Done Reply Inline Actions It's a git revision, so it'd be constantly changing. sunfish: It's a git revision, so it'd be constantly changing.

This is an archive of the discontinued LLVM Phabricator instance.

[WebAssembly] Enable use of wasm-opt and LTO-enabled system librariesNeeds RevisionPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 230750

clang/lib/Driver/ToolChains/WebAssembly.cpp

clang/test/Driver/wasm-toolchain-lto.c

[WebAssembly] Enable use of wasm-opt and LTO-enabled system libraries
Needs RevisionPublic