Some targets may not want to duplicate BB when threading jumps or use different threshold. Adding threshold parameter in that pass's ctor and initializer allows customization on BB duplication.
Diff Detail
Event Timeline
lib/Transforms/Scalar/JumpThreading.cpp | ||
---|---|---|
50 | I have no opinion about the functionality of this patch, but how about giving a name to that magic '6'...like: static unsigned BBDuplicateThresholdDefault = 6; Then you can use that constant as the default parameter assignment, and you don't have to introduce yet another magic number (-1) and cast signed to unsigned. |
PING again
lib/Transforms/Scalar/JumpThreading.cpp | ||
---|---|---|
50 | '-1' here is used to tell whether a value other that default threshold is specified or not. However, that default threshold would be changed through command line option. Adding a name for that magic '6' and using it as default parameter does work correctly when that default threshold is changed through command option but no value is specified through pass creator. |
The functionality of the patch looks okay. Please add a comment that says that non-negative values override the internal default.
If this is supposed to allow for target customization, should the number come from TTI? What is the underlying hardware constraint(s) you're trying to model? Also, a test case would be nice.
I think that he is trying to disable code duplication because GPU languages may contain barriers that must not be duplicated.
I think that he is trying to disable code duplication because GPU languages may contain barriers that must not be duplicated.
If that's the motivation, then I wonder why our "noduplicate" function attribute is not already sufficient.
Hi Hal
Yeah, "noduplicate" could prevent duplicating of barrier calls but that
patch wants to address the potential issue on processors with divergent
control flow, commonly found in GPUs, e.g. AMD/NVIDIA ones. The
scenario is that, if BB is duplicated to exploit more jump threading,
targets with divergent CF may execute more instructions if the
condition is a divergent one.
For updating that threshold from TTI, yeah, if we are interested in
that case. I could come another patch considering both TTI and
user-specified threshold.
Yours
- Michael
- Original Message -----
From: "Michael Liao" <michael.liao@intel.com>
To: "michael liao" <michael.liao@intel.com>, nrotem@apple.com, hfinkel@anl.gov
Cc: spatel@rotateright.com, llvm-commits@cs.uiuc.edu
Sent: Monday, September 29, 2014 6:34:36 PM
Subject: Re: [PATCH] Allow BB duplication threshold to be adjusted through JumpThreading's ctorHi Hal
Yeah, "noduplicate" could prevent duplicating of barrier calls but
that
patch wants to address the potential issue on processors with
divergent
control flow, commonly found in GPUs, e.g. AMD/NVIDIA ones. The
scenario is that, if BB is duplicated to exploit more jump threading,
targets with divergent CF may execute more instructions if the
condition is a divergent one.For updating that threshold from TTI, yeah, if we are interested in
that case. I could come another patch considering both TTI and
user-specified threshold.
I suppose that I don't understand what you mean by "if we are interested." Generally speaking, ctor parameters are useful only for clients who are not using the standard optimization pipeline, and we'd like the standard optimization pipeline to generally work well for a wide range of targets. Thus, a TTI interface is preferred.
From a cost modeling perspective, how can you tell whether the instruction duplication will be worthwhile. Can this be something like 2*(instruction costs) <= (branch cost)?
Thanks again,
Hal
Yours
- Michael
I have no opinion about the functionality of this patch, but how about giving a name to that magic '6'...like:
Then you can use that constant as the default parameter assignment, and you don't have to introduce yet another magic number (-1) and cast signed to unsigned.