- User Since
- Jun 18 2014, 12:59 AM (173 w, 6 d)
Thu, Oct 12
Wed, Oct 11
Tue, Oct 10
While this shouldn't be an issue (legalization should only look at one instruction), there's some out of tree code that looks at other instructions as well, and this is breaking the legalization for that instruction.
Mon, Oct 9
Thanks Amara, LGTM modulo one tiny nitpick which you should feel free to disagree with.
Thu, Oct 5
Not really sure how this can be tested since I can only trigger this when reusing the pass in (julia) JIT.
Tue, Oct 3
- updated to ToT; adapting the G_OR narrowing support added recently by Quentin.
- slightly improve test arm64-fallback.ll by working around not being able to legalize non-power-of-2-sized G_IMPLICIT_DEFs yet.
Mon, Oct 2
Sat, Sep 30
Fri, Sep 29
Use unordered_map instead of (ordered) map.
- Rebased to ToT.
- Addressed all outstanding review comments.
- Used the test-suite on AArch64 to make sure there are no correctness regressions, both in fallback mode and in assert-when-not-legalizable mode.
- Measured compile time impact of this change: it's below the noise level I see on gathering CTMark compile time numbers on my system.
Mon, Sep 25
Sep 12 2017
Sep 11 2017
Still LGTM; please commit.
I thought that some indentations looked a bit strange, so I'd just still check that clang-format formats your changes the same way.
Sep 7 2017
Aug 31 2017
I have felt before that LNT's connection and interaction with the database is too obscure in the code.
Therefore, I like this proposal, even if it means passing around session objects more.
Aug 30 2017
Aug 17 2017
Aug 16 2017
Aug 11 2017
Hi Brian, Hal,
Aug 10 2017
Great to see you taking the effort to make the test-suite more relevant!
I think it'd be helpful if you could also add a little explanation of how adding this code makes the test-suite more relevant.
I'm assuming that this kind of code is common in some domain (HPC?) and that there is no other benchmark already in the test-suite with similar enough properties?
Aug 9 2017
Aug 8 2017
Jul 17 2017
Jul 14 2017
Jul 12 2017
Mostly looks good, just 2 minor remarks.
Jul 7 2017
LGTM from a Cortex-A57 point-of-view: this patch is generating code in line with the optimization guide recommendations and the benchmark numbers quoted look good.
Jul 5 2017
Jul 4 2017
I find it a bit suspicious that this triggers on G_IMPLICIT_DEF which was added in the past few days, but I don't have time right now to dig in and try and understand the full scope of what's going on.
Given the default action for G_IMPLICIT_DEF is defined as NarrowScalar, my impression is that a target will have to specify at least one size for which "needsLegalizingToDifferentSize" is false, e.g. "Legal".
Otherwise, which size should "NarrowScalar" narrow towards on a G_IMPLICIT_DEF?
That being said, the condition "Aspect.Idx >= Actions[Aspect.Opcode - FirstOp].size()" doesn't seem to be related to the infinite loop of ever more narrowing the type size I explained in the previous sentence?
I'm clearly missing something.
I hope Tim will be able to give you more useful feedback.
Jun 30 2017
Jun 29 2017
- "project_revision" seems to generic to me (I wouldn't know whether it's the compiler or the benchmark suites revision). I could probably live with "compiler_revision", but we should have a separate patch for that.
Agreed, it's not the best possible name. It should be the revision of the software for which you're tracking the performance in my mind. That software doesn't have to be a "compiler". Not sure what a better name would be though.
I'd be fine changing it to "compiler_revision" right away, if we need more discussion about how to call the field, then I'd suggest doing that in another patch to not delay this patch.
Let's change it to "compiler_revision" then, as all the other projects I know of so far track performance of a piece of software that could be claimed to be a compiler of sorts.
Hmpf, I should have tried this before making promises: Turns out sqlite does not support renaming a column. This means I cannot easily provide migration code (adding a new column and copying doesn't work nicely either as removing the old column isn't supported either). So I fear this is best addressed by creating a new LNT schema.
Jun 28 2017
Agreed that not losing performance will be the key challenge for any redesign of the SQL schema to avoid having test-suite-specific tables. Anyway, it's indeed a discussion for another time...
Rebased to ToT & ping.
Jun 27 2017
Thank you so much for this, Matthias!
Jun 26 2017
Jun 23 2017
I hope this could all be done within the test-suite, probably extending test-suite/litsupport. I hope/think that no changes to LNT will be needed, but I haven't investigated myself.
For example, I'm thinking that test-suite/litsupport could end up reporting the execution time of a single iteration of the code benchmarked with the Google Benchmark library, and that will map neatly to concepts already understood by LNT.
Jun 20 2017
Thanks for sharing your thoughts - that makes things quite a bit clearer to me.
Jun 19 2017
There is quite a bit of code that is duplicated from LNT to test-suite as part of this patch.
I'm not entirely sure if having this much code duplication between LNT and test-suite is desirable, unless this is part of a bigger redesign, like we discussed in January in the thread starting roughly at http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170123/423073.html, with a nicer design in the end?
Jun 1 2017
Replace the matcher if-statements for each rule with a state-machine. This
significantly reduces compile time, memory allocations, and cumulative memory
allocation when compiling AArch64InstructionSelector.cpp.o after r303259 is
May 30 2017
Abandoning this as my motivation for this (fixing PR30324) has gone away, as that PR got fixed in another way, in r303360.
Splitting the StackProtector pass into a separate analysis and transformation is probably still nice, but probably also better pushed forward by someone with a stronger interest in the stack protector pass and/or someone who is more of an expert in this area.
committed in r304196.
May 29 2017
Thanks all for the review comments!
May 25 2017
LGTM, modulo considering a call instruction to be ending a basic block.
I don't know enough details about the X86_64 instruction set to judge whether the regular expressions are fully correct, but they look plausible to me.
May 24 2017
Looks like a reasonable start to me (with one more nit-pick).
Let's get this in as it's valuable as it is.
It can be improved further incrementally.
May 23 2017
May 19 2017
May 16 2017
Just wanted to point to https://reviews.llvm.org/D29494, which is a patch that isn't fully polished to split the StackProtector pass into an Analysis pass and a Transformation pass.
My motivation for that was also fixing PR30324.
May 11 2017
Thanks for the review on the ARM-specific part, Diana!
@t.p.northover : would you have time for a quick glance at the target-independent parts in LegalizerInfo.h, LegalizerInfo.cpp and LegalizerInfoTest.cpp?
May 10 2017
LGTM, even if you don't implement the nitpicky suggestions I made.
May 9 2017
May 3 2017
May 2 2017
Apr 27 2017
Could you explain the rationale for adding libs/benchmarks-1.1.0?
Also, since that seems to have a different license, you probably need to add that directory to the top-level LICENSE.TXT file as one of the sub-directories with "additional or alternate copyrights,licenses, and/or restrictions".
Apr 26 2017
Adding a unit test.
I've also run the latest version of this patch with command line options '-O3 -DNDEBUG -mcpu=cortex-a57 -mthumb -fomit-frame-pointer' on the test-suite and a range of other benchmarks.
I do see the same few performance regressions as reported by Andrew, for the same reasons; and many performance improvements.
Overall, I see a geomean performance improvement of 1.70% for the benchmarks reporting execution time and a 1.63% performance improvement for the benchmarks reporting scores.
Apr 24 2017
Apr 21 2017
Adding big.LITTLE mcpu options for clang to be more command-line compatible with gcc is a good idea.
Mapping the code generation for it to what is done for the little core also is a defendable position.
I'm just wondering if in some of the implementation of this there could be less copy-paste? I've made a few specific comments in some places of the code where ideally there should be less copy-paste.
Apr 19 2017
Committed in r300664
Apr 18 2017
Move public methods to all be in the same location, as suggested by Ahmed, now that the implementation details are hidden more in private methods.
Apr 14 2017