kristof.beyls (Kristof Beyls)
User

Projects

User does not belong to any projects.

User Details

User Since
Jun 18 2014, 12:59 AM (165 w, 4 d)

Recent Activity

Thu, Aug 17

kristof.beyls added a comment to D36717: [test-suite] Add SPEC CPU 2017.

One more remark: The 'ref' dataset of 638.imagick_s on all the computers I tried on took between 2 and 3 hours. Submitted results to SPEC are in a similar range. Unfortunately timeit.py has a 7200 seconds (2 hours) limit hardcoded (for --limit-cpu and --timeout). I had to remove that limit to be able to have a successful run.

Can we make the limit configurable or set higher?

Thu, Aug 17, 10:56 PM

Wed, Aug 16

kristof.beyls added inline comments to D36717: [test-suite] Add SPEC CPU 2017.
Wed, Aug 16, 12:16 AM

Fri, Aug 11

kristof.beyls accepted D36582: [test-suite] Adding HPCCG benchmark.

Thanks again, The -DVERIFICATION should not have been in the Makefile (not used in this app). I added a short note at the top of the README to make things clearer.

Brian

Fri, Aug 11, 10:52 PM
kristof.beyls added a comment to D36582: [test-suite] Adding HPCCG benchmark.

Hi Brian, Hal,

Fri, Aug 11, 12:55 PM

Thu, Aug 10

kristof.beyls added a comment to D36582: [test-suite] Adding HPCCG benchmark.

Great to see you taking the effort to make the test-suite more relevant!
I think it'd be helpful if you could also add a little explanation of how adding this code makes the test-suite more relevant.
I'm assuming that this kind of code is common in some domain (HPC?) and that there is no other benchmark already in the test-suite with similar enough properties?

Thu, Aug 10, 8:55 AM
kristof.beyls added a comment to D30529: [RFC][GlobalISel] Enable legalizing non-power-of-2 sized types..

Hi Kristof,

Thanks for working on this and I'm really sorry it took so long to reply.

Thu, Aug 10, 7:21 AM

Wed, Aug 9

kristof.beyls updated subscribers of D35402: [zorg] Enable running 'lnt runtest test-suite' instead of 'lnt runtest nt'..
Wed, Aug 9, 8:40 AM

Tue, Aug 8

kristof.beyls added a comment to D35588: Remove --commit/--commit=1 flags/settings..

Interesting. I wonder if one use case for this option was to submit (but not commit) a run and then just obtain a set of performance comparisons against the last submitted run. I think I have seen this pattern once, but I am not using it.

I personally won't be affected by the removal of this option, so I am fine with dropping it immediately. Now, a slightly less agressive approach would be to just switch it's default value, deprecate it and wait for a release or two to actually remove it.

Tue, Aug 8, 6:43 AM

Jul 17 2017

kristof.beyls added a comment to D35501: Add lnt commandline REST client.

This is a wonderful idea!

Jul 17 2017, 11:49 PM

Jul 14 2017

kristof.beyls created D35402: [zorg] Enable running 'lnt runtest test-suite' instead of 'lnt runtest nt'..
Jul 14 2017, 1:17 AM

Jul 12 2017

kristof.beyls accepted D35001: [LNT] Missing "Produced by" field on Run causes UI to crash.

LGTM!

Jul 12 2017, 11:44 PM
kristof.beyls added a comment to D35001: [LNT] Missing "Produced by" field on Run causes UI to crash.

Mostly looks good, just 2 minor remarks.

Jul 12 2017, 7:22 AM

Jul 7 2017

kristof.beyls accepted D34954: [AArch64] Use 16 bytes as preferred function alignment on Cortex-A57..

LGTM from a Cortex-A57 point-of-view: this patch is generating code in line with the optimization guide recommendations and the benchmark numbers quoted look good.

Jul 7 2017, 12:29 AM
kristof.beyls accepted D34961: [AArch64] Use 16 bytes as preferred function alignment on Cortex-A72..
Jul 7 2017, 12:29 AM
kristof.beyls added a comment to D34961: [AArch64] Use 16 bytes as preferred function alignment on Cortex-A72..

From a coding standpoint this all LGTM. However, I'm going to defer to a A72 code owner for final approval.

Jul 7 2017, 12:28 AM

Jul 5 2017

kristof.beyls added inline comments to D35001: [LNT] Missing "Produced by" field on Run causes UI to crash.
Jul 5 2017, 8:49 AM

Jul 4 2017

kristof.beyls added a comment to D34978: [GlobalIsel] fix undefined behavior if Action not set..

Hi Igor,

Would it be possible to add a regression test for this?
I'm wondering when this fails today. I guess it must be when the target has specified an action for e.g. TypeIdx 0, but not TypeIdx 1?
I'm just trying to understand if this should be an assert (if a target hasn't specified something fully enough) or an if statement.

Thanks,

Kristof

Hi Kristof,
I am not sure how to add regression test for this.
Today it fails for example on the follow mir

# RUN: llc -mtriple=x86_64-linux-gnu -O0 -run-pass=legalizer -global-isel %s -o - | FileCheck %s
  ---
  name:            test_implicit_def
  registers:
  body: |
  bb.0.entry:
    liveins:

    %0:_(s64) = G_IMPLICIT_DEF
...

In this case target doesn't set any Action for G_IMPLICIT_DEF.

I find it a bit suspicious that this triggers on G_IMPLICIT_DEF which was added in the past few days, but I don't have time right now to dig in and try and understand the full scope of what's going on.
Given the default action for G_IMPLICIT_DEF is defined as NarrowScalar, my impression is that a target will have to specify at least one size for which "needsLegalizingToDifferentSize" is false, e.g. "Legal".
Otherwise, which size should "NarrowScalar" narrow towards on a G_IMPLICIT_DEF?
That being said, the condition "Aspect.Idx >= Actions[Aspect.Opcode - FirstOp].size()" doesn't seem to be related to the infinite loop of ever more narrowing the type size I explained in the previous sentence?
I'm clearly missing something.
I hope Tim will be able to give you more useful feedback.

Jul 4 2017, 8:34 AM
kristof.beyls added a comment to D34978: [GlobalIsel] fix undefined behavior if Action not set..

Hi Igor,

Jul 4 2017, 4:55 AM

Jun 30 2017

kristof.beyls added inline comments to D32529: [GlobalISel] Make multi-step legalization work..
Jun 30 2017, 1:25 AM

Jun 29 2017

kristof.beyls added a comment to D34584: Introduce new JSON import format.
  • "project_revision" seems to generic to me (I wouldn't know whether it's the compiler or the benchmark suites revision). I could probably live with "compiler_revision", but we should have a separate patch for that.

Agreed, it's not the best possible name. It should be the revision of the software for which you're tracking the performance in my mind. That software doesn't have to be a "compiler". Not sure what a better name would be though.

I'd be fine changing it to "compiler_revision" right away, if we need more discussion about how to call the field, then I'd suggest doing that in another patch to not delay this patch.

Let's change it to "compiler_revision" then, as all the other projects I know of so far track performance of a piece of software that could be claimed to be a compiler of sorts.

Hmpf, I should have tried this before making promises: Turns out sqlite does not support renaming a column. This means I cannot easily provide migration code (adding a new column and copying doesn't work nicely either as removing the old column isn't supported either). So I fear this is best addressed by creating a new LNT schema.

Jun 29 2017, 12:53 PM
kristof.beyls updated the diff for D30529: [RFC][GlobalISel] Enable legalizing non-power-of-2 sized types..
Jun 29 2017, 9:09 AM
kristof.beyls accepted D34584: Introduce new JSON import format.

Before I dive in the meta discussion, are there any issues with the patch itself?

Jun 29 2017, 12:32 AM

Jun 28 2017

kristof.beyls added a comment to D34584: Introduce new JSON import format.

Drop Run/Info/tag: json itself is a schemaless format so it feels more natural without one. The LNT testsuite type is now determined by the submit URL used, or the '-s' commandline flag on the commandline. Both default to 'nts' if omitted.

I don't fully understand how LNT reacts to this, but on the surface makes sense to me. If someone ever had time to do a major redesign of the database schema, there is a chance we might also be able to drop the need for differences in the database schema per test suite, like "nts"? Anyway, that is definitely far out of scope for this patch.

The patch works fine. People using non-nts testsuites (which I have a strong suspicion only exist within apple), may need to supply a -s MySchema flag for the commandline tools or use db_default/v4/MySchema/submitRun instead of db_default/submitRun to express they don't use the 'nts' default.

This is probably not the right patch to discuss LNT database design. But some points anyway:

  • IMO making SQL tables too generic (like having one row with MetricType, Test, Run and Value; so we could create new MetricTypes on demand) is a sure way to kill performance. As long as we use SQL we should have 1 table column for each metric.
  • A schema is nice when you need to decide bits like the unit of a metric, whether bigger is better or getting a human readable metric name for the website. However we could make good arguments that we only need that on the presentation/analysis side and not as part of the data storage model.

Agreed that not losing performance will be the key challenge for any redesign of the SQL schema to avoid having test-suite-specific tables. Anyway, it's indeed a discussion for another time...

Jun 28 2017, 7:43 AM
kristof.beyls added a comment to D34362: [LNT] Support for different DataSet usage in Polybench for "lnt runtest nt".

It makes more sense to me to be using the --make-param flag to pass a test specific configuration options. If you want to add all these size classes, all the tests should support them, or have them mapped back to nearest size the tests can handle.

I feel having a common flag for test data sizes is better than having to pass them as test specific flag. As you suggested, we can map them back to nearest data sizes for the other tests. I was working with Polybench, so did not look into other tests. Will proceed with it if everyone agrees on adding a flag like "--testdatasizes=mini|small|medium|large|extra_large" and aliasing --small and --large for backward compatibility.

Look at SPEC for example: It has 3 different sizes: test, train and ref. You can describe them roughly as:

  • test: "As fast as posisble, not useful as a benchmark but they touch enough code paths so that they can give you a quick way to test for correctness"
  • train: "Somewhat realistic but smaller inputs, useful to produce data for PGO optimizations. The important aspect here is that the data is different enough from ref, so we don't overfit the code because training and reference data were the same"
  • ref: "Larger inputs running for a longer time producing stable numbers".

    So there is some semantics and intended uses here that is captured fine with "test", "train" and "ref". Mapping this to some generic terms like "small", "medium", "large" that just map to sizes would be a loss IMO.
Jun 28 2017, 7:23 AM
kristof.beyls updated the diff for D32529: [GlobalISel] Make multi-step legalization work..

Rebased to ToT & ping.

Jun 28 2017, 6:28 AM

Jun 27 2017

kristof.beyls updated subscribers of D34584: Introduce new JSON import format.

Thank you so much for this, Matthias!

Jun 27 2017, 12:02 AM

Jun 26 2017

kristof.beyls added a comment to D34581: Fix missing/mismatched html tags.

It would be wonderful if we could programmatically check these during testing. Thanks Matthias!

Jun 26 2017, 8:10 AM

Jun 23 2017

kristof.beyls added a comment to D32272: [XRay] Add Google Benchmark library + initial XRay benchmarks.

LGTM for the most part, thanks for making the changes, it's much simpler now.

Thoughts @MatzeB? I think we can work on running/reporting the microbenchmarks data later, in changes to LNT.

Can you please summarize the state of this and the eventual goal? It looks like:

  1. Eventually, we need a new LNT plugin (i.e. something like what we have in LNTBased/*) that understands how to interpret the output of the Google benchmark library.

I hope this could all be done within the test-suite, probably extending test-suite/litsupport. I hope/think that no changes to LNT will be needed, but I haven't investigated myself.
For example, I'm thinking that test-suite/litsupport could end up reporting the execution time of a single iteration of the code benchmarked with the Google Benchmark library, and that will map neatly to concepts already understood by LNT.

Jun 23 2017, 7:53 AM
kristof.beyls added a comment to D34368: [AArch64] PointerRegClass should be GPR64spRegClass.

Testcase? (I think this affects inline asm.)

This change is actually laying ground for fixing an inline asm bug (PR33134). Test cases will be added along with rest of the fix in a follow up change.

Jun 23 2017, 7:43 AM
kristof.beyls added reviewers for D34362: [LNT] Support for different DataSet usage in Polybench for "lnt runtest nt": MatzeB, cmatthews.

I think nt.py is deprecated, and meant to be replaced by test_suite.py.

Maybe someone who already contributed to lnt should be necessary for acceptance.

Jun 23 2017, 2:23 AM

Jun 20 2017

kristof.beyls added a comment to D34132: Add scripts to perform LNT submissions.

Thanks for sharing your thoughts - that makes things quite a bit clearer to me.

Jun 20 2017, 7:03 AM

Jun 19 2017

kristof.beyls added a comment to D34132: Add scripts to perform LNT submissions.

There is quite a bit of code that is duplicated from LNT to test-suite as part of this patch.
I'm not entirely sure if having this much code duplication between LNT and test-suite is desirable, unless this is part of a bigger redesign, like we discussed in January in the thread starting roughly at http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170123/423073.html, with a nicer design in the end?

Jun 19 2017, 7:13 AM

Jun 1 2017

kristof.beyls added a comment to D33758: [globalisel][tablegen] Partially fix compile-time regressions by converting matcher to state-machine(s).

Replace the matcher if-statements for each rule with a state-machine. This
significantly reduces compile time, memory allocations, and cumulative memory
allocation when compiling AArch64InstructionSelector.cpp.o after r303259 is
recommitted.

My guess is that the compiler compile time goes down simply because tablegen produces far less code?

That's right. Instead of generating C++ and having the compiler optimize a single massive function, it's compiling a single copy of each predicate and chaining them together with a table.

Do you also have an insight of how using a state machine "interpreter" changes the run-time of the compiler itself compared to using the "directly-compiled" generated C++ code?
I can't predict really how the different tradeoffs in both approaches will affect the compiler's run-time performance in the end, so it would be nice to share insights you may have for this.

I haven't measured it yet since the priority so far has been to get the compile times back down to a reasonable level. Once I've finished posting the patches for review I'll take a look at the run-time effect.

My intuition is that the compiler was unlikely to have optimized the previous implementation very well (particularly since it didn't inline the single-use lambdas) and assuming that's the case, the overhead of the interpreter is likely to be limited to the cost of the table lookups. It's possible that it will turn out to be faster since the previous implementation had no chance of fitting into an instruction cache whereas the new code may fit on some machines. On machines where it doesn't fit, some paths are more common than others so it may still be able to use the instruction cache better than the old implementation did.

Also, if we end up using a state machine for this, I think it'd be a good idea to, after the design settles, write a blog post or something similar to explain how the state machine works, as I've heard from many people that reverse engineering the DAGISel-produced state machine was hard due to limited or no documentation on it.

Ok.

Jun 1 2017, 1:22 AM
kristof.beyls added a comment to D33758: [globalisel][tablegen] Partially fix compile-time regressions by converting matcher to state-machine(s).

Replace the matcher if-statements for each rule with a state-machine. This
significantly reduces compile time, memory allocations, and cumulative memory
allocation when compiling AArch64InstructionSelector.cpp.o after r303259 is
recommitted.

Jun 1 2017, 12:26 AM
kristof.beyls added a reviewer for D33734: [AArch64] Add fallback in FastISel fp16 conversions: SjoerdMeijer.
Jun 1 2017, 12:18 AM

May 30 2017

kristof.beyls abandoned D29494: Mostly split the StackProtect Pass into an Analysis Pass and a Transformation Pass..

Abandoning this as my motivation for this (fixing PR30324) has gone away, as that PR got fixed in another way, in r303360.
Splitting the StackProtector pass into a separate analysis and transformation is probably still nice, but probably also better pushed forward by someone with a stronger interest in the stack protector pass and/or someone who is more of an expert in this area.

May 30 2017, 12:26 AM
kristof.beyls closed D33443: Fix PR33031: Cannot scavenge register without an emergency spill slot..

committed in r304196.

May 30 2017, 12:20 AM

May 29 2017

kristof.beyls added inline comments to D33443: Fix PR33031: Cannot scavenge register without an emergency spill slot..
May 29 2017, 8:06 AM
kristof.beyls updated the diff for D33443: Fix PR33031: Cannot scavenge register without an emergency spill slot..

Thanks all for the review comments!

May 29 2017, 7:52 AM

May 25 2017

kristof.beyls accepted D33544: [lnt][profile] Add support for X86-64 CFG view.

LGTM, modulo considering a call instruction to be ending a basic block.
I don't know enough details about the X86_64 instruction set to judge whether the regular expressions are fully correct, but they look plausible to me.

May 25 2017, 5:45 AM

May 24 2017

kristof.beyls accepted D33343: Add some tips on how to benchhmark.

Looks like a reasonable start to me (with one more nit-pick).
Let's get this in as it's valuable as it is.
It can be improved further incrementally.

May 24 2017, 2:39 AM

May 23 2017

kristof.beyls created D33443: Fix PR33031: Cannot scavenge register without an emergency spill slot..
May 23 2017, 7:50 AM

May 19 2017

kristof.beyls updated subscribers of D33343: Add some tips on how to benchhmark.
May 19 2017, 11:53 AM
kristof.beyls added inline comments to D33343: Add some tips on how to benchhmark.
May 19 2017, 11:52 AM

May 16 2017

kristof.beyls added a comment to D33222: [LegacyPassManager] Remove TargetMachine constructors.

Just wanted to point to https://reviews.llvm.org/D29494, which is a patch that isn't fully polished to split the StackProtector pass into an Analysis pass and a Transformation pass.
My motivation for that was also fixing PR30324.

May 16 2017, 1:54 AM

May 11 2017

kristof.beyls added a comment to D32529: [GlobalISel] Make multi-step legalization work..

Thanks for the review on the ARM-specific part, Diana!
@t.p.northover : would you have time for a quick glance at the target-independent parts in LegalizerInfo.h, LegalizerInfo.cpp and LegalizerInfoTest.cpp?

May 11 2017, 7:54 AM

May 10 2017

kristof.beyls accepted D32963: [AARCH64] Fix a comment to match the code. NFC..

LGTM, even if you don't implement the nitpicky suggestions I made.

May 10 2017, 1:57 AM

May 9 2017

kristof.beyls added inline comments to D31965: [SLP] Enable 64-bit wide vectorization for Cyclone.
May 9 2017, 5:24 AM

May 3 2017

kristof.beyls added a comment to D32706: [AArch64] Consider widening instructions in cast cost calculation.

Addressed Kristof's comments.

  • Changed wording to refer to "widening" instructions rather than "lengthening" instructions.
  • Changed function name to is isWideningFree.
  • Updated the logic to consider the W instruction variants in addition to the L variants.
  • Added code generation tests along side the cost test cases.
  • Added additional tests for the W variants.
May 3 2017, 12:42 AM
kristof.beyls added a comment to D31965: [SLP] Enable 64-bit wide vectorization for Cyclone.

Hey Matt,

Hi Adam,

I'm not sure where the conversation about this patch landed, but I'm fine with it being a Cyclone only change for now if that's what you prefer. I haven't had a chance to evaluate it on our cores yet. But when I do, I can easily set MinVectorRegisterBitWith if there's any benefit. How does compile-time look?

There is no measurable compile-time change for AArch64 (testsuite, ctmark, spec).

I believe the idea was to try to enable this for all subtargets, assuming you and @evandro can test this. On the other hand, it's been almost a month and I'd like to wrap this up. So perhaps we should enable this for Cyclone and A57 for now and then perhaps file bugs for the remaining subtargets to evaluate this.

How does that sound, @rengolin and others?

May 3 2017, 12:19 AM

May 2 2017

kristof.beyls added inline comments to D32706: [AArch64] Consider widening instructions in cast cost calculation.
May 2 2017, 8:01 AM
kristof.beyls added inline comments to D32706: [AArch64] Consider widening instructions in cast cost calculation.
May 2 2017, 12:31 AM

Apr 27 2017

kristof.beyls updated subscribers of D32272: [XRay] Add Google Benchmark library + initial XRay benchmarks.

Could you explain the rationale for adding libs/benchmarks-1.1.0?
Also, since that seems to have a different license, you probably need to add that directory to the top-level LICENSE.TXT file as one of the sub-directories with "additional or alternate copyrights,licenses, and/or restrictions".

Apr 27 2017, 4:03 AM

Apr 26 2017

kristof.beyls added a comment to D32529: [GlobalISel] Make multi-step legalization work..

Thanks for splitting this out! The ARM part looks excellent :)
I think you could add some target-independent tests for this in unittests/CodeGen/GlobalISel/LegalizerInfoTest.cpp, to make sure it works to WidenScalar/MoreElements etc to a type that has a Libcall or Custom action. That way we won't have to depend on ARM for testing this functionality.

Apr 26 2017, 9:53 AM
kristof.beyls updated the diff for D32529: [GlobalISel] Make multi-step legalization work..

Adding a unit test.

Apr 26 2017, 9:53 AM
kristof.beyls added a comment to D28152: Cortex-A57 scheduling model for ARM backend (AArch32).

I've also run the latest version of this patch with command line options '-O3 -DNDEBUG -mcpu=cortex-a57 -mthumb -fomit-frame-pointer' on the test-suite and a range of other benchmarks.
I do see the same few performance regressions as reported by Andrew, for the same reasons; and many performance improvements.
Overall, I see a geomean performance improvement of 1.70% for the benchmarks reporting execution time and a 1.63% performance improvement for the benchmarks reporting scores.

Apr 26 2017, 7:30 AM
kristof.beyls created D32529: [GlobalISel] Make multi-step legalization work..
Apr 26 2017, 4:00 AM

Apr 24 2017

kristof.beyls retitled D32438: [LNT] Document how to capture linux perf profiles. from Document how to capture linux perf profiles. to [LNT] Document how to capture linux perf profiles..
Apr 24 2017, 9:37 AM
kristof.beyls created D32438: [LNT] Document how to capture linux perf profiles..
Apr 24 2017, 9:26 AM

Apr 21 2017

kristof.beyls added a comment to D32076: [ARM] Add big.LITTLE mcpu tuning options.

Adding big.LITTLE mcpu options for clang to be more command-line compatible with gcc is a good idea.
Mapping the code generation for it to what is done for the little core also is a defendable position.
I'm just wondering if in some of the implementation of this there could be less copy-paste? I've made a few specific comments in some places of the code where ideally there should be less copy-paste.

Apr 21 2017, 9:43 AM
kristof.beyls added inline comments to D31750: [globalisel] Enable tracing the legalizer with --debug-only=legalize-mir.
Apr 21 2017, 12:11 AM

Apr 19 2017

kristof.beyls closed D31934: [GlobalISel] Support vector-of-pointers in LLT.

Committed in r300664

Apr 19 2017, 8:26 AM
kristof.beyls accepted D31750: [globalisel] Enable tracing the legalizer with --debug-only=legalize-mir.
Apr 19 2017, 12:27 AM

Apr 18 2017

kristof.beyls added inline comments to D31750: [globalisel] Enable tracing the legalizer with --debug-only=legalize-mir.
Apr 18 2017, 5:44 AM
kristof.beyls added a comment to D32048: ClamAV: Copy zlib into ClamAV benchmark.

If a non-trivial amount of time is spent in zlib for clamav and we would want to retain benchmarking of the zlib library, maybe it'd be better to add zlib (with appropriate test input) as a separate benchmark to the test-suite?

There is already spec2000/164.gzip which has pretty much the same inflation/deflation code as zlib. Also this patch is not about adding a new benchmark but simply getting rid of an external dependency!

Apr 18 2017, 2:58 AM
kristof.beyls updated the diff for D31934: [GlobalISel] Support vector-of-pointers in LLT.

Move public methods to all be in the same location, as suggested by Ahmed, now that the implementation details are hidden more in private methods.

Apr 18 2017, 1:10 AM

Apr 14 2017

kristof.beyls added a comment to D32015: Make globalaa-retained.ll test catching more cases..

I have to note that after http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170410/444886.html
[llvm] r300200 - Re-apply "[GVNHoist] Move GVNHoist to function simplification part of pipeline."
that test starts failing.

Apr 14 2017, 2:53 AM
kristof.beyls accepted D32015: Make globalaa-retained.ll test catching more cases..

LGTM, Thanks!

Apr 14 2017, 2:32 AM
kristof.beyls added inline comments to D32015: Make globalaa-retained.ll test catching more cases..
Apr 14 2017, 12:45 AM
kristof.beyls added a comment to D30521: Introduce llc/ExecuteTestCommands pass.

Currently, I really don't understand why this is the right approach...

We sometimes want to test a specific codegen API rather than a whole pass.

This adds a special pass to llc that features a minimalistic scripting language to
call some predefined API functions so we can test the API with the usual lit+FileCheck tools.

If you're actually trying to test a specific API, why aren't unittests the correct approach?

If the problem is that the unittests are hard to write, why not write libraries to make authoring the unittest easier?

I think there is something fundamental about how we write tests in llvm, and why we do not write pass tests in a unit testing framework because hey, who stops us implementing llc/opt as some functions in a unit testing library, they mostly have trivial APIs:

I think there is a beauty and we are hitting a sweet spot in the way most of our testing works in llvm: Forcing tests to be a simple human readable text file keeps things approachable and understandable. Understanding the basics of lit, llc, opt and FileCheck is easy. Having the full power of C++ in a unit test library on the other hand makes it way too easy do "fancy" stuff that is harder to understand because it happens to abstracted away in generic code and multiple layers of APIs that you have to learn and dig through when you actually have to understand a bug.

I don't see how things being abstracted away in a scripting language and complex fake "pass" infrastructure is going to be substantially better here.

Essentially, I totally understand prefering FileCheck and friends when testing something that is reasonably cohesive as a *pass*. But I feel like the point at which a scripting language or other things are being used to handle the fact that we're actually testing a specific API, I would *much rather* just write a unittest against that API directly.

I don't see why wanting FileCheck for things that aren't a pass isn't valid.

I don't think that pass vs. non-pass is the right distinction.

I think FileCheck makes sense when there is already a fundamental reason to have textual output that is sufficiently stable to use in tests. This can be everything from having an IR to having a clear and precise printing structure for an analysis. Those have utility *outside* of tests and also are great tools for testing.

But when the thing we are testing, fundamentally, is *an API*, I think we should test it *by calling that API*.

Admittedly I am not happy to introduce a scripting language here, but it seemed like the pragmatic choice compared with the alternative of building up a pile of small passes for every test or putting .mir code into C++ string literals in a unittest and still lacking all the llc mechanics for bringing up a target or having the nice matching capabilities of FileCheck.

Again, I am very much in favor of having the facilities from 'llc' or 'FileCheck' in useful APIs for unittests to leverage. But I don't think we should be adding a scripting language in order to write tests in a language other than C++ to exercise C++ API calls. I think this is significantly more burdensome than just writing unittests.

Sorry for the slow response, I think we may just fundamentally disagree about the right approach here. Currently, the approach you are proposing would make the resulting tests nearly incomprehensible to me... On the other hand, I think unittests (which are "just" C++ code even if sometimes non-obvious C++ code) *when used well* are much more friendly to people outside of the particular area than a MIR-specific scripting framework.

Apr 14 2017, 12:27 AM
kristof.beyls added a comment to D32048: ClamAV: Copy zlib into ClamAV benchmark.

Avoid the external zlib dependency in ClamAV by copying zlib-1.2.11
source into the benchmark.

External dependencies are problematic in benchmarks because:

They are not compiled with the same compiler/flags as the rest of the benchmarks.
They are an additional burden to setup when running the test-suite. While zlib is a really popular and ubiquitous library it is still sometimes missing in cross-compilation settings.
No external dependencies simplifies the buildsystem.

Apr 14 2017, 12:09 AM

Apr 13 2017

kristof.beyls added inline comments to D31934: [GlobalISel] Support vector-of-pointers in LLT.
Apr 13 2017, 12:48 PM
kristof.beyls updated the diff for D31934: [GlobalISel] Support vector-of-pointers in LLT.

Further tweaks based on Ahmed's feedback.

Apr 13 2017, 12:42 PM
kristof.beyls added a comment to D31934: [GlobalISel] Support vector-of-pointers in LLT.

Thanks very much for the review Ahmed! I think I took all feedback into account in the new version of the patch. Please have another look.

Apr 13 2017, 6:44 AM
kristof.beyls updated the diff for D31934: [GlobalISel] Support vector-of-pointers in LLT.

update patch taking into account Ahmed's review feedback.

Apr 13 2017, 6:30 AM

Apr 12 2017

kristof.beyls added a comment to D31965: [SLP] Enable 64-bit wide vectorization for Cyclone.

Hi Kristof,

Hi Renato,

Hi Adam,

Interesting results! But it doesn't sound like this is Cyclone specific.

Sure it's not, it is just a deployment strategy for this change. See the FIXME in the code.

Rolling it out for Cyclone-only is just a way to get this going in a controllable manner. Other subtargets can roll it this out as people find the time to benchmark and tune this.

As the results section shows I had and still doing some tuning on this. This mostly allows 2-lane vectorization for 32-bit types so they benefit of vectorization is not so great thus the accuracy of the cost model is really tested by enabling this.

I also got the impression that this is a change that is somewhat (but only somewhat) independent of micro-architecture, as I assume this is mostly about trading off the overhead that may be introduced to get data into the vector registers vs the gain by doing arithmetic in a SIMD fashion.
Of course, the cost of getting data in vector registers and the gain of doing arithmetic in a SIMD fashion is somewhat micro-architecture dependent.

I noticed that Adam points to a number of other patches improving things - I'm assuming these other patches lower the cost of getting data into the vector registers?

Yes, do you have rL299482?

Apr 12 2017, 8:44 AM
kristof.beyls added a comment to D31965: [SLP] Enable 64-bit wide vectorization for Cyclone.

Hi Renato,

Hi Adam,

Interesting results! But it doesn't sound like this is Cyclone specific.

Sure it's not, it is just a deployment strategy for this change. See the FIXME in the code.

Rolling it out for Cyclone-only is just a way to get this going in a controllable manner. Other subtargets can roll it this out as people find the time to benchmark and tune this.

As the results section shows I had and still doing some tuning on this. This mostly allows 2-lane vectorization for 32-bit types so they benefit of vectorization is not so great thus the accuracy of the cost model is really tested by enabling this.

Apr 12 2017, 8:08 AM

Apr 11 2017

kristof.beyls added inline comments to D31801: Performance enhancements for Cavium ThunderX2 T99.
Apr 11 2017, 8:17 AM
kristof.beyls created D31934: [GlobalISel] Support vector-of-pointers in LLT.
Apr 11 2017, 6:36 AM

Apr 10 2017

kristof.beyls accepted D31711: [GlobalISel] LegalizerInfo: Enable legalization of non-power-of-2 types.
Apr 10 2017, 11:31 PM

Apr 6 2017

kristof.beyls added inline comments to D31750: [globalisel] Enable tracing the legalizer with --debug-only=legalize-mir.
Apr 6 2017, 5:23 AM
kristof.beyls added inline comments to D31711: [GlobalISel] LegalizerInfo: Enable legalization of non-power-of-2 types.
Apr 6 2017, 2:40 AM

Apr 5 2017

kristof.beyls added inline comments to D31711: [GlobalISel] LegalizerInfo: Enable legalization of non-power-of-2 types.
Apr 5 2017, 8:10 AM

Mar 31 2017

kristof.beyls added inline comments to D31236: Refactor getHostCPUName to allow testing on non-native hardware..
Mar 31 2017, 11:32 AM
kristof.beyls added inline comments to D31503: [GlobalISel]: Fix bug where we can report GISelFailure on erased instructions.
Mar 31 2017, 12:08 AM

Mar 30 2017

kristof.beyls added inline comments to D31236: Refactor getHostCPUName to allow testing on non-native hardware..
Mar 30 2017, 4:45 AM

Mar 29 2017

kristof.beyls added a comment to D31359: [GlobalISel]: Allow backends to custom legalize Intrinsics.

It'd be nice if there could be a test case for the fallback mechanism in-tree for a case like this. It's not immediately obvious to me how such an in-tree test could be created at the moment.
All in all, my impression so far is that patch has 2 changes:

  1. Having a mechanism to not blow up reportGISelFailure when a multi-step legalization fails half-way through
  2. Allowing custom legalization of intrinsics. If I've understood correctly, maybe best to commit them separately once there's agreement?

Yes - you are correct with the above 2 impressions.
Please let me know if I should create two separate reviews instead of this one.

Mar 29 2017, 11:22 AM
kristof.beyls updated the diff for D31236: Refactor getHostCPUName to allow testing on non-native hardware..

. Added a few short tests for a few more cores.

Mar 29 2017, 9:42 AM
kristof.beyls added a comment to D31359: [GlobalISel]: Allow backends to custom legalize Intrinsics.

The other (easier) approach is to reportGlobalISelFailure on intermediate instructions whenever we're unable to legalize/Lower them.

Is this all accounting for the fact that you may not know if you can legalize an intrinsic before you've already erased it? Seems like a bit of an odd situation to be in if so.

Not sure if I understand the question correctly. My view is the Intrinsic was legalized by the backend (created some machine instruction, erased the generic intrinsic and created intermediate instructions). It's entirely possible that the intermediate instructions may fail legalization (for e.g. missing LegalizeActions/custom lowering of other intrinsics etc).

Mar 29 2017, 7:06 AM

Mar 24 2017

kristof.beyls updated the diff for D31236: Refactor getHostCPUName to allow testing on non-native hardware..
  • By refactoring a little bit more, it became possible to test using a regular unit test, not having to use a mock, which makes the test a lot cleaner.
  • During review discussions with Hal, it become clear that it's probably best to stick to unit testing rather than using lit-based testing for this feature.
Mar 24 2017, 7:42 AM

Mar 22 2017

kristof.beyls added a comment to D31236: Refactor getHostCPUName to allow testing on non-native hardware..

I'm curious why you chose to take this approach rather than add some option that allows us to change the file name being read? If we do that, then we can test this with lit tests. I generally think of our practice as using mocking, and unit tests in general, for cases where lit tests aren't practical (or, to put it another way, the infrastructure necessary to enable them is more complicated than unit testing in C++). This does not seem to be the case here. It seems straightforward to make -proc-cpuinfo-file=/foo/bar/cpuinfo.txt (modulo bikeshed) work.

As I recall, there are several other places in Clang where this is also a problem (we have hard-coded file names for /etc/lsb-release, /etc/redhat-release, etc.).

Mar 22 2017, 7:31 AM
kristof.beyls added inline comments to D31236: Refactor getHostCPUName to allow testing on non-native hardware..
Mar 22 2017, 5:23 AM
kristof.beyls created D31236: Refactor getHostCPUName to allow testing on non-native hardware..
Mar 22 2017, 2:50 AM

Mar 17 2017

kristof.beyls added a comment to D30529: [RFC][GlobalISel] Enable legalizing non-power-of-2 sized types..
  1. Would this allow merging data?

    When sub-arch specific decisions are concerned, having a way to override a base default case would reduce the amount of code on both table-gen and c++ parts.

    For example, we could have a default catch-all case like {1,widen; 32,legal; 33,unsupp}, and later on, based on sub-arch decisions, increment legality, lib calls, etc.

With the patch as is, you indeed need to re-specify the info for all bit sizes.
It could be that I pushed the current patch way too far in breaking existing APIs and if I turned back the patch to only change the internal representations used in LegalizerInfo.cpp,
this would work. So, the idea being that tablegen info and targets specify for which data types an action can be taken that doesn't require changing bitsize, such as Legal, Lower, Libcall.
And then the target-independent logic can hopefully make decisions on most/all operations on how to adapt types to different bitsizes when that's needed.
I'll look into that.

Mar 17 2017, 6:52 AM
kristof.beyls updated the diff for D30529: [RFC][GlobalISel] Enable legalizing non-power-of-2 sized types..
  • Made the API change to LegalizerInfo::setAction much smaller: the setAction API is largely unchanged now. The only difference is that it no longer allows legalizationActions that change the size to be specified this way.
  • Instead, specifying how to legalize when the size of the type legalized needs to change is specified using a SizeChangeStrategy. In follow-up work, I think that these size-changing strategies will turn out to be largely target-independent, and therefore can be shared between all targets, and not need to be respecified for each target separately.
  • Split setAction into setScalarAction and setPointerAction to avoid having to specify an LLT just to indicate whether the type is a scalar or a pointer(with address space).
  • To keep this patch as NFC as possible, for AArch64, I had to come up with some complicated SizeChangeStrategies. While not pretty, it demonstrates that it is possible to create very custom SizeChangeStrategies. These ugly SizeChangeStrategies are also only expect to be there for a short while, until we make functional-change-changes to allow all non-power-of-two-sized types
  • Moved some of the implementation code from LegalizerInfo.h to LegalizerInfo.cpp
  • A lot of smaller cleanups.
Mar 17 2017, 6:50 AM

Mar 14 2017

kristof.beyls updated the diff for D30529: [RFC][GlobalISel] Enable legalizing non-power-of-2 sized types..
  • Split LegalizerInfo::setAction into setScalarAction and setPointerAction to avoid having to specify a mostly meaningless LLT as an argument just to indicate whether the type is a scalar or a pointer(with address space).
Mar 14 2017, 7:16 AM
kristof.beyls added a comment to D30529: [RFC][GlobalISel] Enable legalizing non-power-of-2 sized types..

I haven't read the actual code yet, but I've got a couple questions and a comment based on the description and the conversation so far.

Mar 14 2017, 4:18 AM

Mar 13 2017

kristof.beyls added a comment to D30529: [RFC][GlobalISel] Enable legalizing non-power-of-2 sized types..

I hope the above makes sense?

It does, but there are two issues here:

Mar 13 2017, 8:49 AM
kristof.beyls added a comment to D30529: [RFC][GlobalISel] Enable legalizing non-power-of-2 sized types..

Hi Kristof,

I haven't seen the patch at all, but what about situations where 64-bit is done with lib calls? For example, ldivmod takes 64-bit arguments and you wouldn't want to narrow them to 32-bits.

If this patch is intended to just simplify the legal vs. others, it shouldn't have a narrow-all that spans to +inf. Makes sense?

cheers,
--renato

Mar 13 2017, 4:38 AM

Mar 10 2017

kristof.beyls updated the diff for D30529: [RFC][GlobalISel] Enable legalizing non-power-of-2 sized types..
  • Rebased to top-of-trunk.
  • Correctly store Action info for pointer types for targets with multiple address spaces.
  • Added a few basic checks to verify correctness of a targets legalization specification.
  • Added one more static function to LegalizerInfo to make specifications shorter and more readable. (UnsupportedButFor).
  • Adapted setAction API to be able to specify if a particular setAction should be the first specification on the operation, or if it could be a refinement.
  • Introduced a few typedefs to improve readability.
  • Made legalization info identical to current ToT for all backends, so that this patch becomes NFCi.
Mar 10 2017, 8:11 AM