I don't see any changes for VEXTRACTF128 in tests. Do you really need this JWriteVecExtractF128? If YES you should add the corresponding test.
Aug 30 2018
Aug 29 2018
Aug 28 2018
The issue with SB uops was fixed.
Aug 27 2018
Now we have WriteCMPXCHGRMW accordingly to craig.topper requirement.
Aug 22 2018
Aug 21 2018
Aug 17 2018
:start: means the timer was started
In some cases, the ChildTime is already non-zero at the "start" point; what does that mean?
I fixed the issue with WriteCMPXCHGLd - tnx to craig.topper.
Aug 16 2018
I mean, which of the callers of startFrontendTimer() is calling it with a pointer to std::declval()?
The ability to produce debug output for 'ftiming' was added. As result now it's possible to check places where timers start/stop and for what functions it's being done (see changes in Utils.h).
Aug 14 2018
I fixed sched parameters accordingly to craig.topper suggestions.
Aug 13 2018
The code was re-based. We have 2 changed tests for SandyBridge (Generic). I did not find the proper numbers in Intel SDMs or on Agner site that's why I don't know what's beter the new results or the old ones. The old sched model does not use SBPort4 for memory ops and that looks strange for me. Could anyone help me with these tests?
Aug 9 2018
Currently we have 93 warnings for X86 CPU models:
I'm unclear why the we would want to assign clang's FrontendTimesIsEnabled from inside CodeGenAction. If I'm understanding the intentions here, the goal was to add more timing infrastructure to clang. But if the enabling is tied to CodeGenAction, then doesn't that mean any new clang timers wouldn't work under -fsyntax-only?
Aug 8 2018
The patch was rebased after D49912.
Aug 7 2018
All last requirements were fixed.
Aug 3 2018
I think you're almost there but D49912 must be completed and committed first
"0.0040" is four milliseconds? You're probably crediting time incorrectly, somehow. Can you tell which FrontendTimeRAII the time is coming from?
The comment was added:
Aug 2 2018
XADD*rr instrs were added: I did not add rm version because we should decide how to do it beter:
The new tests were updated, xadd* tests were added.
Aug 1 2018
The WriteCMPXCHGLd was implemented.
efriedma, I removed redundant RAII objects but I still have the following:
Jul 31 2018
Accordingly to efriedma suggestion I removed start/stopFrontendTimer where it's possible and inserted FrontendTimeRAII in several new places. As result the patch becomes bigger and bigger. And as another result I got output like here (on compiler bootstrap):
0.5920 (165) _ZSt7declvalv (*)
0.5960 (155) _ZSt7declvalv (*)
0.5960 (162) _ZSt7declvalv (*)
0.6000 (167) _ZSt7declvalv (*)
0.6040 (155) _ZSt7declvalv (*)
0.6040 (160) _ZSt7declvalv (*)
0.6040 (169) _ZSt7declvalv (*)
(the above is grep output from build log file)
Jul 30 2018
Jul 28 2018
The SLM & btver2 - are those placeholders, or those are real values from agner?
Jul 27 2018
I updated the tests accordingly to Roman's request.
But of course you're right: we need such mca tests. I'll try to prepare such tests for XCHG* and CMPXCHG* instrs (I've just started to work with CMPXCHG*).
But we have a lot of XCHG tests in other places. For example:
Jul 26 2018
WriteShiftDouble was removed, other comments were fixed.
efriedma, do you have any other comments/requirements?
Jul 24 2018
I fixed all requirements from Simon - even the SLM test was fixed.
The main idea of this patch is implemented in D47763.
Anything to do to get this going? Or did this got replaced by some other differential? D48222?
Jul 23 2018
We decided that this patch won't include memory versions of instrs that's why I simply fixed tiny requirements like "place of WriteBitTest" and "removing of the TableGen CodeGenSchedule diffs".
Jul 20 2018
Jul 19 2018
I removed the changes from D48222.
Comments from Simon and Roman were resolved.
Jul 18 2018
I added required comments and did the required changes.
Jul 17 2018
I replaced 'auto' with real types accordingly to Simon's request.
What happened to the BSWAP patch? Compared to BT (D49243) it should have been pretty trivial to implement.
In fact we have there very similar issues but even with rr versions: we should separate implementations for different sizes 16/32/64. I'm going to complete it tomorrow. But again - most probably it will be w/o memory operands support like in D49243.
Now we use WriteRes instead of *WriteResPair.
Folded versions of the intrs will be implemented in the next patches.
http://www.agner.org/optimize/instruction_tables.pdf, page 202, "Intel Haswell", "List of instruction timings and μop breakdown" appears to list all the BT* as having latency of 1.
I mixed columns for latency and throughput: latency is missed.
(Would be great if the checkbox 'done' in notes would be getting checked, too)
Adding startFrontendTimer/stopFrontendTimer helps a little, but it's still difficult to match a given startFrontendTimer to the corresponding stopFrontendTimer because they're in completely different functions in some cases. Do they really need to be scattered like that? If they do, please add comments so someone reading the code can match them up.
Jul 16 2018
I need your help!
It seems I fixed all issues raised by lebedev.ri. The open question is: should I remove the second check inside checkSchedClasses? I'm sure there is a way to get default values for such SchedWrites like WriteALU, WriteFAdd, etc. but I don't know at the moment how to do it. Please, help.
Jul 13 2018
I renamed WriteBTr with WriteBitTest.
I could not add memory version of the instructions because there were some issues. For example, if we have
I'm assuming that you have run ninja check-llvm-tools-llvm-mca-x86 and it was ok. (the test coverage seems to be ok as-is.)
Jul 12 2018
Now the patch keeps only infrastructure changes to produce new TableGen warns about sched models.
Jul 11 2018
Please can you pul out the BSWAP change into its own patch for review?
Jul 10 2018
I fixed all issues raised by efriedma: GlobalDecl(FD), function body, class names, etc. Many tnx for your help.
Jul 9 2018
I fixed warns for BSWAP instrs (see changes in X86*.td files). If this approach is OK I'll fix other instrs.
At the moment we have the following warns for X86:
In fact we have the warnings for 2 Targets only:
@avt77 Next step is probably to start briefly collating the instructions causing the warnings and investigate how best to fix them (list them here, raise bugs etc.). The aim would be to see if we can remove ALL these warnings before we consider committing this patch - its probably time to add the maintainers for each target that has warnings so they can be investigated?
Jul 3 2018
Now it's almost completed but should deal with deafult Latency value: hope to implement it soon. (The current version does not produce yet warning about identical uOps & Latency during compiler build: maybe it's OK?)
Hi! I was out for 2 weeks that's why I did not do anything here. Is it still interesting for you? I'm going to publish an update asap.
But the question is: how to get latnency/uop from CodeGenSchedClass ?
Jun 15 2018
Jun 7 2018
If I use one check only (" // Check if an instruction is always overriden (candidate for a new class?)") I see warnings for p9model only.
At the moment, I'm learning debug output related to generation of all these tables and hope to come up with some realistic logging soon.
Simon, I have some troubles with slack connection that's why I read your message only now :-( I'll back with answer asap.
Then I'm not sure if we win a lot from this as it makes things less explicit, though I agree that it slightly reduces code duplication.
Jun 6 2018
Of course not. First of all, I suggest to put AVX2 and AVX512 instructions in separate files and to use them in models which don't support AVX2 and/or AVX512. Etc.
Obviously, the idea was to use one include file for all targets. But it did not work in the first version of the patch. Now I did a "dirty hack" in TableGen to be able to do it. It works for 2 CPUs and obviously it should work for others. The usage is very simply: the common include file should keep the common unsupported instructions while the specific onces should be inserted into the specific .td files.
Jun 5 2018
And the last question. Again, in P9Model we have
Next, we have for P9Model only:
Something is wrong with current diagnostic. Fro example, we have