User Details
- User Since
- Apr 11 2016, 3:46 AM (362 w, 9 h)
Aug 30 2018
I don't see any changes for VEXTRACTF128 in tests. Do you really need this JWriteVecExtractF128? If YES you should add the corresponding test.
Aug 29 2018
Aug 28 2018
The issue with SB uops was fixed.
Aug 27 2018
Now we have WriteCMPXCHGRMW accordingly to craig.topper requirement.
Aug 22 2018
Aug 21 2018
Aug 17 2018
I fixed the issue with WriteCMPXCHGLd - tnx to craig.topper.
Aug 16 2018
The ability to produce debug output for 'ftiming' was added. As result now it's possible to check places where timers start/stop and for what functions it's being done (see changes in Utils.h).
Aug 14 2018
I fixed sched parameters accordingly to craig.topper suggestions.
Aug 13 2018
The code was re-based. We have 2 changed tests for SandyBridge (Generic). I did not find the proper numbers in Intel SDMs or on Agner site that's why I don't know what's beter the new results or the old ones. The old sched model does not use SBPort4 for memory ops and that looks strange for me. Could anyone help me with these tests?
Aug 9 2018
Currently we have 93 warnings for X86 CPU models:
Aug 8 2018
The patch was rebased after D49912.
Aug 7 2018
All last requirements were fixed.
Aug 3 2018
The comment was added:
Aug 2 2018
XADD*rr instrs were added: I did not add rm version because we should decide how to do it beter:
The new tests were updated, xadd* tests were added.
Aug 1 2018
The WriteCMPXCHGLd was implemented.
efriedma, I removed redundant RAII objects but I still have the following:
Jul 31 2018
Accordingly to efriedma suggestion I removed start/stopFrontendTimer where it's possible and inserted FrontendTimeRAII in several new places. As result the patch becomes bigger and bigger. And as another result I got output like here (on compiler bootstrap):
....
0.5920 (165) _ZSt7declvalv (*)
0.5960 (155) _ZSt7declvalv (*)
0.5960 (162) _ZSt7declvalv (*)
0.6000 (167) _ZSt7declvalv (*)
0.6040 (155) _ZSt7declvalv (*)
0.6040 (160) _ZSt7declvalv (*)
0.6040 (169) _ZSt7declvalv (*)
....
(the above is grep output from build log file)
Jul 30 2018
Jul 28 2018
Jul 27 2018
I updated the tests accordingly to Roman's request.
But of course you're right: we need such mca tests. I'll try to prepare such tests for XCHG* and CMPXCHG* instrs (I've just started to work with CMPXCHG*).
But we have a lot of XCHG tests in other places. For example:
Jul 26 2018
WriteShiftDouble was removed, other comments were fixed.
efriedma, do you have any other comments/requirements?
Jul 24 2018
I fixed all requirements from Simon - even the SLM test was fixed.
The main idea of this patch is implemented in D47763.
Jul 23 2018
We decided that this patch won't include memory versions of instrs that's why I simply fixed tiny requirements like "place of WriteBitTest" and "removing of the TableGen CodeGenSchedule diffs".
Jul 20 2018
Jul 19 2018
I removed the changes from D48222.
Comments from Simon and Roman were resolved.
Jul 18 2018
I added required comments and did the required changes.
Jul 17 2018
I replaced 'auto' with real types accordingly to Simon's request.
In fact we have there very similar issues but even with rr versions: we should separate implementations for different sizes 16/32/64. I'm going to complete it tomorrow. But again - most probably it will be w/o memory operands support like in D49243.
Now we use WriteRes instead of *WriteResPair.
Folded versions of the intrs will be implemented in the next patches.
http://www.agner.org/optimize/instruction_tables.pdf, page 202, "Intel Haswell", "List of instruction timings and μop breakdown" appears to list all the BT* as having latency of 1.
I mixed columns for latency and throughput: latency is missed.
Jul 16 2018
Hi All,
I need your help!
It seems I fixed all issues raised by lebedev.ri. The open question is: should I remove the second check inside checkSchedClasses? I'm sure there is a way to get default values for such SchedWrites like WriteALU, WriteFAdd, etc. but I don't know at the moment how to do it. Please, help.
Jul 13 2018
I renamed WriteBTr with WriteBitTest.
I could not add memory version of the instructions because there were some issues. For example, if we have
Jul 12 2018
Now the patch keeps only infrastructure changes to produce new TableGen warns about sched models.
Jul 11 2018
Jul 10 2018
I fixed all issues raised by efriedma: GlobalDecl(FD), function body, class names, etc. Many tnx for your help.
Jul 9 2018
I fixed warns for BSWAP instrs (see changes in X86*.td files). If this approach is OK I'll fix other instrs.
At the moment we have the following warns for X86:
In fact we have the warnings for 2 Targets only:
Jul 3 2018
Now it's almost completed but should deal with deafult Latency value: hope to implement it soon. (The current version does not produce yet warning about identical uOps & Latency during compiler build: maybe it's OK?)
Hi! I was out for 2 weeks that's why I did not do anything here. Is it still interesting for you? I'm going to publish an update asap.
But the question is: how to get latnency/uop from CodeGenSchedClass ?
Jun 15 2018
Jun 7 2018
9>>! In D47766#1124708, @RKSimon wrote:
I really don't like the idea of a separate file - I think the models need to stay self contained. @courbet 's approach in D47763 seems a lot tidier
If I use one check only (" // Check if an instruction is always overriden (candidate for a new class?)") I see warnings for p9model only.
At the moment, I'm learning debug output related to generation of all these tables and hope to come up with some realistic logging soon.
Simon, I have some troubles with slack connection that's why I read your message only now :-( I'll back with answer asap.
Jun 6 2018
Of course not. First of all, I suggest to put AVX2 and AVX512 instructions in separate files and to use them in models which don't support AVX2 and/or AVX512. Etc.
Obviously, the idea was to use one include file for all targets. But it did not work in the first version of the patch. Now I did a "dirty hack" in TableGen to be able to do it. It works for 2 CPUs and obviously it should work for others. The usage is very simply: the common include file should keep the common unsupported instructions while the specific onces should be inserted into the specific .td files.
Jun 5 2018
And the last question. Again, in P9Model we have
Next, we have for P9Model only:
Something is wrong with current diagnostic. Fro example, we have