- User Since
- Sep 1 2015, 3:36 AM (268 w, 3 d)
Are we sure that the lowering of MachineInstr to MCInst is preserving the operand sequence? Can it be that the immediate is at position 4 for the MCInst only?
I have no idea how an ldrbt looks like as a MachineInstr. The original check should have triggered an assertion too for MachineInstr then...
Should it always have been checking operand 4 then? I think that makes sense
Uploaded diff with full context
@andreadb Looks like I updated wrong diff last time, sorry. Addressed comments
Rebased and added context
Fixed variable names
Fixed issue with update forms of ldm* instructions
Wed, Oct 21
@andreadb Makes sense. Let's try AArch64-like approach for this case. I've updated patch to handle just IsLdrAm3RegOffPred
Tue, Oct 20
Mon, Oct 19
@dmgreen There is no predicate for basic (w/o shift) moves in A57 model, that's why mvneq is not touched.
D89553 was pushed instead
@Paul-C-Anagnostopoulos I've removed the test completely. Closing this.
Sat, Oct 17
If you are happy with the mca test showing the changes, the other test could be removed if it's causing more trouble than it's worth?
Fri, Oct 16
I do wish there was a faster way to test what you're testing.
error: command failed with exit status: 2147483651
Thanks! I've created D89553 for this
The main problem in your case was that CheckFunctionPredicate is not very good because it assumes a single operand in input to the function (i.e. a MachineInstr/MCInst operand).
Removed commented line of code
Thu, Oct 15
Removed wrongly added file from diff
Tue, Oct 13
Mon, Oct 12
Fri, Oct 9
Reduced test case
Patch also fixes Cortex-A57 model for sxth/uxth/sxtab/uxtab instruction family. Added test case.
Sep 21 2020
sounds like you've got your environment all setup. Would it be easy for you to quickly test the changes that you suggested earlier?
LGTM with nits
Sep 20 2020
@flyingforyou There are numerous places where latencies are different from those in arm_cortex_a55_software_optimization_guide_v2.pdf. Values in the guide seems to be correct, at least they match my measurements on real piece of hardware.
Also there are some forwarding paths not listed by model
Sep 12 2020
@dmgreen Thx, I've updated the diff.
Sep 11 2020
Sep 10 2020
Sep 2 2020
Our case is a bit different. Given a 512M incremental flush threshold, I tested an LTO built that outputs a 5G bitcode file. The BackpatchWord is called 16,613,927 times, among which only 12 needs disk seek. Plus, each access visits 4-8 bytes on a page, and all visited pages are far away from each other. It is likely that the pages are not cached, and need to load anyway, and after a load, our code does not access enough data on a page to 'cancel' the page fault cost. So its cost could be very similar to seek.
Sep 1 2020
Aug 20 2020
Aug 1 2020
Jul 31 2020
The code in computeDeadSymbols will conservatively mark all copies live if any is. See not only the worklist iteration I modified here, but also the code in visit at lines 878-879, and the preserved GUID handling at lines 814-815. So if there is a collision, the colliding values may conservatively be marked live, so the loop in propagateAttributes will handle them. I.e they are either all dead or all live (especially after my fix here which will make the behavior more conservative in the chance case of any alias values that collide with a value marked live during module summary building). Other than the corner case I'm fixing here, this should be a no-op in behavior for that loop in propagateAttributes, which was already skipping all dead values, and which does check if it is a GlobalVarSummary already.
I made a smaller efficiency improvement (no measurable impact) to skip all summaries for a VI if the first copy is dead. I added an assert to ensure that all copies are dead if any is....
This LGTM, but why is the verifier figuring out how to do its own RPO?
Jul 24 2020
Can anyone look at this, please? Thanks
Jul 23 2020
Jul 22 2020
Does this mean that we run UnreachableMachineBlockElimID twice?