- User Since
- Oct 21 2016, 1:19 AM (100 w, 5 d)
Thu, Sep 6
Tue, Sep 4
- Calculate CSStackSize by accumulating reg-size for each saved register.
- Only sets PairedReg in SavedRegs when it is not AArch64::NoRegister.
- Removed trailing whitespace from test.
Mon, Sep 3
The code in hasLoadFromStackSlot filters out MMO's that have a pseudo source value of type Stack. If that function populates an array of only those MMOs, I think its safe to do a cast<FixedStackPseudoSourceValue>(mmo->getPseudoValue()) and get rid of the FrameAccess struct.
Fri, Aug 31
- Some refactoring to maintain CSStackSize.
- Migrated .ll test to a .mir test.
Fixed issue with incorrect size of MachineMemOperand.
Thu, Aug 30
Aug 17 2018
Added negative tests and moved the tests to their own files test/MC/AArch64/SVE (system-regs.s and system-regs-diagnostics.s).
These system registers were missing from my previous SVE MC patches.
Jul 30 2018
- Updated to reflect changes in D49592 to use TSFlags instead of separate table for annotating instructions as destructive operations.
- Added negative tests for new SVE instructions that were added since uploading the previous patch.
Updated the patch to use TSFlags instead of a separate look-up table to annotate instructions as being destructive operations.
Jul 27 2018
Hi Sjoerd, thanks for your feedback! We actually use the TSFlags in our downstream assembler, but had a few reasons to re-implement it with a GenericTable instead:
- The TSFlags are used for all instructions, taking up global bits for only a relatively small selection of instructions, where these bits make little sense to non-movprfxable instructions. To better utilize the (potentially) cheap move/zeroing capabilities of MOVPRFX, we'll need more fields/bits to describe the kind of operation e.g. whether it is unary, binary, commutative, whether it has a reversed operation (e.g. sub and subr), which in our downstream compiler takes a total of 9 bits. With this in mind, I thought it made more sense to describe this in a separate table instead.
- The table is not queried that often (only when encountered together with a MOVPRFX), so there is little runtime overhead.
Jul 20 2018
It is probably worth noting that the positive and negative tests to explicitly test that instructions (and/or specific forms of those instructions) are or are not movprfx-able, are auto-generated using a separate tool (so my apologies for the large diff :)). The movprfx-diagnostics.s is worth pointing out as testing the specific rules of the MOVPRFX instruction.
Jul 4 2018
Jul 3 2018
Jul 2 2018
Jun 18 2018
Thanks @fhahn I've added a comment to regsEqual describing what it can be used for.
Jun 15 2018
Jun 14 2018
Jun 6 2018
Merged k_ExactFPImm with k_FPImm.
Jun 5 2018
Hi @olista01 , I just tried merging this change with k_FPImm and one of the problems I'm running into is with the '#0.0' and InstAliases.
For example in SVEInstrFormats.td, class sve_int_dup_imm:
InstAlias<"fmov $Zd, #0.0", (!cast<Instruction>(NAME # _S) ZPR32:$Zd, 0, 0), 1>;
The k_FPImm case works a little different.. Instead of it storing the APFloat value, it stores the (AArch64_AM) encoded FP imm value. If the value cannot be encoded, the parser throws an error. Also not all values can be encoded, such as '0.0', for which the parser (for k_FPImm) puts in a "#0.0" string literal into the parsed operands list. The two cases can be merged, but I think it may be better to do that in a separate patch as this will require some refactoring and changes to match e.g. the #0.0 case in existing FP instructions.
Jun 4 2018
Jun 3 2018
Jun 1 2018
Removed some redundant test-cases for fmov alias that are already tested by test/MC/AArch64/SVE/fdup.s
I think the difference with other immediates that have a limited set, is that these immediates have a non-trivial parsing/decoding/printing. Rather than just allowing a given range of integer values the string is first parsed as FP value and then encoded as integer value between 0-255. And the opposite for the decode. I think it makes sense to test these cases individually, although I can see how doing this for the FMOV alias as well could be a bit excessive, so I'll reduce those.
Since the immediate encoding only allows for a limited set of 256 different floating-point immediates, I thought it would make sense to test each of them separately to make sure they are all correctly parsed, assembled and printed, since basically these are all the edge-cases. The negative tests are much smaller and test a few floating point values that cannot be encoded. I can easily reduce the tests, but I think these tests are valuable. What do you think?
May 31 2018
May 30 2018
That sounds similar, but there is some distinction in that it uses predication. It is a predicated splat of an immediate value into the result vector with merging predication.
I used the word 'copy' instead of 'splat' since that is the name of the instruction. Splat would be more appropriate for D47482 .
May 29 2018
May 25 2018
Addressed nits and cleanup isSVEAddSubImm.
May 24 2018
- Removed some brackets and updated the patch to reflect change in D47309.
May 23 2018
LGTM! Thanks for adding the negative tests and rationale.