This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC][NFC] Rename record instructions to use _rec suffix instead of o
ClosedPublic

Authored by jsji on Nov 26 2019, 6:59 PM.

Details

Summary

We use o suffix to indicate record form instuctions,
(as it is similar to dot '.' in mne?)

This was fine before, as we did not support XO-form.
However, with https://reviews.llvm.org/D66902,
we now have XO-form support.

It becomes confusing now to still use 'o' for record form,
and it is weird to have something like 'Oo' .

This patch rename all 'o' instructions to use '_rec' instead.
Also rename isDot to isRecordForm.

Diff Detail

Event Timeline

jsji created this revision.Nov 26 2019, 6:59 PM
steven.zhang accepted this revision.Nov 26 2019, 8:55 PM

LGTM as it is straightforward. However, please hold on for some time in case someone else has other concern.

This revision is now accepted and ready to land.Nov 26 2019, 8:55 PM
jsji updated this revision to Diff 231719.Dec 2 2019, 8:27 AM

Rebased to resolve conflict.

nemanjai requested changes to this revision.Dec 2 2019, 1:44 PM
nemanjai added a subscriber: joerg.

I am going to mark this as requiring changes until we can get more consensus on the direction. This is a very pervasive change that is completely subjective and as such should require agreement from more than two people that work on the target.

Personally, I am on the fence with a slight lean towards keeping the o convention we had before. While I understand that things like ADDOo may appear a bit ambiguous, I would say that ANDIr and RLDICRr is downright weird. To me, the fact that the mnemonic is all uppercase and the o that is a stand-in for the dot is pretty clear, but that may just be the fact that I am so familiar with the existing naming. Furthermore, the flag we set in the multiclasses is called isDOT and there is no clear relationship between this "dot" and the letter r. All that said, I will not block this if the majority is in agreement (or indifferent). But let's get some more eyes on this.

Perhaps @hfinkel, @jhibbits, @echristo or @joerg might have some input on this patch.

This revision now requires changes to proceed.Dec 2 2019, 1:44 PM
jhibbits requested changes to this revision.Dec 2 2019, 1:55 PM

I second @nemanjai, the lowercase 'o' suffix stands out to me as a clear indicator of it being punctuation, not a letter in the instruction. I would much rather see ANDIo than ANDIr, the 'r' just doesn't look right when reading over it. My vote is to reject. It looks like a lot of churn without much benefit.

jsji added a comment.Dec 2 2019, 2:02 PM

Furthermore, the flag we set in the multiclasses is called isDOT and there is no clear relationship between this "dot" and the letter r. All that said, I will not block this if the majority is in agreement (or indifferent). But let's get some more eyes on this.

Thanks for the review. Yes, welcome others' input and comments.
To me isDOT is saying the mne will have dot '.' at the end , that is exactly what the ISA says about mne of record form. And we can also rename isDOT to isRecord if we do need it to be explicit.

jsji added a comment.EditedDec 2 2019, 2:15 PM

I second @nemanjai, the lowercase 'o' suffix stands out to me as a clear indicator of it being punctuation, not a letter in the instruction. I would much rather see ANDIo than ANDIr, the 'r' just doesn't look right when reading over it. My vote is to reject. It looks like a lot of churn without much benefit.

I understand that it might not be a big problem if you are familiar with the code base, but it is confusing to new comers.

For example, if you are discussing about some optimization related to ADD-o,
then you don't know exactly what you are talking about.

You will need to clarify more :

are you talking about big O or small o?
are you talking about assembly addo or opcode ADDO?
I need to change some o-form instructions..... wait, are you talking about record form or XO-form?

This is an effort to avoid confusion for newcomers, especially when dealing with assembly as well.

If r is not as straightforward as it should be, how about _r? eg: ANDI_r instead of ANDIo. Or any other suggestions?

When discussing optimizations related to "ADD-o", if it's "ADDO" I'd call it "ADD-O", but if it's "ADDo" I'd call it "ADD-dot", since all record forms are referred to as 'dot's, even in all the documentation I've seen. For instance, immediately above this text box is the "rlwinm_rldicl_to_andi.mir" test, with the change of "ANDIo8..." to "ANDIr8".... I read that instinctively as "ANDI-dot-8", whereas "ANDIr8" looks like "ANDI-R-8" which doesn't have a corresponding instruction in the ISA. Additionally, the asm string in PPCInstrInfo.td is "andi. ...", which would make someone question what the 'r' is for, when it doesn't even exist in the instruction. Seeing a 'ANDIo' alongside "andi." someone will easily see the 'o' stands for the '.'.

I understand where you're coming from, but with the (supposed) goal being to mimic the mnemonics as closely as possible, I still believe keeping 'o' is the right course.

Yeah, I actually added my initial comment precisely because I thought the existing form is more familiar to those that do not have deep experience with the ISA. What I mean is that the r is fairly intuitive to those of us that are deeply familiar with the ISA - it means "record-form". But to those less familiar with the ISA, the phrase "record-form" may be completely foreign and the choice of r would then be quite confusing. Whereas looking at the mnemonics in the ISA:

add
add.
addo
addo.

and the corresponding mnemonics in our td files:

ADD
ADDo
ADDO
ADDOo

it does not require a lot of familiarity to connect the uppercase letters to the letters in the mnemonic and the lowercase letter to the dot.

(note that I do realize that these have 4/8 on them and chose those to illustrate that the more confusing part to a newcomer is the 32/64 bit versions)

jsji added a comment.Dec 2 2019, 2:56 PM

When discussing optimizations related to "ADD-o", if it's "ADDO" I'd call it "ADD-O", but if it's "ADDo" I'd call it "ADD-dot",

Thanks. Yeah, agree that is a good trick, iff we already know we should call small 'o' as 'dot', instead of O, then it will be better.

But I believe that is not obvious to new members..
At least I did not recall any of our new members being able to read it as 'ADD-dot'.

Hrmm... the current naming convention predates me, but it wasn't difficult to figure out. If you're familiar with the ISA and the fact that the record forms have mnemonics ending in a '.'. Honestly, replacing 'o' with 'r' doesn't seem all that much better to me, as I still need to know that the 'r' stands for record form and understand what that means. It's not like searching the ISA manual for '.', or 'o', or 'r', is going to buy one much of anything. I am sympathetic to none of these being immediately obvious, and I'd be happy with some fix to that situation.

To make things more obvious, we could replace isDOT with isRecordForm, the 'o' suffix with _record (moving the numeric suffix, where present, to before the underscore). What do you think of that?

jsji added a comment.Dec 2 2019, 4:44 PM

To make things more obvious, we could replace isDOT with isRecordForm, the 'o' suffix with _record (moving the numeric suffix, where present, to before the underscore). What do you think of that?

Thanks! Regarding numeric suffixes, ANDI(S) are the only two that were not consistent, i planned to rename them independent of record suffix renaming, in
https://reviews.llvm.org/D70928.

I will update the patch to base on D70928, then do renaming of isDot and 'o' to '_record'.

jsji updated this revision to Diff 231815.Dec 2 2019, 8:13 PM

Rename record instructions to use _record suffix instead of o
Rename isDot to isRecordForm

jsji retitled this revision from [PowerPC][NFC] Rename record instructions to use r suffix instead of o to [PowerPC][NFC] Rename record instructions to use _record suffix instead of o.
jsji added a reviewer: joerg.
jsji updated this revision to Diff 231816.Dec 2 2019, 8:24 PM

Rename RLWINM(I)_recordbm to RLWINM(I)bm_record

jsji edited the summary of this revision. (Show Details)Dec 2 2019, 8:28 PM

I was originally going to suggest a more verbose suffix much as Hal suggested, since any single letter is kind of meaningless. But to be honest, I hate the added verbosity and feel that instruction mnemonics are just an inherently tricky thing that requires experience with the ISA document.

So since the status quo is at the top of my list of naming convention preferences in this case, I will abstain from further comments and will accept whatever the majority decides.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
1392–1395

Ugh.

I still hold pretty strongly that this change is a more negative impact than positive. As shown in the tests, it's very easy to read the generated statements as "FOO-dot", and search for what 'foo.' actually is. The only difficult part is the '*8' vs '*4', since you can't search for that, and need to remember to drop the number, which isn't difficult after the first time.

Searching the POWER ISA 3.0B reference, I see IBM using "period(.)", while the "EREF_RM" (NXP/Freescale Book-E reference) uses "dot suffix", with "record-form" only being used when describing the actual encoding.

jsji marked an inline comment as done.Dec 3 2019, 11:09 AM

So since the status quo is at the top of my list of naming convention preferences in this case, I will abstain from further comments and will accept whatever the majority decides.

Thanks for the kindness to support efforts for helping newcomers. @nemanjai

I still hold pretty strongly that this change is a more negative impact than positive.

Can you be more specific about the negative impact of new proposal (_record suffix)?

What I can think of is:

  1. longer name
  2. breaking existing user's habit
  3. breaking downstream code.

#3 shouldn't be a problem if we agree that it is a correct direction to make the code more readable and less confusion, to make it easier for newcomers to contribute.
#2 might be annoying to existing developers, especially experts like you and @nemanjai. However, I believe experts like you will have mercy on newcomers.
#1 I think it depends on how you look at it. -- To me, it is more verbose, and also call more attention to such opcodes: paying attention that this is a record form instruction, it is different from normal form!

As shown in the tests, it's very easy to read the generated statements as "FOO-dot", and search for what 'foo.' actually is.

Again, it is easy for experts like you, but non obvious for newcomers without background.

Searching the POWER ISA 3.0B reference, I see IBM using "period(.)", while the "EREF_RM" (NXP/Freescale Book-E reference) uses "dot suffix", with "record-form" only being used when describing the actual encoding.

Thanks for double check both Power ISA and EREF_RM. Yeah, so, record-form is the ONLY common terminology here.

If you still feel strongly against the new suffixes (_record) that Hal proposed, do you have any other suggestions?

Stepping back, or do you think it is NOT confusing at all? or you think it is somehow a little confusing but no good way to fix it?

Thanks. @jhibbits

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
1392–1395

Agree that this ISD name is somehow ugly. We can definitely rename them if needed.
However, looks like to me that they are for a 'AND glue' bug only, should eventually go away.

I personally think the status quo is easy to read. The definitions look like the instructions themselves, and the added verbosity seems very unnecessary. I don't know about others, but I consider people working in the PPCInstr*.td files to be either knowing exactly what they're looking for (grepping), or have a sufficient handle on the ISA that the existing notation is clear. At most I could see changing the 'o' suffix to '_o', which might help newcomers familiar with '_' notation to indicate a diminutive or subscript, but '_record' is very verbose, and as @nemanjai points out, can make some text very ugly.

jsji added a comment.EditedDec 3 2019, 12:12 PM

I personally think the status quo is easy to read. The definitions look like the instructions themselves, and the added verbosity seems very unnecessary. I don't know about others, but I consider people working in the PPCInstr*.td files to be either knowing exactly what they're looking for (grepping), or have a sufficient handle on the ISA that the existing notation is clear. At most I could see changing the 'o' suffix to '_o', which might help newcomers familiar with '_' notation to indicate a diminutive or subscript, but '_record' is very verbose, and as @nemanjai points out, can make some text very ugly.

'status quo' was OK if we don't have XO-form support.

The motivation of this patch is to avoid confusion between 'o' and 'O' suffixes, and make it easier to connect 'o' to record form in ISA,
apparently, changing 'o' to '_o' won't help with that.

I see the benefit of verbose (_record) here: lower the barrier of 2-3 levels for newcomers.

  1. You don't necessarily know 'o' at the end of opcode should be read as 'dot', not as o character itself.
  2. You don't necessarily know that 'o' will be corresponding to period(.) at the end of assembly mnemonics.
  3. You don't necessarily know that mnemonics with period(.) or dot suffixes is called record form in ISA.
  4. You don't necessarily know that record form will modify condition registers

It will take a newcomer at least above 4 steps to figure out what is the difference between SUBICo and SUBIC,
and what is worse, one may easily be blocked by misunderstandings in any of these steps.

But with SUBIC_record, one can easily figure out it in one step: searching SUBIC and record in ISA.

Yes, _record is a bit longer and verbose, and may be somehow *ugly* to someone,
but I see the benefits outweigh tidy/compact code.

What do you think?

lkail added a subscriber: lkail.Dec 4 2019, 1:44 AM

As a newbie of PowerPC backend, I think what @jsji proposed will ease my mind burden when faced with o ended instructions. I have read ISA-3.0 and know

A period (.) as the last character of an instruction
mnemonic means that the instruction *records* sta-
tus information in certain fields of the Condition
Register as a side effect of execution.

I can directly react to the effect of record-form instruction if it ends with _record. If _record is too long, how about _rec, just like X86's _alt ended instructions?

I might object less to a _rec suffix. However, you would still need to understand what "record form" means, which requires understanding of the ISA. Why not simply add a comment block to the top of PPCInstrInfo.td, describing things like this? I have a feeling other newbies would have similar questions/concerns/complaints, and they can't all be addressed by renaming everything.

I would propose something along the lines of the following as a block comment in PPCInstrInfo.td:

The following conventions are used for PowerISA definitions:

  • - 'o' suffix: Record-form of an instruction, updates CR
  • - '8' suffix: 64-bit version of an instruction

These can be chained, such as in the case of 'ANDI8o'

jsji added a comment.EditedDec 5 2019, 9:01 PM

I might object less to a _rec suffix. However, you would still need to understand what "record form" means, which requires understanding of the ISA. Why not simply add a comment block to the top of PPCInstrInfo.td, describing things like this? I have a feeling other newbies would have similar questions/concerns/complaints, and they can't all be addressed by renaming everything.

Although I still prefer _record as what Hal proposed. Verbose and Simplicity. _rec also acceptable to me.

I would propose something along the lines of the following as a block comment in PPCInstrInfo.td:

Sure, we can always add similar comments besides renaming.

The downside of only putting some block comment in certain file is:
most of the time you will still miss it if you don't happen to look into the specific section of that file.

I don't expect everyone be able to find and read this specific block comment in PPCInstrInfo.td,
when he/she start to look at some code, and confused about SUBICo and SUBIC.
He/she may be able to look back into definition of SUBIC, but if the comment is not nearby,
he might not have idea to where to find that block of comments.
And we can't expect everyone to be able to read all important files, or know which file to look at.

jsji added a comment.EditedDec 9 2019, 12:17 PM

Hi all, thanks for the review and discussion.
I'd like to make a final decision about commit or abandon, I don't see any point of leaving this longer.

Currently,
@jsji (me) is still prefer _record, can accept _rec.
@jhibbits is still against _record (prefer no change or at most _rec).
@lkail agreed with _record as a relatively new comer, can accept _rec.
@nemanjai is still kindly silent.

@steven.zhang @hfinkel Any further comments or suggestions? Thanks.

Hi all, thanks for the review and discussion.
I'd like to make a final decision about commit or abandon, I don't see any point of leaving this longer.

Currently,
@jsji (me) is still prefer _record, can accept _rec.
@jhibbits is still against _record (prefer no change or at most _rec).
@lkail agreed with _record as a relatively new comer, can accept _rec.
@nemanjai is still kindly silent.

@steven.zhang @hfinkel Any further comments or suggestions? Thanks.

So the common subset is _rec, right?

When my first time see the suffix "o" in the opcode, it is really hard for me to know that it is record form...(o stands for .) As mentioned before, there is no good way to use short character(such as 'o' or 'r') to indicate too much information. But we could have some information in the instruction definition if people really want to know what it is. Personally, I don't like the _record, even the _rec, as it will make the opcode longer(i.e. ANDI_record_1_GT_BIT8). I will vote for abandon if we want to change it to _rec or _record. But you can skip my opinion as I didn't have strong bias on this :)

jsji added a comment.Dec 9 2019, 8:08 PM

So the common subset is _rec, right?

Yes, you are right! I will update the patch to use _rec.

When my first time see the suffix "o" in the opcode, it is really hard for me to know that it is record form...(o stands for .)

Thanks for double confirming the problem.

I will vote for abandon if we want to change it to _rec or _record.

I think we shouldn't abandon just due to longer name.
As I mentioned above, yes, renaming may be somehow *ugly* to someone,
but I see the benefits outweigh tidy/compact code. (especially for newcomers).

jsji updated this revision to Diff 232992.Dec 9 2019, 8:21 PM

Rename to _rec instead.

Build result: pass - 60660 tests passed, 0 failed and 726 were skipped.

Log files: console-log.txt, CMakeCache.txt

jsji retitled this revision from [PowerPC][NFC] Rename record instructions to use _record suffix instead of o to [PowerPC][NFC] Rename record instructions to use _rec suffix instead of o.Dec 23 2019, 8:01 AM
jsji edited the summary of this revision. (Show Details)
hfinkel accepted this revision as: hfinkel.Dec 30 2019, 3:47 PM

Ping ...

LGTM

lkail accepted this revision as: lkail.Dec 30 2019, 6:15 PM

LGTM. Given some verbosity can really help us to understand code quickly.

jsji added a comment.Dec 30 2019, 6:35 PM

Thanks, Hal & Kai.
@jhibbits @nemanjai , are you OK with this now?

nemanjai accepted this revision.Dec 30 2019, 7:54 PM

I would say that this is as close to a [perhaps begrudging] consensus as we'll get. Lets get this bike shed built. Please go ahead with the commit. LGTM.

Unit tests: pass. 61160 tests passed, 0 failed and 728 were skipped.

clang-tidy: fail. Please fix clang-tidy findings.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

jsji updated this revision to Diff 235717.Dec 31 2019, 8:50 AM

clang-tidy failure fixed in https://reviews.llvm.org/rGfcbf05bbdccc8a32f6a80316ea1c13be7e7eeae2.

Rebase and run git-clang-format HEAD^ as sugggested by merge_guards_bot.

Unit tests: pass. 61160 tests passed, 0 failed and 728 were skipped.

clang-tidy: fail. Please fix clang-tidy findings.

clang-format: pass.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

jsji added a comment.EditedDec 31 2019, 11:58 AM

clang-tidy: fail. Please fix clang-tidy findings.

...llvm/lib/Target/PowerPC/PPCISelLowering.cpp:7399:28: warning: invalid case style for function 'LowerTRUNCATEVector' [readability-identifier-naming]
SDValue PPCTargetLowering::LowerTRUNCATEVector(SDValue Op,

^~~~~~~~~~~~~~~~~~~
 lowerTruncateVector

I have fixed one the the clang-tidy failure in https://github.com/llvm/llvm-project/commit/fcbf05bbdccc8a32f6a80316ea1c13be7e7eeae2,
but as @nemanjai pointed out 'It is questionable how/whether this renaming applies to well established conventions that do not conform…'

In this case it might be worse, as we have some non-conform naming in target independent code as well.

eg:

$ grep ":Lower.*(" llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
const char *TargetLowering::LowerXConstraint(EVT ConstraintVT) const {
SDValue TargetLowering::LowerAsmOutputForConstraint(
void TargetLowering::LowerAsmOperandForConstraint(SDValue Op,
SDValue TargetLowering::LowerToTLSEmulatedModel(const GlobalAddressSDNode *GA,

My understanding is we want to avoid large-scale renaming just due to naming convention, but we should gradually transform to keep the source code clang-tidy conform, and pre-merge testing green?
In some cases, this means we will gradually getting some new code that conform to clang-tidy, but break well established conventions .

Is that OK? @hfinkel
If not, what direction we should pursue? Thanks.

jsji added a comment.Jan 6 2020, 6:49 AM

@jhibbits: are you OK with this now? Most of the other reviewers are OK with this now. Thanks.

jhibbits resigned from this revision.Jan 6 2020, 7:08 AM

Not happy, but won't hold it up.

I'd rather see just the "isDOT" changed to "isRecord" and leave the rest as-is, but I'll defer to everyone else.

This revision is now accepted and ready to land.Jan 6 2020, 7:08 AM
jsji added a subscriber: jhibbits.Jan 6 2020, 7:18 AM

Not happy, but won't hold it up.

I'd rather see just the "isDOT" changed to "isRecord" and leave the rest as-is, but I'll defer to everyone else.

OK.. Sorry to make you unhappy..
Thank you for the kindness to sacrifice and support efforts for helping newcomers.

This revision was automatically updated to reflect the committed changes.