This is an archive of the discontinued LLVM Phabricator instance.

[globalisel][docs] Rework GMIR documentation and add an early GenericOpcode reference
ClosedPublic

Authored by dsanders on Oct 28 2019, 7:13 PM.

Details

Summary

Rework the GMIR documentation to focus more on the end user than the
implementation and tie it in to the MIR document. There was also some
out-of-date information which has been removed.

The quality of the GenericOpcode reference is highly variable and drops
sharply as I worked through them all but we've got to start somewhere :-).
It would be great if others could expand on this too as there is an awful
lot to get through.

Also fix a typo in the definition of G_FLOG. Previously, the comments said
we had two base-2's (G_FLOG and G_FLOG2).

Diff Detail

Event Timeline

dsanders created this revision.Oct 28 2019, 7:13 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 28 2019, 7:13 PM
arsenm added inline comments.Oct 28 2019, 7:27 PM
llvm/docs/GlobalISel/GenericOpcode.rst
5

This document brings the number of places documenting the generic opcodes to 3. TargetOpcodes.def attempts to document these, as does GenericOpcodes.td. There's a risk of these becoming out of date and inconsistent with each other. We should probably prune these to have one canonical place for documenting them. Ideally this document would be generated from the comments in either the .td or .def file

dsanders marked an inline comment as done.Oct 28 2019, 8:49 PM
dsanders added inline comments.
llvm/docs/GlobalISel/GenericOpcode.rst
5

I don't think we should count TargetOpcodes.def there. With a couple exceptions, the comments for the G_* instructions more or less re-state the name of the instruction but replaces 'G_' with 'Generic ' and tacks ' instruction' on the end. I think we could delete all of those comments and lose very little. That said, the G_TRUNC description has information that isn't present in GenericOpcodes.td which is interesting.

There's a risk of these becoming out of date and inconsistent with each other. We should probably prune these to have one canonical place for documenting them.

I agree. Having a one-line summary in each place makes some sense but duplicating the detail is bad. The end goal with the GMIR+GenericOpcodes pages in our sphinx documentation is to end up with the GMIR equivalent of LangRef.rst, giving the high-level picture, how the pieces fit together, and the specification of the GMIR. We can also cross-reference the GMIR+GenericOpcodes to our existing MIR documentation and LLVM-IR where appropriate (e.g. the reference to the LLVM-IR bitcast which G_BITCAST matches).

Ideally this document would be generated from the comments in either the .td or .def file

I don't really agree with this bit though, the important bit is that there's a single source of truth. There's also some downsides to the generating from comments approach. For example, it actually causes duplication as well as reducing it (although to be fair, this is a problem with the tools available more than an approach). SelectionDAG's ISD::NodeType documentation (https://llvm.org/doxygen/namespacellvm_1_1ISD.html#a22ea9cec080dd5f4f47ba234c2f59110) is a good example of this as the rendered docs is full of holes caused by not duplicating the comment for each enum value and this makes it hard to extract meaning from the rendered docs.

rovka added inline comments.Oct 29 2019, 3:03 AM
llvm/docs/GlobalISel/GMIR.rst
93

I'd suggest "can only access registers from A or registers from B, but never a mix of the two. If we want to perform an operation on data that's split between the register files, we must first copy all of the data into just one of them."

97

I would suggest "the cost of copying data around between different instructions that use it" (or some other way that highlights that we're looking at more that one instruction at a time).

98

"to bind"

100

You say register 5 times in this sentence :)

"Register Banks are a means to constrain the register allocator to use a particular register file for a given virtual register." would be a bit more straightforward.

dsanders updated this revision to Diff 227157.Oct 30 2019, 12:15 PM

Fix for Diana's review comments

dsanders marked 4 inline comments as done.Oct 30 2019, 12:16 PM
rovka added a comment.Oct 31 2019, 2:54 AM

Just a few more typos and stuff.

llvm/docs/GlobalISel/GMIR.rst
93

Typo "or register B"

llvm/docs/GlobalISel/GenericOpcode.rst
99

Typo: "the a value"

207

Typo: "multiple registers specified size"

220

Punctuation.

229

Ditto.

270

I think it's ok for some of the other instructions to have a more summary description and no code block at the moment, but it would be nice to brush this one up a bit since it's different from what people might expect.

274

I'm not a native speaker but I think it's resemblance.

313

Code block should use G_UADDE

319

Missing punctuation.

323

Code block should use G_UMULH

328

Missing punctuation and code block.

349

Missing punctuation for this and the following ones.

419

Typo: treat

438

Typo: "fused multiple add"

443

Ditto.

dsanders updated this revision to Diff 227343.Oct 31 2019, 2:14 PM
dsanders marked 16 inline comments as done.

Fix the various typos (except the 'treat' one which I didn't understand)

dsanders added inline comments.Oct 31 2019, 5:16 PM
llvm/docs/GlobalISel/GenericOpcode.rst
270

I've added a code block and an explicit mention that the offset is in addressable units (usually bytes) but I'm not sure what else to add at the moment.

When I get a chance I'm going to propose renaming this to G_PTR_ADD on the list

274

You're right, I just don't have spell checking enabled in vim :-)

419

I'm not sure what you mean on this one. It looks fine to me

rovka accepted this revision.Nov 4 2019, 1:24 AM

The text itself LGTM.

FWIW I too would prefer generating this automatically (e.g. as is done for clang's AST Matchers) but I don't know the caveats involved in doing that. I suppose we could start with this and automate later (similarly to how we started with hand-written code and then moved to TableGen).

Maybe wait a couple of days to see if anyone feels strongly about automating.

llvm/docs/GlobalISel/GenericOpcode.rst
271

Actually I just noticed it says "bytes" here: https://github.com/llvm/llvm-project/blob/2be17087f8c38934b7fc9208ae6cf4e9b4d44f4b/llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h#L409

We should say the same thing in both places, either bytes or addressable units. Or ideally stop saying so much about the gMIR itself in the comments for the MachineIRBuilder.h.

This revision is now accepted and ready to land.Nov 4 2019, 1:24 AM
dsanders updated this revision to Diff 227754.Nov 4 2019, 11:52 AM

Updated MachineIRBuilder function to match GenericOpcodes.td

FWIW I too would prefer generating this automatically (e.g. as is done for clang's AST Matchers) but I don't know the caveats involved in doing that. I suppose we could start with this and automate later (similarly to how we started with hand-written code and then moved to TableGen).

I don't think there's anything insurmountable there but the main one is probably adding build dependencies to the documentation. We can probably achieve the same effect without having to build tblgen using with a Sphinx Extension that is able to read tablegen sufficiently well to handle GenericOpcodes.td as that file has no reason to use non-trivial features of tablegen.

Maybe wait a couple of days to see if anyone feels strongly about automating.

Sure

This revision was automatically updated to reflect the committed changes.