Page MenuHomePhabricator

Machine Level IR Serialization: print and parse LLVM IR.

Authored by arphaman on May 8 2015, 12:47 PM.



This patch is the initial patch for the machine level IR serialization. It creates a new library in CodeGen called Serialization.
This patch introduces the 'mir' file extension that is used by the machine level IR files.

This new library adds a new machine function pass that is responsible for printing out the machine level IR in the new
It also adds the functions and a class that allows llc to parse machine level IR. I've adopted a two stage parsing
approach - at first the optional LLVM IR is parsed so that the corresponding LLVM module can be created. Then
the machine level IR will be parsed at a later stage after llc creates the TargetMachine with the target information
from the already parsed module.

Diff Detail


Event Timeline

arphaman updated this revision to Diff 25360.May 8 2015, 12:47 PM
arphaman retitled this revision from to Machine Level IR Serialization: print and parse LLVM IR..
arphaman updated this object.
arphaman edited the test plan for this revision. (Show Details)
arphaman added reviewers: dexonsmith, bob.wilson, bogner.
arphaman set the repository for this revision to rL LLVM.
arphaman added a subscriber: Unknown Object (MLST).
arphaman updated this revision to Diff 25382.May 8 2015, 4:11 PM

Use @LINE filecheck expression instead of the line number in one of the tests.

arphaman updated this revision to Diff 25598.May 12 2015, 10:04 AM

I've decided to split the patch up so it will be easier to review.
The updated patch has some of the error handling and second stage parsing declarations taken out.

reames added a subscriber: reames.May 12 2015, 3:06 PM

A few random comments, not a systematic review.

General comment: Likely too much complexity for the current functionality. Start with the simple code, add complexity as justified.

10 ↗(On Diff #25598)

I'd strongly suggest adding a clear comment explaining the in progress nature of the work.

Also, terminology wise, I'd strongly suggest you strict with machine instruction representation rather than machine level IR. It's not clear what the later is, while the former is well defined.

18 ↗(On Diff #25598)

Include order.

30 ↗(On Diff #25598)

It looks like this is a helpful class which should live in the cpp file?

28 ↗(On Diff #25598)

Is there a MachineModulePass? The use of the HaveToPrintModule flag feels very dirty.

1 ↗(On Diff #25598)

This is overly generally named. I'm not really clear what the purpose of this file is at all.

221 ↗(On Diff #25598)

Without context, it's impossible to tell if this in the right place. Please upload with full context.

arphaman updated this revision to Diff 25725.May 13 2015, 1:01 PM

I've updated the patch using Philip's suggestions.

arphaman added inline comments.May 13 2015, 1:04 PM
10 ↗(On Diff #25598)


I also replaced the machine level IR terminology with MIR in this patch.

18 ↗(On Diff #25598)


30 ↗(On Diff #25598)

This class will be useful as a public class later, but for now I removed it from this patch.

1 ↗(On Diff #25598)

I've renamed it and updated the comments. Hopefully it is better now.

221 ↗(On Diff #25598)

The updated patch includes more context now.

arphaman added inline comments.May 13 2015, 1:08 PM
28 ↗(On Diff #25598)

There's no machine module pass, but I decided to use a module pass and to get rid of this flag.
This means I will have to add a new pass that prints out machine functions in later patches. It will have to run right after the module printing pass.

arphaman updated this revision to Diff 25825.May 14 2015, 4:45 PM

I rebased this patch and reflowed a long line in lib/CodeGen/LLVMBuild.txt based on
r237367 (it reflows long lines of LLVMBuild files).

arphaman updated this revision to Diff 25875.May 15 2015, 10:27 AM

I've updated the patch with fixes suggested by Duncan:

  • Changed the mir printing pass name.
  • Replaced 'Serialization' with 'MIR'.
  • Modified the parsing interface to use a single error.
  • Fixed namespace '}' comments.
MatzeB added a subscriber: MatzeB.May 15 2015, 12:02 PM

This is mostly a skeleton of the API, so I can't say much about the algorithms/implementation. Also this is one of your first commits so this will be a lot of nitpicking:

In general: For what the code does at the moment there are too many .cpp and .h files for my taste, better start simple and self-contained and break it up later when it has grown (the breaking apart will happen naturally, merging stuff together usually never happens).

30–31 ↗(On Diff #25875)

I changed the doxygen settings yesterday so you don't need to prepend \brief anymore if you have exactly one sentence as brief description.

291 ↗(On Diff #25875)

It may not be llvm coding style but all the others declarations below and above have no extra space in fron of the & here...

1093–1094 ↗(On Diff #25875)

This does not modify the object and should be marked const.

29–31 ↗(On Diff #25875)

My understanding is that this class is only used internally to implement the parser and is not intended for external users. It may be a good idea to move it to the MIRParser.cpp file then and not provide a header for it to make this fact obvious.

1 ↗(On Diff #25875)

Call this MIRPrintingPass.cpp unless you intend to really implement multiple different printing passes.

26–27 ↗(On Diff #25875)

Don't repeat the type name in the documentation comment.

22–23 ↗(On Diff #25875)

Generally the auto keyword is very unfriendly towards readers of your code. On the other hand having very longish types isn't friendly either. So I'm find with avoiding the Error<std::unique_ptr<...>> monster in the getFile() - although the best solution here would be a typedef in the MemoryBuffer class that can be used instead, but that should be part of a separate patch. The 2nd auto should definitely be replaced by std::error.

1 ↗(On Diff #25875)

Call it MIRPrinter.cpp like the declarations inside.

32 ↗(On Diff #25875)

Use llvm_unreachable instead of assert(false);

1 ↗(On Diff #25875)

Call it MIRPrinter.h like the class that is declared within.

24–33 ↗(On Diff #25875)

This looks to me like the external interface can be reduced to something like "void printMIR(raw_ostream &OS, const Module &Mod);" and leave the class internal to the .cpp file.

arphaman updated this revision to Diff 25901.May 15 2015, 3:50 PM

I've updated the patch using the recommendations made by Matthias.

arphaman updated this revision to Diff 25902.May 15 2015, 3:53 PM

Small formatting update that I forgot in the last update.

arphaman updated this revision to Diff 25907.May 15 2015, 5:24 PM

I decided to switch back to the MachineFunctionPass, but this time the 'HaveToPrintModule' flag isn't required - I print out the
module in the 'doFinalization' method. This means I will have to print out the machine functions into a
temporary string and then print out the string after the module is printed in the 'doFinalization' method.

arphaman updated this revision to Diff 25910.May 15 2015, 6:10 PM

Remove unnecessary #include in MIRPrinter header.

MatzeB accepted this revision.May 18 2015, 2:24 PM
MatzeB added a reviewer: MatzeB.

Newest patch LGTM once Matthias and Philip sign off.

I'm fine with it.

This revision is now accepted and ready to land.May 18 2015, 2:24 PM

I haven't reviewed the more recent versions but am happy to defer to
Matthias and Duncan who have.


This revision was automatically updated to reflect the committed changes.
mehdi_amini added inline comments.

The absolute path to llc sounds wrong to me?
(same in the other test).

2015-05-19 16:01 GMT-07:00 Mehdi AMINI <>:



Comment at: llvm/trunk/test/CodeGen/MIR/llvmIRMissing.mir:1
@@ +1,2 @@
+# RUN: ~/build/llvm/bin/llc -start-after branch-folder -stop-after
branch-folder -o /dev/null %s
+# This test ensures that the MIR parser accepts files without the LLVM IR.

The absolute path to llc sounds wrong to me?
(same in the other test).

Yes, this was a mistake that I fixed. However there was an unrelated bug
that caused
the incremental buildbots to fail so I reverted this commit.



I would like to resubmit the reverted commit now that I committed the
AsmParser change in r237833.

I've attached an updated patch and a smaller patch that shows the change
between the reverted commit and the
updated patch.

The updated patch adds a terminating null character to the LLVM IR sourced
that's passed from MIRParser
to LLParser.


2015-05-19 11:21 GMT-07:00 Alex Lorenz <>: