Flags some potential cases where splitting isn't happening and so could result
in confusing results. Also update some test files where there were near misses
in splitting that seemed unintentional.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
mlir/lib/Support/ToolUtilities.cpp | ||
---|---|---|
51 | This patch made me look at our current behavior: we don't require a new line after the pattern! I mean that it is happy to parse this: func private @foo() // -----func private @foo() -> $ ./bin/mlir-opt -split-input-file split.mlir module { func private @foo() } module { func private @foo() } So I would check exactly --\n here. | |
53 | Can you make splitMarker a StringRef from the beginning? | |
67 | I think this loop deserve a bit more doc |
mlir/lib/Support/ToolUtilities.cpp | ||
---|---|---|
51 | We actually semi rely on this when we generate "print-ir-*" output. We generate output such that each output block can be treated as a separate IR block. |
mlir/lib/Support/ToolUtilities.cpp | ||
---|---|---|
51 | Oh wow! // -----// IR Dump Before CSE //----- // func @foo() { return } // -----// IR Dump Before CSE //----- // func @baz() { return } // -----// IR Dump Before Canonicalizer //----- // func @foo() { return } // -----// IR Dump Before Canonicalizer //----- // func @baz() { return } I didn't notice this trick about the extra // before, nifty! |
It is more a function of number of splits than file size: on files you would check ~4 characters 2x instead of 1x per potential split. You have 1 additional loop over splits, and in case where there is no near misses you have 4N characters additionally checked and stringref's inserted into another vector (which I could reserve space for all to avoid any allocations in the loop), so it is linear in number of splits. If you have near misses throughout the file you end up with 4N additional checks + N-1 StringRef creations and getFromPointer calls - so that could be heavy, but also unexpected that we'd have such near misses throughout the file.
mlir/lib/Support/ToolUtilities.cpp | ||
---|---|---|
51 | I'll revert the newline check then (or I could additionally allow for "" or " " suffices, but not sure that valuable). Although what is the use case? E.g., if you dump all the sections, would you feed this output file into mlir-opt to run a pass over all stages of a pipeline? (I'd have thought you'd just copy & past the part of interest out into a new file post print-ir-* rather than feed in the whole file). |
mlir/lib/Support/ToolUtilities.cpp | ||
---|---|---|
51 | The main concrete use case is that you can open the dump in vscode(or another IDE) and get language server capabilities (without having the different dumps negatively interact with each other). |
mlir/lib/Support/ToolUtilities.cpp | ||
---|---|---|
56–57 | Will this crash if the split is the last line? |
mlir/lib/Support/ToolUtilities.cpp | ||
---|---|---|
56–57 | Good question, it didn't but I think that was as mostly null-terminated main buffer here, but checking explicitly better. |
This patch made me look at our current behavior: we don't require a new line after the pattern!
I think we should?
I mean that it is happy to parse this:
->
So I would check exactly --\n here.