This is an archive of the discontinued LLVM Phabricator instance.

[FileCheck][NFC] Refactor FileCheckDiag into class hierarchy
Needs ReviewPublic

Authored by jdenny on May 30 2022, 12:04 PM.

Details

Summary

Problem

FileCheckDiag and its enum MatchType have outgrown their original purpose. The -dump-input presentation layer (in llvm/utils/FileCheck/FileCheck.cpp) and the FileCheck library's diagnostic emission (in llvm/lib/FileCheck/FileCheck.cpp) are too tightly coupled. The interactions are subtle to understand and maintain. It is difficult for the former to reason about the latter's emitted diagnostics in order to present them in the most readable manner.

Solution

This patch removes MatchType from FileCheckDiag and refactors FileCheckDiag as the base class of a class hierarchy. That class hierarchy is designed to enable the FileCheck library to focus on communicating FileCheck diagnostic information clearly and completely without participating in the specific presentation decisions of -dump-input. -dump-input is then freer to evolve more independently in the way it reasons about emitted diagnostics.

Diff Detail

Event Timeline

jdenny created this revision.May 30 2022, 12:04 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 30 2022, 12:04 PM
jdenny requested review of this revision.May 30 2022, 12:04 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 30 2022, 12:04 PM
jdenny added a comment.EditedMay 30 2022, 12:06 PM

Below are some more details that seem worthwhile to capture somewhere
but that seem like too much for the review summary.

Background

Over a year ago in D96653, when we realized that patch was not a good direction, we discussed a new way to mark a FileCheck directive's search range in input dumps. I've tinkered with that idea since then and concluded that FileCheckDiag first needs a rewrite, which is the goal of the current patch.

Several months ago, I finished implementing the search-range idea in subsequent patches, and I have been observing the effect on FileCheck's input dumps since then to be sure I'm happy with it. Eventually I'll post the additional patches to see what others think.

My point in mentioning the search-range work here is that it offers some evidence that the FileCheckDiag rewrite in the current patch is usable for further evolution of input dumps. However, I think this rewrite is worthwhile for code readability and maintainability even if the search-range idea is ultimately not accepted by the community.

Specific Issues Addressed

Result vs. Note

Similar to compiler errors/warnings/remarks vs. notes, the FileCheckDiag series emitted by the FileCheck library contains match results, each of which might be followed by a series of associated notes before the next match result. Without this patch, that association is not formally modeled by FileCheckDiag or clearly documented, and -dump-input isn't able to easily reason about it.

Specifically, without being aware of that association, -dump-input needs some other means to determine (1) whether some notes should be included by -dump-input-filter=error (the default filter), and (2) where to place some notes in the input dump. As a result, the FileCheck library is responsible for copying the following from a match result to any note that doesn't have its own versions: (1) the MatchType, which determines error status, and (2) the input range so that -dump-input can place the note next to the match result. This copied information is redundant as it's already part of the associated match result, and semantically it doesn't really apply to the note itself.

Furthermore, for a note that does not have a match location (e.g., the FileCheck library doesn't include match locations when emitting notes about variable substitutions), the FileCheck library is responsible for reducing the copied location to an empty range as a special case that -dump-input understands means it should not add a marker (like ^~~), which would be misleading in the input dump. However, that means a note can never have a real input range that is empty and that thus refers to just a single position.

These interactions are too subtle and are hard to explain. As a thought experiment to improve the separation of semantics from presentation, imagine an alternative HTML-based presentation (this is purely hypothetical: I have no plans to develop this). Imagine notes appear in pop-up windows when clicking on match results. It is then useless for the FileCheck library to copy the error status and location from a match result to its notes. However, the association between the match result and its notes *is* relevant in *both* the HTML-based presentation and in the existing -dump-input presentation.

This patch formally models the relationship between match results and notes with two classes directly derived from FileCheckDiag: MatchResultDiag and MatchNoteDiag. A MatchNoteDiag does not carry a bogus MatchTy, and a match range is exposed as an Optional. Thus, this patch relieves the FileCheck library of the above -dump-input-specific responsibilities, and -dump-input is now able to cleanly examine the FileCheckDiag series for each required property.

getMarker

Because of the FileCheckDiag class hierarchy, this patch is able to clean up the getMarker function for -dump-input. Without this patch, getMarker encodes the lead character, color, message, and error status individually for every possible MatchType. With this patch, getMarker instead encodes the logic of how the marker properties are generally chosen at the top of the FileCheckDiag hierarchy, and then it overrides those choices where needed lower in the hierarchy. Again, it is now easier for a diagnostic presentation layer to reason about diagnostics.

Diagnostic-Specific Data

Because of the class hierarchy, each FileCheckDiag can contain just the information that is useful for its kind of diagnostic. MatchNoteDiag does not need to store a FileCheckType or check location as that information can be retrieved from the associated MatchResultDiag. MatchCustomNoteDiag is the only class that needs to store a note string. MatchFoundDiag now records both a match range and a search range. Currently, -dump-input does not make use of the search range for MatchFoundDiag, but I plan to change that in the case of errors, as discussed above.

jdenny edited the summary of this revision. (Show Details)May 30 2022, 12:09 PM
jdenny edited the summary of this revision. (Show Details)