This is a work in progress patch that adds the ability to specify an AST dump format on the command line. By default, we continue to dump the AST with its usual tree view, but users can now optionally pass -ast-dump=json to dump to a machine-readable JSON format that makes it easier for third parties to consume the Clang AST in some fashion.
The patch can currently handle dumping a fair amount of declaration information, some statements, and very few expressions. I got it to the point where it was showing useful output in roughly the correct format, but I wanted to get community feedback for continuing the implementation. Once the current approach gains consensus, my plan is to commit the WIP and then do subsequent commits with post-commit review to finish the implementation (unless the changes are somehow interesting enough to warrant pre-commit review, of course).
The hybrid approach of using some LLVM JSON functionality and some streaming functionality is purposeful for performance reasons (collecting the entire AST into memory in a second form means ~2x the memory usage for the AST, which can be prohibitive for large compilation units). Testing this functionality with FileCheck is quite verbose, so if someone has suggestions for a better way to test the JSON output, I'd be happy to consider it.
I think we've talked about this before, but I don't think growing interfaces like this is the best way forward. An enum is a less-good replacement for an object (ie making the user of the API responsible for creating the dumper they want to use).
I think that could be made more convenient in the future. What do you think?