diff --git a/clang/docs/InternalsManual.rst b/clang/docs/InternalsManual.rst --- a/clang/docs/InternalsManual.rst +++ b/clang/docs/InternalsManual.rst @@ -572,6 +572,180 @@ The Frontend library contains functionality useful for building tools on top of the Clang libraries, for example several methods for outputting diagnostics. +Compiler Invocation +------------------- + +One of the classes provided by the Frontend library is ``CompilerInvocation``, +which holds information that describe current invocation of the Clang frontend. +The information typically comes from the command line constructed by the Clang +driver, or directly from clients performing custom initialization. The data +structure is split into logical units used by different parts of the compiler, +for example ``PreprocessorOptions``, ``LanguageOptions`` or ``CodeGenOptions``. + +Command Line Interface +---------------------- + +The command line interface of the Clang ``-cc1`` frontend is defined alongside +the driver options in ``clang/Driver/Options.td``. The information making up an +option definition include the name and prefix (for example ``-std=``), form and +position of the option value, help text, aliases and more. Each option may +belong to a certain group and can be marked with zero or more flags. Options +accepted by the ``-cc1`` frontend are marked with the ``CC1Option`` flag. + +Command Line Parsing +-------------------- + +The option definitions are processed by the ``-gen-opt-parser-defs`` tablegen +backend, transformed into a list of invocations of the ``OPTION`` macro and +stored in ``clang/Driver/Driver.inc``. This file is then used to create instance +of ``llvm::opt::OptTable``, which acts as a command line preprocessor. The +preprocessed command line is stored in ``llvm::opt::ArgList``, an object that +provides an API for performing simple queries on the contents of the command +line. + +Finally, the ``CompilerInvocation::CreateFromArgs`` function is responsible for +the actual parsing of command line arguments. It maps the contents of +``llvm::opt::ArgList`` onto fields of ``CompilerInvocation``, normalizing the +values in the process. + +Command Line Generation +----------------------- + +Any valid instance of ``CompilerInvocation`` can be also serialized back into +semantically equivalent command line arguments in a deterministic manner. This +enables features such as explicit modules. TODO: Link to appropriate section of +Modules.rst. + +Option Marshalling infrastructure +--------------------------------- + +The code for parsing and generating command line is in large automatically +generated from the ``Marshalling`` annotations on the tablegen option +definitions. The following sections describes the basics of adding marshalling +annotations to command line options. + +First, it is necessary to create a class for constructing references to the +``CompilerInvocation`` member. + +.. code-block:: + + class LangOpts : KeyPathAndMacro<"LangOpts->", field, "LANG_"> {} + // CompilerInvocation member ^^^^^^^^^^ + // OPTION_WITH_MARSHALLING prefix ^^^^^ + +This facility is used to construct key paths that refer to a specific field of +the ``CompilerInvocation`` member. + +.. code-block:: + + LangOpts<"ImplicitModules"> + // ^^^^^^^^^^^^^^^ LanguageOptions member + +Key paths can be passed to the marshalling utilities that specify what happens +with the key path during command line parsing and generation: + + def fignore_exceptions : Flag<["-"], "fignore-exceptions">, Flags<[CC1Option]>, + MarshallingInfoFlag>; + +TODO: Mention PARSE_ and GENERATE_ macros. + +Currently, there are several marshalling utilities for different kinds of +command line options: + +**Positive Flag** + +The key path defaults to ``false``, is set to ``true`` when the flag is present +on the command line. + +.. code-block:: + + def fignore_exceptions : Flag<["-"], "fignore-exceptions">, Flags<[CC1Option]>, + MarshallingInfoFlag>; + +**Negative Flag** + +Key path defaults to ``true``, is set to ``false`` when the flag is present on +the command line. + +.. code-block:: + + def fno_verbose_asm : Flag<["-"], "fno-verbose-asm">, Flags<[CC1Option]>, + MarshallingInfoNegativeFlag>; + +**Negative and Positive Flag** + +Key path defaults to the specified value, takes on the value of the flag that +appears last on the command line. + +.. code-block:: + + defm pthread : BoolOption<"", "pthread", + LangOpts<"POSIXThreads">, DefaultFalse, // default => false + PosFlag, // -pthread => true + NegFlag, // -no-pthread => false + BothFlags<[CC1Option]>>; + +With most pair of flags, ``-cc1`` intentionally accepts only the one that +changes the default key path value. In this case, the driver is responsible for +accepting both flags and either forwarding the changing flag or discarding the +no-op flag. + +**String** + +Key path defaults to the specified string (or empty one if ommitted). When the +option appears on the command line, its value is simply copied. + +.. code-block:: + + def isysroot : JoinedOrSeparate<["-"], "isysroot">, Flags<[CC1Option]>, + MarshallingInfoString, [{"/"}]>; + +**List of Strings** + +Key path defaults to an empty ``std::vector``. Values from each +option appearance are appended to the vector. + + def frewrite_map_file : Separate<["-"], "frewrite-map-file">, Flags<[CC1Option]>, + MarshallingInfoStringVector>; + +**Integer** + +Key path defaults to the specified integer value (or ``0`` if ommitted). +When the option appears on the command line, its value gets parsed via +``llvm::APInt`` and assigned to the key path if successful. + +.. code-block:: + + def mstack_probe_size : Joined<["-"], "mstack-probe-size=">, Flags<[CC1Option]>, + MarshallingInfoStringInt, "4096">; + +**Enumeration** + +Key path defaults to the specified value prefixed by the contents of +``NormalizedValuesScope`` and ``::``. This ensures correct reference to an enum +case is formed even if the enum resides in different namespace or is an enum +class. If the option value does not match any of the comma-separated values from +``Values``, an error diagnostics is issued. Otherwise, the corresponding element +from ``NormalizedValues`` (at the same index) is assigned to the key path. The +number of comma-separated values and elements of the array within +``NormalizedValues`` must match. + +.. code-block:: + + def mthread_model : Separate<["-"], "mthread-model">, Flags<[CC1Option]>, + NormalizedValuesScope<"LangOptions::ThreadModelKind">, + MarshallingInfoEnum, "POSIX">, + Values<"posix,single">, + NormalizedValues<["POSIX", "Single"]>; + +TODO: Bitfields?, ImpliedByAnyOf, ShouldParseIf + +TODO: Reiterate that manual parsing/generation is still okay, especially in +complex cases. + +TODO: Mention that generating command lines is not strictly necessary for +downstream projects (yet). + The Lexer and Preprocessor Library ==================================