By generating in the .h file, we were forcing dialects to include
a lot of additional header files because:
- Fields of the dialect, e.g. std::unique_ptr<>, were unable to use forward declarations.
- Dependent dialects are loaded in the constructor, requiring the full definition of each dependent dialect (which, depending on the file structure of the dialect, may include the operations).
By generating in the .cpp we get much faster builds, and also
better align with the rest of the code base.
Fixes #55044