The description on the patch is really unhelpful.
It should explain why the change is needed/should happen, not how great the new intrinsic is, how bad it is not to support it...
Also, as usual, please upload all patches with full context.
We already try (but fail due to lack of sub op)
to convert this to fshr in AggressiveInstCombine.cpp:92.
There wouldn't be any sub op here, because these are constants,
and we don't seem to canonicalize this rotate pattern with constants to fshl in instcombine/aggressiveinstcombine.
That is a separate issue, i'm not sure if it's a bug or not, so please file a bug.
Yes, I also have the same concerns as Roman. If we don't canonicalize to fshr in instcombine, then we shouldn't canonicalize to fshr in simplifycfg. They're both canonicalization passes.
Given the testing burden, if you really want to go down this route I would recommend changing instcombine *first*, from which you can observe the fallout with a large testing base. Then change this code, which triggers in many fewer cases.
I agree - the canonicalization should happen in instcombine (and that might make doing it in simplifycfg an academic exercise since we can always count on instcombine to do it).
As far as I know we don't have any known issues in funnel shift optimization or codegen.
That is my understanding too. Targets that support some kind of rotate/funnel instruction should produce those from the intrinsic. Targets that don't have that should translate exactly to the sh/sh/or sequence when building SDAG. We already canonicalize to the funnel shift intrinsics for a variable shift amount in several cases, but nobody bothered to add the constant shift amount pattern because it wasn't causing problems like the variable shift case. But using the intrinsics will still theoretically improve passes like inlining and vectorization (assuming they have their cost models straight).
Final note: please don't look at AggressiveInstCombine as your first source for canonicalization truth; use InstCombine for that and propose changes there. "AIC" is a home for expensive/unusual patterns and (at least currently) doesn't run without -O3.