I am just posting this for your reference. This was my initial attempt to fix the SETcc problem back in February. As you can see, the changes weren't all that extensive. The basic approach was to define new SETCC pseudo-opcodes that return 32-bit/64-bit values along with patterns that cause a MOV32r0/SETCC sequence to be generated. The pseudo-ops then get lowered after RA to the "real" SETCC opcodes.
The approach seems sound to me, and it fixed the performance kernel that I wrote. But there are a few things that I don't like.
(1) The opcode explosion is kind of gross: we should really have a condition field rather than building the condition into the opcode.
(2) I couldn't figure out a good way to handle the different register constraints in 32-bit vs. 64-bit mode, so I created separate opcodes for them.
During testing, I got lots of failures in our test suites and never got around to debugging. It could be something obvious/trivial.
So, feel free to take this code and run with it or just give me feedback on the basic approach.