This is an archive of the discontinued LLVM Phabricator instance.

[RFC] Lame fix for PR20376
Needs ReviewPublic

Authored by pcc on Jul 21 2014, 6:55 PM.
This revision needs review, but there are no reviewers specified.

Details

Reviewers
None
Summary

The problem (as I understand it) is the DAG eventually ends up representing
a necessary EFLAGS copy. But we deliberately do not perform EFLAGS copies
because they are impossible on x86 (see InstrEmitter.cpp:171 or thereabouts),
so we end up using whatever happens to be in EFLAGS at the time of the branch.
This problem does not end up affecting compares/overflow checks because
they are schedule-independent, so we can easily move (or copy) the
instruction to wherever is convenient.

The workaround I came up with was to disable a simplification that may lead
to such a copy occurring, specifically the one that eliminates SETCC nodes, if
they refer to a CopyFromReg which may not be scheduled right before the branch,
in the hope that the SETCC will be scheduled in the right place. This turned
out to be not so great; a number of tests now fail because of extra copies.

One other idea I had was a peephole MI pass that removes SETCC instructions
if safe to do so. Does that sound like a good idea?

There may be a much better way of solving this. I'm not a backend expert;
this is what I have gleaned after staring at selection DAGs for far too
long.

Diff Detail

Event Timeline

pcc updated this revision to Diff 11733.Jul 21 2014, 6:55 PM
pcc retitled this revision from to [RFC] Lame fix for PR20376.
pcc updated this object.
pcc edited the test plan for this revision. (Show Details)
pcc added a subscriber: Unknown Object (MLST).

We actually do have a last-ditch EFLAGS copy available (though with worrying implications for the red-zone, if we ever start using it). It's used by test/CodeGen/Generic/2011-07-07-ScheduleDAGCrash.ll for example.

Enabling it more generally is probably just a matter of tweaking getCopyCost. Which might be an idea for robustness anyway.

But I'm not sure that actually affects the best solution here. I rather suspect performing the comparison again would be quicker in general. I suppose the question is how common and performance-critical such fragments are.

Cheers.

Tim.

pcc added a comment.Jul 29 2014, 12:08 PM