This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] Compile-time improvement for Machine Late Instructions Cleanup Pass.
AbandonedPublic

Authored by vpykhtin on Mar 24 2023, 7:02 AM.

Details

Reviewers
jonpa
foad
Summary

This is the follow-up for https://github.com/llvm/llvm-project/issues/61397 -
Machine Late Instructions Cleanup Pass takes long time to process huge testcase.

I've made an attempt to improve patch sent by Jonas Paulsson at the github issue, Jonas can you
please try this on your testcases to see if it has any compile-time benefit. Basically it joins
RegDefs and RegKills into single map to reduce lookup time.

I'm submitting original patch from Jonas and then mine so they can be compared for convenience.

Diff Detail

Event Timeline

vpykhtin created this revision.Mar 24 2023, 7:02 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 24 2023, 7:02 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
vpykhtin requested review of this revision.Mar 24 2023, 7:02 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 24 2023, 7:02 AM
vpykhtin updated this revision to Diff 508080.Mar 24 2023, 7:04 AM

Mine version of the patch.

vpykhtin updated this revision to Diff 508095.Mar 24 2023, 7:54 AM

Fixed debug build.

Thanks for trying the idea of a single map!

On SystemZ, the two versions show practically identical runtimes, so there is no winner. Is there any difference at all on your machine?

I am thinking maybe it would be possible to lookup Reg only once by calling removeRedundantDef() with MI and also the KillMI. That way, clearKillsForDef() wouldn't have to duplicate the same lookup.

On SystemZ, the two versions show practically identical runtimes, so there is no winner. Is there any difference at all on your machine?

Not really :) Something barely measurable like 2.13 and 2.09 seconds on the huge testcase. I'm curious which cases you use to compare to the current, non-map version?

I am thinking maybe it would be possible to lookup Reg only once by calling removeRedundantDef() with MI and also the KillMI. That way, clearKillsForDef() wouldn't have to duplicate the same lookup.

I need to dig more into what is going on there.

jonpa added a comment.Mar 30 2023, 3:30 AM

On SystemZ, the two versions show practically identical runtimes, so there is no winner. Is there any difference at all on your machine?

Not really :) Something barely measurable like 2.13 and 2.09 seconds on the huge testcase. I'm curious which cases you use to compare to the current, non-map version?

I am thinking maybe it would be possible to lookup Reg only once by calling removeRedundantDef() with MI and also the KillMI. That way, clearKillsForDef() wouldn't have to duplicate the same lookup.

I need to dig more into what is going on there.

If the map lookup time really matters, maybe it could be worth trying a vector of DefMI/KillMI pairs, so lookup time would be O(n) into the vector..? BTW, I really hope we don't have to use first/second accesses, something like RegInstrs[Reg].DefMI seems more readable...

If it's not possible to speed it up further, I think the simple version I tried is acceptable: it avoids the slowdowns in huge functions and is just barely slower on SPEC, which is then probably preferrable...

If the map lookup time really matters, maybe it could be worth trying a vector of DefMI/KillMI pairs, so lookup time would be O(n) into the vector..?

I don't think it's that critical

BTW, I really hope we don't have to use first/second accesses, something like RegInstrs[Reg].DefMI seems more readable...

RegInstrs[Reg] is a lookup in the map, I don't know if compilers are good enough to optimize several identical map lookups.

If it's not possible to speed it up further, I think the simple version I tried is acceptable: it avoids the slowdowns in huge functions and is just barely slower on SPEC, which is then probably preferrable...

I agree, I'll abandon this review then. Maybe it worth to use SmallDenseMap in your version instead of std::map.

jonpa added a comment.Apr 4 2023, 6:24 AM

If the map lookup time really matters, maybe it could be worth trying a vector of DefMI/KillMI pairs, so lookup time would be O(n) into the vector..?

I don't think it's that critical

BTW, I really hope we don't have to use first/second accesses, something like RegInstrs[Reg].DefMI seems more readable...

RegInstrs[Reg] is a lookup in the map, I don't know if compilers are good enough to optimize several identical map lookups.

If it's not possible to speed it up further, I think the simple version I tried is acceptable: it avoids the slowdowns in huge functions and is just barely slower on SPEC, which is then probably preferrable...

I agree, I'll abandon this review then. Maybe it worth to use SmallDenseMap in your version instead of std::map.

I tried this patch again a final time, this time passing the RegInfo pointer to removeRedundantDef() and further into clearKillsForDef(). My idea here is that this would only mean a single lookup for both the def and the kill. This was however also not any faster at all.

I also tried having a vector of RegInfo:s for each MBB - instead of a map. The idea was to make the lookup time linear, but this was much slower in small functions (while perhaps good idea in huge ones).

I have tried different map types before, so I don't think there is anything to gain from that.

I will post my patch as a separate review...

vpykhtin abandoned this revision.Apr 4 2023, 9:09 AM

I tried this patch again a final time, this time passing the RegInfo pointer to removeRedundantDef() and further into clearKillsForDef(). My idea here is that this would only mean a single lookup for both the def and the kill. This was however also not any faster at all.

I also tried having a vector of RegInfo:s for each MBB - instead of a map. The idea was to make the lookup time linear, but this was much slower in small functions (while perhaps good idea in huge ones).

I have tried different map types before, so I don't think there is anything to gain from that.

I will post my patch as a separate review...

Ok, thank you, I'll close this one.