This is an archive of the discontinued LLVM Phabricator instance.

[X86] Reduce unnecessary instregex for AlderlakeP schedule model
ClosedPublic

Authored by HaohaiWen on Nov 11 2022, 7:09 AM.

Details

Summary

Using instregex for simple instruction opcode is much slower than
instrs. This patch replaces them with instrs.

Github issue: 35303

Diff Detail

Event Timeline

HaohaiWen created this revision.Nov 11 2022, 7:09 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 11 2022, 7:09 AM
HaohaiWen requested review of this revision.Nov 11 2022, 7:09 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 11 2022, 7:09 AM

@RKSimon , you can compare diff of patch1 and patch2 to see what changed for instregex.

All I'm actually requiring is that the instregex that end in "_Int$" are replaced with (_Int)?$

HaohaiWen updated this revision to Diff 474763.Nov 11 2022, 7:32 AM

Replace simple instregex with instrs

Sorry - disregard my comment - I thought this was for PR58792!

All I'm actually requiring is that the instregex that end in "_Int$" are replaced with (_Int)?$

I have converted something like instregex "^SHA1MSG2rr$" to nstrs SHA1MSG2rr.

_Int version is not CodeGenOnly but the same prefix instruction without _Int is usually CodeGenOnly.
As I explained in https://github.com/llvm/llvm-project/issues/58792. We have some problem to auto gen schedule description for CodeGenOnly instruction due to wrong predicates issue. Therefore, non-Int version opcode will actually use scheduling info for its default schedreadwrite.
I know this is a bug and I'm trying to fix it.

@RKSimon , If you need a quick fix for _Int issue, please feel free to modify this schedule model. I will auto gen a new model after solving this problem.

RKSimon accepted this revision.Nov 11 2022, 9:08 AM

Thanks - the instregex -> instrs diffs LGTM

Do you have any suggestions for scheduler classes that we could add/change to help reduce the number of overrides?

After finishing my cleanup of the the conversions I want to look at the old x86 (not BMI) shift / rotate instructions are they aren't setup properly for RMW etc. Have you noticed anything else?

This revision is now accepted and ready to land.Nov 11 2022, 9:08 AM

Do you have any suggestions for scheduler classes that we could add/change to help reduce the number of overrides?

schedtool has tried to reuse same WriteRes and minimize the amount of extra SchedWriteRes.
Here's simplified procedure to auto emit WriteRes and InstRW:

1. Manually pre associate some schedwrite to it resources so that for each instruction, there's at most 1 schedule write that haven't been associated  with resources. 
2. For each schedule write that haven’t been associated with schedule resources
	2.1 Find out instructions which use it as part of default schedule write.
	2.2 For each instruction, infer Latency, NumMicroOps, Ports so that if this instruction finally use this default schedwrite, then it will get same Lat/NumMicroOps/Ports as we provided in input json.
	2.3 Find out which schedwrite parameters is the most common one and set this to it.
3. For rest instructions, since they can’t get correct result with default schedwrite, we need to create a new schedule write, associate it with correct resources, and use InstRW replace instructions main schedwrite with it.

Does overrides have side effect than default one? I think It's not easy to modify default schedreadwrite since it is used by all X86 schedule modes. If we want to change default schedreadwrite to another one, then we may need to update X86 schedmodels.

After finishing my cleanup of the the conversions I want to look at the old x86 (not BMI) shift / rotate instructions are they aren't setup properly for RMW etc. Have you noticed anything else?

I didn't notice this carefully.