This is an archive of the discontinued LLVM Phabricator instance.

I'm not sure of that. There's no description in Intel SOM. I also googled about it when writting #60043 and just found LEA was improved in Golden Cove, e.g., https://www.hardwaretimes.com/intel-golden-cove-core-architecture-deep-dive-vs-zen-3-and-sunny-cove/

https://uops.info/table.html?search=lea&cb_lat=on&cb_tp=on&cb_uops=on&cb_ports=on&cb_CLX=on&cb_ICL=on&cb_measurements=on&cb_base=on

There are no 3 cycle port 1 R32/R64 LEAs on Icelake. They are now single cycle port 1 and 5. And some cases that were 1 cycle port 1 and 5 before are now port 0/1/5/6.

In D141974#4060943, @pengfei wrote:

I'm not sure of that. There's no description in Intel SOM. I also googled about it when writting #60043 and just found LEA was improved in Golden Cove, e.g., https://www.hardwaretimes.com/intel-golden-cove-core-architecture-deep-dive-vs-zen-3-and-sunny-cove/

Fairly certain its been changed. uops.info is generally reliable also tested just now on ICX:

	.global	_start
	.p2align 6
	.text
_start:
	movl	$10000000, %eax

	xorl	%edx, %edx
loop:
	leaq	1(%rdx, %rax, 8), %rdx
	decl	%eax
	jnz	loop


	movl	$60, %eax
	xorl	%edi, %edi
	syscall

Results in:

10,002,046      cycles                                                      
 4,678,897      p0                                                          
 4,678,955      p1                                                          
 5,321,396      p5                                                          
 5,321,559      p6

Should be 30,000,000 cycles if was 3c latency.

LGTM. Thanks for the information!

This revision is now accepted and ready to land.Jan 18 2023, 5:32 AM

This revision was landed with ongoing or failed builds.Jan 19 2023, 11:30 AM

Closed by commit rG1d67f2cd7850: Removing 'TuningSlow3OpsLEA' from ICL config (authored by goldstein.w.n). · Explain Why

This revision was automatically updated to reflect the committed changes.

goldstein.w.n added a commit: rG1d67f2cd7850: Removing 'TuningSlow3OpsLEA' from ICL config.

Revision Contents

Path

Size

llvm/

lib/

Target/

X86/

X86.td

1 line

Diff 490614

llvm/lib/Target/X86/X86.td

Show First 20 Lines • Show All 880 Lines • ▼ Show 20 Lines	list<SubtargetFeature> ICLAdditionalFeatures = [FeatureBITALG,
FeatureVNNI,		FeatureVNNI,
FeatureVPCLMULQDQ,		FeatureVPCLMULQDQ,
FeatureVPOPCNTDQ,		FeatureVPOPCNTDQ,
FeatureGFNI,		FeatureGFNI,
FeatureRDPID,		FeatureRDPID,
FeatureFSRM];		FeatureFSRM];
list<SubtargetFeature> ICLTuning = [TuningFastGather,		list<SubtargetFeature> ICLTuning = [TuningFastGather,
TuningMacroFusion,		TuningMacroFusion,
TuningSlow3OpsLEA,
TuningSlowDivide64,		TuningSlowDivide64,
TuningFastScalarFSQRT,		TuningFastScalarFSQRT,
TuningFastVectorFSQRT,		TuningFastVectorFSQRT,
TuningFastSHLDRotate,		TuningFastSHLDRotate,
TuningFast15ByteNOP,		TuningFast15ByteNOP,
TuningFastVariableCrossLaneShuffle,		TuningFastVariableCrossLaneShuffle,
TuningFastVariablePerLaneShuffle,		TuningFastVariablePerLaneShuffle,
TuningPrefer256Bit,		TuningPrefer256Bit,
▲ Show 20 Lines • Show All 833 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[X86] Removing 'TuningSlow3OpsLEA' from ICL configClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 490614

llvm/lib/Target/X86/X86.td

[X86] Removing 'TuningSlow3OpsLEA' from ICL config
ClosedPublic