This is an archive of the discontinued LLVM Phabricator instance.

[ARM] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz()
ClosedPublic

Authored by spatel on Nov 6 2015, 4:33 PM.

Download Raw Diff

Details

Reviewers

rengolin
t.p.northover
andreadb

Commits

rGaf1b48bfdcc4: [ARM] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz()
rL252639: [ARM] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz()

Summary

I'm a long way from home on this one...
Believe it or not, this is one step towards solving PR24818:
https://llvm.org/bugs/show_bug.cgi?id=24818

Some background here:
http://reviews.llvm.org/rL248439

The immediate problem is that ARM is using the default TLI cost settings for count-leading/trailing-zeros. I think this should be considered a cheap operation (and therefore fair game for speculation) for any implementation with V6T2 or later.

Another possibility is that we just invert the default settings for the base class hooks. Of the in-tree targets, I'm pretty sure that ARM64 and MIPS should also be making these ops cheap, but they're currently not.

The net result of allowing this speculation for the new ARM regression tests in this patch is that we get this code:

ctlz:               
  clz  r0, r0
  bx  lr
cttz:              
  rbit  r0, r0
  clz  r0, r0
  bx  lr

Instead of:

ctlz:    
  cmp  r0, #0
  moveq  r0, #32
  clzne  r0, r0
  bx  lr
cttz:     
  cmp   r0, #0
  moveq  r0, #32
  rbitne  r0, r0
  clzne  r0, r0
  bx  lr

Diff Detail

Repository: rL LLVM

Event Timeline

spatel updated this revision to Diff 39608.Nov 6 2015, 4:33 PM

spatel retitled this revision from to [ARM] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz().

spatel updated this object.

spatel added reviewers: t.p.northover, rengolin, andreadb.

spatel added a subscriber: llvm-commits.

Herald added subscribers: rengolin, aemerson. · View Herald TranscriptNov 6 2015, 4:33 PM

spatel mentioned this in D14500: [MIPS] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz().Nov 9 2015, 9:27 AM

spatel mentioned this in D14505: [AArch64] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz().Nov 9 2015, 11:56 AM

Hi Sanjay,

The logic seems sound. I'm curious as to why the lit.local.cfg wasn't there in the first place. :)

LGTM, with maybe a split, committing the lit config first, then the change itself.

cheers,
--renato

This revision is now accepted and ready to land.Nov 10 2015, 2:25 AM

In D14469#285988, @rengolin wrote:

The logic seems sound. I'm curious as to why the lit.local.cfg wasn't there in the first place. :)

LGTM, with maybe a split, committing the lit config first, then the change itself.

Thanks, Renato! The lit config isn't there because the directory isn't there at all. This would be the first ARM-specific regression test under SimplifyCFG.

In D14469#286185, @spatel wrote:

Thanks, Renato! The lit config isn't there because the directory isn't there at all. This would be the first ARM-specific regression test under SimplifyCFG.

Ops, I missed that. :) LGTM as is. Thanks!

spatel mentioned this in rL252625: [AArch64] add overrides for isCheapToSpeculateCttz() and….Nov 10 2015, 10:14 AM

Closed by commit rL252639: [ARM] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz() (authored by spatel). · Explain WhyNov 10 2015, 11:27 AM

This revision was automatically updated to reflect the committed changes.

spatel mentioned this in rL252755: [MIPS] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz().Nov 11 2015, 9:27 AM

Revision Contents

Path

Size

llvm/

trunk/

lib/

Target/

ARM/

ARMISelLowering.h

3 lines

ARMISelLowering.cpp

8 lines

test/

Transforms/

SimplifyCFG/

ARM/

cttz-ctlz.ll

43 lines

lit.local.cfg

5 lines

Diff 39839

llvm/trunk/lib/Target/ARM/ARMISelLowering.h

Show First 20 Lines • Show All 461 Lines • ▼ Show 20 Lines	public:
shouldExpandAtomicRMWInIR(AtomicRMWInst *AI) const override;		shouldExpandAtomicRMWInIR(AtomicRMWInst *AI) const override;
bool shouldExpandAtomicCmpXchgInIR(AtomicCmpXchgInst *AI) const override;		bool shouldExpandAtomicCmpXchgInIR(AtomicCmpXchgInst *AI) const override;

bool useLoadStackGuardNode() const override;		bool useLoadStackGuardNode() const override;

bool canCombineStoreAndExtract(Type VectorTy, Value Idx,		bool canCombineStoreAndExtract(Type VectorTy, Value Idx,
unsigned &Cost) const override;		unsigned &Cost) const override;

		bool isCheapToSpeculateCttz() const override;
		bool isCheapToSpeculateCtlz() const override;

protected:		protected:
std::pair<const TargetRegisterClass *, uint8_t>		std::pair<const TargetRegisterClass *, uint8_t>
findRepresentativeClass(const TargetRegisterInfo *TRI,		findRepresentativeClass(const TargetRegisterInfo *TRI,
MVT VT) const override;		MVT VT) const override;

private:		private:
/// Subtarget - Keep a pointer to the ARMSubtarget around so that we can		/// Subtarget - Keep a pointer to the ARMSubtarget around so that we can
/// make the right decision when generating code for different targets.		/// make the right decision when generating code for different targets.
▲ Show 20 Lines • Show All 199 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/ARM/ARMISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 11,835 Lines • ▼ Show 20 Lines	bool ARMTargetLowering::canCombineStoreAndExtract(Type VectorTy, Value Idx,
// or Q register.		// or Q register.
if (BitWidth == 64 \|\| BitWidth == 128) {		if (BitWidth == 64 \|\| BitWidth == 128) {
Cost = 0;		Cost = 0;
return true;		return true;
}		}
return false;		return false;
}		}

		bool ARMTargetLowering::isCheapToSpeculateCttz() const {
		return Subtarget->hasV6T2Ops();
		}

		bool ARMTargetLowering::isCheapToSpeculateCtlz() const {
		return Subtarget->hasV6T2Ops();
		}

Value ARMTargetLowering::emitLoadLinked(IRBuilder<> &Builder, Value Addr,		Value ARMTargetLowering::emitLoadLinked(IRBuilder<> &Builder, Value Addr,
AtomicOrdering Ord) const {		AtomicOrdering Ord) const {
Module *M = Builder.GetInsertBlock()->getParent()->getParent();		Module *M = Builder.GetInsertBlock()->getParent()->getParent();
Type *ValTy = cast<PointerType>(Addr->getType())->getElementType();		Type *ValTy = cast<PointerType>(Addr->getType())->getElementType();
bool IsAcquire = isAtLeastAcquire(Ord);		bool IsAcquire = isAtLeastAcquire(Ord);

// Since i64 isn't legal and intrinsics don't get type-lowered, the ldrexd		// Since i64 isn't legal and intrinsics don't get type-lowered, the ldrexd
// intrinsic must return {i32, i32} and we have to recombine them into a		// intrinsic must return {i32, i32} and we have to recombine them into a
▲ Show 20 Lines • Show All 322 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/SimplifyCFG/ARM/cttz-ctlz.ll

				; RUN: opt -S -simplifycfg -mtriple=arm -mattr=+v6t2 < %s \| FileCheck %s

				define i32 @ctlz(i32 %A) {
				; CHECK-LABEL: @ctlz(
				; CHECK: [[ICMP:%[A-Za-z0-9]+]] = icmp eq i32 %A, 0
				; CHECK-NEXT: [[CTZ:%[A-Za-z0-9]+]] = tail call i32 @llvm.ctlz.i32(i32 %A, i1 true)
				; CHECK-NEXT: [[SEL:%[A-Za-z0-9.]+]] = select i1 [[ICMP]], i32 32, i32 [[CTZ]]
				; CHECK-NEXT: ret i32 [[SEL]]
				entry:
				%tobool = icmp eq i32 %A, 0
				br i1 %tobool, label %cond.end, label %cond.true

				cond.true:
				%0 = tail call i32 @llvm.ctlz.i32(i32 %A, i1 true)
				br label %cond.end

				cond.end:
				%cond = phi i32 [ %0, %cond.true ], [ 32, %entry ]
				ret i32 %cond
				}

				define i32 @cttz(i32 %A) {
				; CHECK-LABEL: @cttz(
				; CHECK: [[ICMP:%[A-Za-z0-9]+]] = icmp eq i32 %A, 0
				; CHECK-NEXT: [[CTZ:%[A-Za-z0-9]+]] = tail call i32 @llvm.cttz.i32(i32 %A, i1 true)
				; CHECK-NEXT: [[SEL:%[A-Za-z0-9.]+]] = select i1 [[ICMP]], i32 32, i32 [[CTZ]]
				; CHECK-NEXT: ret i32 [[SEL]]
				entry:
				%tobool = icmp eq i32 %A, 0
				br i1 %tobool, label %cond.end, label %cond.true

				cond.true:
				%0 = tail call i32 @llvm.cttz.i32(i32 %A, i1 true)
				br label %cond.end

				cond.end:
				%cond = phi i32 [ %0, %cond.true ], [ 32, %entry ]
				ret i32 %cond
				}

				declare i32 @llvm.ctlz.i32(i32, i1)
				declare i32 @llvm.cttz.i32(i32, i1)

llvm/trunk/test/Transforms/SimplifyCFG/ARM/lit.local.cfg

				config.suffixes = ['.ll']

				targets = set(config.root.targets_to_build.split())
				if not 'ARM' in targets:
				config.unsupported = True