This is an archive of the discontinued LLVM Phabricator instance.

[mlir] Set random starting index
Needs ReviewPublic

Authored by jpienaar on Nov 1 2020, 2:00 PM.

Download Raw Diff

Details

Reviewers

Summary

Example prodding the start of the printed ID sequence to identify cases where a
test is depending on current numbering starting at 0. Might be useful to flag
tests too constrained that will cause pain when updating.

Using MINSTD and LINE as simple (and might even be overkill as 5 as
starting value would work too).

More of an exploration than something we have to do :)

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

jpienaar created this revision.Nov 1 2020, 2:00 PM

Herald added a reviewer: rriddle. · View Herald TranscriptNov 1 2020, 2:00 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: rdzhabarov, tatianashp, msifontes and 14 others. · View Herald Transcript

jpienaar requested review of this revision.Nov 1 2020, 2:00 PM

Herald added subscribers: stephenneuendorffer, nicolasvasilache. · View Herald TranscriptNov 1 2020, 2:00 PM

Harbormaster completed remote builds in B77200: Diff 302175.Nov 1 2020, 2:24 PM

rdzhabarov added inline comments.Nov 2 2020, 9:34 AM

mlir/lib/IR/AsmPrinter.cpp
559	qq: do we document anywhere these keys used for tweaking different things?

rriddle added inline comments.Nov 2 2020, 10:15 AM

mlir/lib/IR/AsmPrinter.cpp
559	I think we should align under fewer macros, if this is something that we actually want to do. Seems like randomized naming starts, could be something under EXPENSIVE_CHECKS.
568	Is this guaranteed not to produce duplicates?

rdzhabarov added inline comments.Nov 2 2020, 10:44 AM

mlir/lib/IR/AsmPrinter.cpp
559	+1 to River's point.
568	I was playing with different seeds and seems like i have not found collisions after 10M+ iterations. Curious, if there is math supporting this for any seeds we could use potentially use? (also for our purposes, we could fix seed to predefined value, this will guarantee non duplicates for large number of iterations and also would work for our testing purposes).

mehdi_amini added inline comments.Nov 2 2020, 1:58 PM

mlir/lib/IR/AsmPrinter.cpp
559	This isn't an "expensive" check though. Seems closer to the LLVM_ENABLE_REVERSE_ITERATION macro in the mindset.

jpienaar marked 2 inline comments as done.Nov 2 2020, 3:24 PM

jpienaar added inline comments.

mlir/lib/IR/AsmPrinter.cpp
559	Correct, this isn't too expensive nor a would fit in with the other expensive checks. It just makes debug string printing more expensive. Re: documentation, this was currently just for discussion so I didn't add any and we don't have anything this invasive with ifdefs at the moment (NDEBUG is almost as much as we have I think). If folks think useful to flush out bad tests, we could document in a separate md file. Depends on how many of these we add, we could either also document in cmake or use a config file.
568	Re: duplicates, not until it has gone full cycle back to the start (once we hit 2^31 value IDs in a module). Seed of 0 doesn't work, beyond that any 32-bit seed should work I believe. This is: Park, Stephen K.; Miller, Keith W. (1988). "Random Number Generators: Good Ones Are Hard To Find" (PDF). Communications of the ACM. 31 (10): 1192–1201. doi:10.1145/63039.63042.

Stupid Q, if the end-goal is to detect SSA value names being used, can't we just switch the pretty printer to print "letter counters" instead ?
I.e. by default print SSA names in basis 10, when enabled print in basis "a-z" ?
I don't see a need for worrying about collisions with such a scheme.

Herald added a project: Restricted Project. · View Herald TranscriptOct 3 2022, 2:05 PM

Herald added subscribers: zero9178, bzcheeseman, sdasgup3 and 4 others. · View Herald Transcript

In D90573#3831822, @nicolasvasilache wrote:

Stupid Q, if the end-goal is to detect SSA value names being used, can't we just switch the pretty printer to print "letter counters" instead ?
I.e. by default print SSA names in basis 10, when enabled print in basis "a-z" ?
I don't see a need for worrying about collisions with such a scheme.

Yes, even counting with letters rather than numbers should work ... I don't think anywhere someone would have a sort along the way which this would break too (but that's not a safe thing to do in general).

Visually random does break things a bit more and gets folks out of mindset of number reflecting order in function and or believing SSA ID is retained in some way. Goal is towards avoiding relying on the incidental esp in tests.

Herald added a subscriber: Moerafaat. · View Herald TranscriptDec 7 2022, 5:57 PM

Revision Contents

Path

Size

mlir/

lib/

IR/

AsmPrinter.cpp

38 lines

Diff 302175

mlir/lib/IR/AsmPrinter.cpp

Show First 20 Lines • Show All 549 Lines • ▼ Show 20 Lines	private:
DenseMap<Block *, unsigned> blockIDs;		DenseMap<Block *, unsigned> blockIDs;

/// This keeps track of all of the non-numeric names that are in flight,		/// This keeps track of all of the non-numeric names that are in flight,
/// allowing us to check for duplicates.		/// allowing us to check for duplicates.
/// Note: the value of the map is unused.		/// Note: the value of the map is unused.
llvm::ScopedHashTable<StringRef, char> usedNames;		llvm::ScopedHashTable<StringRef, char> usedNames;
llvm::BumpPtrAllocator usedNameAllocator;		llvm::BumpPtrAllocator usedNameAllocator;

		// If testing that SSA IDs are being used in tests directly.
		#ifdef MLIR_TEST_SSA_ID_SEQUENCE_ASSUMED
		rdzhabarovUnsubmitted Not Done Reply Inline Actions qq: do we document anywhere these keys used for tweaking different things? rdzhabarov: qq: do we document anywhere these keys used for tweaking different things?
		rriddleUnsubmitted Not Done Reply Inline Actions I think we should align under fewer macros, if this is something that we actually want to do. Seems like randomized naming starts, could be something under EXPENSIVE_CHECKS. rriddle: I think we should align under fewer macros, if this is something that we actually want to do.
		rdzhabarovUnsubmitted Not Done Reply Inline Actions +1 to River's point. rdzhabarov: +1 to River's point.
		mehdi_aminiUnsubmitted Not Done Reply Inline Actions This isn't an "expensive" check though. Seems closer to the LLVM_ENABLE_REVERSE_ITERATION macro in the mindset. mehdi_amini: This isn't an "expensive" check though. Seems closer to the LLVM_ENABLE_REVERSE_ITERATION macro…
		jpienaarAuthorUnsubmitted Done Reply Inline Actions Correct, this isn't too expensive nor a would fit in with the other expensive checks. It just makes debug string printing more expensive. Re: documentation, this was currently just for discussion so I didn't add any and we don't have anything this invasive with ifdefs at the moment (NDEBUG is almost as much as we have I think). If folks think useful to flush out bad tests, we could document in a separate md file. Depends on how many of these we add, we could either also document in cmake or use a config file. jpienaar: Correct, this isn't too expensive nor a would fit in with the other expensive checks. It just…
		static constexpr unsigned initID() {
		// This could be more random seed, but given the intention is just to
		// ensure 0 is not assumged, should suffice.
		return __LINE__ + 42;
		}

		static unsigned incrementID(unsigned &id) {
		unsigned int ij = id;
		// Use MINSTD to increment (period 2^31-1).
		rriddleUnsubmitted Done Reply Inline Actions Is this guaranteed not to produce duplicates? rriddle: Is this guaranteed not to produce duplicates?
		rdzhabarovUnsubmitted Done Reply Inline Actions I was playing with different seeds and seems like i have not found collisions after 10M+ iterations. Curious, if there is math supporting this for any seeds we could use potentially use? (also for our purposes, we could fix seed to predefined value, this will guarantee non duplicates for large number of iterations and also would work for our testing purposes). rdzhabarov: I was playing with different seeds and seems like i have not found collisions after 10M+…
		jpienaarAuthorUnsubmitted Done Reply Inline Actions Re: duplicates, not until it has gone full cycle back to the start (once we hit 2^31 value IDs in a module). Seed of 0 doesn't work, beyond that any 32-bit seed should work I believe. This is: Park, Stephen K.; Miller, Keith W. (1988). "Random Number Generators: Good Ones Are Hard To Find" (PDF). Communications of the ACM. 31 (10): 1192–1201. doi:10.1145/63039.63042. jpienaar: Re: duplicates, not until it has gone full cycle back to the start (once we hit 2^31 value IDs…
		id = (16807l * ij) % 2147483647;
		return ij;
		};
		#else
		static constexpr unsigned initID() { return 0; }

		static unsigned incrementID(unsigned &id) { return id++; }
		#endif

/// This is the next value ID to assign in numbering.		/// This is the next value ID to assign in numbering.
unsigned nextValueID = 0;		unsigned nextValueID;
/// This is the next ID to assign to a region entry block argument.		/// This is the next ID to assign to a region entry block argument.
unsigned nextArgumentID = 0;		unsigned nextArgumentID;
/// This is the next ID to assign when a name conflict is detected.		/// This is the next ID to assign when a name conflict is detected.
unsigned nextConflictID = 0;		unsigned nextConflictID;
};		};
} // end anonymous namespace		} // end anonymous namespace

SSANameState::SSANameState(		SSANameState::SSANameState(
Operation *op,		Operation *op,
DialectInterfaceCollection<OpAsmDialectInterface> &interfaces) {		DialectInterfaceCollection<OpAsmDialectInterface> &interfaces)
		: nextValueID(initID()), nextArgumentID(initID()),
		nextConflictID(initID()) {
llvm::ScopedHashTable<StringRef, char>::ScopeTy usedNamesScope(usedNames);		llvm::ScopedHashTable<StringRef, char>::ScopeTy usedNamesScope(usedNames);
numberValuesInOp(*op, interfaces);		numberValuesInOp(*op, interfaces);

for (auto &region : op->getRegions())		for (auto &region : op->getRegions())
numberValuesInRegion(region, interfaces);		numberValuesInRegion(region, interfaces);
}		}

void SSANameState::printValueID(Value value, bool printResultNo,		void SSANameState::printValueID(Value value, bool printResultNo,
▲ Show 20 Lines • Show All 121 Lines • ▼ Show 20 Lines	void SSANameState::numberValuesInBlock(
// 'arg'.		// 'arg'.
SmallString<32> specialNameBuffer(isEntryBlock ? "arg" : "");		SmallString<32> specialNameBuffer(isEntryBlock ? "arg" : "");
llvm::raw_svector_ostream specialName(specialNameBuffer);		llvm::raw_svector_ostream specialName(specialNameBuffer);
for (auto arg : block.getArguments()) {		for (auto arg : block.getArguments()) {
if (valueIDs.count(arg))		if (valueIDs.count(arg))
continue;		continue;
if (isEntryBlock) {		if (isEntryBlock) {
specialNameBuffer.resize(strlen("arg"));		specialNameBuffer.resize(strlen("arg"));
specialName << nextArgumentID++;		specialName << incrementID(nextArgumentID);
}		}
setValueName(arg, specialName.str());		setValueName(arg, specialName.str());
}		}

// Number the operations in this block.		// Number the operations in this block.
for (auto &op : block)		for (auto &op : block)
numberValuesInOp(op, interfaces);		numberValuesInOp(op, interfaces);
}		}
Show All 19 Lines	void SSANameState::numberValuesInOp(
};		};
if (OpAsmOpInterface asmInterface = dyn_cast<OpAsmOpInterface>(&op))		if (OpAsmOpInterface asmInterface = dyn_cast<OpAsmOpInterface>(&op))
asmInterface.getAsmResultNames(setResultNameFn);		asmInterface.getAsmResultNames(setResultNameFn);
else if (auto *asmInterface = interfaces.getInterfaceFor(op.getDialect()))		else if (auto *asmInterface = interfaces.getInterfaceFor(op.getDialect()))
asmInterface->getAsmResultNames(&op, setResultNameFn);		asmInterface->getAsmResultNames(&op, setResultNameFn);

// If the first result wasn't numbered, give it a default number.		// If the first result wasn't numbered, give it a default number.
if (valueIDs.try_emplace(resultBegin, nextValueID).second)		if (valueIDs.try_emplace(resultBegin, nextValueID).second)
++nextValueID;		incrementID(nextValueID);

// If this operation has multiple result groups, mark it.		// If this operation has multiple result groups, mark it.
if (resultGroups.size() != 1) {		if (resultGroups.size() != 1) {
llvm::array_pod_sort(resultGroups.begin(), resultGroups.end());		llvm::array_pod_sort(resultGroups.begin(), resultGroups.end());
opResultGroups.try_emplace(&op, std::move(resultGroups));		opResultGroups.try_emplace(&op, std::move(resultGroups));
}		}
}		}

Show All 33 Lines	void SSANameState::getResultIDAndNumber(OpResult result, Value &lookupValue,
if (groupSize != 1)		if (groupSize != 1)
lookupResultNo = resultNo - groupResultNo;		lookupResultNo = resultNo - groupResultNo;
lookupValue = owner->getResult(groupResultNo);		lookupValue = owner->getResult(groupResultNo);
}		}

void SSANameState::setValueName(Value value, StringRef name) {		void SSANameState::setValueName(Value value, StringRef name) {
// If the name is empty, the value uses the default numbering.		// If the name is empty, the value uses the default numbering.
if (name.empty()) {		if (name.empty()) {
valueIDs[value] = nextValueID++;		valueIDs[value] = incrementID(nextValueID);
return;		return;
}		}

valueIDs[value] = NameSentinel;		valueIDs[value] = NameSentinel;
valueNames[value] = uniqueValueName(name);		valueNames[value] = uniqueValueName(name);
}		}

/// Returns true if 'c' is an allowable punctuation character: [$._-]		/// Returns true if 'c' is an allowable punctuation character: [$._-]
Show All 39 Lines	StringRef SSANameState::uniqueValueName(StringRef name) {
} else {		} else {
// Otherwise, we had a conflict - probe until we find a unique name. This		// Otherwise, we had a conflict - probe until we find a unique name. This
// is guaranteed to terminate (and usually in a single iteration) because it		// is guaranteed to terminate (and usually in a single iteration) because it
// generates new names by incrementing nextConflictID.		// generates new names by incrementing nextConflictID.
SmallString<64> probeName(name);		SmallString<64> probeName(name);
probeName.push_back('_');		probeName.push_back('_');
while (true) {		while (true) {
probeName.resize(name.size() + 1);		probeName.resize(name.size() + 1);
probeName += llvm::utostr(nextConflictID++);		probeName += llvm::utostr(incrementID(nextConflictID));
if (!usedNames.count(probeName)) {		if (!usedNames.count(probeName)) {
name = StringRef(probeName).copy(usedNameAllocator);		name = StringRef(probeName).copy(usedNameAllocator);
break;		break;
}		}
}		}
}		}

usedNames.insert(name, char());		usedNames.insert(name, char());
▲ Show 20 Lines • Show All 1,608 Lines • Show Last 20 Lines