Download Raw Diff

Details

Reviewers

Commits

rG54dd6837265d: [PGO]: Do not update Data->Value field during profile write.
rCRT256543: [PGO]: Do not update Data->Value field during profile write.
rL256543: [PGO]: Do not update Data->Value field during profile write.

Summary

The Values field of the per-function data structure is overloaded for two different purposes. At runtime time, the field points to the in-memory value profile data, but when the data is serialized to disk, its value will be changed to point to internals of value profile data's serialization buffer (continuous) -- which are later relocated during profile reading.

Current implementation simply overwrites the pointer during profile dumping -- this won't work well with multi-threaded program -- after Values field is modified, runtime code may either crash or corrupt the data. Zerolized Value field may also be re-initialized with addresses which will be garbage in reading time. Introducing synchronization mechanism will greatly affect runtime performance.

The solution is simple -- instead of overwriting the "Data" data structure, the per-func data array will be copied to a small buffer and written out after the Values are fixed up in the buffer. The Data's Values field will be kept intact. There is no need to clear the memory either. In some use cases, the user code will just clear the counters and all the data allocated for VP can be reused.

Diff Detail

Repository: rL LLVM

Event Timeline

davidxl updated this revision to Diff 41977.Dec 4 2015, 8:00 PM

davidxl retitled this revision from to [PGO] Remove data races on Data->Values field.

davidxl updated this object.

davidxl added a reviewer: betulb.

davidxl added a subscriber: llvm-commits.

Patch rebase.

betulb added inline comments.Dec 10 2015, 2:25 PM

lib/profile/InstrProfilingFile.c
23 ↗	(On Diff #42457)	storek -> store ?
lib/profile/InstrProfilingValue.c
152 ↗	(On Diff #42457)	I do not think we really do need a DataValuePtrs array. I like the direction taken by the removal of the update of the Values field of the ProfileData struct, but I do not think this implementation makes the overall need for synchronization any less due to the global declaration of the DataValuePtr's array. This also resurfaces the comments made for a previous implementation i.e. "depending on the implicit ordering of the ProfileData nodes in memory for reading of ValueData nodes". Value profiling data is readable w/o this indirection. I do not think there is any need for this array.

DataValuePtrs won't introduce data races as it is only accessed by the dumping thread.

Other than that, I admit I don't like introducing this overhead either. I will prepare a different patch for this.

[Update the patch according to Betul's feedback)

By removing the dependency on ->Values field from the reader, the runtime patch now becomes quite clean. Lots of unnecessary things are removed:

The value profile data no longer needs to be put into an contiguous array before writing (as there will no relocation needed at read time)
This removes the need to compute the total size ahead of time.
It eliminates the problem of possible dropping of hot value sites due to updates from other threads
It eliminates the need to reserve extra bytes for buffer, thus the need to set reserved space at runtime
It greatly simplifies dumping -- there will be no need for two pass data collection -- one for total size, one for actual data collect.

The only major interface change is the data gathering interface -- it will now gather an array of pointers to ValueProfData objects instead of a pointer to the contiguous array.

Any more comments? since this is a bug fix with good cleanups, I'd like to get this in sooner.

betulb added inline comments.Dec 17 2015, 3:40 PM

lib/profile/InstrProfilingValue.c
159 ↗	(On Diff #42541)	As an overall comment, I'm not too fond of using macros this heavily. Specific to this line, DEF_VALUE_RECORD is used only once and there is a macro definition for its contents. It's not necessary. Secondly, R is passed to the macro w/o being declared in this context. I'd rather have preferred that R is declared first and then passed to the macro (if macro usage is ever a requisite).
lib/profile/InstrProfilingWriter.c
76 ↗	(On Diff #42541)	I think there is an overhead of making the syscall to write as many times as there are profile data variables. I think it was one of the reasons why the other profiling constants/variables were merged into a single global variable and into linker sections to reduce the many iterations needed to write the data.

davidxl added inline comments.Dec 17 2015, 3:46 PM

lib/profile/InstrProfilingValue.c
159 ↗	(On Diff #42541)	yes, after the cleanup, this one can be removed -- as it is not reused anywhere else.
lib/profile/InstrProfilingWriter.c
76 ↗	(On Diff #42541)	We can do the write in batches of course. I will do some measurement to see if that is needed.

betulb added inline comments.Dec 17 2015, 4:35 PM

lib/profile/InstrProfilingFile.c
37 ↗	(On Diff #42541)	Is there a purpose for not directly using free, assigning it to a pointer and calling it through the function pointer?
lib/profile/InstrProfilingValue.c
181 ↗	(On Diff #42541)	Is there any need to return S in this implementation? We may also remove the need to pass the ValueDataSize around.
lib/profile/InstrProfilingWriter.c
68 ↗	(On Diff #42541)	Instead of ValueDataSize, why not check for if ValueDataBegin is null?

davidxl added inline comments.Dec 17 2015, 4:45 PM

lib/profile/InstrProfilingValue.c
159 ↗	(On Diff #42541)	For the record, this macro was also introduced to 'mimic' C++ where variable declaration and construction are done together.
181 ↗	(On Diff #42541)	good point. It was needed to indicate if there is vp data to be written -- but the rewrite makes it not necessary. Will remove.
lib/profile/InstrProfilingWriter.c
68 ↗	(On Diff #42541)	yes. Will change.

Regarding performance, fwrite actually buffers the data so the number of system calls to write is actually much fewer than the number of Data entry - the overall time is dominated by IO, not system calls.

Here is the design of the stress testing:

total number of value data entries to be written out : 3 million
total size of the value data 1.6G -- this is way larger than an average program can produce -- for instance clang's profile data raw size is about 100M

Test machine is a sandybridge machine.

Results:
a) write out VP data one by one without batching

total number of calls to fwrite: 3M,
total number of calls to write: ~390K.
real time: ~12s; sys time: ~2.5s

b) write out VP data in batches -- batch size is 1024 (i.e, copy 1024 VP data into a buffer and write out)

total number of calls to fwrite: 3K
total number of calls to write: ~6K (yes, it is more than calls to fwrite -- large a very large write can be split into smaller chunks).
real time: ~12s, sys time: ~2s

The savings from reduced number of sys calls is not much.

In another experiment, /dev/null is used as the output to remove IO.

a) for non batch case

real time: ~1.6s (average of 10 runs)
sys time: ~0.85s (average of 10 runs)

b) with batch:

real time: ~1.7s (average 10 runs)
sys time: ~0.73s (average of 10 runs).

Note that b has addition cost of memory copy

Based on the above data, we can probably go with non-batch for simplicity for now. The batch write can be easily added later.

Updated the patch according to Betul's feedback.

(VP gather interface is cleaned up -- the size is still needed by the header.)

Reimplement VP data writing using the Buffer Writer API. With the new change, the VP writing buffer size also becomes configurable with an environment variable.

Remove a redundant check.

LGTM overall.

lib/profile/InstrProfilingValue.c
135 ↗	(On Diff #43478)	nullptr check for ValueDataSize.
lib/profile/InstrProfilingWriter.c
65 ↗	(On Diff #43478)	1 -> sizeof(uint8_t) . sizeof(char), sizeof(uint8_t) and imm constant value 1 are all interspersed throughout the code. sizeof(uint8_t) should replace all necessary locations to be consistent throughout the code. Similarly both char and uint8_t as types are used interchangeably throughout the code. Using one of the two consistently is preferable.

Address review comments from Betul

Closed by commit rL256543: [PGO]: Do not update Data->Value field during profile write. (authored by davidxl). · Explain WhyDec 28 2015, 11:17 PM

This revision was automatically updated to reflect the committed changes.

Diff 43713

compiler-rt/trunk/lib/profile/InstrProfiling.h

	Show First 20 Lines • Show All 72 Lines • ▼ Show 20 Lines
	void INSTR_PROF_VALUE_PROF_FUNC(			void INSTR_PROF_VALUE_PROF_FUNC(
	#define VALUE_PROF_FUNC_PARAM(ArgType, ArgName, ArgLLVMType) ArgType ArgName			#define VALUE_PROF_FUNC_PARAM(ArgType, ArgName, ArgLLVMType) ArgType ArgName
	#include "InstrProfData.inc"			#include "InstrProfData.inc"
	);			);

	/*!			/*!
	* \brief Prepares the value profiling data for output.			* \brief Prepares the value profiling data for output.
	*			*
	* Prepares a single __llvm_profile_value_data array out of the many			* Returns an array of pointers to value profile data.
	* ValueProfNode trees (one per instrumented function).
	*/			*/
	uint64_t __llvm_profile_gather_value_data(uint8_t **DataArray);			struct ValueProfData;
				struct ValueProfData *__llvm_profile_gather_value_data(uint64_t Size);

	/*!			/*!
	* \brief Write instrumentation data to the current file.			* \brief Write instrumentation data to the current file.
	*			*
	* Writes to the file with the last name given to \a __llvm_profile_set_filename(),			* Writes to the file with the last name given to \a *
				* __llvm_profile_set_filename(),
	* or if it hasn't been called, the \c LLVM_PROFILE_FILE environment variable,			* or if it hasn't been called, the \c LLVM_PROFILE_FILE environment variable,
	* or if that's not set, the last name given to			* or if that's not set, the last name given to
	* \a __llvm_profile_override_default_filename(), or if that's not set,			* \a __llvm_profile_override_default_filename(), or if that's not set,
	* \c "default.profraw".			* \c "default.profraw".
	*/			*/
	int __llvm_profile_write_file(void);			int __llvm_profile_write_file(void);

	/*!			/*!
	Show All 36 Lines

compiler-rt/trunk/lib/profile/InstrProfilingFile.c

Show All 25 Lines	for (I = 0; I < NumIOVecs; I++) {
if (fwrite(IOVecs[I].Data, IOVecs[I].ElmSize, IOVecs[I].NumElm, File) !=		if (fwrite(IOVecs[I].Data, IOVecs[I].ElmSize, IOVecs[I].NumElm, File) !=
IOVecs[I].NumElm)		IOVecs[I].NumElm)
return 1;		return 1;
}		}
return 0;		return 0;
}		}

static int writeFile(FILE *File) {		static int writeFile(FILE *File) {
uint8_t *ValueDataBegin = NULL;		const char *BufferSzStr = 0;
const uint64_t ValueDataSize =		uint64_t ValueDataSize = 0;
__llvm_profile_gather_value_data(&ValueDataBegin);		struct ValueProfData **ValueDataArray =
int r = llvmWriteProfData(fileWriter, File, ValueDataBegin, ValueDataSize);		__llvm_profile_gather_value_data(&ValueDataSize);
free(ValueDataBegin);		FreeHook = &free;
return r;		CallocHook = &calloc;
		BufferSzStr = getenv("LLVM_VP_BUFFER_SIZE");
		if (BufferSzStr && BufferSzStr[0])
		VPBufferSize = atoi(BufferSzStr);
		return llvmWriteProfData(fileWriter, File, ValueDataArray, ValueDataSize);
}		}

static int writeFileWithName(const char *OutputName) {		static int writeFileWithName(const char *OutputName) {
int RetVal;		int RetVal;
FILE *OutputFile;		FILE *OutputFile;
if (!OutputName \|\| !OutputName[0])		if (!OutputName \|\| !OutputName[0])
return -1;		return -1;

▲ Show 20 Lines • Show All 174 Lines • Show Last 20 Lines

compiler-rt/trunk/lib/profile/InstrProfilingInternal.h

Show First 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	typedef struct ProfDataIOVec {
size_t NumElm;		size_t NumElm;
} ProfDataIOVec;		} ProfDataIOVec;

typedef uint32_t (WriterCallback)(ProfDataIOVec , uint32_t NumIOVecs,		typedef uint32_t (WriterCallback)(ProfDataIOVec , uint32_t NumIOVecs,
void **WriterCtx);		void **WriterCtx);
uint32_t llvmBufferWriter(ProfDataIOVec *IOVecs, uint32_t NumIOVecs,		uint32_t llvmBufferWriter(ProfDataIOVec *IOVecs, uint32_t NumIOVecs,
void **WriterCtx);		void **WriterCtx);
int llvmWriteProfData(WriterCallback Writer, void *WriterCtx,		int llvmWriteProfData(WriterCallback Writer, void *WriterCtx,
const uint8_t *ValueDataBegin,		struct ValueProfData **ValueDataArray,
const uint64_t ValueDataSize);		const uint64_t ValueDataSize);
int llvmWriteProfDataImpl(WriterCallback Writer, void *WriterCtx,		int llvmWriteProfDataImpl(WriterCallback Writer, void *WriterCtx,
const __llvm_profile_data *DataBegin,		const __llvm_profile_data *DataBegin,
const __llvm_profile_data *DataEnd,		const __llvm_profile_data *DataEnd,
const uint64_t *CountersBegin,		const uint64_t *CountersBegin,
const uint64_t *CountersEnd,		const uint64_t *CountersEnd,
const uint8_t *ValueDataBegin,		struct ValueProfData **ValueDataBeginArray,
const uint64_t ValueDataSize, const char *NamesBegin,		const uint64_t ValueDataSize, const char *NamesBegin,
const char *NamesEnd);		const char *NamesEnd);

extern char (GetEnvHook)(const char *);		extern char (GetEnvHook)(const char *);
		extern void (FreeHook)(void );
		extern void* (*CallocHook)(size_t, size_t);
		extern uint32_t VPBufferSize;

#endif		#endif

compiler-rt/trunk/lib/profile/InstrProfilingValue.c

Show All 15 Lines
#define INSTR_PROF_VALUE_PROF_DATA		#define INSTR_PROF_VALUE_PROF_DATA
#define INSTR_PROF_COMMON_API_IMPL		#define INSTR_PROF_COMMON_API_IMPL
#include "InstrProfData.inc"		#include "InstrProfData.inc"

#define PROF_OOM(Msg) PROF_ERR(Msg ":%s\n", "Out of memory");		#define PROF_OOM(Msg) PROF_ERR(Msg ":%s\n", "Out of memory");
#define PROF_OOM_RETURN(Msg) \		#define PROF_OOM_RETURN(Msg) \
{ \		{ \
PROF_OOM(Msg) \		PROF_OOM(Msg) \
return 0; \		free(ValueDataArray); \
		return NULL; \
}		}

#if COMPILER_RT_HAS_ATOMICS != 1		#if COMPILER_RT_HAS_ATOMICS != 1
COMPILER_RT_VISIBILITY		COMPILER_RT_VISIBILITY
uint32_t BoolCmpXchg(void *Ptr, void OldV, void *NewV) {		uint32_t BoolCmpXchg(void *Ptr, void OldV, void *NewV) {
void R = Ptr;		void R = Ptr;
if (R == OldV) {		if (R == OldV) {
*Ptr = NewV;		*Ptr = NewV;
Show All 40 Lines	if (!Mem)
return 0;		return 0;
if (!COMPILER_RT_BOOL_CMPXCHG(&Data->Values, 0, Mem)) {		if (!COMPILER_RT_BOOL_CMPXCHG(&Data->Values, 0, Mem)) {
free(Mem);		free(Mem);
return 0;		return 0;
}		}
return 1;		return 1;
}		}

static void deallocateValueProfileCounters(__llvm_profile_data *Data) {
uint64_t NumVSites = 0, I;
uint32_t VKI;
if (!Data->Values)
return;
for (VKI = IPVK_First; VKI <= IPVK_Last; ++VKI)
NumVSites += Data->NumValueSites[VKI];
for (I = 0; I < NumVSites; I++) {
ValueProfNode Node = ((ValueProfNode *)Data->Values)[I];
while (Node) {
ValueProfNode *Next = Node->Next;
free(Node);
Node = Next;
}
}
free(Data->Values);
}

COMPILER_RT_VISIBILITY void		COMPILER_RT_VISIBILITY void
__llvm_profile_instrument_target(uint64_t TargetValue, void *Data,		__llvm_profile_instrument_target(uint64_t TargetValue, void *Data,
uint32_t CounterIndex) {		uint32_t CounterIndex) {

__llvm_profile_data PData = (__llvm_profile_data )Data;		__llvm_profile_data PData = (__llvm_profile_data )Data;
if (!PData)		if (!PData)
return;		return;

Show All 35 Lines	else if (PrevVNode && !PrevVNode->Next)
Success = COMPILER_RT_BOOL_CMPXCHG(&(PrevVNode->Next), 0, CurrentVNode);		Success = COMPILER_RT_BOOL_CMPXCHG(&(PrevVNode->Next), 0, CurrentVNode);

if (!Success) {		if (!Success) {
free(CurrentVNode);		free(CurrentVNode);
return;		return;
}		}
}		}

/* For multi-threaded programs, while the profile is being dumped, other		COMPILER_RT_VISIBILITY ValueProfData **
threads may still be updating the value profile data and creating new		__llvm_profile_gather_value_data(uint64_t *ValueDataSize) {
value entries. To accommadate this, we need to add extra bytes to the		size_t S = 0;
data buffer. The size of the extra space is controlled by an environment
variable. */
static unsigned getVprofExtraBytes() {
const char *ExtraStr =
GetEnvHook ? GetEnvHook("LLVM_VALUE_PROF_BUFFER_EXTRA") : 0;
if (!ExtraStr \|\| !ExtraStr[0])
return 1024;
return (unsigned)atoi(ExtraStr);
}

/* Extract the value profile data info from the runtime. */
#define DEF_VALUE_RECORD(R, NS, V) \
ValueProfRuntimeRecord R; \
if (initializeValueProfRuntimeRecord(&R, NS, V)) \
PROF_OOM_RETURN("Failed to write value profile data ");

#define DTOR_VALUE_RECORD(R) finalizeValueProfRuntimeRecord(&R);

COMPILER_RT_VISIBILITY uint64_t
__llvm_profile_gather_value_data(uint8_t **VDataArray) {
size_t S = 0, RealSize = 0, BufferCapacity = 0, Extra = 0;
__llvm_profile_data *I;		__llvm_profile_data *I;
if (!VDataArray)		ValueProfData **ValueDataArray;
PROF_OOM_RETURN("Failed to write value profile data ");

const __llvm_profile_data *DataEnd = __llvm_profile_end_data();		const __llvm_profile_data *DataEnd = __llvm_profile_end_data();
const __llvm_profile_data *DataBegin = __llvm_profile_begin_data();		const __llvm_profile_data *DataBegin = __llvm_profile_begin_data();

		if (!ValueDataSize)
		return NULL;

		ValueDataArray =
		(ValueProfData *)calloc(DataEnd - DataBegin, sizeof(void ));
		if (!ValueDataArray)
		PROF_OOM_RETURN("Failed to write value profile data ");

/*		/*
* Compute the total Size of the buffer to hold ValueProfData		* Compute the total Size of the buffer to hold ValueProfData
* structures for functions with value profile data.		* structures for functions with value profile data.
*/		*/
for (I = (__llvm_profile_data *)DataBegin; I != DataEnd; ++I) {		for (I = (__llvm_profile_data *)DataBegin; I != DataEnd; ++I) {
		ValueProfRuntimeRecord R;
DEF_VALUE_RECORD(R, I->NumValueSites, I->Values);		if (initializeValueProfRuntimeRecord(&R, I->NumValueSites, I->Values))
		PROF_OOM_RETURN("Failed to write value profile data ");

/* Compute the size of ValueProfData from this runtime record. */		/* Compute the size of ValueProfData from this runtime record. */
if (getNumValueKindsRT(&R) != 0)		if (getNumValueKindsRT(&R) != 0) {
S += getValueProfDataSizeRT(&R);		ValueProfData *VD = NULL;
		uint32_t VS = getValueProfDataSizeRT(&R);
DTOR_VALUE_RECORD(R);		VD = (ValueProfData *)calloc(VS, sizeof(uint8_t));
}		if (!VD)
/* No value sites or no value profile data is collected. */
if (!S)
return 0;

Extra = getVprofExtraBytes();
BufferCapacity = S + Extra;
*VDataArray = calloc(BufferCapacity, sizeof(uint8_t));
if (!*VDataArray)
PROF_OOM_RETURN("Failed to write value profile data ");		PROF_OOM_RETURN("Failed to write value profile data ");
		serializeValueProfDataFromRT(&R, VD);
ValueProfData VD = (ValueProfData )(*VDataArray);		ValueDataArray[I - DataBegin] = VD;
/*		S += VS;
* Extract value profile data and write into ValueProfData structure		}
* one by one. Note that new value profile data added to any value		finalizeValueProfRuntimeRecord(&R);
* site (from another thread) after the ValueProfRuntimeRecord is
* initialized (when the profile data snapshot is taken) won't be
* collected. This is not a problem as those dropped value will have
* very low taken count.
*/
for (I = (__llvm_profile_data *)DataBegin; I != DataEnd; ++I) {
DEF_VALUE_RECORD(R, I->NumValueSites, I->Values);
if (getNumValueKindsRT(&R) == 0)
continue;

/* Record R has taken a snapshot of the VP data at this point. Newly
added VP data for this function will be dropped. */
/* Check if there is enough space. */
if (BufferCapacity - RealSize < getValueProfDataSizeRT(&R)) {
PROF_ERR("Value profile data is dropped :%s \n",
"Out of buffer space. Use environment "
" LLVM_VALUE_PROF_BUFFER_EXTRA to allocate more");
I->Values = 0;
}		}

serializeValueProfDataFromRT(&R, VD);		if (!S) {
deallocateValueProfileCounters(I);		free(ValueDataArray);
I->Values = VD;		ValueDataArray = NULL;
RealSize += VD->TotalSize;
VD = (ValueProfData )((char )VD + VD->TotalSize);
DTOR_VALUE_RECORD(R);
}		}

return RealSize;		*ValueDataSize = S;
		return ValueDataArray;
}		}

compiler-rt/trunk/lib/profile/InstrProfilingWriter.c

	/===- InstrProfilingWriter.c - Write instrumentation to a file or buffer -===\			/===- InstrProfilingWriter.c - Write instrumentation to a file or buffer -===\
	\|*			\|*
	\|* The LLVM Compiler Infrastructure			\|* The LLVM Compiler Infrastructure
	\|*			\|*
	\|* This file is distributed under the University of Illinois Open Source			\|* This file is distributed under the University of Illinois Open Source
	\|* License. See LICENSE.TXT for details.			\|* License. See LICENSE.TXT for details.
	\|*			\|*
	\===----------------------------------------------------------------------===/			\===----------------------------------------------------------------------===/

	#include "InstrProfiling.h"			#include "InstrProfiling.h"
	#include "InstrProfilingInternal.h"			#include "InstrProfilingInternal.h"
	#include <string.h>			#include <string.h>

				#define INSTR_PROF_VALUE_PROF_DATA
				#include "InstrProfData.inc"
				void (FreeHook)(void ) = NULL;
				void* (*CallocHook)(size_t, size_t) = NULL;
				uint32_t VPBufferSize = 0;

	/* The buffer writer is reponsponsible in keeping writer state			/* The buffer writer is reponsponsible in keeping writer state
	* across the call.			* across the call.
	*/			*/
	COMPILER_RT_VISIBILITY uint32_t llvmBufferWriter(ProfDataIOVec *IOVecs,			COMPILER_RT_VISIBILITY uint32_t llvmBufferWriter(ProfDataIOVec *IOVecs,
	uint32_t NumIOVecs,			uint32_t NumIOVecs,
	void **WriterCtx) {			void **WriterCtx) {
	uint32_t I;			uint32_t I;
	char Buffer = (char )WriterCtx;			char Buffer = (char )WriterCtx;
	for (I = 0; I < NumIOVecs; I++) {			for (I = 0; I < NumIOVecs; I++) {
	size_t Length = IOVecs[I].ElmSize * IOVecs[I].NumElm;			size_t Length = IOVecs[I].ElmSize * IOVecs[I].NumElm;
	memcpy(*Buffer, IOVecs[I].Data, Length);			memcpy(*Buffer, IOVecs[I].Data, Length);
	*Buffer += Length;			*Buffer += Length;
	}			}
	return 0;			return 0;
	}			}

	COMPILER_RT_VISIBILITY int llvmWriteProfData(WriterCallback Writer,			COMPILER_RT_VISIBILITY int llvmWriteProfData(WriterCallback Writer,
	void *WriterCtx,			void *WriterCtx,
	const uint8_t *ValueDataBegin,			ValueProfData **ValueDataArray,
	const uint64_t ValueDataSize) {			const uint64_t ValueDataSize) {
	/* Match logic in __llvm_profile_write_buffer(). */			/* Match logic in __llvm_profile_write_buffer(). */
	const __llvm_profile_data *DataBegin = __llvm_profile_begin_data();			const __llvm_profile_data *DataBegin = __llvm_profile_begin_data();
	const __llvm_profile_data *DataEnd = __llvm_profile_end_data();			const __llvm_profile_data *DataEnd = __llvm_profile_end_data();
	const uint64_t *CountersBegin = __llvm_profile_begin_counters();			const uint64_t *CountersBegin = __llvm_profile_begin_counters();
	const uint64_t *CountersEnd = __llvm_profile_end_counters();			const uint64_t *CountersEnd = __llvm_profile_end_counters();
	const char *NamesBegin = __llvm_profile_begin_names();			const char *NamesBegin = __llvm_profile_begin_names();
	const char *NamesEnd = __llvm_profile_end_names();			const char *NamesEnd = __llvm_profile_end_names();
	return llvmWriteProfDataImpl(Writer, WriterCtx, DataBegin, DataEnd,			return llvmWriteProfDataImpl(Writer, WriterCtx, DataBegin, DataEnd,
	CountersBegin, CountersEnd, ValueDataBegin,			CountersBegin, CountersEnd, ValueDataArray,
	ValueDataSize, NamesBegin, NamesEnd);			ValueDataSize, NamesBegin, NamesEnd);
	}			}

				#define VP_BUFFER_SIZE 8 * 1024
				static int writeValueProfData(WriterCallback Writer, void *WriterCtx,
				ValueProfData **ValueDataBegin,
				uint64_t NumVData) {
				ValueProfData **ValueDataArray = ValueDataBegin;
				char BufferStart = 0, Buffer;
				ValueProfData *CurVData;
				uint32_t I = 0, BufferSz;

				if (!ValueDataBegin)
				return 0;

				BufferSz = VPBufferSize ? VPBufferSize : VP_BUFFER_SIZE;
				BufferStart = (char *)CallocHook(BufferSz, sizeof(uint8_t));
				if (!BufferStart)
				return -1;

				uint32_t WriteSize = 0;
				Buffer = BufferStart;
				do {
				CurVData = ValueDataArray[I];
				if (!CurVData) {
				I++;
				continue;
				}

				/* Buffer is full or not large enough, it is time to flush. */
				if (CurVData->TotalSize + WriteSize > BufferSz) {
				if (WriteSize) {
				ProfDataIOVec IO[] = {{BufferStart, sizeof(uint8_t), WriteSize}};
				if (Writer(IO, 1, &WriterCtx))
				return -1;
				WriteSize = 0;
				Buffer = BufferStart;
				}
				/* Special case, bypass the buffer completely. */
				if (CurVData->TotalSize > BufferSz) {
				ProfDataIOVec IO[] = {{CurVData, sizeof(uint8_t), CurVData->TotalSize}};
				if (Writer(IO, 1, &WriterCtx))
				return -1;
				FreeHook(ValueDataArray[I]);
				I++;
				}
				} else {
				/* Write the data to buffer */
				ProfDataIOVec IO[] = {{CurVData, sizeof(uint8_t), CurVData->TotalSize}};
				llvmBufferWriter(IO, 1, (void **)&Buffer);
				WriteSize += CurVData->TotalSize;
				FreeHook(ValueDataArray[I]);
				I++;
				}
				} while (I < NumVData);

				/* Final flush. */
				if (WriteSize) {
				ProfDataIOVec IO[] = {{BufferStart, sizeof(uint8_t), WriteSize}};
				if (Writer(IO, 1, &WriterCtx))
				return -1;
				}

				FreeHook(ValueDataBegin);
				FreeHook(BufferStart);
				return 0;
				}

	COMPILER_RT_VISIBILITY int llvmWriteProfDataImpl(			COMPILER_RT_VISIBILITY int llvmWriteProfDataImpl(
	WriterCallback Writer, void *WriterCtx,			WriterCallback Writer, void *WriterCtx,
	const __llvm_profile_data DataBegin, const __llvm_profile_data DataEnd,			const __llvm_profile_data DataBegin, const __llvm_profile_data DataEnd,
	const uint64_t CountersBegin, const uint64_t CountersEnd,			const uint64_t CountersBegin, const uint64_t CountersEnd,
	const uint8_t *ValueDataBegin, const uint64_t ValueDataSize,			ValueProfData **ValueDataBegin, const uint64_t ValueDataSize,
	const char NamesBegin, const char NamesEnd) {			const char NamesBegin, const char NamesEnd) {

	/* Calculate size of sections. */			/* Calculate size of sections. */
	const uint64_t DataSize = DataEnd - DataBegin;			const uint64_t DataSize = DataEnd - DataBegin;
	const uint64_t CountersSize = CountersEnd - CountersBegin;			const uint64_t CountersSize = CountersEnd - CountersBegin;
	const uint64_t NamesSize = NamesEnd - NamesBegin;			const uint64_t NamesSize = NamesEnd - NamesBegin;
	const uint64_t Padding = __llvm_profile_get_num_padding_bytes(NamesSize);			const uint64_t Padding = __llvm_profile_get_num_padding_bytes(NamesSize);

	/* Enough zeroes for padding. */			/* Enough zeroes for padding. */
	const char Zeroes[sizeof(uint64_t)] = {0};			const char Zeroes[sizeof(uint64_t)] = {0};

	/* Create the header. */			/* Create the header. */
	__llvm_profile_header Header;			__llvm_profile_header Header;

	if (!DataSize)			if (!DataSize)
	return 0;			return 0;

	/* Initialize header struture. */			/* Initialize header struture. */
	#define INSTR_PROF_RAW_HEADER(Type, Name, Init) Header.Name = Init;			#define INSTR_PROF_RAW_HEADER(Type, Name, Init) Header.Name = Init;
	#include "InstrProfData.inc"			#include "InstrProfData.inc"

	/* Write the data. */			/* Write the data. */
	ProfDataIOVec IOVec[] = {			ProfDataIOVec IOVec[] = {{&Header, sizeof(__llvm_profile_header), 1},
	{&Header, sizeof(__llvm_profile_header), 1},
	{DataBegin, sizeof(__llvm_profile_data), DataSize},			{DataBegin, sizeof(__llvm_profile_data), DataSize},
	{CountersBegin, sizeof(uint64_t), CountersSize},			{CountersBegin, sizeof(uint64_t), CountersSize},
	{NamesBegin, sizeof(char), NamesSize},			{NamesBegin, sizeof(uint8_t), NamesSize},
	{Zeroes, sizeof(char), Padding}};			{Zeroes, sizeof(uint8_t), Padding}};
	if (Writer(IOVec, sizeof(IOVec) / sizeof(*IOVec), &WriterCtx))			if (Writer(IOVec, sizeof(IOVec) / sizeof(*IOVec), &WriterCtx))
	return -1;			return -1;
	if (ValueDataBegin) {
	ProfDataIOVec IOVec2[] = {{ValueDataBegin, sizeof(char), ValueDataSize}};			return writeValueProfData(Writer, WriterCtx, ValueDataBegin, DataSize);
	if (Writer(IOVec2, sizeof(IOVec2) / sizeof(*IOVec2), &WriterCtx))
	return -1;
	}
	return 0;
	}			}

compiler-rt/trunk/test/profile/instrprof-value-prof.c

	// RUN: %clang_profgen -O2 -o %t %s			// RUN: %clang_profgen -O2 -o %t %s
	// RUN: env LLVM_PROFILE_FILE=%t.profraw %run %t 1			// RUN: env LLVM_PROFILE_FILE=%t.profraw %run %t 1
	// RUN: env LLVM_PROFILE_FILE=%t-2.profraw %run %t			// RUN: env LLVM_PROFILE_FILE=%t-2.profraw %run %t
	// RUN: llvm-profdata merge -o %t.profdata %t.profraw			// RUN: llvm-profdata merge -o %t.profdata %t.profraw
	// RUN: llvm-profdata merge -o %t-2.profdata %t-2.profraw			// RUN: llvm-profdata merge -o %t-2.profdata %t-2.profraw
	// RUN: llvm-profdata merge -o %t-merged.profdata %t.profraw %t-2.profdata			// RUN: llvm-profdata merge -o %t-merged.profdata %t.profraw %t-2.profdata
	// RUN: llvm-profdata show --all-functions -ic-targets %t-2.profdata \| FileCheck %s -check-prefix=NO-VALUE			// RUN: llvm-profdata show --all-functions -ic-targets %t-2.profdata \| FileCheck %s -check-prefix=NO-VALUE
	// RUN: llvm-profdata show --all-functions -ic-targets %t.profdata \| FileCheck %s			// RUN: llvm-profdata show --all-functions -ic-targets %t.profdata \| FileCheck %s
	// value profile merging current do sorting based on target values -- this will destroy the order of the target			// value profile merging current do sorting based on target values -- this will destroy the order of the target
	// in the list leading to comparison problem. For now just check a small subset of output.			// in the list leading to comparison problem. For now just check a small subset of output.
	// RUN: llvm-profdata show --all-functions -ic-targets %t-merged.profdata \| FileCheck %s -check-prefix=MERGE			// RUN: llvm-profdata show --all-functions -ic-targets %t-merged.profdata \| FileCheck %s -check-prefix=MERGE
				//
				// RUN: env LLVM_PROFILE_FILE=%t-3.profraw LLVM_VP_BUFFER_SIZE=1 %run %t 1
				// RUN: env LLVM_PROFILE_FILE=%t-4.profraw LLVM_VP_BUFFER_SIZE=8 %run %t 1
				// RUN: env LLVM_PROFILE_FILE=%t-5.profraw LLVM_VP_BUFFER_SIZE=128 %run %t 1
				// RUN: env LLVM_PROFILE_FILE=%t-6.profraw LLVM_VP_BUFFER_SIZE=1024 %run %t 1
				// RUN: env LLVM_PROFILE_FILE=%t-7.profraw LLVM_VP_BUFFER_SIZE=102400 %run %t 1
				// RUN: llvm-profdata merge -o %t-3.profdata %t-3.profraw
				// RUN: llvm-profdata merge -o %t-4.profdata %t-4.profraw
				// RUN: llvm-profdata merge -o %t-5.profdata %t-5.profraw
				// RUN: llvm-profdata merge -o %t-6.profdata %t-6.profraw
				// RUN: llvm-profdata merge -o %t-7.profdata %t-7.profraw
				// RUN: llvm-profdata show --all-functions -ic-targets %t-3.profdata \| FileCheck %s
				// RUN: llvm-profdata show --all-functions -ic-targets %t-4.profdata \| FileCheck %s
				// RUN: llvm-profdata show --all-functions -ic-targets %t-5.profdata \| FileCheck %s
				// RUN: llvm-profdata show --all-functions -ic-targets %t-6.profdata \| FileCheck %s
				// RUN: llvm-profdata show --all-functions -ic-targets %t-7.profdata \| FileCheck %s

	#include <stdint.h>			#include <stdint.h>
	#include <stdio.h>			#include <stdio.h>
	#include <stdlib.h>			#include <stdlib.h>
	typedef struct __llvm_profile_data __llvm_profile_data;			typedef struct __llvm_profile_data __llvm_profile_data;
	const __llvm_profile_data *__llvm_profile_begin_data(void);			const __llvm_profile_data *__llvm_profile_begin_data(void);
	const __llvm_profile_data *__llvm_profile_end_data(void);			const __llvm_profile_data *__llvm_profile_end_data(void);
	void __llvm_profile_set_num_value_sites(__llvm_profile_data *Data,			void __llvm_profile_set_num_value_sites(__llvm_profile_data *Data,
	▲ Show 20 Lines • Show All 218 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[PGO] Remove data races on Data->Values field
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 43713

compiler-rt/trunk/lib/profile/InstrProfiling.h

compiler-rt/trunk/lib/profile/InstrProfilingFile.c

compiler-rt/trunk/lib/profile/InstrProfilingInternal.h

compiler-rt/trunk/lib/profile/InstrProfilingValue.c

compiler-rt/trunk/lib/profile/InstrProfilingWriter.c

compiler-rt/trunk/test/profile/instrprof-value-prof.c

This is an archive of the discontinued LLVM Phabricator instance.

[PGO] Remove data races on Data->Values fieldClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 43713

compiler-rt/trunk/lib/profile/InstrProfiling.h

compiler-rt/trunk/lib/profile/InstrProfilingFile.c

compiler-rt/trunk/lib/profile/InstrProfilingInternal.h

compiler-rt/trunk/lib/profile/InstrProfilingValue.c

compiler-rt/trunk/lib/profile/InstrProfilingWriter.c

compiler-rt/trunk/test/profile/instrprof-value-prof.c

[PGO] Remove data races on Data->Values field
ClosedPublic