This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/Support/
-
llvm/
-
Support/
1/4
DataExtractor.h
-
lib/Support/
-
Support/
-
DataExtractor.cpp
-
unittests/Support/
-
Support/
-
DataExtractorTest.cpp

Differential D64006

Support 64-bit offsets in utility classes (1/5)
ClosedPublic

Authored by ikudrin on Jul 1 2019, 6:17 AM.

Download Raw Diff

Details

Reviewers

dblaikie
bkramer
probinson
aprantl

Commits

rGf5f35c5cd110: Support 64-bit offsets in utility classes (1/5)
rL368013: Support 64-bit offsets in utility classes (1/5)

Summary

Using 64-bit offsets is required to fully implement 64-bit DWARF.
As these classes are used in many different libraries they should temporarily support both 32- and 64-bit offsets.

Diff Detail

Repository: rL LLVM

Event Timeline

ikudrin created this revision.Jul 1 2019, 6:17 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 1 2019, 6:17 AM

Herald added subscribers: kristina, aprantl. · View Herald Transcript

aprantl added inline comments.Jul 1 2019, 10:06 AM

include/llvm/Support/DataExtractor.h
82	Why do we need both variants? Would the 64-bit variant alone be sufficient?

ikudrin marked an inline comment as done.Jul 1 2019, 10:19 AM

ikudrin added inline comments.

include/llvm/Support/DataExtractor.h
82	In theory, yes. But the class is widely used in LLVM, and not only to parse DWARF sections. Trying to switch to `uint64_t` everywhere will result in a huge patch which will be hard to review. Hence, I decided to move gradually.

aprantl added inline comments.Jul 1 2019, 1:41 PM

include/llvm/Support/DataExtractor.h
82	I think I'd prefer to review one large (but mostly mechanical) patch over having two overloads that could potentially introduce subtle and hard-to-debug-bugs when, e.g., the 32-bit version is accidentally invoked where the 64-bit version was needed.

dblaikie added inline comments.Jul 1 2019, 1:53 PM

include/llvm/Support/DataExtractor.h
82	I would expect a mismatch like that would be rare - since the transition for one user would probably be fairly pervasive (since you pass around uint32_t* all over the place, so most of those will be forced to change when you change the root of that usage). But yeah, I'd at least want this 32+64 support to be /very/ short lived through a few patches over weeks, not months/years.

Initially, I tried to replace uint32_t offsets to uint64_t. Unfortunately, that is not as mechanical, as you might expect. Please look at D64059, which is my current progress with that approach. It is far to be complete and it is already overwhelmingly complex.

I identified several problems with that approach:

You have to find all the places where offsets are printed and decide the correct specifiers. It is not hard to overlook some.
Sometimes the values are not passed by pointers, so the compiler helps little to find that places.
As sometimes the offsets are calculated from several values, you have to dig deeper to decide which should be switched to uint64_t and which should not. That is really not a mechanical work, too.
Some clients just don't need 64-bit offsets, at least for now. But with that approach, they have to be updated as well, further increasing the complexity.

Comparing all that with a subtle chance to misuse an overloaded method, I'd prefer to move at a slow pace, but with patches which at least might be understandable.

Thank you for demonstrating this! I agree that the patch also introduces lots of opportunities for subtle bugs. It also increases the memory footprint in places where it likely won't be needed any time soon, such as 64-bit DWARF parsing support. This patch looks much more palatable to me now and I'd be okay with taking it if @dblaikie is okay with it. Is there danger in using the same function name or should we call the 64-bit variants something more explicit?

In D64006#1566831, @aprantl wrote:

Thank you for demonstrating this! I agree that the patch also introduces lots of opportunities for subtle bugs. It also increases the memory footprint in places where it likely won't be needed any time soon, such as 64-bit DWARF parsing support. This patch looks much more palatable to me now and I'd be okay with taking it if @dblaikie is okay with it. Is there danger in using the same function name or should we call the 64-bit variants something more explicit?

I don't feel /super/ strongly that it be done in one go. But it sounds like @ikudrin is suggesting maybe it's not their goal to migrate from 32 to 64 entirely, and that maybe it's OK to have both APIs? I really rather that not be the case. I think if there are negative size impacts for this migration to users of the DataExtractor API those can be worked around by certain clients continuing to store 32 bit offsets in data structures (but mostly these offsets aren't intended to be updated - so their addresses won't be passed directly to DataExtractor) and copying those into 64 bit values before using them as cursors into DataExtractor.

I'm OK with this patch direction - but only so long as there's a plan in weeks (probably not months, definitely not years) to migrate all callers over to 64 bit cursors.

(I could be convinced otherwise - but I think the bar would/should be pretty high to justify keeping both of these)

In D64006#1566831, @aprantl wrote:

...in places where it likely won't be needed any time soon, such as 64-bit DWARF parsing support.

In fact, I am going to implement 64-bit DWARF support. I started from DataExtractor because changing it seems inevitable for that anyway.

In D64006#1567394, @dblaikie wrote:

But it sounds like @ikudrin is suggesting maybe it's not their goal to migrate from 32 to 64 entirely, and that maybe it's OK to have both APIs?

In the beginning, I also thought that having only one possible type for offsets, uint64_t, is preferable. But thinking of that for some time, I cannot find any strong reason not to keep both variants, apart from some kind of aesthetic sense, which is debatable. I really do not see any new harm which may result from having overloads for both 32- and 64-bit offsets. The only issue I can guess is a potential risk of wrapping a 32-bit offset value, but the current implementation is also affected by that, so nothing new is here.

Moreover, this is just a utility class. It should provide its users with functionality, not force them to satisfy its whims, especially if that complicates things for them without any noticeable value; I mean, storing 32-bit offsets and creating a temporary 64-bit variable just to call the utility class does not seem aesthetic, too. It is the callers who know how big their data is and which data type for offsets reflects their needs better.

Anyway, as I said, my main intent is to add support for 64-bit DWARF. While there are other users, DataExtractor is mainly used in the DebugInfo/DWARF library and I expect to change it to use 64-bit offsets while implementing that support. Hopefully, that will be done with a bunch of relatively small patches. After that, we can decide what to do with the remaining callers. Unfortunately, I cannot promise that this transition may be done in several weeks. It seems that two to four months is a more realistic estimation.

In D64006#1567868, @ikudrin wrote:

In D64006#1566831, @aprantl wrote:

...in places where it likely won't be needed any time soon, such as 64-bit DWARF parsing support.

In fact, I am going to implement 64-bit DWARF support. I started from DataExtractor because changing it seems inevitable for that anyway.

In D64006#1567394, @dblaikie wrote:

But it sounds like @ikudrin is suggesting maybe it's not their goal to migrate from 32 to 64 entirely, and that maybe it's OK to have both APIs?

In the beginning, I also thought that having only one possible type for offsets, uint64_t, is preferable. But thinking of that for some time, I cannot find any strong reason not to keep both variants, apart from some kind of aesthetic sense, which is debatable. I really do not see any new harm which may result from having overloads for both 32- and 64-bit offsets. The only issue I can guess is a potential risk of wrapping a 32-bit offset value, but the current implementation is also affected by that, so nothing new is here.

I think the main one is code duplication - write once/fix once/etc is valuable.

Moreover, this is just a utility class. It should provide its users with functionality, not force them to satisfy its whims, especially if that complicates things for them without any noticeable value; I mean, storing 32-bit offsets and creating a temporary 64-bit variable just to call the utility class does not seem aesthetic, too. It is the callers who know how big their data is and which data type for offsets reflects their needs better.

I think if we knew then what we know now we'd have built it with 64 bit offsets & it wouldn't've been that much imposition to clients that can use 32 bit offsets.

Did you come across cases where you would need to insert new 64 bit temporaries? My understanding would be that anything storing a lot of offsets in a data structure would be storing fixed offsets that were not intended to be mutated (eg: the DIEs in a DIE tree might contain an offset to the start of the attributes in that DIE) - so you wouldn't want to pass the address of that offset directly to DataExtractor (because it would mutate it, then the DIE in the DIE tree would have an offset that no longer points to the start of the attributes - but somewhere in the middle/end - making it unusable from then on) - so such code would likely copy the offset into a local intended for mutation. The difference now is that local would be 64 bit. That doesn't seem like an imposition to me.

Anyway, as I said, my main intent is to add support for 64-bit DWARF. While there are other users, DataExtractor is mainly used in the DebugInfo/DWARF library and I expect to change it to use 64-bit offsets while implementing that support. Hopefully, that will be done with a bunch of relatively small patches. After that, we can decide what to do with the remaining callers. Unfortunately, I cannot promise that this transition may be done in several weeks. It seems that two to four months is a more realistic estimation.

Fair enough - though would it be possible to prioritize finishing the DataExtractor migration (or demonstrating it is not desirable) before necessarily fleshing out the rest of the DWARF 64 support? I'd be concerned it might be left lingering otherwise.

In D64006#1568562, @dblaikie wrote:

I think the main one is code duplication - write once/fix once/etc is valuable.

In that case, what do you think about templates?

Moreover, this is just a utility class. It should provide its users with functionality, not force them to satisfy its whims, especially if that complicates things for them without any noticeable value; I mean, storing 32-bit offsets and creating a temporary 64-bit variable just to call the utility class does not seem aesthetic, too. It is the callers who know how big their data is and which data type for offsets reflects their needs better.

I think if we knew then what we know now we'd have built it with 64 bit offsets & it wouldn't've been that much imposition to clients that can use 32 bit offsets.

Did you come across cases where you would need to insert new 64 bit temporaries? My understanding would be that anything storing a lot of offsets in a data structure would be storing fixed offsets that were not intended to be mutated (eg: the DIEs in a DIE tree might contain an offset to the start of the attributes in that DIE) - so you wouldn't want to pass the address of that offset directly to DataExtractor (because it would mutate it, then the DIE in the DIE tree would have an offset that no longer points to the start of the attributes - but somewhere in the middle/end - making it unusable from then on) - so such code would likely copy the offset into a local intended for mutation. The difference now is that local would be 64 bit. That doesn't seem like an imposition to me.

Seems you are right. We may come across some unusual cases during the transition, but we can solve them in the corresponding patches.

Anyway, as I said, my main intent is to add support for 64-bit DWARF. While there are other users, DataExtractor is mainly used in the DebugInfo/DWARF library and I expect to change it to use 64-bit offsets while implementing that support. Hopefully, that will be done with a bunch of relatively small patches. After that, we can decide what to do with the remaining callers. Unfortunately, I cannot promise that this transition may be done in several weeks. It seems that two to four months is a more realistic estimation.

Fair enough - though would it be possible to prioritize finishing the DataExtractor migration (or demonstrating it is not desirable) before necessarily fleshing out the rest of the DWARF 64 support? I'd be concerned it might be left lingering otherwise.

Well, in many cases they are connected. You need 64-bit offsets because they can be found in 64-bit DWARF sections. Thus, migrating to 64-bit offsets is a half-way to implement 64-bit DWARF in the DebugInfo/DWARF library, so, I think it is better to migrate class-by-class, adding DWARF64 support consciously, rather than just mechanical replacing 32-bit offsets with 64-bit ones. We have already seen that the mechanical approach does not work well.

Maybe we can postpone applying this patch until we have a whole set for the migration. Surely, it will require some efforts to keep all patches in the actual state, and I would be happy to avoid that. But this might be a reasonable way if you want the migration to be as fast as possible.

In D64006#1568874, @ikudrin wrote:

In D64006#1568562, @dblaikie wrote:

I think the main one is code duplication - write once/fix once/etc is valuable.

In that case, what do you think about templates?

They help, for sure - though I I'm not sure it's probably necessary to have even that complexity (having DataExtractor templated on an integer type for the cursor), at least I don't see it yet, maybe with better understanding I might - but for now it really sounds like we would've built this with uint64_t only if we started from scratch.

Moreover, this is just a utility class. It should provide its users with functionality, not force them to satisfy its whims, especially if that complicates things for them without any noticeable value; I mean, storing 32-bit offsets and creating a temporary 64-bit variable just to call the utility class does not seem aesthetic, too. It is the callers who know how big their data is and which data type for offsets reflects their needs better.

I think if we knew then what we know now we'd have built it with 64 bit offsets & it wouldn't've been that much imposition to clients that can use 32 bit offsets.

Did you come across cases where you would need to insert new 64 bit temporaries? My understanding would be that anything storing a lot of offsets in a data structure would be storing fixed offsets that were not intended to be mutated (eg: the DIEs in a DIE tree might contain an offset to the start of the attributes in that DIE) - so you wouldn't want to pass the address of that offset directly to DataExtractor (because it would mutate it, then the DIE in the DIE tree would have an offset that no longer points to the start of the attributes - but somewhere in the middle/end - making it unusable from then on) - so such code would likely copy the offset into a local intended for mutation. The difference now is that local would be 64 bit. That doesn't seem like an imposition to me.

Seems you are right. We may come across some unusual cases during the transition, but we can solve them in the corresponding patches.

Anyway, as I said, my main intent is to add support for 64-bit DWARF. While there are other users, DataExtractor is mainly used in the DebugInfo/DWARF library and I expect to change it to use 64-bit offsets while implementing that support. Hopefully, that will be done with a bunch of relatively small patches. After that, we can decide what to do with the remaining callers. Unfortunately, I cannot promise that this transition may be done in several weeks. It seems that two to four months is a more realistic estimation.

Fair enough - though would it be possible to prioritize finishing the DataExtractor migration (or demonstrating it is not desirable) before necessarily fleshing out the rest of the DWARF 64 support? I'd be concerned it might be left lingering otherwise.

Well, in many cases they are connected. You need 64-bit offsets because they can be found in 64-bit DWARF sections. Thus, migrating to 64-bit offsets is a half-way to implement 64-bit DWARF in the DebugInfo/DWARF library, so, I think it is better to migrate class-by-class, adding DWARF64 support consciously, rather than just mechanical replacing 32-bit offsets with 64-bit ones. We have already seen that the mechanical approach does not work well.

Maybe we can postpone applying this patch until we have a whole set for the migration. Surely, it will require some efforts to keep all patches in the actual state, and I would be happy to avoid that. But this might be a reasonable way if you want the migration to be as fast as possible.

Nah, it's fine - happy to let you & @aprantl carry on here - holding the patches out of tree doesn't buy us anything. The main concern is just that things aren't cleaned up & left in a hybrid state, and that risk exists even if we delay this going in-tree until some slightly later point.

To answer one of @aprantl 's later questions - nah, I don't think these need different names. The overloads are distinct, a pointer to uint32_t can't implicitly convert to/from a pointer to uint64_t - so there doesn't seem to be any great risk of confusion there.

ikudrin added a child revision: D64209: [DWARF] Make DWARFDataExtractor possible to be used with both 32- and 64-bit offsets..Jul 4 2019, 8:03 AM

In D64006#1568894, @dblaikie wrote:

In that case, what do you think about templates?

They help, for sure - though I I'm not sure it's probably necessary to have even that complexity (having DataExtractor templated on an integer type for the cursor), at least I don't see it yet, maybe with better understanding I might - but for now it really sounds like we would've built this with uint64_t only if we started from scratch.

D64209 demonstrates how I see using templates for that task. Which way for these temporary patches do you prefer?

wolfgangp added a subscriber: wolfgangp.Jul 23 2019, 3:49 PM

jhenderson added a subscriber: jhenderson.Jul 25 2019, 1:37 AM

I am very keen for this to go ahead, but unfortunately don't have the personal bandwidth to take on any more reviewing of this sort of thing, sorry!

I have prepared a set of patches to switch LLVM and the sub-projects to use 64-bit offsets with DataExtractor and all other places connected to it directly or indirectly. Please, take a look.

ikudrin added a child revision: D65638: Switch LLVM to use 64-bit offsets (2/5).Aug 2 2019, 1:58 AM

ikudrin removed a child revision: D64209: [DWARF] Make DWARFDataExtractor possible to be used with both 32- and 64-bit offsets..Aug 2 2019, 2:11 AM

Looks good - thanks!

This revision is now accepted and ready to land.Aug 5 2019, 1:51 PM

Closed by commit rL368013: Support 64-bit offsets in utility classes (1/5) (authored by ikudrin). · Explain WhyAug 6 2019, 3:49 AM

This revision was automatically updated to reflect the committed changes.

probinson mentioned this in D63713: Add error handling to the DataExtractor class.Aug 8 2019, 11:51 AM

Revision Contents

Path

Size

include/

llvm/

Support/

DataExtractor.h

56 lines

lib/

Support/

DataExtractor.cpp

152 lines

unittests/

Support/

DataExtractorTest.cpp

30 lines

Diff 207284

include/llvm/Support/DataExtractor.h

Show First 20 Lines • Show All 73 Lines • ▼ Show 20 Lines	public:
/// enough bytes to extract this value, the offset will be left		/// enough bytes to extract this value, the offset will be left
/// unmodified.		/// unmodified.
///		///
/// @return		/// @return
/// A pointer to the C string value in the data. If the offset		/// A pointer to the C string value in the data. If the offset
/// pointed to by \a offset_ptr is out of bounds, or if the		/// pointed to by \a offset_ptr is out of bounds, or if the
/// offset plus the length of the C string is out of bounds,		/// offset plus the length of the C string is out of bounds,
/// NULL will be returned.		/// NULL will be returned.
		const char getCStr(uint64_t offset_ptr) const;
		aprantlUnsubmitted Not Done Reply Inline Actions Why do we need both variants? Would the 64-bit variant alone be sufficient? aprantl: Why do we need both variants? Would the 64-bit variant alone be sufficient?
		ikudrinAuthorUnsubmitted Done Reply Inline Actions In theory, yes. But the class is widely used in LLVM, and not only to parse DWARF sections. Trying to switch to `uint64_t` everywhere will result in a huge patch which will be hard to review. Hence, I decided to move gradually. ikudrin: In theory, yes. But the class is widely used in LLVM, and not only to parse DWARF sections.
		aprantlUnsubmitted Not Done Reply Inline Actions I think I'd prefer to review one large (but mostly mechanical) patch over having two overloads that could potentially introduce subtle and hard-to-debug-bugs when, e.g., the 32-bit version is accidentally invoked where the 64-bit version was needed. aprantl: I think I'd prefer to review one large (but mostly mechanical) patch over having two overloads…
		dblaikieUnsubmitted Not Done Reply Inline Actions I would expect a mismatch like that would be rare - since the transition for one user would probably be fairly pervasive (since you pass around uint32_t* all over the place, so most of those will be forced to change when you change the root of that usage). But yeah, I'd at least want this 32+64 support to be /very/ short lived through a few patches over weeks, not months/years. dblaikie: I would expect a mismatch like that would be rare - since the transition for one user would…

		/// \overload const char getCStr(uint32_t offset_ptr) const
const char getCStr(uint32_t offset_ptr) const;		const char getCStr(uint32_t offset_ptr) const;

/// Extract a C string from \a *OffsetPtr.		/// Extract a C string from \a *OffsetPtr.
///		///
/// Returns a StringRef for the C String from the data at the offset		/// Returns a StringRef for the C String from the data at the offset
/// pointed to by \a OffsetPtr. A variable length NULL terminated C		/// pointed to by \a OffsetPtr. A variable length NULL terminated C
/// string will be extracted and the \a OffsetPtr will be		/// string will be extracted and the \a OffsetPtr will be
/// updated with the offset of the byte that follows the NULL		/// updated with the offset of the byte that follows the NULL
/// terminator byte.		/// terminator byte.
///		///
/// \param[in,out] OffsetPtr		/// \param[in,out] OffsetPtr
/// A pointer to an offset within the data that will be advanced		/// A pointer to an offset within the data that will be advanced
/// by the appropriate number of bytes if the value is extracted		/// by the appropriate number of bytes if the value is extracted
/// correctly. If the offset is out of bounds or there are not		/// correctly. If the offset is out of bounds or there are not
/// enough bytes to extract this value, the offset will be left		/// enough bytes to extract this value, the offset will be left
/// unmodified.		/// unmodified.
///		///
/// \return		/// \return
/// A StringRef for the C string value in the data. If the offset		/// A StringRef for the C string value in the data. If the offset
/// pointed to by \a OffsetPtr is out of bounds, or if the		/// pointed to by \a OffsetPtr is out of bounds, or if the
/// offset plus the length of the C string is out of bounds,		/// offset plus the length of the C string is out of bounds,
/// a default-initialized StringRef will be returned.		/// a default-initialized StringRef will be returned.
		StringRef getCStrRef(uint64_t *OffsetPtr) const;

		/// \overload StringRef getCStrRef(uint32_t *OffsetPtr) const
StringRef getCStrRef(uint32_t *OffsetPtr) const;		StringRef getCStrRef(uint32_t *OffsetPtr) const;

/// Extract an unsigned integer of size \a byte_size from \a		/// Extract an unsigned integer of size \a byte_size from \a
/// *offset_ptr.		/// *offset_ptr.
///		///
/// Extract a single unsigned integer value and update the offset		/// Extract a single unsigned integer value and update the offset
/// pointed to by \a offset_ptr. The size of the extracted integer		/// pointed to by \a offset_ptr. The size of the extracted integer
/// is specified by the \a byte_size argument. \a byte_size should		/// is specified by the \a byte_size argument. \a byte_size should
Show All 10 Lines	public:
/// unmodified.		/// unmodified.
///		///
/// @param[in] byte_size		/// @param[in] byte_size
/// The size in byte of the integer to extract.		/// The size in byte of the integer to extract.
///		///
/// @return		/// @return
/// The unsigned integer value that was extracted, or zero on		/// The unsigned integer value that was extracted, or zero on
/// failure.		/// failure.
		uint64_t getUnsigned(uint64_t *offset_ptr, uint32_t byte_size) const;

		/// \overload uint64_t getUnsigned(uint32_t *offset_ptr, uint32_t byte_size) const
uint64_t getUnsigned(uint32_t *offset_ptr, uint32_t byte_size) const;		uint64_t getUnsigned(uint32_t *offset_ptr, uint32_t byte_size) const;

/// Extract an signed integer of size \a byte_size from \a *offset_ptr.		/// Extract an signed integer of size \a byte_size from \a *offset_ptr.
///		///
/// Extract a single signed integer value (sign extending if required)		/// Extract a single signed integer value (sign extending if required)
/// and update the offset pointed to by \a offset_ptr. The size of		/// and update the offset pointed to by \a offset_ptr. The size of
/// the extracted integer is specified by the \a byte_size argument.		/// the extracted integer is specified by the \a byte_size argument.
/// \a byte_size should have a value greater than or equal to one		/// \a byte_size should have a value greater than or equal to one
Show All 9 Lines	public:
/// unmodified.		/// unmodified.
///		///
/// @param[in] size		/// @param[in] size
/// The size in bytes of the integer to extract.		/// The size in bytes of the integer to extract.
///		///
/// @return		/// @return
/// The sign extended signed integer value that was extracted,		/// The sign extended signed integer value that was extracted,
/// or zero on failure.		/// or zero on failure.
		int64_t getSigned(uint64_t *offset_ptr, uint32_t size) const;

		/// \overload int64_t getSigned(uint32_t *offset_ptr, uint32_t size) const
int64_t getSigned(uint32_t *offset_ptr, uint32_t size) const;		int64_t getSigned(uint32_t *offset_ptr, uint32_t size) const;

//------------------------------------------------------------------		//------------------------------------------------------------------
/// Extract an pointer from \a *offset_ptr.		/// Extract an pointer from \a *offset_ptr.
///		///
/// Extract a single pointer from the data and update the offset		/// Extract a single pointer from the data and update the offset
/// pointed to by \a offset_ptr. The size of the extracted pointer		/// pointed to by \a offset_ptr. The size of the extracted pointer
/// is \a getAddressSize(), so the address size has to be		/// is \a getAddressSize(), so the address size has to be
/// set correctly prior to extracting any pointer values.		/// set correctly prior to extracting any pointer values.
///		///
/// @param[in,out] offset_ptr		/// @param[in,out] offset_ptr
/// A pointer to an offset within the data that will be advanced		/// A pointer to an offset within the data that will be advanced
/// by the appropriate number of bytes if the value is extracted		/// by the appropriate number of bytes if the value is extracted
/// correctly. If the offset is out of bounds or there are not		/// correctly. If the offset is out of bounds or there are not
/// enough bytes to extract this value, the offset will be left		/// enough bytes to extract this value, the offset will be left
/// unmodified.		/// unmodified.
///		///
/// @return		/// @return
/// The extracted pointer value as a 64 integer.		/// The extracted pointer value as a 64 integer.
		uint64_t getAddress(uint64_t *offset_ptr) const {
		return getUnsigned(offset_ptr, AddressSize);
		}

		/// \overload uint64_t getAddress(uint32_t *offset_ptr) const
uint64_t getAddress(uint32_t *offset_ptr) const {		uint64_t getAddress(uint32_t *offset_ptr) const {
return getUnsigned(offset_ptr, AddressSize);		return getUnsigned(offset_ptr, AddressSize);
}		}

/// Extract a uint8_t value from \a *offset_ptr.		/// Extract a uint8_t value from \a *offset_ptr.
///		///
/// Extract a single uint8_t from the binary data at the offset		/// Extract a single uint8_t from the binary data at the offset
/// pointed to by \a offset_ptr, and advance the offset on success.		/// pointed to by \a offset_ptr, and advance the offset on success.
///		///
/// @param[in,out] offset_ptr		/// @param[in,out] offset_ptr
/// A pointer to an offset within the data that will be advanced		/// A pointer to an offset within the data that will be advanced
/// by the appropriate number of bytes if the value is extracted		/// by the appropriate number of bytes if the value is extracted
/// correctly. If the offset is out of bounds or there are not		/// correctly. If the offset is out of bounds or there are not
/// enough bytes to extract this value, the offset will be left		/// enough bytes to extract this value, the offset will be left
/// unmodified.		/// unmodified.
///		///
/// @return		/// @return
/// The extracted uint8_t value.		/// The extracted uint8_t value.
		uint8_t getU8(uint64_t *offset_ptr) const;

		/// \overload uint8_t getU8(uint32_t *offset_ptr) const
uint8_t getU8(uint32_t *offset_ptr) const;		uint8_t getU8(uint32_t *offset_ptr) const;

/// Extract \a count uint8_t values from \a *offset_ptr.		/// Extract \a count uint8_t values from \a *offset_ptr.
///		///
/// Extract \a count uint8_t values from the binary data at the		/// Extract \a count uint8_t values from the binary data at the
/// offset pointed to by \a offset_ptr, and advance the offset on		/// offset pointed to by \a offset_ptr, and advance the offset on
/// success. The extracted values are copied into \a dst.		/// success. The extracted values are copied into \a dst.
///		///
Show All 9 Lines	public:
/// be large enough to hold all requested data.		/// be large enough to hold all requested data.
///		///
/// @param[in] count		/// @param[in] count
/// The number of uint8_t values to extract.		/// The number of uint8_t values to extract.
///		///
/// @return		/// @return
/// \a dst if all values were properly extracted and copied,		/// \a dst if all values were properly extracted and copied,
/// NULL otherise.		/// NULL otherise.
		uint8_t getU8(uint64_t offset_ptr, uint8_t *dst, uint32_t count) const;

		/// \overload uint8_t getU8(uint32_t offset_ptr, uint8_t *dst, uint32_t count) const
uint8_t getU8(uint32_t offset_ptr, uint8_t *dst, uint32_t count) const;		uint8_t getU8(uint32_t offset_ptr, uint8_t *dst, uint32_t count) const;

//------------------------------------------------------------------		//------------------------------------------------------------------
/// Extract a uint16_t value from \a *offset_ptr.		/// Extract a uint16_t value from \a *offset_ptr.
///		///
/// Extract a single uint16_t from the binary data at the offset		/// Extract a single uint16_t from the binary data at the offset
/// pointed to by \a offset_ptr, and update the offset on success.		/// pointed to by \a offset_ptr, and update the offset on success.
///		///
/// @param[in,out] offset_ptr		/// @param[in,out] offset_ptr
/// A pointer to an offset within the data that will be advanced		/// A pointer to an offset within the data that will be advanced
/// by the appropriate number of bytes if the value is extracted		/// by the appropriate number of bytes if the value is extracted
/// correctly. If the offset is out of bounds or there are not		/// correctly. If the offset is out of bounds or there are not
/// enough bytes to extract this value, the offset will be left		/// enough bytes to extract this value, the offset will be left
/// unmodified.		/// unmodified.
///		///
/// @return		/// @return
/// The extracted uint16_t value.		/// The extracted uint16_t value.
//------------------------------------------------------------------		//------------------------------------------------------------------
		uint16_t getU16(uint64_t *offset_ptr) const;

		/// \overload uint16_t getU16(uint32_t *offset_ptr) const
uint16_t getU16(uint32_t *offset_ptr) const;		uint16_t getU16(uint32_t *offset_ptr) const;

/// Extract \a count uint16_t values from \a *offset_ptr.		/// Extract \a count uint16_t values from \a *offset_ptr.
///		///
/// Extract \a count uint16_t values from the binary data at the		/// Extract \a count uint16_t values from the binary data at the
/// offset pointed to by \a offset_ptr, and advance the offset on		/// offset pointed to by \a offset_ptr, and advance the offset on
/// success. The extracted values are copied into \a dst.		/// success. The extracted values are copied into \a dst.
///		///
Show All 9 Lines	public:
/// be large enough to hold all requested data.		/// be large enough to hold all requested data.
///		///
/// @param[in] count		/// @param[in] count
/// The number of uint16_t values to extract.		/// The number of uint16_t values to extract.
///		///
/// @return		/// @return
/// \a dst if all values were properly extracted and copied,		/// \a dst if all values were properly extracted and copied,
/// NULL otherise.		/// NULL otherise.
		uint16_t getU16(uint64_t offset_ptr, uint16_t *dst, uint32_t count) const;

		/// \overload uint16_t getU16(uint32_t offset_ptr, uint16_t *dst, uint32_t count) const
uint16_t getU16(uint32_t offset_ptr, uint16_t *dst, uint32_t count) const;		uint16_t getU16(uint32_t offset_ptr, uint16_t *dst, uint32_t count) const;

/// Extract a 24-bit unsigned value from \a *offset_ptr and return it		/// Extract a 24-bit unsigned value from \a *offset_ptr and return it
/// in a uint32_t.		/// in a uint32_t.
///		///
/// Extract 3 bytes from the binary data at the offset pointed to by		/// Extract 3 bytes from the binary data at the offset pointed to by
/// \a offset_ptr, construct a uint32_t from them and update the offset		/// \a offset_ptr, construct a uint32_t from them and update the offset
/// on success.		/// on success.
///		///
/// @param[in,out] offset_ptr		/// @param[in,out] offset_ptr
/// A pointer to an offset within the data that will be advanced		/// A pointer to an offset within the data that will be advanced
/// by the 3 bytes if the value is extracted correctly. If the offset		/// by the 3 bytes if the value is extracted correctly. If the offset
/// is out of bounds or there are not enough bytes to extract this value,		/// is out of bounds or there are not enough bytes to extract this value,
/// the offset will be left unmodified.		/// the offset will be left unmodified.
///		///
/// @return		/// @return
/// The extracted 24-bit value represented in a uint32_t.		/// The extracted 24-bit value represented in a uint32_t.
		uint32_t getU24(uint64_t *offset_ptr) const;

		/// \overload uint32_t getU24(uint32_t *offset_ptr) const
uint32_t getU24(uint32_t *offset_ptr) const;		uint32_t getU24(uint32_t *offset_ptr) const;

/// Extract a uint32_t value from \a *offset_ptr.		/// Extract a uint32_t value from \a *offset_ptr.
///		///
/// Extract a single uint32_t from the binary data at the offset		/// Extract a single uint32_t from the binary data at the offset
/// pointed to by \a offset_ptr, and update the offset on success.		/// pointed to by \a offset_ptr, and update the offset on success.
///		///
/// @param[in,out] offset_ptr		/// @param[in,out] offset_ptr
/// A pointer to an offset within the data that will be advanced		/// A pointer to an offset within the data that will be advanced
/// by the appropriate number of bytes if the value is extracted		/// by the appropriate number of bytes if the value is extracted
/// correctly. If the offset is out of bounds or there are not		/// correctly. If the offset is out of bounds or there are not
/// enough bytes to extract this value, the offset will be left		/// enough bytes to extract this value, the offset will be left
/// unmodified.		/// unmodified.
///		///
/// @return		/// @return
/// The extracted uint32_t value.		/// The extracted uint32_t value.
		uint32_t getU32(uint64_t *offset_ptr) const;

		/// \overload uint32_t getU32(uint32_t *offset_ptr) const
uint32_t getU32(uint32_t *offset_ptr) const;		uint32_t getU32(uint32_t *offset_ptr) const;

/// Extract \a count uint32_t values from \a *offset_ptr.		/// Extract \a count uint32_t values from \a *offset_ptr.
///		///
/// Extract \a count uint32_t values from the binary data at the		/// Extract \a count uint32_t values from the binary data at the
/// offset pointed to by \a offset_ptr, and advance the offset on		/// offset pointed to by \a offset_ptr, and advance the offset on
/// success. The extracted values are copied into \a dst.		/// success. The extracted values are copied into \a dst.
///		///
Show All 9 Lines	public:
/// be large enough to hold all requested data.		/// be large enough to hold all requested data.
///		///
/// @param[in] count		/// @param[in] count
/// The number of uint32_t values to extract.		/// The number of uint32_t values to extract.
///		///
/// @return		/// @return
/// \a dst if all values were properly extracted and copied,		/// \a dst if all values were properly extracted and copied,
/// NULL otherise.		/// NULL otherise.
		uint32_t getU32(uint64_t offset_ptr, uint32_t *dst, uint32_t count) const;

		/// \overload uint32_t getU32(uint32_t offset_ptr, uint32_t *dst, uint32_t count) const
uint32_t getU32(uint32_t offset_ptr, uint32_t *dst, uint32_t count) const;		uint32_t getU32(uint32_t offset_ptr, uint32_t *dst, uint32_t count) const;

/// Extract a uint64_t value from \a *offset_ptr.		/// Extract a uint64_t value from \a *offset_ptr.
///		///
/// Extract a single uint64_t from the binary data at the offset		/// Extract a single uint64_t from the binary data at the offset
/// pointed to by \a offset_ptr, and update the offset on success.		/// pointed to by \a offset_ptr, and update the offset on success.
///		///
/// @param[in,out] offset_ptr		/// @param[in,out] offset_ptr
/// A pointer to an offset within the data that will be advanced		/// A pointer to an offset within the data that will be advanced
/// by the appropriate number of bytes if the value is extracted		/// by the appropriate number of bytes if the value is extracted
/// correctly. If the offset is out of bounds or there are not		/// correctly. If the offset is out of bounds or there are not
/// enough bytes to extract this value, the offset will be left		/// enough bytes to extract this value, the offset will be left
/// unmodified.		/// unmodified.
///		///
/// @return		/// @return
/// The extracted uint64_t value.		/// The extracted uint64_t value.
		uint64_t getU64(uint64_t *offset_ptr) const;

		/// \overload uint64_t getU64(uint32_t *offset_ptr) const
uint64_t getU64(uint32_t *offset_ptr) const;		uint64_t getU64(uint32_t *offset_ptr) const;

/// Extract \a count uint64_t values from \a *offset_ptr.		/// Extract \a count uint64_t values from \a *offset_ptr.
///		///
/// Extract \a count uint64_t values from the binary data at the		/// Extract \a count uint64_t values from the binary data at the
/// offset pointed to by \a offset_ptr, and advance the offset on		/// offset pointed to by \a offset_ptr, and advance the offset on
/// success. The extracted values are copied into \a dst.		/// success. The extracted values are copied into \a dst.
///		///
Show All 9 Lines	public:
/// be large enough to hold all requested data.		/// be large enough to hold all requested data.
///		///
/// @param[in] count		/// @param[in] count
/// The number of uint64_t values to extract.		/// The number of uint64_t values to extract.
///		///
/// @return		/// @return
/// \a dst if all values were properly extracted and copied,		/// \a dst if all values were properly extracted and copied,
/// NULL otherise.		/// NULL otherise.
		uint64_t getU64(uint64_t offset_ptr, uint64_t *dst, uint32_t count) const;

		/// \overload uint64_t getU64(uint32_t offset_ptr, uint64_t *dst, uint32_t count) const
uint64_t getU64(uint32_t offset_ptr, uint64_t *dst, uint32_t count) const;		uint64_t getU64(uint32_t offset_ptr, uint64_t *dst, uint32_t count) const;

/// Extract a signed LEB128 value from \a *offset_ptr.		/// Extract a signed LEB128 value from \a *offset_ptr.
///		///
/// Extracts an signed LEB128 number from this object's data		/// Extracts an signed LEB128 number from this object's data
/// starting at the offset pointed to by \a offset_ptr. The offset		/// starting at the offset pointed to by \a offset_ptr. The offset
/// pointed to by \a offset_ptr will be updated with the offset of		/// pointed to by \a offset_ptr will be updated with the offset of
/// the byte following the last extracted byte.		/// the byte following the last extracted byte.
///		///
/// @param[in,out] offset_ptr		/// @param[in,out] offset_ptr
/// A pointer to an offset within the data that will be advanced		/// A pointer to an offset within the data that will be advanced
/// by the appropriate number of bytes if the value is extracted		/// by the appropriate number of bytes if the value is extracted
/// correctly. If the offset is out of bounds or there are not		/// correctly. If the offset is out of bounds or there are not
/// enough bytes to extract this value, the offset will be left		/// enough bytes to extract this value, the offset will be left
/// unmodified.		/// unmodified.
///		///
/// @return		/// @return
/// The extracted signed integer value.		/// The extracted signed integer value.
		int64_t getSLEB128(uint64_t *offset_ptr) const;

		/// \overload int64_t getSLEB128(uint32_t *offset_ptr) const
int64_t getSLEB128(uint32_t *offset_ptr) const;		int64_t getSLEB128(uint32_t *offset_ptr) const;

/// Extract a unsigned LEB128 value from \a *offset_ptr.		/// Extract a unsigned LEB128 value from \a *offset_ptr.
///		///
/// Extracts an unsigned LEB128 number from this object's data		/// Extracts an unsigned LEB128 number from this object's data
/// starting at the offset pointed to by \a offset_ptr. The offset		/// starting at the offset pointed to by \a offset_ptr. The offset
/// pointed to by \a offset_ptr will be updated with the offset of		/// pointed to by \a offset_ptr will be updated with the offset of
/// the byte following the last extracted byte.		/// the byte following the last extracted byte.
///		///
/// @param[in,out] offset_ptr		/// @param[in,out] offset_ptr
/// A pointer to an offset within the data that will be advanced		/// A pointer to an offset within the data that will be advanced
/// by the appropriate number of bytes if the value is extracted		/// by the appropriate number of bytes if the value is extracted
/// correctly. If the offset is out of bounds or there are not		/// correctly. If the offset is out of bounds or there are not
/// enough bytes to extract this value, the offset will be left		/// enough bytes to extract this value, the offset will be left
/// unmodified.		/// unmodified.
///		///
/// @return		/// @return
/// The extracted unsigned integer value.		/// The extracted unsigned integer value.
		uint64_t getULEB128(uint64_t *offset_ptr) const;

		/// \overload uint64_t getULEB128(uint32_t *offset_ptr) const
uint64_t getULEB128(uint32_t *offset_ptr) const;		uint64_t getULEB128(uint32_t *offset_ptr) const;

/// Test the validity of \a offset.		/// Test the validity of \a offset.
///		///
/// @return		/// @return
/// \b true if \a offset is a valid offset into the data in this		/// \b true if \a offset is a valid offset into the data in this
/// object, \b false otherwise.		/// object, \b false otherwise.
bool isValidOffset(uint32_t offset) const { return Data.size() > offset; }		bool isValidOffset(uint64_t offset) const { return Data.size() > offset; }

/// Test the availability of \a length bytes of data from \a offset.		/// Test the availability of \a length bytes of data from \a offset.
///		///
/// @return		/// @return
/// \b true if \a offset is a valid offset and there are \a		/// \b true if \a offset is a valid offset and there are \a
/// length bytes available at that offset, \b false otherwise.		/// length bytes available at that offset, \b false otherwise.
bool isValidOffsetForDataOfSize(uint32_t offset, uint32_t length) const {		bool isValidOffsetForDataOfSize(uint64_t offset, uint64_t length) const {
return offset + length >= offset && isValidOffset(offset + length - 1);		return offset + length >= offset && isValidOffset(offset + length - 1);
}		}

/// Test the availability of enough bytes of data for a pointer from		/// Test the availability of enough bytes of data for a pointer from
/// \a offset. The size of a pointer is \a getAddressSize().		/// \a offset. The size of a pointer is \a getAddressSize().
///		///
/// @return		/// @return
/// \b true if \a offset is a valid offset and there are enough		/// \b true if \a offset is a valid offset and there are enough
/// bytes for a pointer available at that offset, \b false		/// bytes for a pointer available at that offset, \b false
/// otherwise.		/// otherwise.
bool isValidOffsetForAddress(uint32_t offset) const {		bool isValidOffsetForAddress(uint64_t offset) const {
return isValidOffsetForDataOfSize(offset, AddressSize);		return isValidOffsetForDataOfSize(offset, AddressSize);
}		}
};		};

} // namespace llvm		} // namespace llvm

#endif		#endif

lib/Support/DataExtractor.cpp

	//===-- DataExtractor.cpp -------------------------------------------------===//			//===-- DataExtractor.cpp -------------------------------------------------===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include "llvm/Support/DataExtractor.h"			#include "llvm/Support/DataExtractor.h"
	#include "llvm/Support/ErrorHandling.h"			#include "llvm/Support/ErrorHandling.h"
	#include "llvm/Support/Host.h"			#include "llvm/Support/Host.h"
	#include "llvm/Support/SwapByteOrder.h"			#include "llvm/Support/SwapByteOrder.h"
	#include "llvm/Support/LEB128.h"			#include "llvm/Support/LEB128.h"
	using namespace llvm;			using namespace llvm;

	template <typename T>			template <typename T, typename TOff>
	static T getU(uint32_t offset_ptr, const DataExtractor de,			static T getU(TOff offset_ptr, const DataExtractor de,
	bool isLittleEndian, const char *Data) {			bool isLittleEndian, const char *Data) {
	T val = 0;			T val = 0;
	uint32_t offset = *offset_ptr;			TOff offset = *offset_ptr;
	if (de->isValidOffsetForDataOfSize(offset, sizeof(val))) {			if (de->isValidOffsetForDataOfSize(offset, sizeof(val))) {
	std::memcpy(&val, &Data[offset], sizeof(val));			std::memcpy(&val, &Data[offset], sizeof(val));
	if (sys::IsLittleEndianHost != isLittleEndian)			if (sys::IsLittleEndianHost != isLittleEndian)
	sys::swapByteOrder(val);			sys::swapByteOrder(val);

	// Advance the offset			// Advance the offset
	*offset_ptr += sizeof(val);			*offset_ptr += sizeof(val);
	}			}
	return val;			return val;
	}			}

	template <typename T>			template <typename T, typename TOff>
	static T getUs(uint32_t offset_ptr, T *dst, uint32_t count,			static T getUs(TOff offset_ptr, T *dst, uint32_t count,
	const DataExtractor de, bool isLittleEndian, const char Data){			const DataExtractor de, bool isLittleEndian, const char Data){
	uint32_t offset = *offset_ptr;			TOff offset = *offset_ptr;

	if (count > 0 && de->isValidOffsetForDataOfSize(offset, sizeof(dst)count)) {			if (count > 0 && de->isValidOffsetForDataOfSize(offset, sizeof(dst)count)) {
	for (T value_ptr = dst, end = dst + count; value_ptr != end;			for (T value_ptr = dst, end = dst + count; value_ptr != end;
	++value_ptr, offset += sizeof(*dst))			++value_ptr, offset += sizeof(*dst))
	*value_ptr = getU<T>(offset_ptr, de, isLittleEndian, Data);			*value_ptr = getU<T>(offset_ptr, de, isLittleEndian, Data);
	// Advance the offset			// Advance the offset
	*offset_ptr = offset;			*offset_ptr = offset;
	// Return a non-NULL pointer to the converted data as an indicator of			// Return a non-NULL pointer to the converted data as an indicator of
	// success			// success
	return dst;			return dst;
	}			}
	return nullptr;			return nullptr;
	}			}

				uint8_t DataExtractor::getU8(uint64_t *offset_ptr) const {
				return getU<uint8_t>(offset_ptr, this, IsLittleEndian, Data.data());
				}

	uint8_t DataExtractor::getU8(uint32_t *offset_ptr) const {			uint8_t DataExtractor::getU8(uint32_t *offset_ptr) const {
	return getU<uint8_t>(offset_ptr, this, IsLittleEndian, Data.data());			return getU<uint8_t>(offset_ptr, this, IsLittleEndian, Data.data());
	}			}

	uint8_t *			uint8_t *
				DataExtractor::getU8(uint64_t offset_ptr, uint8_t dst, uint32_t count) const {
				return getUs<uint8_t>(offset_ptr, dst, count, this, IsLittleEndian,
				Data.data());
				}

				uint8_t *
	DataExtractor::getU8(uint32_t offset_ptr, uint8_t dst, uint32_t count) const {			DataExtractor::getU8(uint32_t offset_ptr, uint8_t dst, uint32_t count) const {
	return getUs<uint8_t>(offset_ptr, dst, count, this, IsLittleEndian,			return getUs<uint8_t>(offset_ptr, dst, count, this, IsLittleEndian,
	Data.data());			Data.data());
	}			}

				uint16_t DataExtractor::getU16(uint64_t *offset_ptr) const {
				return getU<uint16_t>(offset_ptr, this, IsLittleEndian, Data.data());
				}

	uint16_t DataExtractor::getU16(uint32_t *offset_ptr) const {			uint16_t DataExtractor::getU16(uint32_t *offset_ptr) const {
	return getU<uint16_t>(offset_ptr, this, IsLittleEndian, Data.data());			return getU<uint16_t>(offset_ptr, this, IsLittleEndian, Data.data());
	}			}

				uint16_t DataExtractor::getU16(uint64_t offset_ptr, uint16_t *dst,
				uint32_t count) const {
				return getUs<uint16_t>(offset_ptr, dst, count, this, IsLittleEndian,
				Data.data());
				}

	uint16_t DataExtractor::getU16(uint32_t offset_ptr, uint16_t *dst,			uint16_t DataExtractor::getU16(uint32_t offset_ptr, uint16_t *dst,
	uint32_t count) const {			uint32_t count) const {
	return getUs<uint16_t>(offset_ptr, dst, count, this, IsLittleEndian,			return getUs<uint16_t>(offset_ptr, dst, count, this, IsLittleEndian,
	Data.data());			Data.data());
	}			}

				uint32_t DataExtractor::getU24(uint64_t *offset_ptr) const {
				uint24_t ExtractedVal =
				getU<uint24_t>(offset_ptr, this, IsLittleEndian, Data.data());
				// The 3 bytes are in the correct byte order for the host.
				return ExtractedVal.getAsUint32(sys::IsLittleEndianHost);
				}

	uint32_t DataExtractor::getU24(uint32_t *offset_ptr) const {			uint32_t DataExtractor::getU24(uint32_t *offset_ptr) const {
	uint24_t ExtractedVal =			uint24_t ExtractedVal =
	getU<uint24_t>(offset_ptr, this, IsLittleEndian, Data.data());			getU<uint24_t>(offset_ptr, this, IsLittleEndian, Data.data());
	// The 3 bytes are in the correct byte order for the host.			// The 3 bytes are in the correct byte order for the host.
	return ExtractedVal.getAsUint32(sys::IsLittleEndianHost);			return ExtractedVal.getAsUint32(sys::IsLittleEndianHost);
	}			}

				uint32_t DataExtractor::getU32(uint64_t *offset_ptr) const {
				return getU<uint32_t>(offset_ptr, this, IsLittleEndian, Data.data());
				}

	uint32_t DataExtractor::getU32(uint32_t *offset_ptr) const {			uint32_t DataExtractor::getU32(uint32_t *offset_ptr) const {
	return getU<uint32_t>(offset_ptr, this, IsLittleEndian, Data.data());			return getU<uint32_t>(offset_ptr, this, IsLittleEndian, Data.data());
	}			}

				uint32_t DataExtractor::getU32(uint64_t offset_ptr, uint32_t *dst,
				uint32_t count) const {
				return getUs<uint32_t>(offset_ptr, dst, count, this, IsLittleEndian,
				Data.data());
				}

	uint32_t DataExtractor::getU32(uint32_t offset_ptr, uint32_t *dst,			uint32_t DataExtractor::getU32(uint32_t offset_ptr, uint32_t *dst,
	uint32_t count) const {			uint32_t count) const {
	return getUs<uint32_t>(offset_ptr, dst, count, this, IsLittleEndian,			return getUs<uint32_t>(offset_ptr, dst, count, this, IsLittleEndian,
	Data.data());			Data.data());
	}			}

				uint64_t DataExtractor::getU64(uint64_t *offset_ptr) const {
				return getU<uint64_t>(offset_ptr, this, IsLittleEndian, Data.data());
				}

	uint64_t DataExtractor::getU64(uint32_t *offset_ptr) const {			uint64_t DataExtractor::getU64(uint32_t *offset_ptr) const {
	return getU<uint64_t>(offset_ptr, this, IsLittleEndian, Data.data());			return getU<uint64_t>(offset_ptr, this, IsLittleEndian, Data.data());
	}			}

				uint64_t DataExtractor::getU64(uint64_t offset_ptr, uint64_t *dst,
				uint32_t count) const {
				return getUs<uint64_t>(offset_ptr, dst, count, this, IsLittleEndian,
				Data.data());
				}

	uint64_t DataExtractor::getU64(uint32_t offset_ptr, uint64_t *dst,			uint64_t DataExtractor::getU64(uint32_t offset_ptr, uint64_t *dst,
	uint32_t count) const {			uint32_t count) const {
	return getUs<uint64_t>(offset_ptr, dst, count, this, IsLittleEndian,			return getUs<uint64_t>(offset_ptr, dst, count, this, IsLittleEndian,
	Data.data());			Data.data());
	}			}

	uint64_t			template <typename TOff>
	DataExtractor::getUnsigned(uint32_t *offset_ptr, uint32_t byte_size) const {			static uint64_t getUnsigned(TOff *offset_ptr, uint32_t byte_size,
				const DataExtractor *de) {
	switch (byte_size) {			switch (byte_size) {
	case 1:			case 1:
	return getU8(offset_ptr);			return de->getU8(offset_ptr);
	case 2:			case 2:
	return getU16(offset_ptr);			return de->getU16(offset_ptr);
	case 4:			case 4:
	return getU32(offset_ptr);			return de->getU32(offset_ptr);
	case 8:			case 8:
	return getU64(offset_ptr);			return de->getU64(offset_ptr);
	}			}
	llvm_unreachable("getUnsigned unhandled case!");			llvm_unreachable("getUnsigned unhandled case!");
	}			}

	int64_t			uint64_t
	DataExtractor::getSigned(uint32_t *offset_ptr, uint32_t byte_size) const {			DataExtractor::getUnsigned(uint64_t *offset_ptr, uint32_t byte_size) const {
				return ::getUnsigned(offset_ptr, byte_size, this);
				}

				uint64_t
				DataExtractor::getUnsigned(uint32_t *offset_ptr, uint32_t byte_size) const {
				return ::getUnsigned(offset_ptr, byte_size, this);
				}

				template <typename TOff>
				static int64_t getSigned(TOff *offset_ptr, uint32_t byte_size,
				const DataExtractor *de) {
	switch (byte_size) {			switch (byte_size) {
	case 1:			case 1:
	return (int8_t)getU8(offset_ptr);			return (int8_t)de->getU8(offset_ptr);
	case 2:			case 2:
	return (int16_t)getU16(offset_ptr);			return (int16_t)de->getU16(offset_ptr);
	case 4:			case 4:
	return (int32_t)getU32(offset_ptr);			return (int32_t)de->getU32(offset_ptr);
	case 8:			case 8:
	return (int64_t)getU64(offset_ptr);			return (int64_t)de->getU64(offset_ptr);
	}			}
	llvm_unreachable("getSigned unhandled case!");			llvm_unreachable("getSigned unhandled case!");
	}			}

	const char DataExtractor::getCStr(uint32_t offset_ptr) const {			int64_t
	uint32_t offset = *offset_ptr;			DataExtractor::getSigned(uint64_t *offset_ptr, uint32_t byte_size) const {
				return ::getSigned(offset_ptr, byte_size, this);
				}

				int64_t
				DataExtractor::getSigned(uint32_t *offset_ptr, uint32_t byte_size) const {
				return ::getSigned(offset_ptr, byte_size, this);
				}

				template <typename TOff>
				static const char getCStr(TOff offset_ptr, const StringRef &Data) {
				TOff offset = *offset_ptr;
	StringRef::size_type pos = Data.find('\0', offset);			StringRef::size_type pos = Data.find('\0', offset);
	if (pos != StringRef::npos) {			if (pos != StringRef::npos) {
	*offset_ptr = pos + 1;			*offset_ptr = pos + 1;
	return Data.data() + offset;			return Data.data() + offset;
	}			}
	return nullptr;			return nullptr;
	}			}

	StringRef DataExtractor::getCStrRef(uint32_t *OffsetPtr) const {			const char DataExtractor::getCStr(uint64_t offset_ptr) const {
	uint32_t Start = *OffsetPtr;			return ::getCStr(offset_ptr, Data);
				}

				const char DataExtractor::getCStr(uint32_t offset_ptr) const {
				return ::getCStr(offset_ptr, Data);
				}

				template <typename TOff>
				static StringRef getCStrRef(TOff *OffsetPtr, const StringRef &Data) {
				TOff Start = *OffsetPtr;
	StringRef::size_type Pos = Data.find('\0', Start);			StringRef::size_type Pos = Data.find('\0', Start);
	if (Pos != StringRef::npos) {			if (Pos != StringRef::npos) {
	*OffsetPtr = Pos + 1;			*OffsetPtr = Pos + 1;
	return StringRef(Data.data() + Start, Pos - Start);			return StringRef(Data.data() + Start, Pos - Start);
	}			}
	return StringRef();			return StringRef();
	}			}

	uint64_t DataExtractor::getULEB128(uint32_t *offset_ptr) const {			StringRef DataExtractor::getCStrRef(uint64_t *OffsetPtr) const {
				return ::getCStrRef(OffsetPtr, Data);
				}

				StringRef DataExtractor::getCStrRef(uint32_t *OffsetPtr) const {
				return ::getCStrRef(OffsetPtr, Data);
				}

				template <typename TOff>
				static uint64_t getULEB128(TOff *offset_ptr, const StringRef &Data) {
	assert(*offset_ptr <= Data.size());			assert(*offset_ptr <= Data.size());

	const char *error;			const char *error;
	unsigned bytes_read;			unsigned bytes_read;
	uint64_t result = decodeULEB128(			uint64_t result = decodeULEB128(
	reinterpret_cast<const uint8_t >(Data.data() + offset_ptr), &bytes_read,			reinterpret_cast<const uint8_t >(Data.data() + offset_ptr), &bytes_read,
	reinterpret_cast<const uint8_t *>(Data.data() + Data.size()), &error);			reinterpret_cast<const uint8_t *>(Data.data() + Data.size()), &error);
	if (error)			if (error)
	return 0;			return 0;
	*offset_ptr += bytes_read;			*offset_ptr += bytes_read;
	return result;			return result;
	}			}

	int64_t DataExtractor::getSLEB128(uint32_t *offset_ptr) const {			uint64_t DataExtractor::getULEB128(uint64_t *offset_ptr) const {
				return ::getULEB128(offset_ptr, Data);
				}

				uint64_t DataExtractor::getULEB128(uint32_t *offset_ptr) const {
				return ::getULEB128(offset_ptr, Data);
				}

				template <typename TOff>
				static int64_t getSLEB128(TOff *offset_ptr, const StringRef &Data) {
	assert(*offset_ptr <= Data.size());			assert(*offset_ptr <= Data.size());

	const char *error;			const char *error;
	unsigned bytes_read;			unsigned bytes_read;
	int64_t result = decodeSLEB128(			int64_t result = decodeSLEB128(
	reinterpret_cast<const uint8_t >(Data.data() + offset_ptr), &bytes_read,			reinterpret_cast<const uint8_t >(Data.data() + offset_ptr), &bytes_read,
	reinterpret_cast<const uint8_t *>(Data.data() + Data.size()), &error);			reinterpret_cast<const uint8_t *>(Data.data() + Data.size()), &error);
	if (error)			if (error)
	return 0;			return 0;
	*offset_ptr += bytes_read;			*offset_ptr += bytes_read;
	return result;			return result;
	}			}

				int64_t DataExtractor::getSLEB128(uint64_t *offset_ptr) const {
				return ::getSLEB128(offset_ptr, Data);
				}

				int64_t DataExtractor::getSLEB128(uint32_t *offset_ptr) const {
				return ::getSLEB128(offset_ptr, Data);
				}

unittests/Support/DataExtractorTest.cpp

//===- llvm/unittest/Support/DataExtractorTest.cpp - DataExtractor tests --===//		//===- llvm/unittest/Support/DataExtractorTest.cpp - DataExtractor tests --===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/Support/DataExtractor.h"		#include "llvm/Support/DataExtractor.h"
#include "gtest/gtest.h"		#include "gtest/gtest.h"
using namespace llvm;		using namespace llvm;

namespace {		namespace {

		// Test fixture
		template <typename T>
		class DataExtractorTest : public ::testing::Test { };

		// Test DataExtractor with both types which can be used for offsets.
		typedef ::testing::Types<uint32_t, uint64_t> TestTypes;
		TYPED_TEST_CASE(DataExtractorTest, TestTypes);

const char numberData[] = "\x80\x90\xFF\xFF\x80\x00\x00\x00";		const char numberData[] = "\x80\x90\xFF\xFF\x80\x00\x00\x00";
const char stringData[] = "hellohello\0hello";		const char stringData[] = "hellohello\0hello";
const char leb128data[] = "\xA6\x49";		const char leb128data[] = "\xA6\x49";
const char bigleb128data[] = "\xAA\xA9\xFF\xAA\xFF\xAA\xFF\x4A";		const char bigleb128data[] = "\xAA\xA9\xFF\xAA\xFF\xAA\xFF\x4A";

TEST(DataExtractorTest, OffsetOverflow) {		TYPED_TEST(DataExtractorTest, OffsetOverflow) {
DataExtractor DE(StringRef(numberData, sizeof(numberData)-1), false, 8);		DataExtractor DE(StringRef(numberData, sizeof(numberData)-1), false, 8);
EXPECT_FALSE(DE.isValidOffsetForDataOfSize(-2U, 5));		EXPECT_FALSE(DE.isValidOffsetForDataOfSize(-2U, 5));
}		}

TEST(DataExtractorTest, UnsignedNumbers) {		TYPED_TEST(DataExtractorTest, UnsignedNumbers) {
DataExtractor DE(StringRef(numberData, sizeof(numberData)-1), false, 8);		DataExtractor DE(StringRef(numberData, sizeof(numberData)-1), false, 8);
uint32_t offset = 0;		TypeParam offset = 0;

EXPECT_EQ(0x80U, DE.getU8(&offset));		EXPECT_EQ(0x80U, DE.getU8(&offset));
EXPECT_EQ(1U, offset);		EXPECT_EQ(1U, offset);
offset = 0;		offset = 0;
EXPECT_EQ(0x8090U, DE.getU16(&offset));		EXPECT_EQ(0x8090U, DE.getU16(&offset));
EXPECT_EQ(2U, offset);		EXPECT_EQ(2U, offset);
offset = 0;		offset = 0;
EXPECT_EQ(0x8090FFFFU, DE.getU32(&offset));		EXPECT_EQ(0x8090FFFFU, DE.getU32(&offset));
Show All 29 Lines	TYPED_TEST(DataExtractorTest, UnsignedNumbers) {
offset = 0;		offset = 0;

EXPECT_EQ(data, DE.getU32(&offset, data, 2));		EXPECT_EQ(data, DE.getU32(&offset, data, 2));
EXPECT_EQ(0xFFFF9080U, data[0]);		EXPECT_EQ(0xFFFF9080U, data[0]);
EXPECT_EQ(0x80U, data[1]);		EXPECT_EQ(0x80U, data[1]);
EXPECT_EQ(8U, offset);		EXPECT_EQ(8U, offset);
}		}

TEST(DataExtractorTest, SignedNumbers) {		TYPED_TEST(DataExtractorTest, SignedNumbers) {
DataExtractor DE(StringRef(numberData, sizeof(numberData)-1), false, 8);		DataExtractor DE(StringRef(numberData, sizeof(numberData)-1), false, 8);
uint32_t offset = 0;		TypeParam offset = 0;

EXPECT_EQ(-128, DE.getSigned(&offset, 1));		EXPECT_EQ(-128, DE.getSigned(&offset, 1));
EXPECT_EQ(1U, offset);		EXPECT_EQ(1U, offset);
offset = 0;		offset = 0;
EXPECT_EQ(-32624, DE.getSigned(&offset, 2));		EXPECT_EQ(-32624, DE.getSigned(&offset, 2));
EXPECT_EQ(2U, offset);		EXPECT_EQ(2U, offset);
offset = 0;		offset = 0;
EXPECT_EQ(-2137980929, DE.getSigned(&offset, 4));		EXPECT_EQ(-2137980929, DE.getSigned(&offset, 4));
EXPECT_EQ(4U, offset);		EXPECT_EQ(4U, offset);
offset = 0;		offset = 0;
EXPECT_EQ(-9182558167379214336LL, DE.getSigned(&offset, 8));		EXPECT_EQ(-9182558167379214336LL, DE.getSigned(&offset, 8));
EXPECT_EQ(8U, offset);		EXPECT_EQ(8U, offset);
}		}

TEST(DataExtractorTest, Strings) {		TYPED_TEST(DataExtractorTest, Strings) {
DataExtractor DE(StringRef(stringData, sizeof(stringData)-1), false, 8);		DataExtractor DE(StringRef(stringData, sizeof(stringData)-1), false, 8);
uint32_t offset = 0;		TypeParam offset = 0;

EXPECT_EQ(stringData, DE.getCStr(&offset));		EXPECT_EQ(stringData, DE.getCStr(&offset));
EXPECT_EQ(11U, offset);		EXPECT_EQ(11U, offset);
EXPECT_EQ(nullptr, DE.getCStr(&offset));		EXPECT_EQ(nullptr, DE.getCStr(&offset));
EXPECT_EQ(11U, offset);		EXPECT_EQ(11U, offset);
}		}

TEST(DataExtractorTest, LEB128) {		TYPED_TEST(DataExtractorTest, LEB128) {
DataExtractor DE(StringRef(leb128data, sizeof(leb128data)-1), false, 8);		DataExtractor DE(StringRef(leb128data, sizeof(leb128data)-1), false, 8);
uint32_t offset = 0;		TypeParam offset = 0;

EXPECT_EQ(9382ULL, DE.getULEB128(&offset));		EXPECT_EQ(9382ULL, DE.getULEB128(&offset));
EXPECT_EQ(2U, offset);		EXPECT_EQ(2U, offset);
offset = 0;		offset = 0;
EXPECT_EQ(-7002LL, DE.getSLEB128(&offset));		EXPECT_EQ(-7002LL, DE.getSLEB128(&offset));
EXPECT_EQ(2U, offset);		EXPECT_EQ(2U, offset);

DataExtractor BDE(StringRef(bigleb128data, sizeof(bigleb128data)-1), false,8);		DataExtractor BDE(StringRef(bigleb128data, sizeof(bigleb128data)-1), false,8);
offset = 0;		offset = 0;
EXPECT_EQ(42218325750568106ULL, BDE.getULEB128(&offset));		EXPECT_EQ(42218325750568106ULL, BDE.getULEB128(&offset));
EXPECT_EQ(8U, offset);		EXPECT_EQ(8U, offset);
offset = 0;		offset = 0;
EXPECT_EQ(-29839268287359830LL, BDE.getSLEB128(&offset));		EXPECT_EQ(-29839268287359830LL, BDE.getSLEB128(&offset));
EXPECT_EQ(8U, offset);		EXPECT_EQ(8U, offset);
}		}

TEST(DataExtractorTest, LEB128_error) {		TYPED_TEST(DataExtractorTest, LEB128_error) {
DataExtractor DE(StringRef("\x81"), false, 8);		DataExtractor DE(StringRef("\x81"), false, 8);
uint32_t Offset = 0;		TypeParam Offset = 0;
EXPECT_EQ(0U, DE.getULEB128(&Offset));		EXPECT_EQ(0U, DE.getULEB128(&Offset));
EXPECT_EQ(0U, Offset);		EXPECT_EQ(0U, Offset);

Offset = 0;		Offset = 0;
EXPECT_EQ(0U, DE.getSLEB128(&Offset));		EXPECT_EQ(0U, DE.getSLEB128(&Offset));
EXPECT_EQ(0U, Offset);		EXPECT_EQ(0U, Offset);
}		}
}		}

This is an archive of the discontinued LLVM Phabricator instance.

Support 64-bit offsets in utility classes (1/5)ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 207284

include/llvm/Support/DataExtractor.h

lib/Support/DataExtractor.cpp

unittests/Support/DataExtractorTest.cpp

Support 64-bit offsets in utility classes (1/5)
ClosedPublic