This is an archive of the discontinued LLVM Phabricator instance.

Tool for using Intel(R) Processor Trace hardware feature
ClosedPublic

Authored by abhishek.aggarwal on May 10 2017, 6:03 AM.

Details

Summary
  1. Provide single library for all Intel specific hardware features instead of individual libraries for each feature
  2. Added Intel(R) Processor Trace hardware feature in this single library. Details about the tool implementing this feature is as follows:

    Tool developed on top of LLDB to provide its users the execution trace of the debugged inferiors. Tool's API are exposed as C++ object oriented interface in a shared library. API are designed especially to be easily integrable with IDEs providing LLDB as an application debugger. Entire API is also available as Python functions through a script bridging interface allowing development of python modules.

    This patch also provides a CLI wrapper to use the Tool through LLDB's command line. Highlights of the Tool and the wrapper are given below:

    ****** Intel(R) Processor Trace Tool: ******
      • Provides execution trace of the debugged application
      • Uses Intel(R) Processor Trace hardware feature (already implemented inside LLDB) for this purpose
    • Collects trace packets generated by this feature from LLDB, decodes and post-processes them
    • Constructs the execution trace of the application
    • Presents execution trace as a list of assembly instructions
      • Provides 4 APIs (exposed as C++ object oriented interface)
    • start trace with configuration options for a thread/process,
    • stop trace for a thread/process,
    • get the execution flow (assembly instructions) for a thread,
    • get trace specific information for a thread
      • Easily integrable into IDEs providing LLDB as application debugger
      • Entire API available as Python functions through script bridging interface
    • Allows developing python apps on top of Tool
      • README_TOOL.txt provides more details about the Tool, its dependencies, building steps and API usage
      • Tool ready to use through LLDB's command line
    • CLI wrapper has been developed on top of the Tool for this purpose

      ******* CLI wrapper: cli-wrapper-pt.cpp *******
      • Provides 4 commands (syntax similar to LLDB's CLI commands):
    • processor-trace start
    • processor-trace stop
    • processor-trace show-trace-options
    • processor-trace show-instr-log
      • README_CLI.txt provides more details about commands and their options

Signed-off-by: Abhishek Aggarwal <abhishek.a.aggarwal@intel.com>

Diff Detail

Event Timeline

clayborg requested changes to this revision.May 10 2017, 10:06 AM

So we don't want clients of SBStructuredData to have to text scrape and write manual decoders. See my inlined comments on the minimal modifications we need to make to SBStructureData first. This then removes a ton of code from your patch. The SBStructureData modifications should probably be done in separate patch as well. internally lldb_private::StructuredData has everything (dicts, array, strings, integers, bools, nulls) inherit from lldb_private::StructuredData::Object. Then SBStructuredData owns a StructuredDataImpl which has a lldb_private::StructuredData::Object shared pointer, so a lldb::SBStructureData can be any object (dicts, array, strings, integers, bools, nulls). So we should expose the ability to get data from SBStructuredData. See my inlined SBStructureData additions I proposed. We really don't want anyone doing manual JSON text scraping.

tools/CMakeLists.txt
8

Can we default this to enabled?

tools/intel-pt/Decoder.cpp
18 ↗(On Diff #98434)

This kind of reach in to internal plug-in sources shouldn't happen as it violates the boundaries. Remove and see comment below.

27 ↗(On Diff #98434)

We really shouldn't be interpreting integer error codes here. The string in the "sberror" should be what you use. Modify the code that creates these errors in the first place to also put a string values you have here into the lldb_private::Error to begin with and remove this function.

177 ↗(On Diff #98434)

Why do we care which debugger this belongs to?

200 ↗(On Diff #98434)

We meed to add API to SBStructuredData to expose some of the stuff from lldb_private::StructuredData so you don't end up text scraping here. Just do what you need for now but I would suggest at the very least:

class SBStructuredData {
  enum Type {
    eTypeArray,
    eTypeDictionary,
    eTypeString,
    eTypeInteger,
    eTypeBoolean
  }
  // Get the type of data in this structured data.
  Type GetType();
  // Get a dictionary for a key value given a key from the top level of this structured data if type is eTypeDictionary
  SBStructuredData GetValueForKey(const char *key);
  // Get the value of this object as a string if type is eTypeString
  bool GetStringValue(char *s, size_t max_len);
  // Get an integer of this object as an integer if type is eTypeInteger
  bool GetIntegerValue(uint64_t &value);
  ...
};

See cleanup code below if we do this.

211–212 ↗(On Diff #98434)
if (!sbstructdata.GetValueForKey("trace-tech").IsValid())
218–219 ↗(On Diff #98434)
if (!sbstructdata.GetValueForKey("intel-pt").IsValid())
538–547 ↗(On Diff #98434)

No text scraping. Please use SBStructureData methods that we will need to add.

602–603 ↗(On Diff #98434)

Remove this function, add real API to SBStructuredData

1046–1047 ↗(On Diff #98434)

Remove this function?

tools/intel-pt/Decoder.h
86 ↗(On Diff #98434)

Should this be named better? IntelTraceOptions? Also does it need to inherit from lldb::SBTraceOptions? Would it make more sense to have this be a "has a" where SBTraceOptions is a member variable?

90 ↗(On Diff #98434)

Does this need to be virtual? Remember that lldb::SBTraceOptions ins't virtual since it is in the public API.

131–134 ↗(On Diff #98434)

Do you actually need the debugger here? It might be nice to avoid this and just get the debugger from the process:

auto debugger = process.GetTarget().GetDebugger();
177 ↗(On Diff #98434)

You probably don't need this function as you shouldn't have to store the SBDebugger anywhere in this class as you can always get it from the process:

auto debugger = process.GetTarget().GetDebugger();

Why do we care which debugger this belongs to?

This revision now requires changes to proceed.May 10 2017, 10:06 AM
krytarowski added a subscriber: krytarowski.EditedMay 10 2017, 11:06 AM

Can we please use the generic process plugin code, not the specific Linux one? If the generic code is insufficient - we should adjust it and alter instances of it (Linux and BSD).

Can we please use the generic process plugin code, not the specific Linux one? If the generic code is insufficient - we should adjust it and alter instances of it (Linux and BSD).

There should be no need to ever #include something from a plug-in in code the gets loaded externally. We should rely on the error strings themselves, not be comparing them to some value. SBError has an error type (generic, posix, mach-kernel, expression results, windows). It we are going to do anything we can make up a new set of them, but the code in this patch is relying on the values, when it should just be using the string value. If the string value wasn't set correctly, fix that is the code that creates the errors.

Can we please use the generic process plugin code, not the specific Linux one? If the generic code is insufficient - we should adjust it and alter instances of it (Linux and BSD).

There should be no need to ever #include something from a plug-in in code the gets loaded externally. We should rely on the error strings themselves, not be comparing them to some value. SBError has an error type (generic, posix, mach-kernel, expression results, windows). It we are going to do anything we can make up a new set of them, but the code in this patch is relying on the values, when it should just be using the string value. If the string value wasn't set correctly, fix that is the code that creates the errors.

Thank you for explanation. This is even better and ideal.

labath edited edge metadata.May 15 2017, 2:05 AM

I don't really like that we are adding a public shared library for every tiny intel feature. Could we at least merge this "plugin" with the existing "intel-mpx plugin" to create one "intel support" library?

Also, adding an external dependency probably deserves a discussion on lldb-dev.

tools/CMakeLists.txt
8

We probably can't, as this code depends on a third party library. In any case, this option should go to LLDBConfig.cmake

tools/intel-pt/CMakeLists.txt
42 ↗(On Diff #98434)

any reason you're not using add_lldb_library here?

53 ↗(On Diff #98434)

All of this needs to go away. I think you only needed it because you are plucking NativeProcessLinux internals, so fixing that should fix this too.

tools/CMakeLists.txt
8

The tool has a dependency on a decoder library. People who are not interested in building this tool will have to explicitly set this flag OFF to avoid build failures caused by absence of this decoder library on their system. I wanted to avoid that. But if you think enabling it by default is a better idea then I will change it.

tools/intel-pt/Decoder.cpp
18 ↗(On Diff #98434)

I am removing this include from here.

27 ↗(On Diff #98434)

Removing this function.

177 ↗(On Diff #98434)

Please see my reply to your comment in Decoder.h file.

200 ↗(On Diff #98434)

I am on it.

538–547 ↗(On Diff #98434)

Working on it.

602–603 ↗(On Diff #98434)

Working on it.

1046–1047 ↗(On Diff #98434)

Please see reply to your comment in Decoder.h file.

tools/intel-pt/Decoder.h
86 ↗(On Diff #98434)
  1. I am changing its name to "TraceOptions".
  2. Inheriting it from SBTraceOptions makes life easier in the sense of reusing the APIs without re-defining them here. This is a backend class. Exposing any new API in PTInfo class (in case SBTraceOptions class undergo any change/addition in future) will require change only in PTDecoder.h file and corresponding implementation file. Plus, concept wise both "SBTraceOptions" and "Info" represent trace options. These were the motives behind using inheritance. Am I missing some important thing here? Please let me know.
90 ↗(On Diff #98434)

You are right. I will remove it.

131–134 ↗(On Diff #98434)

Please see my reply to your next comment below in this file.

177 ↗(On Diff #98434)

As per my understanding, the Unique IDs of SBProcess instances will be unique within 1 SBDebugger instance but will not be unique across two different SBDebugger instances. If I am wrong then I don't need to store Debugger ID here. Otherwise, for each SBProcess instance provided by user through public APIs of the Tool, I will have to check which particular SBDebugger instance it belongs to. Please correct me if I am wrong.

emaste added a subscriber: emaste.May 18 2017, 7:57 PM

I don't really like that we are adding a public shared library for every tiny intel feature. Could we at least merge this "plugin" with the existing "intel-mpx plugin" to create one "intel support" library?

Also, adding an external dependency probably deserves a discussion on lldb-dev.

Hi Paval ... Before starting the development of this whole feature, we had submitted the full proposal to lldb dev list 1.5 years ago. During the discussions, it was proposed to keep the external dependency outside LLDB (i.e. not to be bundled in liblldb shared library). The External dependency required for this tool is not and will never be a part of lldb repository. Users who are interested to use this tool, will download this external dependency separately.

tools/intel-pt/CMakeLists.txt
42 ↗(On Diff #98434)

No. I will try with that.

I don't really like that we are adding a public shared library for every tiny intel feature. Could we at least merge this "plugin" with the existing "intel-mpx plugin" to create one "intel support" library?

Also, adding an external dependency probably deserves a discussion on lldb-dev.

Hi Paval ... Before starting the development of this whole feature, we had submitted the full proposal to lldb dev list 1.5 years ago. During the discussions, it was proposed to keep the external dependency outside LLDB (i.e. not to be bundled in liblldb shared library). The External dependency required for this tool is not and will never be a part of lldb repository. Users who are interested to use this tool, will download this external dependency separately.

Yes I remember that. But, as you say, that was 1.5 years ago, and we haven't heard anything since. Honestly, I had assumed you abandoned that work until you reappeared last month. :)
So I think it's worth updating that thread, as new things may have come up since then (for example, the decision whether to merge the intel stuff into a single library).

abhishek.aggarwal edited edge metadata.

Single feature library development for all intel specific features

  • Merged the developed tool for Intel(R) Processor Trace hardware feature provided in this patch and already existing tool for Intel(R) Memory Protection Extensions support
abhishek.aggarwal edited the summary of this revision. (Show Details)Jun 19 2017, 4:35 AM
abhishek.aggarwal edited the summary of this revision. (Show Details)
abhishek.aggarwal edited the summary of this revision. (Show Details)
abhishek.aggarwal edited the summary of this revision. (Show Details)Jun 19 2017, 4:37 AM
abhishek.aggarwal added inline comments.
tools/intel-pt/CMakeLists.txt
53 ↗(On Diff #98434)

Fixing this.

tools/intel-pt/Decoder.cpp
27 ↗(On Diff #98434)

Currently, the codes are generated by lldb server and remote packets don't support sending error strings from lldb server to lldb client.
Now, I can think of 2 ways:

  1. modify remote packets to send error strings as well (in addition to error codes).
  2. or decode error codes and generate error strings at lldb client side

Which one do you prefer? @labath prefers 1st one (in discussions of https://reviews.llvm.org/D33674). If you also prefers 1st one then I (or @ravitheja) can work on modifying packets accordingly and submit a separate patch for that.

My idea is to keep error codes and its decoding logic here for now. Once we submit patch of modified packets and that patch gets approved, I can remove this decoding function from here all together. Please let me know what you prefer.

labath added inline comments.Jun 19 2017, 5:18 AM
tools/intel-features/CMakeLists.txt
51

"bin" sounds wrong here. Shouldn't this go ti lib${LLVM_LIBDIR_SUFFIX}?
To properly handle DLL targets (i don't know whether ipt support windows) you'd need something like (see function add_lldb_library):

install(TARGETS ${name}
  COMPONENT ${name}
  RUNTIME DESTINATION bin
  LIBRARY DESTINATION lib${LLVM_LIBDIR_SUFFIX}
  ARCHIVE DESTINATION lib${LLVM_LIBDIR_SUFFIX})

Although it may be than you can just call that function yourself.

tools/intel-features/intel-mpx/CMakeLists.txt
6

Normally we build a single .a file for each source folder, which are then linked into other targets as appropriate, and I don't see a reason to deviate from this here.

Same goes for the ipt subfolder.

tools/intel-features/scripts/python-typemaps.txt
2

Could we just use standard lldb typemaps? I don't see anything ipt specific here...

tools/intel-pt/Decoder.cpp
27 ↗(On Diff #98434)

How about we just avoid interpreting the error codes for now and print a generic "error 47" message instead. This avoids the awkward state, where we have two enums that need to be kept in sync, which is a really bad idea.

I think having more generic error code will be very useful - there are a lot of packets that would benefit from more helpful error messages.

Hi Pavel .. My comments are inlined. Please let me know if you have more concerns. I am submitting the patch with all the proposed changes.

tools/intel-features/CMakeLists.txt
51

Fixing it.

tools/intel-features/intel-mpx/CMakeLists.txt
6

Agreed. I am submitting the changes.

tools/intel-features/scripts/python-typemaps.txt
2

Unfortunately, we can't because that file uses lldb_private::PythonString, lldb_private::PythonList etc and for that I will have to link to liblldbPluginScripInterpreterPython.

tools/intel-pt/Decoder.cpp
27 ↗(On Diff #98434)

I am removing this function as @ravitheja is working on enabling lldb remote packets to send error strings directly. So, we don't even need "error 47" message.

Changes after feedback

  • Created static libs for each hardware feature to combine it to generate single shared library
  • Removed error codes decoding logic and directly using error strings

Hi Abhishek,

Thank you for incorporating the changes. I still see some room for improvement in the cmake files. I realize that this is something most people are not familiar with, and is not exactly glamorous work, but it's still part of our codebase and we should make sure it follows best practices.

tools/intel-features/CMakeLists.txt
17

Could we avoid building the shared library altogether if both features are OFF?

tools/intel-features/cli-wrapper.cpp
28

You will need some ifdef magic to make sure these still compile when the feature is off.

tools/intel-features/intel-mpx/CMakeLists.txt
10

What you want here is to define an INTERFACE dependency on the MPX library instead.
vanilla cmake way would be target_link_libraries(lldbIntelMPX INTERFACE LLVMSupport). However, we should use the llvm function instead, as that also handles other llvm-specific magic (for example, this code will break if someone does a LLVM_LINK_LLVM_DYLIB build).

So, I am asking for the third time:
Have you tried using add_lldb_library instead?

The correct invocation should be add_lldb_library(foo.cpp LINK_LIBS Support) and the rest of this file can just go away.

tools/intel-features/scripts/lldb-intel-features.swig
10

There appear to be no typedefs here.

15

You are already defining this as a part of the swig invocation in cmake.

tools/intel-features/scripts/python-typemaps.txt
2

Ok, let's leave this as-is then. We could try factoring out common part of those typemaps, but I'm not sure that's such a good idea at the moment.

Hi Pavel .. I have made the changes you suggested. My apologies for misinterpreting your previous comments but during written communications, it is sometimes difficult to interpret everything correctly. I have tried following LLDB's coding conventions and guidelines. Please let me know if I still missed things that you would have liked to see in this patch. Thanks for your patience :)

tools/intel-features/CMakeLists.txt
17

yes. Adding the check.

tools/intel-features/cli-wrapper.cpp
28

Fixed it.

tools/intel-features/intel-mpx/CMakeLists.txt
10

I am extremely sorry Pavel but I understood it now what you were trying to say in previous comments. Sorry about misinterpreting your comments before. I have used add_lldb_library function now. Please see them in the next patch set.

tools/intel-features/scripts/lldb-intel-features.swig
10

Forgot to remove this. Doing it.

15

removing it.

Cmake files related changes

  • Using lldb's cmake functions insted of vanilla ones for cmake files

Fixed one minor thing in CMakeFile

Thank you for making the changes. I concede that I am not always clear in my communications, and I apologize if I was being rude.

I am happy with the cmake stuff, but I noticed a new issue now. Exceptions are banned in llvm, so we need to figure out how to remove them from this code.

tools/intel-features/intel-mpx/CMakeLists.txt
10

Looks much better. Thanks.

tools/intel-features/intel-pt/Decoder.cpp
412

We can't have exceptions in llvm code. You will have to achieve this differently. Your trick with manually adding -fexceptions will not work anyway if the rest of the code is compiled without exceptions. Although I'm not really sure why you need to protect this vector append in particular, as we don't do this for any other vector elsewhere.

Thanks for your review Pavel. My comments are inlined. Let me know your opinion :)

tools/intel-features/intel-pt/Decoder.cpp
412

I kept the exception handling around stl containers only to catch bad_alloc exception because if this exception occurs, I didn't want the code to just exit but provide user with whatever amount of instruction log is available in the vector. That much amount of instruction log might still be helpful to the user. What is your opinion on that?

Plus, If rest of the code is not being compiled with -fexceptions but just this file, will it not solve the purpose? Let me know what you think about it. I can make changes accordingly then.

labath added inline comments.Jun 29 2017, 6:24 AM
tools/intel-features/intel-pt/Decoder.cpp
412

I don't think there's any negotiating on this point (not with me anyway, you'd need to take this much higher up). But here's what I think anyway:

  • the exception catch will be generally useless as most (all?) current OSs will overcommit virtual memory (and then just kill you if they really run out of memory and swap space)
  • even if you disable overcommit, chances are you will hit an OOM in one of the zillion other places which allocate memory (all of which are unchecked) instead of here. So this single catch will not make a difference.
tools/intel-features/intel-pt/Decoder.cpp
412

Got it. Then I remove exception handling code from here. Submit the patch again. Thanks for elaborating.

Removed exception handling code

Hi Pavel .. I could remove all exception handling code from source files. However, I am still wondering how to disable exception handling while providing python functions for C++ APIs of the features. Can you suggest something here? The whole problem is occurring because of the use of vector<> as an argument to one of the C++ APIs. To provide a python interface for these APIs, I am forced to include std_vector.i (please see tools/intel-features/intel-pt/interface/PTDecoder.i file) which is introducing exception handling code.

labath added a comment.Jul 4 2017, 6:57 AM

Hi Pavel .. I could remove all exception handling code from source files. However, I am still wondering how to disable exception handling while providing python functions for C++ APIs of the features. Can you suggest something here? The whole problem is occurring because of the use of vector<> as an argument to one of the C++ APIs. To provide a python interface for these APIs, I am forced to include std_vector.i (please see tools/intel-features/intel-pt/interface/PTDecoder.i file) which is introducing exception handling code.

Hm... that's a bit of a drag. I guess the SB API never ran into this problem because it always provides it's own vector-like classes (SBModuleList, SBFileSpecList, etc.). I guess the most canonical way would be to follow that example and have your own class for a list of instructions. However, that is a bit annoying, so I do see a case for making code generated by swig as an exception to the rule.

Hi Pavel .. I could remove all exception handling code from source files. However, I am still wondering how to disable exception handling while providing python functions for C++ APIs of the features. Can you suggest something here? The whole problem is occurring because of the use of vector<> as an argument to one of the C++ APIs. To provide a python interface for these APIs, I am forced to include std_vector.i (please see tools/intel-features/intel-pt/interface/PTDecoder.i file) which is introducing exception handling code.

Hm... that's a bit of a drag. I guess the SB API never ran into this problem because it always provides it's own vector-like classes (SBModuleList, SBFileSpecList, etc.). I guess the most canonical way would be to follow that example and have your own class for a list of instructions. However, that is a bit annoying, so I do see a case for making code generated by swig as an exception to the rule.

Hmm .. Is the decision about enabling exception handling only for swig generated code lie in LLDB's community's hands or we will have to ask LLVM community for this too? I can add another class to resolve this problem but I am not sure whether it is the right way to go.

labath added a comment.Jul 4 2017, 7:56 AM

Hm... that's a bit of a drag. I guess the SB API never ran into this problem because it always provides it's own vector-like classes (SBModuleList, SBFileSpecList, etc.). I guess the most canonical way would be to follow that example and have your own class for a list of instructions. However, that is a bit annoying, so I do see a case for making code generated by swig as an exception to the rule.

Hmm .. Is the decision about enabling exception handling only for swig generated code lie in LLDB's community's hands or we will have to ask LLVM community for this too? I can add another class to resolve this problem but I am not sure whether it is the right way to go.

I think we can start with an email to lldb-dev and see how it goes. Nobody on the llvm side uses swig.

I'm not sure what would be a better solution either... I don't like either of them :)

So std::vector shouldn't be used in a public API. You should make a class that wraps your objects. LLDB's public API has lldb::SBInstruction and lldb::SBInstructionList as examples. std::vector on other systems compiles differently for debug and release builds and things will crash. If you need this via swig for internal kind of stuff, then use a typemap where you map std::vector<T> to a list() with T instances in it in python.

I am a bit confused here as well. Are we exporting a plug-in specific python bindings for the Intel PT data? It seems like it would be nice to wrap this API into the SBTrace or other lldb interface? Am I not understanding this correctly?

So std::vector shouldn't be used in a public API. You should make a class that wraps your objects. LLDB's public API has lldb::SBInstruction and lldb::SBInstructionList as examples. std::vector on other systems compiles differently for debug and release builds and things will crash.

Hi Greg .. Does it go for the tools built on top of LLDB (e.g. in my case where feature library is not a part of liblldb shared lib but built on top of it)? If yes, then I will proceed to remove std::vector from C++ public API of the tool and create a new class for it.

If you need this via swig for internal kind of stuff, then use a typemap where you map std::vector<T> to a list() with T instances in it in python.

I want to provide a python interface for the tool's C++ public API as well so that the API can be used in python modules as well. Therefore, I think typemapping to list() will not solve the problem. Am I right?

I am a bit confused here as well. Are we exporting a plug-in specific python bindings for the Intel PT data? It seems like it would be nice to wrap this API into the SBTrace or other lldb interface? Am I not understanding this correctly?

Probably, I didn't understand this comment of yours. We are not exporting python binding for Intel PT data. It is a python binding for Tool's C++ API and this Tool is built on top of LLDB. Did I answer your question? Please let me know if I misunderstood your comment.

So std::vector shouldn't be used in a public API. You should make a class that wraps your objects. LLDB's public API has lldb::SBInstruction and lldb::SBInstructionList as examples. std::vector on other systems compiles differently for debug and release builds and things will crash.

Hi Greg .. Does it go for the tools built on top of LLDB (e.g. in my case where feature library is not a part of liblldb shared lib but built on top of it)? If yes, then I will proceed to remove std::vector from C++ public API of the tool and create a new class for it.

It really comes down to the issue of how stable you want this API to be. If this API is a "you are expected to rebuild any tools you make against this API every time we release a new version" then you can really do anything you want. If you want a stable API to build against, you could choose to follow the LLDB public API rules: no inheritance, fixed number of member variables that never change (which usually means a std::shared_ptr<T>, std::unique_ptr<T> or a T*), and no virtual functions. This effectively renders the C++ API into a C API, but it does force you to have your public API classes have an internal implementation (T) and then the class that you export for your API be the public facing API that everyone uses externally. The nice thing about this is that swig has a really easy time creating all of the script bindings for you if you do it this way. So it really depends on what your future use case of this API is.

If you need this via swig for internal kind of stuff, then use a typemap where you map std::vector<T> to a list() with T instances in it in python.

I want to provide a python interface for the tool's C++ public API as well so that the API can be used in python modules as well. Therefore, I think typemapping to list() will not solve the problem. Am I right?

I believe it would fix your issue for you. You write the glue code that can translate any argument (std::vector<ptdecoder::PTInstruction>) into something more python like like a list of PTInstruction python objects. We do this in a lot of places with LLDB.

I am a bit confused here as well. Are we exporting a plug-in specific python bindings for the Intel PT data? It seems like it would be nice to wrap this API into the SBTrace or other lldb interface? Am I not understanding this correctly?

Probably, I didn't understand this comment of yours. We are not exporting python binding for Intel PT data. It is a python binding for Tool's C++ API and this Tool is built on top of LLDB. Did I answer your question? Please let me know if I misunderstood your comment.

Now I understand. I didn't realize this was a stand alone tool. The main thing to figure out is how stable you want this C++ API that you are exporting to be. If that isn't an issue, feel free to use anything you want in that API. If stability is an issue you want to solve and you want to vend a C++ API, you might think about adopting LLDB's public API rules to help ensure stability.

So std::vector shouldn't be used in a public API. You should make a class that wraps your objects. LLDB's public API has lldb::SBInstruction and lldb::SBInstructionList as examples. std::vector on other systems compiles differently for debug and release builds and things will crash.

Hi Greg .. Does it go for the tools built on top of LLDB (e.g. in my case where feature library is not a part of liblldb shared lib but built on top of it)? If yes, then I will proceed to remove std::vector from C++ public API of the tool and create a new class for it.

It really comes down to the issue of how stable you want this API to be. If this API is a "you are expected to rebuild any tools you make against this API every time we release a new version" then you can really do anything you want. If you want a stable API to build against, you could choose to follow the LLDB public API rules: no inheritance, fixed number of member variables that never change (which usually means a std::shared_ptr<T>, std::unique_ptr<T> or a T*), and no virtual functions. This effectively renders the C++ API into a C API, but it does force you to have your public API classes have an internal implementation (T) and then the class that you export for your API be the public facing API that everyone uses externally. The nice thing about this is that swig has a really easy time creating all of the script bindings for you if you do it this way. So it really depends on what your future use case of this API is.

If you need this via swig for internal kind of stuff, then use a typemap where you map std::vector<T> to a list() with T instances in it in python.

I want to provide a python interface for the tool's C++ public API as well so that the API can be used in python modules as well. Therefore, I think typemapping to list() will not solve the problem. Am I right?

I believe it would fix your issue for you. You write the glue code that can translate any argument (std::vector<ptdecoder::PTInstruction>) into something more python like like a list of PTInstruction python objects. We do this in a lot of places with LLDB.

I am a bit confused here as well. Are we exporting a plug-in specific python bindings for the Intel PT data? It seems like it would be nice to wrap this API into the SBTrace or other lldb interface? Am I not understanding this correctly?

Probably, I didn't understand this comment of yours. We are not exporting python binding for Intel PT data. It is a python binding for Tool's C++ API and this Tool is built on top of LLDB. Did I answer your question? Please let me know if I misunderstood your comment.

Now I understand. I didn't realize this was a stand alone tool. The main thing to figure out is how stable you want this C++ API that you are exporting to be. If that isn't an issue, feel free to use anything you want in that API. If stability is an issue you want to solve and you want to vend a C++ API, you might think about adopting LLDB's public API rules to help ensure stability.

Hi Greg .. Thanks a lot for your feedback and detailed explanation. I am choosing to follow what LLDB SB API are following. Removing std::vector<> from interface. Submitting a new patch.

Removed std::vector<> from public APIs

Much better on the API with Swig. Just a few inlined questions around GetRawBytes.

tools/intel-features/intel-pt/PTDecoder.h
55–56

In the python version of this function, this should return a python string that contains the bytes or return None of there is nothing. Is that how it currently works? One alternate way of doing this is doing what we do with lldb::SBProcess:

size_t ReadMemory(addr_t addr, void *buf, size_t size, lldb::SBError &error);

In python, we use a type map to detect the args "void *buf, size_t size", and we turn this into:

def ReadMemory(self, addr, error):
    return # Python string with bytes from read memory

So one option would be to change GetRawBytes() into:

size_t GetRawBytes(void *buf, size_t size) const;

The return value is now many bytes were filled in. If "buf" is NULL or "size" is zero, you can return the length that you would need as the return value.

And have the python version be:

def GetRawBytes(self):
    return # bytes in a python string or None
59

This might not be needed if GetRawBytes is changed to:

size_t GetRawBytes(void *buf, size_t size) const;

Much better on the API with Swig. Just a few inlined questions around GetRawBytes.

Hi Greg .. Thanks for feedback. My comments are inlined.

tools/intel-features/intel-pt/PTDecoder.h
55–56

Hi Greg.. As per my understanding, if we change ReadMemory() function as you proposed then both typemaps:
%typemap(in) (void *buf, size_t size)
%typemap(argout) (void *buf, size_t size)

will have to be defined. Am i right?

If it is so then I am afraid that I can't reuse these maps from lldb as the 2nd typemap refers to lldb_private::PythonBytes which is an internal library of lldb. Please let me know if I am wrong.

I am not sure how sensitive typemaps are to the names of the arguments? Maybe we can just have you use different names and then the two typemaps won't collide? You can also write some manual code to do the remapping without typemaps.

Regardless of whether you are going to change the API or not, you will need to write something to make it work in python.

const uint8_t * ptdecoder_private:: Instruction::GetRawBytes() const;

What does swig do right now when the above code is converted to python? It needs to make a python string from this and the GetRawBytesSize()...

I am not sure how sensitive typemaps are to the names of the arguments? Maybe we can just have you use different names and then the two typemaps won't collide? You can also write some manual code to do the remapping without typemaps.

Regardless of whether you are going to change the API or not, you will need to write something to make it work in python.

const uint8_t * ptdecoder_private:: Instruction::GetRawBytes() const;

What does swig do right now when the above code is converted to python? It needs to make a python string from this and the GetRawBytesSize()...

Hi Greg .. You are right. I checked the swig converted code for this API and it was not using GetRawBytesSize(). I have changed the C++ API signature as you had suggested and wrote a new typemap to handle it. It is mostly similar to the one defined in lldb typemaps but doesn't use internal lldb implementation of PythonBytes. Let me know if something is not right. I am submitting new patch set. Thanks a lot for your feedback :)

I am not sure how sensitive typemaps are to the names of the arguments? Maybe we can just have you use different names and then the two typemaps won't collide? You can also write some manual code to do the remapping without typemaps.

Regardless of whether you are going to change the API or not, you will need to write something to make it work in python.

const uint8_t * ptdecoder_private:: Instruction::GetRawBytes() const;

What does swig do right now when the above code is converted to python? It needs to make a python string from this and the GetRawBytesSize()...

Changes to make a C++ API work in python modules

  • Added a new python typemap to handle it

Hi Greg .. Can you please review the changes? Please let me know if something needs to be changed. Thanks :)

clayborg accepted this revision.Aug 7 2017, 8:17 AM

I was out on vacation, sorry for the delay. Looks good.

This revision is now accepted and ready to land.Aug 7 2017, 8:17 AM

I was out on vacation, sorry for the delay. Looks good.

Thanks a lot Greg for your feedback. I am merging the patch now.

tools/intel-features/scripts/python-typemaps.txt