This is an archive of the discontinued LLVM Phabricator instance.

[Libomptarget] Change device free routines to accept the allocation kind
ClosedPublic

Authored by jhuber6 on Aug 31 2022, 2:04 PM.

Details

Summary

Previous support for device memory allocators used a single free
routine and did not provide the original kind of the allocation. This is
problematic as some of these memory types required different handling.
Previously this was worked around using a map in runtime to record the
original kind of each pointer. Instead, this patch introduces new free
routines similar to the existing allocation routines. This allows us to
avoid a map traversal every time we free a device pointer.

The only interfaces defined by the standard are omp_target_alloc and
omp_target_free, these do not take a kind as omp_alloc does. The
standard dictates the following:

"The omp_target_alloc routine returns a device pointer that references
the device address of a storage location of size bytes. The storage
location is dynamically allocated in the device data environment of the
device specified by device_num."

Which suggests that these routines only allocate the default device
memory for the kind. So this has been changed to reflect this. This
change is somewhat breaking if users were using omp_target_free as
previously shown in the tests.

Diff Detail

Event Timeline

jhuber6 created this revision.Aug 31 2022, 2:04 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 31 2022, 2:04 PM
jhuber6 requested review of this revision.Aug 31 2022, 2:04 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 31 2022, 2:04 PM

I think changing the API of __tgt_rtl_data_delete is good. I like the map lookup disappearing from the cuda plugin. It'll mean mixing plugins and libomptargets from different toolchains will tend to segfault. They're not certain to be semantically interchangeable even if we keep the function name and type the same though.

@jdoerfert / @ronlieb / @RaviNarayanaswamy any of you aware of a requirement that forces slower/clumsy/backwards-compatible here, aka renaming the entry point to __tgt_rtl_data_delete_v2 or similar?

In particular changing the llvm_ prefixed function is definitely OK, anyone using llvm_ prefixed openmp functions signed up for semantics changing en route to standardisation.

Patch itself looks good to me, thanks for writing it.

openmp/libomptarget/include/device.h
410

Might be better to avoid the default argument here in favour of explicit at call site since that'll help pair up allocation kind when reading the implementation. Probably as a follow up patch to this one since the verbosity would detract from the review.

jhuber6 added inline comments.Aug 31 2022, 4:34 PM
openmp/libomptarget/include/device.h
410

The alloc above does the same default argument so it made sense to just copy it.

I think we'd better keep __tgt_rtl_data_delete unchanged, and add another interface. AFAIK we require libomptarget to be aligned with clang, but don't force the plugins.

I think we'd better keep __tgt_rtl_data_delete unchanged, and add another interface. AFAIK we require libomptarget to be aligned with clang, but don't force the plugins.

I figured it was okay to change since I don't recall if we actually provide backwards compatibility for the plugins.

I think we'd better keep __tgt_rtl_data_delete unchanged, and add another interface. AFAIK we require libomptarget to be aligned with clang, but don't force the plugins.

I figured it was okay to change since I don't recall if we actually provide backwards compatibility for the plugins.

Is it stated somewhere? I had a lot of trouble about compatibility before when I tried to change the interface. I'm okay with breaking the backward compatibility, as long as we already tell users. :-)

FWIW, if necessary, we can say LLVM 15 is the last version for backward compatibility.

Is it stated somewhere? I had a lot of trouble about compatibility before when I tried to change the interface. I'm okay with breaking the backward compatibility, as long as we already tell users. :-)

It's not too difficult to just make a new interface and have the old one call the new one with the default. Probably better safe than sorry so I may as well add it if it's a problem.

JonChesterfield accepted this revision.Sep 1 2022, 8:28 AM

Is it stated somewhere? I had a lot of trouble about compatibility before when I tried to change the interface. I'm okay with breaking the backward compatibility, as long as we already tell users. :-)

It's not too difficult to just make a new interface and have the old one call the new one with the default. Probably better safe than sorry so I may as well add it if it's a problem.

It's simple to do but means we're stuck with all the cruft in the cuda plugin as we won't know which form of delete will be called.

It'll be a really easy patch for downstream plugins to adopt (Intel has one iiuc?). The other use case is end users who have their plugin that they use with unrelated clang who I'm not sure exist. In either case, these toolchains will have already manually changed their libomptarget to get it to load their plugin already as we don't have any form of automatic discovery in place.

I think we should change the calling convention and remove all the map lookup stuff from the cuda plugin unless we have a compelling reason not to, where miscellaneous concern about slightly breaking unknown users that are reliant on internal interfaces doesn't seem compelling.

This revision is now accepted and ready to land.Sep 1 2022, 8:28 AM
tianshilei1992 added a comment.EditedSep 1 2022, 8:57 AM

Is it stated somewhere? I had a lot of trouble about compatibility before when I tried to change the interface. I'm okay with breaking the backward compatibility, as long as we already tell users. :-)

It's not too difficult to just make a new interface and have the old one call the new one with the default. Probably better safe than sorry so I may as well add it if it's a problem.

It's simple to do but means we're stuck with all the cruft in the cuda plugin as we won't know which form of delete will be called.

It'll be a really easy patch for downstream plugins to adopt (Intel has one iiuc?). The other use case is end users who have their plugin that they use with unrelated clang who I'm not sure exist. In either case, these toolchains will have already manually changed their libomptarget to get it to load their plugin already as we don't have any form of automatic discovery in place.

I think we should change the calling convention and remove all the map lookup stuff from the cuda plugin unless we have a compelling reason not to, where miscellaneous concern about slightly breaking unknown users that are reliant on internal interfaces doesn't seem compelling.

You can still use _v2 or whatever in libomptarget. For libomptarget, there is no old interface then. Just keep the old one and all the logic for compatibility in the plugin. There is no confusion.

You can still use _v2 or whatever in libomptarget. For libomptarget, there is no old interface then. Just keep the old one and all the logic for compatibility in the plugin. There is no confusion.

The potential change here is users who used omp_target_free to free non-device pointers as this worked before. We technically export the plugin routines as they need to be visible to libomptarget but I don't know if we expect any users to call them directly.

You can still use _v2 or whatever in libomptarget. For libomptarget, there is no old interface then. Just keep the old one and all the logic for compatibility in the plugin. There is no confusion.

The potential change here is users who used omp_target_free to free non-device pointers as this worked before. We technically export the plugin routines as they need to be visible to libomptarget but I don't know if we expect any users to call them directly.

The "user" here is old libomptarget.

You can still use _v2 or whatever in libomptarget. For libomptarget, there is no old interface then. Just keep the old one and all the logic for compatibility in the plugin. There is no confusion.

The potential change here is users who used omp_target_free to free non-device pointers as this worked before. We technically export the plugin routines as they need to be visible to libomptarget but I don't know if we expect any users to call them directly.

The "user" here is old libomptarget.

It would be an incredibly weird setup to have libomptarget load a plugin from a different install right? We install them to the same directory so it would mean that the user manually edited libomptarget after the install.

You can still use _v2 or whatever in libomptarget. For libomptarget, there is no old interface then. Just keep the old one and all the logic for compatibility in the plugin. There is no confusion.

The potential change here is users who used omp_target_free to free non-device pointers as this worked before. We technically export the plugin routines as they need to be visible to libomptarget but I don't know if we expect any users to call them directly.

The "user" here is old libomptarget.

It would be an incredibly weird setup to have libomptarget load a plugin from a different install right? We install them to the same directory so it would mean that the user manually edited libomptarget after the install.

We don't force the plugins to be loaded from the same dir as libomptarget right?

Again, like I said, we can claim LLVM 15 is the last version to guarantee backward compatibility for all the components we require. For later version all components have to be from the same build. We just need to make a consensus in the meeting, document it, and do whatever we want to change the APIs in the future.

You can still use _v2 or whatever in libomptarget. For libomptarget, there is no old interface then. Just keep the old one and all the logic for compatibility in the plugin. There is no confusion.

Then we'd have to keep the spurious hashtable inserts in alloc, and consider fixing the pointer leak in the current implementation (the table this deletes grows without bound). I.e. to keep things working for people that we don't have reason to believe exist, the cuda plugin must remain slower than it could be.

I can make my peace with that, but it looks like discarding this worthwhile patch, not applying it and taking the extra complexity without the benefit.

You can still use _v2 or whatever in libomptarget. For libomptarget, there is no old interface then. Just keep the old one and all the logic for compatibility in the plugin. There is no confusion.

The potential change here is users who used omp_target_free to free non-device pointers as this worked before. We technically export the plugin routines as they need to be visible to libomptarget but I don't know if we expect any users to call them directly.

The "user" here is old libomptarget.

It would be an incredibly weird setup to have libomptarget load a plugin from a different install right? We install them to the same directory so it would mean that the user manually edited libomptarget after the install.

We don't force the plugins to be loaded from the same dir as libomptarget right?

Again, like I said, we can claim LLVM 15 is the last version to guarantee backward compatibility for all the components we require. For later version all components have to be from the same build. We just need to make a consensus in the meeting, document it, and do whatever we want to change the APIs in the future.

When we load the plugin we just pass the name like libomptarget.rtl.cuda.so so it goes to the linker's search path which is going to search using the default path. I suppose since we use -rpath it's possible these could get mixed up.

Again, like I said, we can claim LLVM 15 is the last version to guarantee backward compatibility for all the components we require. For later version all components have to be from the same build. We just need to make a consensus in the meeting, document it, and do whatever we want to change the APIs in the future.

This would be nice considering a lot of the legacy we carry around. The fact that LLVM 15 released with proper versioned .so objects, e.g. libomptarget.so.15 helps us to suggest that. It would definitely need to be discussed.

You can still use _v2 or whatever in libomptarget. For libomptarget, there is no old interface then. Just keep the old one and all the logic for compatibility in the plugin. There is no confusion.

Then we'd have to keep the spurious hashtable inserts in alloc, and consider fixing the pointer leak in the current implementation (the table this deletes grows without bound). I.e. to keep things working for people that we don't have reason to believe exist, the cuda plugin must remain slower than it could be.

FWIW, we don't have to fix issues for old interface. That's the whole point of deprecating the old interface.

I can make my peace with that, but it looks like discarding this worthwhile patch, not applying it and taking the extra complexity without the benefit.

Of course we will not discard this patch. We just need to find the right way to do it. As long as we make agreement on compatibility, we are good to go. That's a single cut, which can save a lot future arguments.

We discussed how the plugin interfaces don't need backwards compatibility in the meeting today. @tianshilei1992 is this good to land?

tianshilei1992 accepted this revision.Sep 14 2022, 10:11 AM

Yup, let's go!