This patch implements the code generation for OpenMP 5.0 declare mapper (user-defined mapper) constructs. For each declare mapper, a mapper function is generated. These mapper functions will be called by the runtime and/or other mapper functions to achieve user defined mapping.
The design slides can be found at https://github.com/lingda-li/public-sharing/blob/master/mapper_runtime_design.pptx
Details
- Reviewers
ABataev hfinkel Meinersbur kkwli0 jdoerfert - Commits
- rGd47b9438d7b7: [OpenMP 5.0] Codegen support for user-defined mappers.
rL367905: [OpenMP 5.0] Codegen support for user-defined mappers.
rC367905: [OpenMP 5.0] Codegen support for user-defined mappers.
rGa04ffdbb05fb: [OpenMP 5.0] Codegen support for user-defined mappers.
rC367773: [OpenMP 5.0] Codegen support for user-defined mappers.
rL367773: [OpenMP 5.0] Codegen support for user-defined mappers.
Diff Detail
Event Timeline
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8036 | This code has too many common parts with the existing one. Is it possible to merge it somehow or outline into a function? | |
8554 | Hmm, how could you calculate the required size of the array for the mapper? Or this is constant? | |
8588 | I think it would be good to move this part to the runtime. We should just pass the mapper function to the runtime functions and the runtime should call those mapper functions and get base pointers, pointers, sizes and maptypes. | |
8728 | With the buffering-based implementation we need only function. | |
lib/CodeGen/CGOpenMPRuntime.h | ||
351 | Noope, this must be the part of this patch, because it may cause the crash of the compiler during testing. | |
808 | I don't remember precisely, but probably \a->\p | |
812 | Use /// style of comment |
Hi Alexey,
Again, thanks for your review! Sorry I didn't get back to you in time because I was distracted by other things. Please see the comments inline.
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8036 | I tried to merge it with generateAllInfo. The problem is generateAllInfo also generates information for clauses including to, from, is_device_ptr, use_device_ptr, which don't exist for declare mapper. There is no clear way to extract them separately. For example, every 4 or 5 lines, the code is intended to address a different clause type. | |
8554 | I'm not sure I understand your question here. | |
8588 | This part cannot be moved into the runtime, because the runtime does not know the map type associated with the mapper. Another argument can be potentially added to the runtime API, but that will be more work and I don't think it's necessary | |
8728 | Yes, in either case, we only generate functions here. Is there a problem? | |
lib/CodeGen/CGOpenMPRuntime.h | ||
351 | It will not crash the compiler, because this UDMap is only written in this patch, never read. |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8036 | If those clauses do not exist for the declare mapper, it is fine, no problems with them. If they don't exist, we can't generate anything for them, no? | |
8554 | Yes, it is the question about the size of mapper array. It is the part of our discussion about mappers generation we had before. You said that it is hard to generate the sizes of the arrays since we don't know the number of map clauses at the codegen phase. bu there we have this number. | |
8588 | I think it is better again discuss the runtime part of the patch, because everything else depends on the runtime. I would suggest to try to implement the solution we discussed before, where the required data is stored in the runtime dynamic arrays and only after that it is used to transfer the data. | |
8728 | Sorry, I meant we'll need only one function. | |
lib/CodeGen/CGOpenMPRuntime.h | ||
351 | Still, you should clear it in this patch. Otherwise, you're breaking data-dependency between the patches and this is not good at all. |
Hi Alexey,
Let's discuss your runtime data mapping scheme next week first. After that we will continue the review of this.
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8554 | Sorry there was probably some miscommunication. What I meant is that after fully expanded, for example, from map(mapper(id):a[0:n]), eventually to map(a.b.c[0:e]) map(a.k) ..., the number of things in the results is unknown at compile time. Here, we only do one level of expansion of one instance based on the declare mapper directive, for example, the mapper is declare mapper(class A a) map(a.b[0:a.n]) map(a.c) | |
8728 | Yes, in that case, only one is needed. | |
lib/CodeGen/CGOpenMPRuntime.h | ||
351 | Sure, if you think that's absolutely necessary, I can add it to this patch. |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8724 | This function looks like the universal one, regardless of the type <type_name> specifics. Do we really need to generate it for each particular type and mapper? Or we could use the same function for all types/mappers? |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8724 | I think we need a particular mapper function for each type and mapper, because the code generated within the mapper function depends on what type and what mapper it is. |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8724 | Hmm, maybe I'm wrong but I don't see significant mapper or type-specific dependencies in this mapper function. It uses the pointer to type and size of the type, but this information can be generalized, I think. Could you point the lines of code that are type and mapper specific? |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8724 | Code between line 8857-8965 is type and mapper specific. For instance, generateAllInforForMapper depends on the map clauses associated with the mapper and the internal structure of struct/class type, and generates difference code as a result. BasePointers.size() also depends on the above things. |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8724 | Most of these data can be passed as parameters to the function. It would be good, if we could move this function to the libomptaret library and reduce the number of changes (and, thus, complexity) of the compiler itself. It is always easier to review and to maintain the source code written in C/C++ rather than the changes in the compiler codegen. Plus, it may reduce the size of the final code significantly, I assume. I would appreciate it if you would try to move this function to libomptarget. | |
8852 | These data can be passed as parameters to the function, no? |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8724 | Hi Alexey, I don't think libomptarget can do this efficiently. For example, class C { int a; double *b } #pragma omp declare mapper(C c) map(c.a, c.b[0:1]) The codegen can directly know there are 2 components (c.a, c.b[0]) in this mapper function (3 actually when we count the pointer), and it can also know the size, starting address, map type, etc. about these components. Passing all these information to libomptarget seems to be a bad idea. Or did I get your idea wrong? |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8724 | Yes, I understand this. But can we pass some additional parameters to this function so we don't need to generate a unique copy of almost the same function for all types/mappers? |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8724 | For different types/mappers, the skeleton of mapper functions are similar (i.e., the things outlined in the comment here). I would say most other code is unique, for instance, the code to prepare parameters of call to __tgt_push_mapper_component. These code should be much more compared with the skeleton shown here. I cannot think of a way to reduce the code by passing more parameters to this function. Please let me know if you have some suggestions. |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8734 | Currently currentComponent is generated by the compiler. But can we instead pass this data as an extra parameter to this omp_mapper function. | |
8735 | I don't see this part of logic in the code. Could you show me where is it exactly? | |
8780 | Bad idea to do this. Better to use something like this: SmallString<256> TyStr; llvm::raw_svector_ostream Out(TyStr); CGM.getCXXABI().getMangleContext().mangleTypeName(Ty, Out); |
Fix mapper function name mangling
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8734 | Emm, I think this scheme will be very difficult and inefficient. If we pass components as an argument of omp_mapper function, it means that the runtime needs to generate all components related to a map clause. I don't think the runtime is able to do that efficiently. On the other hand, in the current scheme, these components are naturally generated by the compiler, and the runtime only needs to know the base pointer, pointer, type, size. etc. | |
8735 | This part doesn't exist in this patch. Currently we don't really look up the mapper for any mapped variable/array/etc. The next patch will add the code to look up the specified mapper for every map clause, and get the mapper function for them correspondingly. | |
8780 | Sounds good. Thanks! |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8734 | With the current scheme, we may end with the code blowout. We need to generate very similar code for different types and variables. The worst thing here is that we will be unable to optimize this huge amount of code because the codegen relies on the runtime functions and the code cannot be inlined. That's why I would like to move as much as possible code to the runtime rather than to emit it in the compiler. | |
8735 | Then we need at least some kind of TODO note that this part is not implemented in this patch. | |
8768 | Always better to use constructor with the location to generate correct debug info for all the parameters. | |
8861–8862 | I don't like this code very much! It hides the logiс ща the MEMBER_OF flag deep inside and it is going to be very hard to update it in future if there are some changes in the flags. | |
8929 | I don't see this logic in the comment for the function. Could you add more details for all this logic implemented here? | |
8990 |
| |
8998 | Use StringRef or SmallString | |
9015–9021 | Enclose all substatements into braces or none of them. |
Address Alexey's comments
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8734 | I understand your concerns. I think this is the best we can do right now. The most worrisome case will be when we have nested mappers within each other. In this case, a mapper function will call another mapper function. We can inline the inner mapper functions in this scenario, so that these mapper function can be properly optimized. As a result, I think the performance should be fine. | |
8861–8862 | Add a function to calculate this offset. Also modify another existing place using the hard coded number 48. |
Sorry for the delay, Lingda. I tried to find some better solution for this.
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
749 | Do we really need to use int64_t for number of elements? size_t must be enough. | |
8734 | Instead, we can use indirect function calls passed in the array to the runtime. Do you think it is going to be slower? In your current scheme, we generate many runtime calls instead. Could you try to estimate the number of calls in cases if we'll call the mappers through the indirect function calls and in your cuurent scheme, where we need to call the runtime functions many times in each particular mapper? | |
8893 | You can use nuw attribute here, I think | |
9019 | You can use C.toBits(CGM.getSizeSize()) | |
9041 | Just CGM.getSizeSize() | |
9042 | Add nuw attribute |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8734 | Hi Alexey, Sorry I don't understand your idea. What indirect function calls do you propose to be passed to the runtime? What are these functions supposed to do? The number of function calls will be exactly equal to the number of components mapped, no matter whether there are nested mappers or not. The number of components depend on the program. E.g., if we map a large array section, then there will be many more function calls. |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8734 | I mean the pointers to the mapper function, generated by the compiler. In your comment, it is c.Mapper() |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8734 | If we pass nested mapper functions to the runtime, I think it will slow down execution because of the extra level of indirect function calls. E.g., the runtime will call omp_mapper1, which calls the runtime back, which calls omp_mapper2, .... This can result in a deep call stack. I think the current implementation will be more efficient, which doesn't pass nested mappers to the runtime. One call to the outer most mapper function will have all data mapping done. The call stack will be 2 level deep (the first level is the mapper function, and the second level is __tgt_push_mapper_component) in this case from the runtime. There are also more compiler optimization space when we inline all nested mapper functions. |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
749 | Because we use the return value to shift the memberof filed of map type, which is int64_t, so I think int64_t makes sense here. |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
749 | Ok, there is a discrepancy between runtime part and compiler part: __tgt_push_mapper_component uses size_t for size, while the runtime function uses int64_t. It won't work for 32bit platform. | |
8734 | Yes, if we leave it as is. But if instead of the bunch unique functions we'll have the common one, that accept list if indirect pointers to functions additionally, and move it to the runtime library, we won't need those 2 functions we have currently. We'll have full access to the mapping data vector in the runtime library and won't need to use those 2 accessors we have currently. Instead, we'll need just one runtime functions, which implements the whole mapping logic. We still need to call it recursively, but I assume the number of calls will remain the same as in the current scheme. Did you understand the idea? If yes, it would good if you coild try to estimate the number of function calls in current scheme and in this new scheme to estimate possible pros and cons. |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8734 | Hi Alexey, Could you give an example for this scheme? 1) I don't understand how the mapper function can have full access to the mapping data vector without providing these 2 accessors. 2) I don't think it is possible to have a common function instead of bunch of unique functions for each mapper declared. |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
749 | This should work on 32bit machines, since 32bit machines also use int64_t for the map type? |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
752 | Check the declaration of the runtime function in the runtime patch and here. size parameter here has type size_t, while in runtime part it is int64_t | |
8734 | Hi Lingda, something like this. void __tgt_mapper(void *base, void *begin, size_t size, int64_t type, auto components[]) { // Allocate space for an array section first. if (size > 1 && !maptype.IsDelete) <push>(base, begin, size*sizeof(Ty), clearToFrom(type)); // Map members. for (unsigned i = 0; i < size; i++) { // For each component specified by this mapper: for (auto c : components) { if (c.hasMapper()) (*c.Mapper())(c.arg_base, c.arg_begin, c.arg_size, c.arg_type); else <push>(c.arg_base, c.arg_begin, c.arg_size, c.arg_type); } } // Delete the array section. if (size > 1 && maptype.IsDelete) <push>(base, begin, size*sizeof(Ty), clearToFrom(type)); } void <type>.mapper(void *base, void *begin, size_t size, int64_t type) { auto sub_components[] = {...}; __tgt_mapper(base, begin, size, type, sub_components); } |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
752 | The runtime part uses int64_t because I see every other runtime function use int64_t instead of size_t for size, e.g., __tgt_target, __tgt_target_teams, etc., despite that they are declared to be size_t in Clang codegen. So I guess it is done on purpose? Otherwise we need to modify all these runtime function interface in the future. | |
8734 | Hi Alexey, I don't think this scheme is more efficient than the current scheme. My reasons are:
|
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
752 | I recently committed the patch that fixes this problem in clang. If you're using int64_t in the runtime, the same type must be used in clang codegen. | |
8734 | Hi Lingda, I'm trying to simplify the code generated by clang and avoid some unnecessary code duplications. If the complexity of this scheme is the same as proposed by you, I would prefer to use this scheme unless there are some other opinions.
Also, this solution reduces the number of required runtime functions, instead of 2 we need just 1 and, thus, we need to make fewer runtime functions calls. I think it would better to propose this scheme as an alternate design and discuss it in the OpenMP telecon. What do you think? Or we can try to discuss it in the offline mode via the e-mail with other members. |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
752 | That's nice to know. Then I will change the type of size to int64_t as well. | |
8734 | Hi Alexey, I still prefer the current scheme, because:
But if you have a different option, we can discuss it next time in the meeting. I do have a time constraint to work on the mapper implementation. I'll no longer work in this project starting this September, and I have about 30% of my time working on it until then. |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8734 | Lingda,
I see :(. I understand your concern. In this case, we could try to discuss it offline, in the mailing list, to make it a little bit faster. We just need to hear other opinions on this matter, maybe there are some other pros and cons for these schemes. |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8734 | Hi Alexey, Sure, let's discuss this in the mailing list. I'll summarize it and send it to the mailing list later.
We won't have recursive calls with inlining.
I think it's even harder to optimized these code out when we move them into the runtime.
I don't quite understand. It's still equal to the number of mappers in any case. |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8734 |
Good, thanks!
We won't have recursive calls anyway (recursive types are not allowed). Plus, I'm not sure that inlining is the best option here. We have a lot of code for each mapper and I'm not sure that the optimizer will be able to squash it effectively.
Definitely not, unless we use LTO or inlined runtime. |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8734 |
Sorry I should not say recursive calls. Here it needs to "recursively" call other mapper functions in case of nested mappers, but we don't need it in case of inlining.
But you are proposing to move many code to the runtime here, right? That doesn't make sense to me. |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8734 |
I'm just not sure that there going be significant problems with the performance because of that. And it significantly simplifies codegen in the compiler and moves the common part into a single function. Plus, if in future we'll need to modify this functionality for some reason, 2 different versions of the compiler will produce incompatible code. With my scheme, you still can use old runtime and have the same functionality as the old compiler and the new one. |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8734 | Hi Alexey, I think more carefully about your scheme, and I don't think we can solve the 2 problems below with this scheme:
Based on these reasons, I think the current scheme is still more preferable. |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8734 | Hi Lingda,
|
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8734 |
|
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8734 |
|
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8734 |
Besides, I don't think there is a valid reason that the current scheme is not good. You mentioned it's complex codegen. But it only has less 200 loc here, I don't see why it is complex. |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8734 |
I think it would be better to share it with others and listen to their opinions. It is better to spend some extra time to provide good design. |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8734 |
Sure, I can release the discussion to the mailing list. I don't see a reason to use the new scheme now. |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8734 | Lingda, I meant to send the message to the OpenMP telecon list :) Could you forward the same e-mail to Ravi and others, please? |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
8734 | Sure, probably no one in the public mailing list will care about it. I'll send it to the bi-weekly meeting list. |
Sure, I'll do it next week, since I'm on vacation and don't have access to my desktop.
lib/CodeGen/CGDecl.cpp | ||
---|---|---|
2499 | Why do we need to emit it for simd only mode? | |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
1684 | You're looking for CGF.CurFn twice here, used find member function instead and work with iterator. | |
7061–7069 | Maybe it is better to define a constant constexpr uint64_t OMP_MEMBER_OF_RANK = 48 and then deduce OMP_MAP_MEMBER_OF as ~((1<<OMP_MEMBER_OF_RANK) - 1)? | |
8040 | AFAIK, LLVM has dropped support for msvc 2013, do we still need this? | |
lib/CodeGen/CGOpenMPRuntime.h | ||
352 | Use using instead of typedef. |
lib/CodeGen/CGDecl.cpp | ||
---|---|---|
2499 | This code is not emitting mapper for simd only mode. | |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
7061–7069 | In libomptarget, the same way is used to define OMP_TGT_MAPTYPE_MEMBER_OF: OMP_TGT_MAPTYPE_MEMBER_OF = 0xffff000000000000. So I think they should stay the same. Btw, the number 48 is directly used in libomptarget now, which may need to change in the future. In your code, it assumes bits higher than 48 are all OMP_MAP_MEMBER_OF, which may not be true in the future. My code here is more universal, although it does not look great. What do you think? |
lib/CodeGen/CGDecl.cpp | ||
---|---|---|
2499 | Ah, yes, it is the condition for the early exit. | |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
7061–7069 | You can apply a mask to drop some of the most significant bits if required. My code looks much cleaner it would be good to have the same code in libomptarget too. | |
8040 | Ping. | |
lib/CodeGen/CGOpenMPRuntime.h | ||
805–810 | I don't think we need virtual functions here, non-virtual are good enough. | |
2101 | I think you can drop this function here if the original function is not virtual |
lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
7061–7069 | Maybe let's keep it this way in this patch, then we modify both places in a further patch? | |
8040 | I'm okay either way. I guess it doesn't hurt to keep it? | |
lib/CodeGen/CGOpenMPRuntime.h | ||
2101 | The function for simd only mode includes a llvm_unreachable, so I think it's still needed as a virtual function above |
lib/CodeGen/CGOpenMPRuntime.h | ||
---|---|---|
2101 | It is better to reduce number of virtual functions, if possible. |
lib/CodeGen/CGOpenMPRuntime.h | ||
---|---|---|
2101 | What about virtual function? |
lib/CodeGen/CGOpenMPRuntime.h | ||
---|---|---|
2101 | Since emitUserDefinedMapper here overrides the same function in CGOpenMPRuntime, I think emitUserDefinedMapper of CGOpenMPRuntime needs to be defined as virtual? |
lib/CodeGen/CGOpenMPRuntime.h | ||
---|---|---|
2101 | If you are asking about what virtual function I removed, it is emitUDMapperArrayInitOrDel of CGOpenMPRuntime which is never overrided. |
lib/CodeGen/CGOpenMPRuntime.h | ||
---|---|---|
2101 | I would suggest to not make this virtual too and remove overriden version. It does not help. |
lib/CodeGen/CGOpenMPRuntime.h | ||
---|---|---|
2101 | If we remove virtual from emitUserDefinedMapper, we will have to define it in both CGOpenMPRuntime and CGOpenMPSIMDRuntime, and its definition is identical in both places. I don't think that's good software engineering practice? |
lib/CodeGen/CGOpenMPRuntime.h | ||
---|---|---|
2101 | WE just don't need an overridden version in CGOpenMPSIMDRuntime. Just remove virtual and remove the overridden function from CGOpenMPSIMDRuntime. |
lib/CodeGen/CGOpenMPRuntime.h | ||
---|---|---|
810 | Seems to me, this function is used only in emitUserDefinedMapper. I think you can make it static local in the CGOpenMPRuntime.cpp and do not expose it in the interface. |
lib/CodeGen/CGOpenMPRuntime.h | ||
---|---|---|
810 | emitUserDefinedMapper needs to call createRuntimeFunction of CGOpenMPRuntime, which is private. Which one do you think is better, make createRuntimeFunction public, or have emitUserDefinedMapper not defined in CGOpenMPRuntime? It seems to me that they are similar |
lib/CodeGen/CGOpenMPRuntime.h | ||
---|---|---|
810 | Then make emitUDMapperArrayInitOrDel private instead. |
test/OpenMP/declare_mapper_codegen.cpp | ||
---|---|---|
44–48 | Just like in this test when you're using vars. |
test/OpenMP/declare_mapper_codegen.cpp | ||
---|---|---|
44–48 | Sorry I was not clear before. What do you mean by "predetermined indices" here? If you are referring to, for example, %0 in store i8* %0, i8** [[HANDLEADDR:%[^,]+]], I guess there is no way to get rid of %0 because it means the first argument of the function? |
test/OpenMP/declare_mapper_codegen.cpp | ||
---|---|---|
44–48 | Yes, I meant those %0 like registers. Better to mark them as variables in function declaration and use those names in the checks. |
test/OpenMP/declare_mapper_codegen.cpp | ||
---|---|---|
44–48 | Now it's like define {{.*}}void @.omp_mapper.{{.*}}C.id{{.*}}(i8*, i8*, i8*, i64, i64), I think you are suggesting something like define {{.*}}void @.omp_mapper.{{.*}}C.id{{.*}}(i8* [[HANDLE:%[^,]+]], i8* [[BPTR:%[^,]+]], ...), and later I can use store i8* [[HANDLE]], i8** [[HANDLEADDR:%[^,]+]] I'm not sure how to add names for function arguments. They seems to be always nameless like (i8*, i8*, i8*, i64, i64). Is there a way to do that? |
test/OpenMP/declare_mapper_codegen.cpp | ||
---|---|---|
44–48 | If the clang parameters have names, the llvm params also will get the names. But it is not worth it to add the names to the function. Could just use regexp here to avoid using LLVM register names? Just %{{·+}}. And rely on the order, i.e. remove -DAG checks? |
Why do we need to emit it for simd only mode?