- User Since
- Nov 16 2012, 6:02 PM (374 w, 9 h)
Mon, Jan 13
Wed, Jan 8
I like this general idea; couple of thoughts...
Sun, Jan 5
Tue, Dec 31
Mon, Dec 30
Do you expect any effect at all from doing this? These are all scheduling barriers?
Sat, Dec 28
Fri, Dec 27
Thu, Dec 26
Dec 18 2019
Dec 17 2019
Dec 16 2019
Dec 13 2019
If we really want to be confident that this is robust, we should probably fuzz-test it. Regardless, this seems like a definite improvement. LGTM.
Dec 12 2019
Given we have two implementations, each at different points in the pipeline, it might be constructive to each write down why you each choose said point. I suspect the rationale is hidden by the implementation details.
Actually, early resolution will break tbe tools, not help them. It will definitely break clangd, for example. The user will try to navigate to originally called function base and instead he will be redirected to the function hst.
Can you please be more specific? Please explain why the user would consider this incorrect behavior. If the point of the tool is to allow the user to navigate to the function actually being called, then navigating to base seems incorrect if that's not the function being called. This is just like any other kind of overload resolution - the user likely wants to navigate to the function being called.
Now the user might want an OpenMP-aware tool that understands differences between host and accelerator behavior, how that affects which functions are called, etc. The user might want this for host/device overloads in CUDA too, but this is really an orthogonal concern.
You wrote the code. You called a function in the expression. Now you want to navivate to this function. Clicked on it and instead of the called base you are redirected to hst because AST has the link to hst functiin inthe expression instead of the base.
Sure, but it has that link because that hst function is actually the function being called (assuming that the clangd-using-tool is configured to interpret the code as if compiling for the host). When I click on a function call in a source file, I want to navigate to the function actually being called. I certainly understand that the function being called now depends on compilation context, and the tool in our example is only providing the resolution in one context, but at least it provides one valid answer. An OpenMP-aware tool could navigate to the base function (we do need to preserve information to make this possible). This is just like dealing with some host/device functions in CUDA (where there are separate overloads) - if you click on the function in such a tool you'll probably navigate to the host variant of the function (even if, in some other context, the device overload might be called).
Again, I see this as exactly analogous to overload resolution, or as another example, when calling a function template with specializations. When using such a tool, my experience is that users want to click on the function and navigate to the function actually being called. If, for example, I have a function template with specializations, if the specialized one is being called, I should navigate to the specialization being called (not the base function template).
You wrote wrong context matcher. You has a bug in the base function, which should be called by default everu sw here but the host, and want to fix it. Etc.
While we talk a lot about what you think is bad about this solution it seems we ignore the problems in the current one. Let me summarize a few:
- Take https://godbolt.org/z/XCjQUA where the wrong function is called in the target region (because the "hack" to inject code in the wrong definition is not applicable).
No time for it, just short answers. No definition for the variant - no definition for the base.
Dec 10 2019
Also, check how -ast-print works with your solution. It returns different result than expected because you're transform the code too early. It is incorrect behavior.
This is debatable. AST print does not print the input but the AST, thus what is correct wrt. OpenMP declare variant is nowhere defined but by us.
Arguably, having it print the actually called function and not the base function is preferable. Thus, the new way is actually more informative.
You're completely wrong here! We shall keep the original AST. This is used in several tools and you significantly modify the user code. I consider it a real issue here.
Dec 9 2019
We restricted it for now to function definitions so we don't need to define the mangling as you cannot expect linking. (I did this to get it in TR8 while I figured it will solve all our math.h problems already).
However, we need to avoid collisions with user code, e.g., through the use of symbols in the name that are not allowed to be used by the user (I thought "." is one of them).
Okay, but how to we distinguish functions for which there is a declaration and we need the mangling because the user has provided a definition elsewhere, from those for which there is a declaration, and we don't want mangling because we need to link to some system library?
The idea is, declarations inside begin/end declare variant are supposed to be not affected by the begin/end declare variant. That is, if you have declarations you cannot expect variant multi-versioning to happen. Having declarations inside or outside the begin/end declare variant is still fine if they all denote the same function.
Dec 8 2019
Dec 6 2019
But I can implement !setop if you prefer. (And while I'm at it, perhaps !getop too.)
Did you consider introducing something to change the dag operator instead of this? I think it might be cleaner to have some kind of thing which changes the dag operator, and after that, if you wish to !con them, everything works as expected.
Dec 2 2019
Hrmm... the current naming convention predates me, but it wasn't difficult to figure out. If you're familiar with the ISA and the fact that the record forms have mnemonics ending in a '.'. Honestly, replacing 'o' with 'r' doesn't seem all that much better to me, as I still need to know that the 'r' stands for record form and understand what that means. It's not like searching the ISA manual for '.', or 'o', or 'r', is going to buy one much of anything. I am sympathetic to none of these being immediately obvious, and I'd be happy with some fix to that situation.
Nov 19 2019
Thanks for posting patches to fix problems with the APFloat implementation. FWIW, this code is hard to review, from my perspective, because it lacks comments (both your new code and the code it is replacing). As you clearly have gone through this and understand what it's doing, could you add a bunch of comments explaining what's going on? I think that would be very helpful for everyone else.
Nov 17 2019
Nov 6 2019
Do we need this at all? Can't we just emit @llvm.assume calls before the loop if we know the alignment? I've done this in the past and it's worked well (by using __builtin_assume_aligned from Clang).
Nov 5 2019
Please add a test case.
Nov 4 2019
If this is wrong (or buggy), then it will probably break self-hosting on PPC, so then we'll know.
Oct 30 2019
Oct 29 2019
Oct 28 2019
Oct 25 2019
This is the same as #pragma clang loop vectorize(assume_safety)?
Oct 23 2019
Oct 22 2019
I don't see how such warning can help a user. A note about impact of the pragma on performance can be put into documentation. Issuing a warning on every use of the pragma may be annoying.
Oct 21 2019
Oct 15 2019
Thanks for posting this. Please send a plain email explaining the proposal to llvm-dev - RFCs regarding new intrinsics and other IR extensions should go there to reach a wide audience.
Oct 10 2019
How do you imagine that we'd use this? Do we need some kind of size to go along with this?
Oct 4 2019
How far down this road can we go? I would like to see infix operators in TableGen because I believe that it will make it easier to use. However, I'm concerned about just adding add as a special case. Of the other similar operators:
Oct 1 2019
Sep 27 2019
Maybe everything is fine, but given this setup, does anyone see any potential problems with compiling these functions for nvptx? I'd like to eventually see a mode where we compile an appropriate subset of these functions for GPU targets -- either in bitcode form for linking with our device-side OpenMP runtime library or as a native object -- to provide a more feature-complete offloading environment.
Sep 26 2019
Is it possible to check the permutation mask in the test case? (I suppose that these are auto-generated tests, but the test coverage it's all that good if we can't check the mask).
I apologize for the delay; I've been contemplating what to recommend here...
Sep 25 2019
Sep 23 2019
Please post patches with full context (the full source file should be here). I can't see where moveCodeToFunction is called.
Sep 18 2019
One thought that occurs to me when reviewing this. I think we will assume that F18/flang when it lands in the LLVM project will be an optional thing to build and will be an opt-in project at the start that is off by default. What happens when clang --driver-mode=flang is called and flang has not been built? And where would we put a test for that behaviour? If flang were not to be built, should --driver-mode=flang be unsupported/unrecognised and emit an error message?
Sep 17 2019