- User Since
- Nov 16 2012, 6:02 PM (384 w, 3 d)
Fri, Mar 27
I don't object to doing this, but...
Thu, Mar 26
Wed, Mar 25
Tue, Mar 24
Fri, Mar 6
Wed, Mar 4
Update formatting, etc., and add a reference in the new-contributor section of the dev policy in response to reviewer feedback.
Feb 29 2020
Updated to remove blank line and fix that (1 < N < M) should be (1 <= N <= M).
Feb 28 2020
Updated per review feedback. Please let me know if you see anything else that should be addressed pre-commit.
I would like to thank everyone who provided feedback on this patch. I apologize for the slow turn-around time on this. I'll upload a revised version that, I believe, addresses all feedback.
Feb 27 2020
Unfortunately, we cannot do this kind of thing just because it seems to make sense. The language semantics must be exactly satisfied by the IR-level semantics. I certainly agree that it would make sense for users to be able to mark invariant loads, but this mechanism simply might not be the right one.
Feb 19 2020
Feb 18 2020
Feb 11 2020
Feb 7 2020
Feb 4 2020
Jan 31 2020
Jan 27 2020
Jan 24 2020
Jan 21 2020
Jan 20 2020
I'm not sure whether this is deliberate (but it seems weird) or just a bug. I can ask the GCC developers ...
Jan 19 2020
Can we, at least, put this constant in a header file so we don't repeat it in several places? Also, can we write it as 0x20000000 so that it's more obvious what the value is.
Jan 13 2020
Jan 8 2020
I like this general idea; couple of thoughts...
Jan 5 2020
Dec 31 2019
Dec 30 2019
Do you expect any effect at all from doing this? These are all scheduling barriers?
Dec 28 2019
Dec 27 2019
Dec 26 2019
Dec 18 2019
Dec 17 2019
Dec 16 2019
Dec 13 2019
If we really want to be confident that this is robust, we should probably fuzz-test it. Regardless, this seems like a definite improvement. LGTM.
Dec 12 2019
Given we have two implementations, each at different points in the pipeline, it might be constructive to each write down why you each choose said point. I suspect the rationale is hidden by the implementation details.
Actually, early resolution will break tbe tools, not help them. It will definitely break clangd, for example. The user will try to navigate to originally called function base and instead he will be redirected to the function hst.
Can you please be more specific? Please explain why the user would consider this incorrect behavior. If the point of the tool is to allow the user to navigate to the function actually being called, then navigating to base seems incorrect if that's not the function being called. This is just like any other kind of overload resolution - the user likely wants to navigate to the function being called.
Now the user might want an OpenMP-aware tool that understands differences between host and accelerator behavior, how that affects which functions are called, etc. The user might want this for host/device overloads in CUDA too, but this is really an orthogonal concern.
You wrote the code. You called a function in the expression. Now you want to navivate to this function. Clicked on it and instead of the called base you are redirected to hst because AST has the link to hst functiin inthe expression instead of the base.
Sure, but it has that link because that hst function is actually the function being called (assuming that the clangd-using-tool is configured to interpret the code as if compiling for the host). When I click on a function call in a source file, I want to navigate to the function actually being called. I certainly understand that the function being called now depends on compilation context, and the tool in our example is only providing the resolution in one context, but at least it provides one valid answer. An OpenMP-aware tool could navigate to the base function (we do need to preserve information to make this possible). This is just like dealing with some host/device functions in CUDA (where there are separate overloads) - if you click on the function in such a tool you'll probably navigate to the host variant of the function (even if, in some other context, the device overload might be called).
Again, I see this as exactly analogous to overload resolution, or as another example, when calling a function template with specializations. When using such a tool, my experience is that users want to click on the function and navigate to the function actually being called. If, for example, I have a function template with specializations, if the specialized one is being called, I should navigate to the specialization being called (not the base function template).
You wrote wrong context matcher. You has a bug in the base function, which should be called by default everu sw here but the host, and want to fix it. Etc.
While we talk a lot about what you think is bad about this solution it seems we ignore the problems in the current one. Let me summarize a few:
- Take https://godbolt.org/z/XCjQUA where the wrong function is called in the target region (because the "hack" to inject code in the wrong definition is not applicable).
No time for it, just short answers. No definition for the variant - no definition for the base.
Dec 10 2019
Also, check how -ast-print works with your solution. It returns different result than expected because you're transform the code too early. It is incorrect behavior.
This is debatable. AST print does not print the input but the AST, thus what is correct wrt. OpenMP declare variant is nowhere defined but by us.
Arguably, having it print the actually called function and not the base function is preferable. Thus, the new way is actually more informative.
You're completely wrong here! We shall keep the original AST. This is used in several tools and you significantly modify the user code. I consider it a real issue here.
Dec 9 2019
We restricted it for now to function definitions so we don't need to define the mangling as you cannot expect linking. (I did this to get it in TR8 while I figured it will solve all our math.h problems already).
However, we need to avoid collisions with user code, e.g., through the use of symbols in the name that are not allowed to be used by the user (I thought "." is one of them).
Okay, but how to we distinguish functions for which there is a declaration and we need the mangling because the user has provided a definition elsewhere, from those for which there is a declaration, and we don't want mangling because we need to link to some system library?
The idea is, declarations inside begin/end declare variant are supposed to be not affected by the begin/end declare variant. That is, if you have declarations you cannot expect variant multi-versioning to happen. Having declarations inside or outside the begin/end declare variant is still fine if they all denote the same function.
Dec 8 2019
Dec 6 2019
But I can implement !setop if you prefer. (And while I'm at it, perhaps !getop too.)
Did you consider introducing something to change the dag operator instead of this? I think it might be cleaner to have some kind of thing which changes the dag operator, and after that, if you wish to !con them, everything works as expected.
Dec 2 2019
Hrmm... the current naming convention predates me, but it wasn't difficult to figure out. If you're familiar with the ISA and the fact that the record forms have mnemonics ending in a '.'. Honestly, replacing 'o' with 'r' doesn't seem all that much better to me, as I still need to know that the 'r' stands for record form and understand what that means. It's not like searching the ISA manual for '.', or 'o', or 'r', is going to buy one much of anything. I am sympathetic to none of these being immediately obvious, and I'd be happy with some fix to that situation.
Nov 19 2019
Thanks for posting patches to fix problems with the APFloat implementation. FWIW, this code is hard to review, from my perspective, because it lacks comments (both your new code and the code it is replacing). As you clearly have gone through this and understand what it's doing, could you add a bunch of comments explaining what's going on? I think that would be very helpful for everyone else.
Nov 17 2019
Nov 6 2019
Do we need this at all? Can't we just emit @llvm.assume calls before the loop if we know the alignment? I've done this in the past and it's worked well (by using __builtin_assume_aligned from Clang).
Nov 5 2019
Please add a test case.
Nov 4 2019
If this is wrong (or buggy), then it will probably break self-hosting on PPC, so then we'll know.