twoh (Taewook Oh)
User

Projects

User does not belong to any projects.

User Details

User Since
Feb 4 2016, 3:03 PM (80 w, 5 d)

Recent Activity

Wed, Aug 16

twoh committed rL311037: Use the file name from linemarker for debug info if an input is preprocessed….
Use the file name from linemarker for debug info if an input is preprocessed…
Wed, Aug 16, 12:37 PM
twoh closed D36474: Use the file name from linemarker for debug info if an input is preprocessed source. by committing rL311037: Use the file name from linemarker for debug info if an input is preprocessed….
Wed, Aug 16, 12:37 PM
twoh added a comment to D34796: upporting -f(no)-reorder-functions flag, clang side change.

Friendly ping. @davidxl, I think there's no harm to make clang consistent with gcc for compiler options, and I wonder if you have any concerns that I may miss. Thanks!

Wed, Aug 16, 10:42 AM
twoh added a comment to D36474: Use the file name from linemarker for debug info if an input is preprocessed source..

Ping. Thanks!

Wed, Aug 16, 10:38 AM

Thu, Aug 10

twoh committed rL310642: Make .file directive to have basename only.
Make .file directive to have basename only
Thu, Aug 10, 11:18 AM
twoh closed D36018: Make .file directive to have basename only by committing rL310642: Make .file directive to have basename only.
Thu, Aug 10, 11:18 AM

Tue, Aug 8

twoh updated the diff for D36474: Use the file name from linemarker for debug info if an input is preprocessed source..

Addressing dblaikie's comments. Thanks!

Tue, Aug 8, 11:10 AM
twoh updated the diff for D36474: Use the file name from linemarker for debug info if an input is preprocessed source..

Relocate.

Tue, Aug 8, 10:39 AM
twoh created D36474: Use the file name from linemarker for debug info if an input is preprocessed source..
Tue, Aug 8, 10:37 AM

Mon, Aug 7

twoh added a comment to D36018: Make .file directive to have basename only.

Ping. Thanks!

Mon, Aug 7, 9:35 AM

Sat, Aug 5

twoh added a comment to D34796: upporting -f(no)-reorder-functions flag, clang side change.

I think it is generally good to match what GCC does to not to confuse people.

Sat, Aug 5, 11:47 AM

Thu, Aug 3

twoh committed rL309985: Move unit test to the proper location.
Move unit test to the proper location
Thu, Aug 3, 2:08 PM
twoh closed D36282: Move unit test to the proper location by committing rL309985: Move unit test to the proper location.
Thu, Aug 3, 2:08 PM
twoh created D36282: Move unit test to the proper location.
Thu, Aug 3, 1:13 PM

Mon, Jul 31

twoh updated the diff for D34796: upporting -f(no)-reorder-functions flag, clang side change.

Update documentation. Please let me know if I need to update other documents as well. Thanks!

Mon, Jul 31, 12:01 PM
twoh added a comment to D34795: Supporting -f(no)-reorder-functions flag, llvm side change.

@davidxl I think it is theoretically possible, if the if branch is not taken on line 294. Did I miss something? Thanks!

Mon, Jul 31, 9:55 AM
twoh added a comment to D34795: Supporting -f(no)-reorder-functions flag, llvm side change.

Ping. Thanks!

Mon, Jul 31, 8:51 AM
twoh added a comment to D34796: upporting -f(no)-reorder-functions flag, clang side change.

Ping. Thanks!

Mon, Jul 31, 8:51 AM

Fri, Jul 28

twoh updated the diff for D36018: Make .file directive to have basename only.

Delete unnecessary line from the test.

Fri, Jul 28, 2:11 PM
twoh created D36018: Make .file directive to have basename only.
Fri, Jul 28, 2:03 PM

Jun 29 2017

twoh committed rL306758: Remove redundant copy in recurrences.
Remove redundant copy in recurrences
Jun 29 2017, 4:11 PM
twoh closed D31821: Remove redundant copy in recurrences by committing rL306758: Remove redundant copy in recurrences.
Jun 29 2017, 4:11 PM
twoh updated the diff for D31821: Remove redundant copy in recurrences.

@wmi Good call! I fixed the code per your suggestion. Thanks!

Jun 29 2017, 1:31 PM
twoh updated the diff for D31821: Remove redundant copy in recurrences.

minor fix.

Jun 29 2017, 11:04 AM
twoh updated the diff for D31821: Remove redundant copy in recurrences.

Addressing comments from @wmi. Thank you for the suggestion!

Jun 29 2017, 11:03 AM

Jun 28 2017

twoh added a comment to D34795: Supporting -f(no)-reorder-functions flag, llvm side change.

https://reviews.llvm.org/D34796 is clang side change.

Jun 28 2017, 6:12 PM
twoh created D34796: upporting -f(no)-reorder-functions flag, clang side change.
Jun 28 2017, 6:10 PM
twoh created D34795: Supporting -f(no)-reorder-functions flag, llvm side change.
Jun 28 2017, 6:05 PM
twoh added a comment to D31821: Remove redundant copy in recurrences.

Ping. Thanks!

Jun 28 2017, 11:15 AM

Jun 25 2017

twoh added a comment to D34373: [LV] Optimize for size when vectorizing loops with tiny trip count.

Hello @Ayal, can you please update the comments per @hfinkel's request so that I can accept the patch? Thanks!

Jun 25 2017, 2:58 PM

Jun 20 2017

twoh updated the diff for D31821: Remove redundant copy in recurrences.

Addressing @wmi's concern by limiting the targets to the recurrence cycles that only the last instruction of the recurrence (that feeds the PHI instruction) can have uses outside of the recurrence. This is not an ideal solution yet, and more fundamental solution (such as having recurrence optimization as a separate pass and/or using live range analysis for it) should follow. But still I think it is worth to have it here.

Jun 20 2017, 1:44 PM
twoh added a comment to D34373: [LV] Optimize for size when vectorizing loops with tiny trip count.
In D34373#785485, @twoh wrote:
In D34373#784975, @twoh wrote:

I think this is a right approach, but concerned that the experimental results I shared on D32451 show that it is generally better to not to vectorize the low trip count loops. @Ayal, I wonder if you have any results that this patch actually improves the performance. Thanks!

I know that we're currently missing opportunities for large vectorizable loops with low (static) trip counts. Smaller inner loops are also good candidates for unpredicated vectorization, but we may need to be a bit careful because of modeling inaccuracies and phase-ordering effects (e.g. if we don't vectorize a loop, then we'll end up unrolling it when the unroller runs).

Got it. My concern was for small single-level loops with low trip counts, as I observe them pretty frequently. I have no objection accepting this patch and improve the cost estimator separately.

Do you mean that you see such loops frequently with dynamically-small trip counts, or with static trip counts? I assume the small loops with (small) static trip counts will generally be unrolled.

Jun 20 2017, 11:11 AM
twoh added a comment to D34373: [LV] Optimize for size when vectorizing loops with tiny trip count.
In D34373#784975, @twoh wrote:

I think this is a right approach, but concerned that the experimental results I shared on D32451 show that it is generally better to not to vectorize the low trip count loops. @Ayal, I wonder if you have any results that this patch actually improves the performance. Thanks!

I know that we're currently missing opportunities for large vectorizable loops with low (static) trip counts. Smaller inner loops are also good candidates for unpredicated vectorization, but we may need to be a bit careful because of modeling inaccuracies and phase-ordering effects (e.g. if we don't vectorize a loop, then we'll end up unrolling it when the unroller runs).

Jun 20 2017, 9:17 AM

Jun 19 2017

twoh added a comment to D34373: [LV] Optimize for size when vectorizing loops with tiny trip count.

I think this is a right approach, but concerned that the experimental results I shared on D32451 show that it is generally better to not to vectorize the low trip count loops. @Ayal, I wonder if you have any results that this patch actually improves the performance. Thanks!

Jun 19 2017, 10:13 PM
twoh committed rL305729: Improve profile-guided heuristics to use estimated trip count..
Improve profile-guided heuristics to use estimated trip count.
Jun 19 2017, 11:50 AM
twoh closed D32451: Improve profile-guided heuristics to use estimated trip count. by committing rL305729: Improve profile-guided heuristics to use estimated trip count..
Jun 19 2017, 11:49 AM
twoh updated the diff for D32451: Improve profile-guided heuristics to use estimated trip count..

Addressing comments from @tejohnson. Thanks!

Jun 19 2017, 11:35 AM

Jun 18 2017

twoh added a comment to D32451: Improve profile-guided heuristics to use estimated trip count..

@tejohnson Sorry for the late reply. I was out of internet for days. I have no objection to set the default value to true, but prefer to have it as a separate patch so that we can track the impact on performance of each chance more easily.

Jun 18 2017, 5:20 PM

Jun 12 2017

twoh added a comment to D32451: Improve profile-guided heuristics to use estimated trip count..

Ping. Thanks!

Jun 12 2017, 1:42 PM

Jun 10 2017

twoh added a comment to D31821: Remove redundant copy in recurrences.

@wmi Not at all! Thanks for your comments.

Jun 10 2017, 9:46 PM

Jun 7 2017

twoh added a comment to D31821: Remove redundant copy in recurrences.

ping. thanks!

Jun 7 2017, 2:29 PM

Jun 4 2017

twoh added inline comments to D32451: Improve profile-guided heuristics to use estimated trip count..
Jun 4 2017, 10:10 PM

Jun 2 2017

twoh added inline comments to D32451: Improve profile-guided heuristics to use estimated trip count..
Jun 2 2017, 12:54 PM

May 31 2017

twoh added a comment to D31821: Remove redundant copy in recurrences.

FYI, below is the compile difference in percentage for spec2006:

May 31 2017, 11:11 AM
twoh updated the diff for D31821: Remove redundant copy in recurrences.

Add missing reference.

May 31 2017, 11:09 AM
twoh updated the diff for D31821: Remove redundant copy in recurrences.

Reimplement the feature in peephole optimizer. This patch makes the values in the recurrence cycle to be tied to each other if possible.

May 31 2017, 10:28 AM
twoh added a comment to D32451: Improve profile-guided heuristics to use estimated trip count..

Ping. Thanks!

May 31 2017, 9:50 AM

May 22 2017

twoh added a comment to D32451: Improve profile-guided heuristics to use estimated trip count..

Comparing the existing implementation and this patch, I don't observe noticeable compile time different with Video-SIMDBench. I compiled the benchmark suite 3 times each and the median was 27.12 sec vs 27.55 sec, while the average was 27.96sec vs 27.89 sec. There were some code size differences, but it is simply more vectorization results bigger code size. There's no difference between OptForSize and non-vectorized.

May 22 2017, 3:33 PM

May 19 2017

twoh updated the diff for D32451: Improve profile-guided heuristics to use estimated trip count..

Addressing Ayal's comments.

May 19 2017, 1:20 PM
twoh added a comment to D31821: Remove redundant copy in recurrences.

@wmi Thank you for your reply. I agree on you that we should consider tied operand group, and as @qcolombet mentioned in the previous comment, if tie operand information is already available before this pass, I'd like to discuss where would be the best place to implement this.

May 19 2017, 1:14 PM

May 17 2017

twoh added a comment to D31821: Remove redundant copy in recurrences.

@qcolombet Actually you're right. Having tied operands, this can be done within SSA. Do you have a suggestion for a better place to implement this? I thought what isRevCopyChain does is most close to what this patch does.

May 17 2017, 1:25 PM

May 16 2017

twoh added a comment to D31821: Remove redundant copy in recurrences.

@MatzeB Thank you for your comments. I'm collection compile time numbers now with spec2006. For our internal benchmark, which takes about ~40min to compile, the compile time difference was negligible.

May 16 2017, 2:42 PM
twoh updated the diff for D31821: Remove redundant copy in recurrences.

Addressing comments from @MatzeB.

May 16 2017, 2:42 PM

May 13 2017

twoh added a comment to D32451: Improve profile-guided heuristics to use estimated trip count..

@Ayal Thanks for the pointer.

May 13 2017, 10:38 PM

May 12 2017

twoh added a comment to D31821: Remove redundant copy in recurrences.

Ping. Can someone please let me know what would be the best way to have this patch reviewed? Thanks!

May 12 2017, 2:36 PM

May 11 2017

twoh added a comment to D32451: Improve profile-guided heuristics to use estimated trip count..

I don't have a strong opinion about treating static trip count and profile-based trip count or not. I treated them separately because existing implementation sets OptForSize to true instead of returning false when it makes a profile-based decision. It would be great if someone can explain the reason behind that.

May 11 2017, 10:05 PM

May 7 2017

twoh added a comment to D32451: Improve profile-guided heuristics to use estimated trip count..

Thanks @Ayal for your comments! If the profile-based trip count checking is moved above the line

May 7 2017, 9:59 PM
twoh updated the diff for D32451: Improve profile-guided heuristics to use estimated trip count..

Addresing comments from Ayal.

May 7 2017, 9:49 PM

May 5 2017

twoh added a comment to D31821: Remove redundant copy in recurrences.

Ping.

May 5 2017, 8:46 AM
twoh added a comment to D32451: Improve profile-guided heuristics to use estimated trip count..

Ping. Thanks!

May 5 2017, 8:46 AM

Apr 28 2017

twoh added a comment to D31821: Remove redundant copy in recurrences.

Ping. If you're too busy to review this patch, could you please recommend someone else? Thanks!

Apr 28 2017, 9:40 AM

Apr 27 2017

twoh added a comment to D32451: Improve profile-guided heuristics to use estimated trip count..

So I evaluated loop vectorizer with https://github.com/malvanos/Video-SIMDBench, which is introduced in this paper: http://ieeexplore.ieee.org/document/7723550/. I first built it without profile data and ran linux perf with a command of

Apr 27 2017, 2:36 PM
twoh updated the diff for D32451: Improve profile-guided heuristics to use estimated trip count..

Use profile-based trip count estimation only when trip count cannot be computed.

Apr 27 2017, 2:35 PM

Apr 24 2017

twoh added a comment to D32451: Improve profile-guided heuristics to use estimated trip count..

@mkuper I can do that, but the issue is that the vectorized part doesn't dominated the execution time of the benchmark (profile is pretty flat across multiple functions), so it may not be suitable to tell the impact of vectorization change. Let me think about a good way to measure the it. Thanks!

Apr 24 2017, 2:17 PM
twoh added a comment to D32451: Improve profile-guided heuristics to use estimated trip count..

@mkuper This helps with our internal benchmarks, but I don't have numbers with public benchmarks. What would be the good benchmarks to evaluate the vectorization? Thanks!

Apr 24 2017, 2:00 PM
twoh created D32451: Improve profile-guided heuristics to use estimated trip count..
Apr 24 2017, 1:50 PM

Apr 21 2017

twoh added a comment to D31821: Remove redundant copy in recurrences.

@MatzeB I was considering other passes as well for the sake of simplicity, but I concluded two-address instruction is the right place to do. The problem doesn't appear while we're in SSA, but becomes an issue when we make an decision about which operand register should be the destination register. I think this is why isProfitableToCommute function is in two-address instruction pass.

Apr 21 2017, 11:39 AM
twoh added a comment to D31821: Remove redundant copy in recurrences.

Ping. Thanks!

Apr 21 2017, 11:16 AM

Apr 14 2017

twoh added a comment to D31821: Remove redundant copy in recurrences.

Friendly ping. Thanks!

Apr 14 2017, 10:43 AM
twoh added a reviewer for D31821: Remove redundant copy in recurrences: MatzeB.
Apr 14 2017, 10:43 AM

Apr 12 2017

twoh added a comment to D31827: Add a flag that enables more aggressive load PRE.

With numbers, it doesn't seem worth to turn it on by default. @dberlin, just curious, do you have any rough estimation about how much compilation time can be saved by implementing rank-order based iteration optimization?

Apr 12 2017, 11:07 PM
twoh added a comment to D31827: Add a flag that enables more aggressive load PRE.

Below are the numbers with spec2006 (The numbers in the last row are overall for SPEC score and total for compile time and binary size):

Apr 12 2017, 10:52 PM

Apr 11 2017

twoh added a comment to D31827: Add a flag that enables more aggressive load PRE.

@dberlin Thank you for your comments. I agree on you that it is not easy to invent a good heuristics around this knob, but still I find this useful for some cases that actually matters. Assuming that this doesn't make the implementation much complex, wouldn't it be better for compiler users to have more options?

Apr 11 2017, 3:58 PM
twoh added a comment to D31827: Add a flag that enables more aggressive load PRE.

No perf numbers with public benchmarks yet, but with our internal benchmark I confirmed that aggressive load PRE removes redundant instructions in a hot loop. I'll report spec2006 numbers later. This patch is for quick unlocking of the feature before making more fundamental improvement.

Apr 11 2017, 8:55 AM

Apr 7 2017

twoh created D31827: Add a flag that enables more aggressive load PRE.
Apr 7 2017, 1:11 PM
twoh updated the diff for D31821: Remove redundant copy in recurrences.

small changes to the test.

Apr 7 2017, 11:22 AM
twoh updated the summary of D31821: Remove redundant copy in recurrences.
Apr 7 2017, 11:19 AM
twoh created D31821: Remove redundant copy in recurrences.
Apr 7 2017, 11:19 AM

Mar 14 2017

twoh committed rL297808: NFC: Reformats comments according to the coding guildelines..
NFC: Reformats comments according to the coding guildelines.
Mar 14 2017, 11:41 PM
twoh committed rL297805: [BranchFolding] Merge debug locations from common tail instead of removing.
[BranchFolding] Merge debug locations from common tail instead of removing
Mar 14 2017, 10:57 PM
twoh closed D30226: [BranchFolding] Merge debug locations from common tail instead of removing by committing rL297805: [BranchFolding] Merge debug locations from common tail instead of removing.
Mar 14 2017, 10:57 PM

Mar 13 2017

twoh added a comment to D30226: [BranchFolding] Merge debug locations from common tail instead of removing.

@aprantl Could you please give your opinion on the inlined questions? Thanks!

Mar 13 2017, 9:10 AM

Mar 7 2017

twoh committed rL297194: Use filename in linemarker when compiling preprocessed source (Revised).
Use filename in linemarker when compiling preprocessed source (Revised)
Mar 7 2017, 12:32 PM
twoh closed D30663: Use filename in linemarker when compiling preprocessed source (Revised) by committing rL297194: Use filename in linemarker when compiling preprocessed source (Revised).
Mar 7 2017, 12:32 PM

Mar 6 2017

twoh added a comment to D30226: [BranchFolding] Merge debug locations from common tail instead of removing.

@aprantl Thanks for the review! I added inline questions asking for your opinion about consistency vs guideline.

Mar 6 2017, 2:29 PM
twoh updated the diff for D30663: Use filename in linemarker when compiling preprocessed source (Revised).

addressing comments from @inglorion

Mar 6 2017, 12:24 PM
twoh added inline comments to D30663: Use filename in linemarker when compiling preprocessed source (Revised).
Mar 6 2017, 11:53 AM
twoh created D30663: Use filename in linemarker when compiling preprocessed source (Revised).
Mar 6 2017, 10:03 AM

Mar 3 2017

twoh abandoned D30591: Introduce the feature "linux" for tests only for linux.

@inglorion That makes a lot of sense. Maybe we don't even need -g, because -S -emit-llvm shows source_filename. I'll run some more experiments with relpath/abspath and debug locations, and submit a revised patch. Thanks for the comment!

Mar 3 2017, 11:25 PM
twoh added a comment to D30226: [BranchFolding] Merge debug locations from common tail instead of removing.

Friendly Ping. Thanks!

Mar 3 2017, 3:24 PM
twoh created D30591: Introduce the feature "linux" for tests only for linux.
Mar 3 2017, 1:37 PM

Mar 2 2017

twoh committed rL296825: [DAGCombiner] Fix DebugLoc propagation when folding !(x cc y) -> (x !cc y).
[DAGCombiner] Fix DebugLoc propagation when folding !(x cc y) -> (x !cc y)
Mar 2 2017, 2:10 PM
twoh closed D29813: [DAGCombiner] Fix DebugLoc propagation when folding !(x cc y) -> (x !cc y) by committing rL296825: [DAGCombiner] Fix DebugLoc propagation when folding !(x cc y) -> (x !cc y).
Mar 2 2017, 2:10 PM
twoh updated the diff for D29813: [DAGCombiner] Fix DebugLoc propagation when folding !(x cc y) -> (x !cc y).

Rebase. Fix test/CodeGen/X86/and-sink.ll that added 9days ago.

Mar 2 2017, 2:02 PM

Mar 1 2017

twoh added a comment to D29813: [DAGCombiner] Fix DebugLoc propagation when folding !(x cc y) -> (x !cc y).

Ping. thanks!

Mar 1 2017, 8:30 PM

Feb 27 2017

twoh committed rL296371: [TailDuplicator] Maintain DebugLoc for branch instructions.
[TailDuplicator] Maintain DebugLoc for branch instructions
Feb 27 2017, 11:42 AM
twoh closed D30026: [TailDuplicator] Maintain DebugLoc for branch instructions by committing rL296371: [TailDuplicator] Maintain DebugLoc for branch instructions.
Feb 27 2017, 11:41 AM

Feb 24 2017

twoh added a comment to D30226: [BranchFolding] Merge debug locations from common tail instead of removing.

Yes I observed the case from refresh_potential function in spec2006 429.mcf. The original C code snippet is below:

Feb 24 2017, 10:00 AM

Feb 23 2017

twoh added a comment to D29833: Improve the API of DILocation::getMergedLocation().

@aprantl I like the idea of having a convenience function. I'm afraid I still don't understand @dblaikie on getMergedLocation is better than the proposed convenience function in preventing buggy profile.

Feb 23 2017, 9:13 AM

Feb 22 2017

twoh added a comment to D30026: [TailDuplicator] Maintain DebugLoc for branch instructions.

Friendly Ping. Thanks!

Feb 22 2017, 10:21 PM