This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Enable the debug entry values feature by default
ClosedPublic

Authored by djtodoro on Jan 28 2020, 1:37 AM.

Details

Summary

This patch enables the debug entry values feature.
We have built the LLVM project, and a few more complex apps, with the patch applied.

Done:

  • Remove the experimental -femit-debug-entry-values option
  • Enable it for x86, arm and aarch64
  • Resolve the test failures
  • Leave the llc experimental option for targets that do not support the CallSiteInfo yet

TODO:

  • Add some notes into the ReleaseNotes

Diff Detail

Event Timeline

djtodoro created this revision.Jan 28 2020, 1:37 AM
djtodoro edited the summary of this revision. (Show Details)Jan 28 2020, 1:41 AM

The Umbrella for the issues related to the call_site/entry_vals features: https://bugs.llvm.org/show_bug.cgi?id=44116.

dstenb added inline comments.Jan 28 2020, 5:40 AM
llvm/lib/CodeGen/MachineLICM.cpp
1370–1371

Would it be possible to move these eraseCallSiteInfo() / moveCallSiteInfo() fixes to parent patch(es)? I guess that would mean that -debug-entry-values would have to be added to a number of tests where you saw those issues, just to be removed by this patch again, but I think it may overall make it easier to review the changes.

djtodoro marked an inline comment as done.Jan 28 2020, 5:53 AM
djtodoro added inline comments.
llvm/lib/CodeGen/MachineLICM.cpp
1370–1371

That makes sense to me.

I did not want to do such thing because we will need to remove the experimental flag, but I see that is the only option to see the reason for the *CallSiteInfo() changes.

dstenb added inline comments.Jan 28 2020, 6:06 AM
llvm/test/DebugInfo/MIR/Hexagon/dbgcall-site-instr-before-bundled-call.mir
1

Is there any clean way we could force call site information, e.g. via a llc flag, for these now-unsupported tests? I guess there might be quite a bit of work of getting call site information up and running on these targets, which AFAIK no one is looking into at the moment, so I wonder if these tests risk laying unsupported here for a while?

(Since I added these Hexagon/Sparc/SystemZ tests, I just want to mention that I wrote them as upstream reproducers for issues I saw with our downstream DSP target. I currently do not have any plans on working on enabling call site information for any of those upstream targets.)

asbirlea removed a subscriber: asbirlea.Jan 28 2020, 9:57 AM

Do you have size measurements for the impact of this change (on, say, a self-host optimized build of clang)?

& I think it's probably best to just change the default value of the flag - rather than changing the default and removing the flag. Giving folks some time with an escape hatch might be important.

Is this patch intended for commit - or as a preview/working area? It seems it's got a bunch of things that could/should be done in independent commits (like the "Fix the assertions regarding updating the CallSiteInfo from the MF" parts, for instance - perhaps that was what was raised by another reviewer earlier).

llvm/include/llvm/Target/TargetMachine.h
240–242

Why is this a target machine feature? What aspect of debug entry values is target specific? Is there something about their architectures that makes it unrepresentable? (that seems drastic given the generality of the feature)

I'm just a bit hesitant to introduce target divergence if it can reasonably be avoided - it'll make debug info comparisons, cross platform work, etc, more difficult.

djtodoro marked 2 inline comments as done.Jan 29 2020, 2:36 AM

Do you have size measurements for the impact of this change (on, say, a self-host optimized build of clang)?

I will share these.

& I think it's probably best to just change the default value of the flag - rather than changing the default and removing the flag. Giving folks some time with an escape hatch might be important.

I agree, thanks for the point!

Is this patch intended for commit - or as a preview/working area? It seems it's got a bunch of things that could/should be done in independent commits (like the "Fix the assertions regarding updating the CallSiteInfo from the MF" parts, for instance - perhaps that was what was raised by another reviewer earlier).

That's why I've marked it as a 'WIP'. I am working on splitting the changes into reasonable parts.

llvm/include/llvm/Target/TargetMachine.h
240–242

Since the debug entry values feature relies on the CallSiteInfo (that holds info about arguments and their transferring registers) which is target dependent, the whole feature is target dependent. The CallSiteInfo is being created during the ISel phase, when it performs the call lowering, which is also target dependent. Perhaps renaming the SupportsDebugEntryValues into something like SupportsCallSiteInfo would make sense here?

The idea of introducing the CallSiteInfo was that it could be used by some other tools or parts of the LLVM project. There was a discussion about that on the llvm-dev.

I'm just a bit hesitant to introduce target divergence if it can reasonably be avoided - it'll make debug info comparisons, cross platform work, etc, more difficult.

Maybe an option for disabling the feature will be useful too for that purpose?

llvm/test/DebugInfo/MIR/Hexagon/dbgcall-site-instr-before-bundled-call.mir
1

Is there any clean way we could force call site information, e.g. via a llc flag, for these now-unsupported tests?

I see.. I do not now for any other way. Maybe we should keep the option, and change the default value only.

I guess there might be quite a bit of work of getting call site information up and running on these targets, which AFAIK no one is looking into at the moment, so I wonder if these tests risk laying unsupported here for a while?

I think so. But we should keep the implementation up-to-date even for the targets that missing support for the CallSiteInfo production.

vsk added a comment.Jan 29 2020, 4:47 PM

I ran CTMark using a stage1 Release clang as my compiler, with and without -Xclang -femit-debug-entry-values set using the 'Os-g.cmake' cache file. The compile time difference on my Mac was in the noise (however the machine was not stabilized for perf measurements):

ConfigCTMark Geomean (seconds)MeanStddevRange# Projects
-Os -g -j114.082816.04678.63911[6.8003, 35.4556]10
-Os -g -Xclang -femit-debug-entry-values -j114.074115.93798.3288[6.8240, 34.5410]10

In the entryvals CTMark run, the total size of DWARF sections in the .o's grew from 52166507 bytes (baseline) to 53249808 bytes [1], a 2% size increase. I'll try to grab numbers for a stage2 build tomorrow.



[1] find . -name \*.o -exec size -m {} \; | grep "Section (__DWARF" | sumcol 4

djtodoro updated this revision to Diff 241429.Jan 30 2020, 6:02 AM
djtodoro retitled this revision from [WIP][DebugInfo] Enable the debug entry values feature by default to [DebugInfo] Enable the debug entry values feature by default.
djtodoro edited the summary of this revision. (Show Details)
  • Make the 'fix the asserts' as a separate patch (D73700)
  • Leave the llc experimental option for targets that do not support the CallSiteInfo yet

@vsk Thanks a lot for sharing this!

(however the machine was not stabilized for perf measurements)

What can be the reason of that?

I have also measure the impact on the size, build time and debug location coverage, on the LLVM built binaries with/out the feature. I see that there is no build time difference, the size of the binaries was increased a bit, but it is also minor.

  1. llc binary
    • with no entry values: .debug_loc 241Mi (21.3% of the whole binary)
    • with entry values: .debug_loc 245Mi (21.5% of the whole binary)
  2. lldb binary
    • with no entry values: .debug_loc 4.17Mi (34.1% of the whole binary)
    • with entry values: .debug_loc 4.23Mi (34.3% of the whole binary)

(Please find additional numbers within the file shared.)

The debug location coverage increased (got with the llvm-locstats --only-formal-parameters --compare on the lldb binaries built with/out the entry values):

Please find the additional numbers regarding the debug location improvements within the measurements-debug-entry-vals.txt.

vsk added a comment.EditedJan 30 2020, 12:46 PM

@vsk Thanks a lot for sharing this!

(however the machine was not stabilized for perf measurements)

What can be the reason of that?

Stabilization entails removing variables from the system that can affect performance. This usually entails killing extraneous background processes and disabling cpu frequency scaling. This can make the system hard to use (e.g. our stabilization script kills the bluetooth daemon, which I need to type).

I have some numbers for stage2 clang builds on macOS:

ConfigdSYM Size (bytes)Aggregate DWARF size in .o's (bytes) [1]# of call site DIEs# of call site parameters
Stage2 RelWithDebInfo168267755480581789274175170
Stage2 RelWithDebInfo + entryvals16915869668069013985417517584620

EDIT: I miscounted earlier. I looked at the number of 'call site entries' instead of the number of 'call site DIEs'. I've updated the table with the correct numbers, which look fine to me.

[1] Just summing the size of all 'DWARF' sections in all .o's in the build directory, as before.

In D73534#1850372, @vsk wrote:

@vsk Thanks a lot for sharing this!

(however the machine was not stabilized for perf measurements)

What can be the reason of that?

Stabilization entails removing variables from the system that can affect performance. This usually entails killing extraneous background processes and disabling cpu frequency scaling. This can make the system hard to use (e.g. our stabilization script kills the bluetooth daemon, which I need to type).

OK, thanks for the info. I see, you said your machine was not stable. I thought the debug_infomade the performance to looking bad. Sorry :)

I have some numbers for stage2 clang builds on macOS:

ConfigdSYM Size (bytes)Aggregate DWARF size in .o's (bytes) [1]# of call site DIEs# of call site parameters
Stage2 RelWithDebInfo168267755480581789274175170
Stage2 RelWithDebInfo + entryvals16915869668069013985417517584620

EDIT: I miscounted earlier. I looked at the number of 'call site entries' instead of the number of 'call site DIEs'. I've updated the table with the correct numbers, which look fine to me.

[1] Just summing the size of all 'DWARF' sections in all .o's in the build directory, as before.

Thanks for sharing this! The numbers look very good to me!

vsk added inline comments.Feb 6 2020, 12:53 PM
llvm/include/llvm/Target/TargetMachine.h
240–242

I think this flag logically makes sense as part of TargetMachine, but the name seems strange. I think all targets support "call site info" in the sense of DW_TAG_call_site. I think this should be renamed to SupportsDebugEntryValues.

llvm/lib/CodeGen/MachineFunction.cpp
871

Could you just add a shouldEmitDebugEntryValues predicate to TargetOptions that checks both flags, and use that everywhere? Also, please use || for logical ops.

djtodoro marked 2 inline comments as done.Feb 7 2020, 2:26 AM
djtodoro added inline comments.
llvm/include/llvm/Target/TargetMachine.h
240–242

It makes sense to me.

djtodoro updated this revision to Diff 243109.Feb 7 2020, 2:27 AM

-Rename the flag
-Make the target-option predicate

vsk accepted this revision.Feb 7 2020, 11:15 AM

LGTM with an updated comment. I'll start a thread on llvm-dev about getting this patch landed soon, as everyone interested in this work may not have been following the patches. Perhaps it'd be best to wait to land this until folks have had 1-2 work days to respond to the thread.

llvm/include/llvm/Target/TargetOptions.h
260

Better description might be: When set to true, the EnableDebugEntryValues option forces production of debug entry values even if the target does not officially support it. Useful for testing purposes only. This flag should never be checked directly, always use \ref ShouldEmitDebugEntryValues instead.

This revision is now accepted and ready to land.Feb 7 2020, 11:15 AM
vsk requested changes to this revision.Feb 7 2020, 12:47 PM

Hm, does this turn on entry values at -O0? Could you please add a clang test that ensures that at -O0 + -gdwarf5 + lldb/gdb tuning, entry values are not emitted?

This revision now requires changes to proceed.Feb 7 2020, 12:47 PM
probinson requested changes to this revision.Feb 7 2020, 5:56 PM
probinson added inline comments.
llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
840

This implementation makes it impractical to enable entry-values while still using SCE tuning. The Sony debugger people will want to be able to do that.

And in fact this is not how tuning controls are supposed to work. There should be a flag in DwarfDebug, initialized in the ctor, which has a *default* based on the tuning; but the flag can still be overridden regardless of tuning, by a command-line option just as many other tuning-related options are handled in the DwarfDebug ctor.

Based on comments by @vsk on the llvm-dev thread, I think you are trying to track all of the following conditions: (1) the target supports *tracking* entry values; (2) someone wants to *track* entry values even if the target doesn't support it by default; (3) whether to *emit* entry values in the DWARF. Currently (3) is not user-selectable because it is hard-coded based on tuning; this is a bug.

I would also rather not have two separate command-line options for (2) and (3), but if that's what we need, so be it.

djtodoro updated this revision to Diff 243525.Feb 10 2020, 6:04 AM
djtodoro marked an inline comment as done.Feb 10 2020, 6:04 AM

-Add a test case for -O0 ensuring we do not emit the entry values
-Add an option within DwarfDebug that enables emitting the entry values
-Add a test case for the SCE tuning + emit-debug-entry-values

djtodoro marked 2 inline comments as done.Feb 10 2020, 6:12 AM
In D73534#1864855, @vsk wrote:

Hm, does this turn on entry values at -O0? Could you please add a clang test that ensures that at -O0 + -gdwarf5 + lldb/gdb tuning, entry values are not emitted?

We do not generate the debug entry values for the -O0. The test is added.

llvm/include/llvm/Target/TargetOptions.h
260

Very good, thanks! :)

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
840

@probinson Thanks for the comment! I agree.

I would also rather not have two separate command-line options for (2) and (3), but if that's what we need, so be it.

I think (2) and (3) are different, so I've introduced the -emit-debug-entry-values within DwarfDebug.

vsk added a comment.Feb 10 2020, 4:02 PM

lgtm with a nit, but please wait for @probinson to confirm that the debugger tuning issue is sorted out.

llvm/test/DebugInfo/X86/loclists-dwp.ll
23

nice :)

llvm/test/DebugInfo/X86/no-entry-values-with-O0.ll
6 ↗(On Diff #243525)

Please add a note here explaining that entry values /would/ get emitted if the 'AllCallsDescribed' DIFlag was set on fn1's subprogram.

probinson accepted this revision.Feb 10 2020, 4:30 PM

Functionally looks fine, see where I think the commentary should be adjusted.

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
423

Emit call-site-param debug info for GDB and LLDB, if the target supports it. Can also be enabled explicitly.

842

Optionally emit call-site-param debug info.
(This is no longer the place where the gdb/lldb decision is made.)

djtodoro updated this revision to Diff 243759.Feb 11 2020, 1:27 AM

-Addressing comments

How does it look now? :)

vsk accepted this revision.Feb 11 2020, 12:53 PM

Looks great. Thanks!

This revision is now accepted and ready to land.Feb 11 2020, 12:53 PM
This revision was automatically updated to reflect the committed changes.
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptFeb 12 2020, 1:29 AM
dstenb added inline comments.Feb 12 2020, 4:00 AM
llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
866

I'm sorry for commenting on this so late, but when now adapting this patch for our downstream target, for which we will not enable call site info by default yet, I noticed that the call site information wasn't added. I think that this should be ShouldEmitDebugEntryValues().

djtodoro marked an inline comment as done.Feb 12 2020, 4:11 AM
djtodoro added inline comments.
llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
866

I thought to restrict the production of the call site info here and only to allow a hand-made testing, but I think you are right, we should use the ShouldEmitDebugEntryValues(). I'll update the patch, thanks!

dstenb added inline comments.Feb 14 2020, 7:52 AM
llvm/lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
866

Thanks! It is perhaps a bit of a corner case, and we're moving towards enabling call site info by default, but I guess this might be useful when experimenting with adding call site info for other targets in the future.

djtodoro updated this revision to Diff 245123.Feb 18 2020, 4:34 AM

-Addressing the latest comments

It looks like this broke the windows lldb bot:

lab.llvm.org:8011/builders/lldb-x64-windows-ninja/builds/13902

Thanks for reporting that, are you sure this was the cause of the failure? I'll revert this while investigate, but it does not seem to me this is related to this patch.

I am not sure this was the cause of the failure, but other than your patches, there was only one other change in the first failing build. I have not had time to fully investigate yet.

I am not sure this was the cause of the failure, but other than your patches, there was only one other change in the first failing build. I have not had time to fully investigate yet.

It looks like the issue was fixed and it was not this change: http://lab.llvm.org:8011/builders/lldb-x64-windows-ninja/builds/13911

Thanks for looking into it!

I’ve already reverted the patch, but I’ll reland it again tomorrow. Thanks.

As a heads up, Linaro's ToolChain Working Group's Linux kernel CI lit up on this change. I see it's been reverted (thank you). Please cc me on the updated patch and I can help test it against the Linux kernel.

Edit: Link: https://ci.linaro.org/job/tcwg_kernel-bisect-llvm-master-arm-mainline-allyesconfig/25/artifact/artifacts/build-a82d3e8a6e67473c94a5ce6345372748e9b61718/03-build_linux/console.log

As a heads up, Linaro's ToolChain Working Group's Linux kernel CI lit up on this change. I see it's been reverted (thank you). Please cc me on the updated patch and I can help test it against the Linux kernel.

Edit: Link: https://ci.linaro.org/job/tcwg_kernel-bisect-llvm-master-arm-mainline-allyesconfig/25/artifact/artifacts/build-a82d3e8a6e67473c94a5ce6345372748e9b61718/03-build_linux/console.log

Its hard to tell from the backtrace, but looking at the code, I think this might be a bug that sneaked in when I did D70431. Sorry if that is the case!

ARMBaseInstrInfo::isAddImmediate() does a getReg() without any isReg() guard:

Optional<RegImmPair> ARMBaseInstrInfo::isAddImmediate(const MachineInstr &MI,                                                                                                                                                                                                                                                  
                                                      Register Reg) const {                                                                                                                                                                                                                                                    
[...]
  // TODO: Handle cases where Reg is a super- or sub-register of the                                                                                                                                                                                                                                                           
  // destination register.
  if (Reg != MI.getOperand(0).getReg())
    return None;

As a heads up, Linaro's ToolChain Working Group's Linux kernel CI lit up on this change. I see it's been reverted (thank you). Please cc me on the updated patch and I can help test it against the Linux kernel.

Edit: Link: https://ci.linaro.org/job/tcwg_kernel-bisect-llvm-master-arm-mainline-allyesconfig/25/artifact/artifacts/build-a82d3e8a6e67473c94a5ce6345372748e9b61718/03-build_linux/console.log

Its hard to tell from the backtrace, but looking at the code, I think this might be a bug that sneaked in when I did D70431. Sorry if that is the case!

ARMBaseInstrInfo::isAddImmediate() does a getReg() without any isReg() guard:

Optional<RegImmPair> ARMBaseInstrInfo::isAddImmediate(const MachineInstr &MI,                                                                                                                                                                                                                                                  
                                                      Register Reg) const {                                                                                                                                                                                                                                                    
[...]
  // TODO: Handle cases where Reg is a super- or sub-register of the                                                                                                                                                                                                                                                           
  // destination register.
  if (Reg != MI.getOperand(0).getReg())
    return None;

@dstenb No problem, we have already addressed the issue. Thanks to @vsk and @nickdesaulniers! I'll update the patch.

djtodoro updated this revision to Diff 245345.Feb 19 2020, 1:05 AM
  • Address the issue with ARM describeLoadedValue() (thanks to @vsk, I've reduced the test llvm/test/DebugInfo/MIR/ARM/dbgcallsite-noreg-is-imm-check.mir)

@dstenb No problem, we have already addressed the issue. Thanks to @vsk and @nickdesaulniers! I'll update the patch.

Great, thanks!

  • Address the issue with ARM describeLoadedValue() (thanks to @vsk, I've reduced the test llvm/test/DebugInfo/MIR/ARM/dbgcallsite-noreg-is-imm-check.mir)

I'd like to help test this, but arc patch D73534 is now failing with merge conflicts. Can you please rebase this on master?

  • Address the issue with ARM describeLoadedValue() (thanks to @vsk, I've reduced the test llvm/test/DebugInfo/MIR/ARM/dbgcallsite-noreg-is-imm-check.mir)

I'd like to help test this, but arc patch D73534 is now failing with merge conflicts. Can you please rebase this on master?

I’ve already pushed this. Please rebase on the latest commits.

I’ve already pushed this. Please rebase on the latest commits.

Ah faff707db82d7db12fcd9f7826b8741261230e63. Cool, I can no longer reproduce the previous issue. Thanks for the quick fix.

There's a failure on an ARM 2-stage buildbot which looks related to this: http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-full-sh/builds/3920/steps/build%20stage%202/logs/stdio

I can reproduce the crash on the buildbot machine, so hopefully I'll have a standalone reproducer for you soon.

@ostannard Thanks for reporting that! Please share the case.

Reverted again with rG2f215cf36adc. The investigation is needed.

Reproducer for that crash: P8198 P8199

Nice! Thanks!

mstorsjo added a subscriber: mstorsjo.EditedMar 10 2020, 9:59 AM

This broke compiling for mingw, repro.c:

a(short);
b() { a(1); }

clang -target x86_64-w64-mingw32 -c repro.c -g -O2, which gives Assertion '!MI.isMoveImmediate() && "Unexpected MoveImm instruction"' failed.

This broke compiling for mingw, repro.c:

a(short);
b() { a(1); }

clang -target x86_64-w64-mingw32 -c repro.c -g -O2, which gives Assertion '!MI.isMoveImmediate() && "Unexpected MoveImm instruction"' failed.

Thanks for reporting this! Since this is the case of the X86::MOV16ri the D75326 will solve this issue.

This broke compiling for mingw, repro.c:

a(short);
b() { a(1); }

clang -target x86_64-w64-mingw32 -c repro.c -g -O2, which gives Assertion '!MI.isMoveImmediate() && "Unexpected MoveImm instruction"' failed.

Thanks for reporting this! Since this is the case of the X86::MOV16ri the D75326 will solve this issue.

The alternative is D75974.

This broke compiling for mingw, repro.c:

a(short);
b() { a(1); }

clang -target x86_64-w64-mingw32 -c repro.c -g -O2, which gives Assertion '!MI.isMoveImmediate() && "Unexpected MoveImm instruction"' failed.

Thanks for reporting this! Since this is the case of the X86::MOV16ri the D75326 will solve this issue.

The alternative is D75974.

The D75974 is commited.

Thanks for reporting this! Since this is the case of the X86::MOV16ri the D75326 will solve this issue.

The alternative is D75974.

The D75974 is commited.

Thanks!

Hi,

I see another crash with this change when building gdb.

Reduced test case:
struct type *a(type *, type *, long, long);
enum b {};
static int empty_array(type *, int c) { type *d = a(__null, d, c, c - 1); }
long e;
b f() { empty_array(0, e); }

Repros with:
clang -cc1 -triple x86_64-linux-gnu -emit-obj -disable-free -mrelocation-model pic -pic-level 2 -pic-is-pie -mthread-model posix -mframe-pointer=all -mconstructor-aliases -munwind-tables -dwarf-column-info -debug-info-kind=limited -dwarf-version=4 -debugger-tuning=gdb -O2 -x c++

Chromium bug: https://bugs.chromium.org/p/chromium/issues/detail?id=1060788

Hi,

I see another crash with this change when building gdb.

Reduced test case:
struct type *a(type *, type *, long, long);
enum b {};
static int empty_array(type *, int c) { type *d = a(__null, d, c, c - 1); }
long e;
b f() { empty_array(0, e); }

Repros with:
clang -cc1 -triple x86_64-linux-gnu -emit-obj -disable-free -mrelocation-model pic -pic-level 2 -pic-is-pie -mthread-model posix -mframe-pointer=all -mconstructor-aliases -munwind-tables -dwarf-column-info -debug-info-kind=limited -dwarf-version=4 -debugger-tuning=gdb -O2 -x c++

Chromium bug: https://bugs.chromium.org/p/chromium/issues/detail?id=1060788

Oh, that assertion is related to D75036 which I did. I can have a look at that.

Hi,

I see another crash with this change when building gdb.

Reduced test case:
struct type *a(type *, type *, long, long);
enum b {};
static int empty_array(type *, int c) { type *d = a(__null, d, c, c - 1); }
long e;
b f() { empty_array(0, e); }

Repros with:
clang -cc1 -triple x86_64-linux-gnu -emit-obj -disable-free -mrelocation-model pic -pic-level 2 -pic-is-pie -mthread-model posix -mframe-pointer=all -mconstructor-aliases -munwind-tables -dwarf-column-info -debug-info-kind=limited -dwarf-version=4 -debugger-tuning=gdb -O2 -x c++

Chromium bug: https://bugs.chromium.org/p/chromium/issues/detail?id=1060788

Oh, that assertion is related to D75036 which I did. I can have a look at that.

I wrote a PR for that: https://bugs.llvm.org/show_bug.cgi?id=45181.

I'll see if I'm able to put together something for that today.

thakis added a subscriber: thakis.Mar 13 2020, 12:37 PM

This makes clang crash, repro at https://bugs.chromium.org/p/chromium/issues/detail?id=1061533#c4 I'll revert for now.

I'm also seeing timeouts and use-after-frees on other bots; maybe they're related (or maybe not).

Reduced repro at https://bugs.chromium.org/p/chromium/issues/detail?id=1061533#c7

Scrolling up, it looks like this was reverted yesterday already. I'm confused why this wasn't reverted yesterday; all work I did in bisecting and creducing was a duplicate of what happened yesterday :/

Why reverting this one? Is it a different assertion?

The D75036 was the problem with the previous issue reported.

Since we landed the fix for the issue related to the D75036, I'll reland this again.

Re-enabled with the d9b962100942.

If we face a failure again, since this enables the whole feature, I recommend reverting smaller pieces of the feature that was causing the problem, rather than reverting this patch.

Still causing a crash using a previously supplied test https://bugs.chromium.org/p/chromium/issues/detail?id=1061533#c7. Any reason this was not tested with a previous repro?

Oh sorry, I thought it all has been fixed, since all the tests pass.

We should revert the D75036 until we fix all the issues.

Oh sorry, I thought it all has been fixed, since all the tests pass.

We should revert the D75036 until we fix all the issues.

@dstenb do you agree?

vsk added a comment.Mar 20 2020, 10:55 AM

Oh sorry, I thought it all has been fixed, since all the tests pass.

We should revert the D75036 until we fix all the issues.

@dstenb do you agree?

This was an oversight on my part, the fix from D76164 was not complete. I need a few more minutes to write a test, but I should be able to fix this very soon. If we can fix-forward, that would be easier.

In D73534#1934105, @vsk wrote:

Oh sorry, I thought it all has been fixed, since all the tests pass.

We should revert the D75036 until we fix all the issues.

@dstenb do you agree?

This was an oversight on my part, the fix from D76164 was not complete. I need a few more minutes to write a test, but I should be able to fix this very soon. If we can fix-forward, that would be easier.

@vsk That would be great. Thanks a lot!

vsk added a comment.Mar 20 2020, 11:19 AM

@manojgupta I apologize for not catching this earlier. The issue should really be fixed by 636665331bbd4c369a9f33c4d35fb9a863c94646.

Thanks for the quick fix, verified that the crash is fixed on trunk.