This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Remove unnecessary assignment in exclusive_scan
ClosedPublic

Authored by ldionne on Sep 6 2019, 6:20 AM.

Diff Detail

Event Timeline

ldionne created this revision.Sep 6 2019, 6:20 AM

That's worse; it replaced 2n assignments with n assignments, n constructions, and n destructions. Interestingly, assigning over __saved is an optimization I should put into MSVC.

My Twitter comment was about the last init = _VSTD::move(tmp) when first == last;

for (;;)
{
    _Tp __tmp = __b(__init, *__first);
    *__result = __init;

    ++__result;
    if (++__first == __last) break;
    __init = _VSTD::move(__tmp);
}

saves the last assignment.

This comment was removed by BillyONeal.
zoecarver added inline comments.
libcxx/include/numeric
343–348

The compiler can optimize a for loop much better than a do-while loop.

I have not run either through tests yet...

I should have done that before posting, tests fail with the 2 iterator version because it doesn't handle the self-aliasing case. This version combines the good parts of both our implementations and passes tests.

if (_UFirst != _ULast) {
    _Ty _Tmp(_Reduce_op(_Val, *_UFirst)); // temp to enable _First == _Dest, also requirement missing
    for (;;) {
        *_UDest = _Val;
        ++_UDest;
        ++_UFirst;
        if (_UFirst == _ULast) {
            break;
        }

        _Val = _STD move(_Tmp); // Requirement missing from N4713
        _Tmp = _Reduce_op(_Val, *_UFirst);
    }
}
ldionne updated this revision to Diff 307387.Nov 24 2020, 9:33 AM

Update with Billy's suggestion

Herald added a reviewer: Restricted Project. · View Herald TranscriptNov 24 2020, 9:33 AM
ldionne added inline comments.Nov 24 2020, 9:34 AM
libcxx/include/numeric
341

@BillyONeal This is equivalent to *_UDest = _Val; in your version, and I think you could instead use *_UDest = STD move(_Val);.

curdeius added inline comments.
libcxx/include/numeric
338–339

Wouldn't you want to move this line into the loop, and so remove the assignment to __tmp at the end of the loop?
It will remove code duplication and should be the same performance-wise.

ldionne added inline comments.Nov 25 2020, 12:16 PM
libcxx/include/numeric
338–339

As Billy mentioned in an earlier comment, this would trade 1 construction, N assignments and 1 destruction for N constructions, and N destructions.

If we think about a string, for example, I believe this version with N assignments would be faster. WDYT?

ldionne accepted this revision as: Restricted Project.Dec 1 2020, 9:49 AM
This revision was not accepted when it landed; it landed in state Needs Review.Dec 1 2020, 9:50 AM
This revision was automatically updated to reflect the committed changes.