This is an archive of the discontinued LLVM Phabricator instance.

[ProfileData] Sort FuncData before iteration to remove non-determinism
ClosedPublic

Authored by mgrang on Feb 8 2019, 4:31 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

mgrang created this revision.Feb 8 2019, 4:31 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 8 2019, 4:31 PM
mgrang added a comment.Feb 8 2019, 4:36 PM

tools/llvm-profdata/instr-remap.test is failing in the reverse iteration bot. See http://lab.llvm.org:8011/builders/reverse-iteration/builds/10546/steps/check_all/logs/stdio.

This is because FunctionData is iterated in InstrProfWriter to output function counts, etc. But for two functions with the same name the iteration order is not defined if we use a DenseMap. Changing this to MapVector resolves this.

Note: We can also sort FunctionData before iteration to fix this issue. But that would mean we would need to sort every time we iterate in order to print. Using a MapVector instead is a better option IMO.

Ping for reviews please.

I think this could roughly double the memory utilization of the writer, which might be problematic because the number of records to write can be high. (@dblaikie did some work on reducing memory usage in this area, he might have hard data about this.)

As the write should only occur once, maybe the better tradeoff would be to sort?

In D57986#1398271, @vsk wrote:

I think this could roughly double the memory utilization of the writer, which might be problematic because the number of records to write can be high. (@dblaikie did some work on reducing memory usage in this area, he might have hard data about this.)

As the write should only occur once, maybe the better tradeoff would be to sort?

Yeah, unless I'm missing something (quite possibly) this seems like the sort of place where "do a bunch of processing, then sort, then iterate" is a valid strategy.

@mgrang - does it look like any use case is emitting more than once & modifying in between? I would find that pretty surprising. I think it's just "build build build, stop building, emit" and adding a sort-before-emit would be OK there.

mgrang updated this revision to Diff 188824.Feb 28 2019, 5:38 PM
mgrang retitled this revision from [ProfileData] Remove non-determinism: Change DenseMap to MapVector to [ProfileData] Sort FuncData before iteration to remove non-determinism.
mgrang added a reviewer: dblaikie.
mgrang updated this revision to Diff 188843.Feb 28 2019, 8:38 PM

I think this patch is right in also sorting the function names: AFAICS StringMap doesn't provide that guarantee.

lib/ProfileData/InstrProfWriter.cpp
393–403 ↗(On Diff #188843)

Please run clang-format, the lambda should only be indented by two additional spaces. Also can you put the explicit types for nameA et al.? That should be StringRef and uint64_t for the hashes.

dblaikie added inline comments.Mar 1 2019, 11:13 AM
lib/ProfileData/InstrProfWriter.cpp
389–397 ↗(On Diff #188824)

I guess you could write this as:

return std::tie(A.first, A.second.first) < std::tie(B.first, B.second.first);

perhaps?

mgrang updated this revision to Diff 188951.Mar 1 2019, 11:35 AM
mgrang marked 2 inline comments as done.Mar 1 2019, 12:04 PM
dblaikie accepted this revision.Mar 1 2019, 4:18 PM
dblaikie added inline comments.
lib/ProfileData/InstrProfWriter.cpp
431–432 ↗(On Diff #188951)

Inconsistent variable naming? (some start with upper case, some with lower case)

Currently I think the LLVM style says upper case is required.

Naming the types here might be reasonable for readability, but I'm not sure.

This revision is now accepted and ready to land.Mar 1 2019, 4:18 PM
mgrang updated this revision to Diff 189002.Mar 1 2019, 4:28 PM

Addressed comments.

mgrang marked an inline comment as done.Mar 1 2019, 4:29 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptMar 1 2019, 4:47 PM