Page MenuHomePhabricator

[LNT] Fixed incorrect orders behavior in case of miss formatted llvm_project_revision
Needs ReviewPublic

Authored by kpdev42 on Sep 10 2021, 1:11 AM.

Details

Summary

LNT expected that llvm_project_revision is a version in the format 1.2.3. But it is a string and may contain any symbols. Someone may use hex numbers, dashes, etc. If llvm_project_revision does not contain any digit at all the behavior is undefined and caused inconsistent DB. hash(str) is a workaround to fix this behavior. Note it does not fix the sorting.

Diff Detail

Event Timeline

kpdev42 created this revision.Sep 10 2021, 1:11 AM
kpdev42 requested review of this revision.Sep 10 2021, 1:11 AM

If the goal is to prevent internal error when the version is not made of the expected dotted version, I'd rather LNT error out when importing a report if the version is invalid.

This code wont change previous behavior for versions which contains digits.
In this commit it is intended to prevent a situation when DB became silently inconsistent if llvm_project_revision does not contain any digit.
e.g: "alpha", "beta", "private", "main", "test", etc

This code wont change previous behavior for versions which contains digits.
In this commit it is intended to prevent a situation when DB became silently inconsistent if llvm_project_revision does not contain any digit.
e.g: "alpha", "beta", "private", "main", "test", etc

My point is that llvm_project_revision should only contain digits and dots. We should give an error if that's not the case rather than trying to avoid DB inconsistencies.

One real problem that we are having is that the most common submission format that I know if is the git describe output. We are using that everywhere.

The numeric part is really what we want from that, but folks seem to just shoot the whole string in. Totally breaks the graphs and all sorts of stuff.

I was thinking about special casing that somehow.

<well-know-tag>-123456-HHHHHHHHHH

Today I worked with the LNT instance which is configured to generate reports (format_version=1) with run_order like "76543-toolchain-default-heuristic-change-09-13-2021-14:59-392". The limitation to digits and dots will break too much. It is not an option.

Note the class Order calls convert_revision() from _get_comparison_discriminant(). _get_comparison_discriminant() is used in many places including eq().
So two different orders without digits in llvm_project_revision will be wrongly considered equivalent.
index = orders.index(order) inside _getOrCreateOrder() will return incorrect index too.
This patch fixes these issues.
I think this patch is safe and useful.

Hadn't thought of the git describe case, that would lead to a correct ordering indeed. However if llvm_revision_version is a hash the resulting ordering of runs would be completely useless. That's why I'd rather LNT refusing any revision that's not a version number so that the user has to make an inform decision on how to convert to a version number. For instance for "llvmorg-14-init-4025-g005fc11ebdd6" the user could convert to 14.4025 while for "abc" user could decide to convert to 1.2.3 (letter replaced by its position in alphabet, or by its ascii code). @cmatthews feel free to override me, but I think this will create silent problems, even though it might make some usecase easier (git describe case you mentioned)