This patch adds an initial set of micro benchmarks for the matrix types
extension.
Details
- Reviewers
anemet paquette LuoYuanke SjoerdMeijer
Diff Detail
- Repository
- rOLDT svn-test-suite
- Build Status
Buildable 63990 Build 79172: arc lint + arc unit
Event Timeline
MicroBenchmarks/MatrixTypes/main.cpp | ||
---|---|---|
147 | Why 15 and 19? |
MicroBenchmarks/MatrixTypes/main.cpp | ||
---|---|---|
147 | No particular reason, it could be 17 and 13 or a similar combination around the 16 element range. The intention for those is to also cover some cases where the number of elements isn't a power-of-2 and more unusual combinations. |
Looks decent as an initial commit to me. Two high level questions:
- I haven't looked at these MicroBenchmarks yets in the test-suite, but in general it would be convenient if a benchmarks also does a correctness check. Do you think there would be any value in doing that here? If so, would that easy to add?
- In benchmarking, stable numbers are convenient. Since the input is randomly generated, I was wondering if there could be timing differences depending on different inputs? But I guess not here?
Agreed, that would indeed be convenient. Let me change that.
- In benchmarking, stable numbers are convenient. Since the input is randomly generated, I was wondering if there could be timing differences depending on different inputs? But I guess not here?
I would expect that the difference in the actual FP values would not impact the throughput/latency of floating point units. I don't think there's anything about that in the public Arm Cortex tuning guides. From what I've seen so far on the devices I have access to is that the numbers are relatively stable, although sometimes there are rather large swings for some individual benchmarks (like +100% in runtime for single benchmarks). But my working theory was that this was due to system noise. If there's a real issue, I think we can address it once it appears
Feel free to ignore: "On Subnormal Floating Point and Abnormal Timing"
http://www.ieee-security.org/TC/SP2015/papers-archived/6949a623.pdf
A NaN matrix times a NaN matrix will be slow.
Why 15 and 19?