Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compare.py: compute and print 'OVERALL GEOMEAN' aggregate #1289

Merged
merged 1 commit into from
Nov 24, 2021

Conversation

LebedevRI
Copy link
Collaborator

@LebedevRI LebedevRI commented Nov 23, 2021

Despite the wide variety of the features we provide,
some people still have the audacity to complain and demand more.

Concretely, i very often would like to see the overall result
of the benchmark. Is the 'new' better or worse, overall,
over all the non-aggregate time/cpu measurements.

This comes up for me most often when i want to quickly see
what effect some LLVM optimization change has on the benchmark.

The idea is straight-forward, just produce four lists:
wall times for LHS benchmark, CPU times for LHS benchmark,
wall times for RHS benchmark, CPU times for RHS benchmark;
then compute geomean for each one of those four lists,
and compute the two percentage change between

  • geomean wall time for LHS benchmark and geomean wall time for RHS benchmark
  • geomean CPU time for LHS benchmark and geomean CPU time for RHS benchmark
    and voila!

It is complicated by the fact that it needs to graciously handle
different time units, so pandas.Timedelta dependency is introduced.
That is the only library that does not barf upon floating times,
i have tried numpy.timedelta64 (only takes integers)
and python's datetime.timedelta (does not take nanosecons),
and they won't do.

Fixes #1147

@google-cla google-cla bot added the cla: yes label Nov 23, 2021
@LebedevRI LebedevRI requested a review from dmah42 November 23, 2021 22:00
Despite the wide variety of the features we provide,
some people still have the audacity to complain and demand more.

Concretely, i *very* often would like to see the overall result
of the benchmark. Is the 'new' better or worse, overall,
over all the non-aggregate time/cpu measurements.

This comes up for me most often when i want to quickly see
what effect some LLVM optimization change has on the benchmark.

The idea is straight-forward, just produce four lists:
wall times for LHS benchmark, CPU times for LHS benchmark,
wall times for RHS benchmark, CPU times for RHS benchmark;
then compute geomean for each one of those four lists,
and compute the two percentage change between
* geomean wall time for LHS benchmark and geomean wall time for RHS benchmark
* geomean CPU time for LHS benchmark and geomean CPU time for RHS benchmark
and voila!

It is complicated by the fact that it needs to graciously handle
different time units, so pandas.Timedelta dependency is introduced.
That is the only library that does not barf upon floating times,
i have tried numpy.timedelta64 (only takes integers)
and python's datetime.timedelta (does not take nanosecons),
and they won't do.

Fixes google#1147
@dmah42 dmah42 merged commit d6ba952 into google:main Nov 24, 2021
@LebedevRI LebedevRI deleted the compare-geomean branch November 24, 2021 11:07
jackgerrits added a commit to jackgerrits/benchmark that referenced this pull request Nov 29, 2021
google#1289 uses added `pandas` to the `gbench` module in the `tools/` directory. That PR added `pandas` to the root `requirements.txt` but not the `requirements.txt` in the `tools/` directory.
sergiud pushed a commit to sergiud/benchmark that referenced this pull request Jan 13, 2022
Despite the wide variety of the features we provide,
some people still have the audacity to complain and demand more.

Concretely, i *very* often would like to see the overall result
of the benchmark. Is the 'new' better or worse, overall,
over all the non-aggregate time/cpu measurements.

This comes up for me most often when i want to quickly see
what effect some LLVM optimization change has on the benchmark.

The idea is straight-forward, just produce four lists:
wall times for LHS benchmark, CPU times for LHS benchmark,
wall times for RHS benchmark, CPU times for RHS benchmark;
then compute geomean for each one of those four lists,
and compute the two percentage change between
* geomean wall time for LHS benchmark and geomean wall time for RHS benchmark
* geomean CPU time for LHS benchmark and geomean CPU time for RHS benchmark
and voila!

It is complicated by the fact that it needs to graciously handle
different time units, so pandas.Timedelta dependency is introduced.
That is the only library that does not barf upon floating times,
i have tried numpy.timedelta64 (only takes integers)
and python's datetime.timedelta (does not take nanosecons),
and they won't do.

Fixes google#1147
@chfast
Copy link
Contributor

chfast commented Jan 13, 2022

How to use this? I'm doing regular comparison and don't see any "OVERALL GEOMEAN".

@LebedevRI
Copy link
Collaborator Author

How to use this? I'm doing regular comparison and don't see any "OVERALL GEOMEAN".

It's not under any of the options, it's just there. Perhaps the benchmark/tools you use doesn't have it?

@chfast
Copy link
Contributor

chfast commented Jan 14, 2022

Aaa, I was on master branch...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FR] compare.py: compute overall stats - e.g. geomean
3 participants