summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorLars Wirzenius <liw@liw.fi>2014-03-02 20:20:16 +0000
committerLars Wirzenius <liw@liw.fi>2014-03-02 20:20:16 +0000
commitc463c8bd575dbd1ec64ee028467e5e638e762e88 (patch)
treeb3dcf32ba9bf34d1156d8dd40132d9a618fa31f5
parentd7f4b168eb3148d3943eea3fa4d759d3d42a22b0 (diff)
parent5fbefbf041a6a8222ff64763ca5aabbfd5449396 (diff)
downloadobnam-c463c8bd575dbd1ec64ee028467e5e638e762e88.tar.gz
Merge branch 'liw/benchmarks'
-rw-r--r--NEWS5
-rw-r--r--README.benchmarks79
-rwxr-xr-xobnam-benchmark527
-rwxr-xr-xobnam-benchmark-summary136
-rw-r--r--obnam-benchmark.1.in133
-rw-r--r--setup.py2
6 files changed, 565 insertions, 317 deletions
diff --git a/NEWS b/NEWS
index 2a0d17d8..f5c85a30 100644
--- a/NEWS
+++ b/NEWS
@@ -38,6 +38,11 @@ Version 1.7, released UNRELEASED
future releases. The error codes are meant to be easy to search for,
and will allow error messages to be translated in the future.
+* The `obnam-benchmark` program got rewritten so that it'll do
+ something useful, but at the same time, it is no longer useful as a
+ general tool. It is now expected to be run from the Obnam source
+ tree (a cloned git repository), and isn't installed anymore.
+
Bug fixes:
* Obnam now creates a `trustdb.gpg` in the temporary GNUPGHOME it uses
diff --git a/README.benchmarks b/README.benchmarks
new file mode 100644
index 00000000..30241aa9
--- /dev/null
+++ b/README.benchmarks
@@ -0,0 +1,79 @@
+README for Obnam benchmarks
+===========================
+
+I've tried a number of approaches to benchmarks with Obnam over the
+years, but no approach has prevailed. This README describes my current
+approach in the hope that it will evolve into something useful.
+
+Ideally I would optimise Obnam for real-world use, but for now, I will
+be content with the simple synthetic benchmarks described here.
+
+Lars Wirzenius
+
+Overview
+--------
+
+I do not want a large number of different benchmarks, at least for
+now. I want a small set that I can and will run systematically, at
+least for each release. Too much data can be just as bad as too little
+data: if it takes too much effort to analyse the data, then that eats
+up from development time. That said, hard numbers are better than
+guesses.
+
+I've decided on the following data sets:
+
+* 10^6 empty files, spread over a 1000 directories with 1000 files
+ each. Obnam has (at least with repository format 6) a high overhead
+ per file, regardless of the contents of the file, and this is a
+ pessimal situation for that.
+
+ The interesting numbers here are: number of files backed up per
+ second, and size of backup repository.
+
+* A single directory with a single file, 2^12 bytes (1 TiB) long.
+ little repetition in the data. This benchmarks the opposite end of
+ the spectrum of number of files vs size of data.
+
+ The interesting numbers here are number of bytes of actual file data
+ backed up per second and size of backup repository.
+
+Later, I may add more data sets. An intriguing idea would be to
+generate data from [Summain] manifests, where everything except the
+actual file data is duplicated from anonymised manifests captured from
+real systems.
+
+[Summain]: http://liw.fi/summain/
+
+For each data set, I will run the following operations:
+
+* An initial backup.
+* A no-op second generation backup.
+* A restore of the second generation, with `obnam restore`.
+* A restore of the second generation, with `obnam mount`.
+
+I will measure the following about each operation:
+
+* Total wall-clock time.
+* Maximum VmRSS memory, as logged by Obnam itself.
+
+I will additionally capture Python profiler output of each operation,
+to allow easier analysis of where time is going.
+
+I will run the benchmarks without compression or encryption, at least
+for now, and in general use the default settings built into Obnam for
+everything, unless there's a need to tweak them to make the benchmark
+work at all.
+
+Benchmark results
+-----------------
+
+A benchmark run will produce the following:
+
+* A JSON file with the measurements given above.
+* A Python profiling file for each operation for each dataset.
+ (Two datasets times four operations gives eight profiles.)
+
+I will run the benchmark for each release of Obnam, starting with
+Obnam 1.6.1. I will not care about Larch versions at this time: I will
+use the installed version. I will store the resulting data sets in a
+separate git repository for reference.
diff --git a/obnam-benchmark b/obnam-benchmark
index d5dc3dc1..6d3ccd16 100755
--- a/obnam-benchmark
+++ b/obnam-benchmark
@@ -1,6 +1,6 @@
#!/usr/bin/env python
#
-# Copyright 2010, 2011 Lars Wirzenius
+# Copyright 2014 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
@@ -16,203 +16,364 @@
# along with this program. If not, see <http://www.gnu.org/licenses/>.
-import cliapp
-import ConfigParser
-import glob
-import logging
+import json
import os
+import platform
import shutil
-import socket
-import subprocess
+import stat
+import sys
import tempfile
+import time
+import cliapp
+import Crypto.Cipher.ARC4
+import larch
+import ttystatus
-class ObnamBenchmark(cliapp.Application):
- default_sizes = ['1g/100m']
- keyid = '3B1802F81B321347'
- opers = ('backup', 'restore', 'list_files', 'forget')
+class BinaryJunkGenerator(object):
- def add_settings(self):
- self.settings.string(['results'], 'put results under DIR (%default)',
- metavar='DIR', default='../benchmarks')
- self.settings.string(['obnam-branch'],
- 'use DIR as the obnam branch to benchmark '
- '(default: %default)',
- metavar='DIR',
- default='.')
- self.settings.string(['larch-branch'],
- 'use DIR as the larch branch (default: %default)',
- metavar='DIR',
- )
- self.settings.string(['seivot-branch'],
- 'use DIR as the seivot branch '
- '(default: installed seivot)',
- metavar='DIR')
- self.settings.boolean(['with-encryption'],
- 'run benchmark using encryption')
-
- self.settings.string(['profile-name'],
- 'short name for benchmark scenario',
- default='unknown')
- self.settings.string_list(['size'],
- 'add PAIR to list of sizes to '
- 'benchmark (e.g., 10g/1m)',
- metavar='PAIR')
- self.settings.bytesize(['file-size'], 'how big should files be?',
- default=4096)
- self.settings.integer(['generations'],
- 'benchmark N generations (default: %default)',
- metavar='N',
- default=5)
- self.settings.boolean(['use-sftp-repository'],
- 'access the repository over SFTP '
- '(requires ssh to localhost to work)')
- self.settings.boolean(['use-sftp-root'],
- 'access the live data over SFTP '
- '(requires ssh to localhost to work)')
- self.settings.integer(['sftp-delay'],
- 'add artifical delay to sftp transfers '
- '(in milliseconds)')
- self.settings.string(['description'], 'describe benchmark')
- self.settings.boolean(['drop-caches'], 'drop kernel buffer caches')
- self.settings.string(['seivot-log'], 'seivot log setting')
-
- self.settings.boolean(['verify'], 'verify restores')
+ key = b'obnam-benchmark'
+ data = b'fake live data' * 1024
- def process_args(self, args):
- self.require_tmpdir()
-
- obnam_revno = self.bzr_revno(self.settings['obnam-branch'])
- if self.settings['larch-branch']:
- larch_revno = self.bzr_revno(self.settings['larch-branch'])
- else:
- larch_revno = None
-
- results = self.results_dir(obnam_revno, larch_revno)
-
- obnam_branch = self.settings['obnam-branch']
- if self.settings['seivot-branch']:
- seivot = os.path.join(self.settings['seivot-branch'], 'seivot')
- else:
- seivot = 'seivot'
-
- generations = self.settings['generations']
-
- tempdir = tempfile.mkdtemp()
- env = self.setup_gnupghome(tempdir)
-
- sizes = self.settings['size'] or self.default_sizes
- logging.debug('sizes: %s' % repr(sizes))
-
- file_size = self.settings['file-size']
- profile_name = self.settings['profile-name']
-
- for pair in sizes:
- initial, inc = self.parse_size_pair(pair)
-
- msg = 'Profile %s, size %s inc %s' % (profile_name, initial, inc)
- print
- print msg
- print '-' * len(msg)
- print
-
- obnam_profile = os.path.join(results,
- 'obnam--%(op)s-%(gen)s.prof')
- output = os.path.join(results, 'obnam.seivot')
- if os.path.exists(output):
- print ('%s already exists, not re-running benchmark' %
- output)
- else:
- argv = [seivot,
- '--obnam-branch', obnam_branch,
- '--incremental-data', inc,
- '--file-size', str(file_size),
- '--obnam-profile', obnam_profile,
- '--generations', str(generations),
- '--profile-name', profile_name,
- '--sftp-delay', str(self.settings['sftp-delay']),
- '--initial-data', initial,
- '--output', output]
- if self.settings['larch-branch']:
- argv.extend(['--larch-branch', self.settings['larch-branch']])
- if self.settings['seivot-log']:
- argv.extend(['--log', self.settings['seivot-log']])
- if self.settings['drop-caches']:
- argv.append('--drop-caches')
- if self.settings['use-sftp-repository']:
- argv.append('--use-sftp-repository')
- if self.settings['use-sftp-root']:
- argv.append('--use-sftp-root')
- if self.settings['with-encryption']:
- argv.extend(['--encrypt-with', self.keyid])
- if self.settings['description']:
- argv.extend(['--description',
- self.settings['description']])
- if self.settings['verify']:
- argv.append('--verify')
- self.runcmd(argv, env=env)
-
- shutil.rmtree(tempdir)
-
- def require_tmpdir(self):
- if 'TMPDIR' not in os.environ:
- raise cliapp.AppException('TMPDIR is not set. '
- 'You would probably run out of space '
- 'on /tmp.')
- if not os.path.exists(os.environ['TMPDIR']):
- raise cliapp.AppException('TMPDIR points at a non-existent '
- 'directory %s' % os.environ['TMPDIR'])
- logging.debug('TMPDIR=%s' % repr(os.environ['TMPDIR']))
+ def __init__(self):
+ self.cipher = Crypto.Cipher.ARC4.new(self.key)
+ self.buffer = ''
+
+ def get(self, num_bytes):
+ n = 0
+ result = []
+ while n < num_bytes:
+ if not self.buffer:
+ self.buffer = self.cipher.encrypt(self.data)
+
+ part = self.buffer[:num_bytes - n]
+ result.append(part)
+ n += len(part)
+ self.buffer = self.buffer[len(part):]
+
+ return ''.join(result)
+
+
+class StepInfo(object):
+
+ def __init__(self, label):
+ self.label = label
+ self.info = {
+ 'step': label,
+ }
+
+ def add_info(self, key, value):
+ self.info[key] = value
+
+ def stop_timer(self):
+ self.end = time.time()
+
+ def __enter__(self):
+ self.start = time.time()
+ self.end = None
+ return self
+
+ def __exit__(self, exc_type, exc_val, exc_tb):
+ if exc_type is None:
+ if self.end is None:
+ self.end = time.time()
+ self.info['duration'] = self.end - self.start
+ return False
- @property
- def hostname(self):
- return socket.gethostname()
+
+class ObnamBenchmark(object):
+
+ def __init__(self, settings, results_dir, srctree, junk_generator):
+ self.settings = settings
+ self.results_dir = results_dir
+ self.srctree = srctree
+ self.junk_generator = junk_generator
+
+ @classmethod
+ def add_settings(self, settings):
+ pass
@property
- def obnam_branch_name(self):
- obnam_branch = os.path.abspath(self.settings['obnam-branch'])
- return os.path.basename(obnam_branch)
-
- def results_dir(self, obnam_revno, larch_revno):
- parent = self.settings['results']
- parts = [self.hostname, self.obnam_branch_name, str(obnam_revno)]
- if larch_revno:
- parts.append(str(larch_revno))
- prefix = os.path.join(parent, "-".join(parts))
-
- get_path = lambda counter: "%s-%d" % (prefix, counter)
-
- counter = 0
- dirname = get_path(counter)
- while os.path.exists(dirname):
- counter += 1
- dirname = get_path(counter)
- os.makedirs(dirname)
- return dirname
-
- def setup_gnupghome(self, tempdir):
- gnupghome = os.path.join(tempdir, 'gnupghome')
- shutil.copytree('test-gpghome', gnupghome)
+ def benchmark_name(self):
+ s = self.__class__.__name__
+ if s.endswith('Benchmark'):
+ s = s[:-len('Benchmark')]
+ return s
+
+ def result_filename(self, label, suffix):
+ return os.path.join(
+ self.results_dir,
+ '%s-%s%s' % (self.benchmark_name, label, suffix))
+
+ def run(self):
+ self.tempdir = tempfile.mkdtemp()
+ self.live_data = self.create_live_data_dir()
+ self.repo = self.create_repo()
+ step_infos = []
+
+ steps = [
+ ('create-live-data', self.create_live_data),
+ ('initial-backup', self.backup),
+ ('no-op-backup', self.backup),
+ ('obnam-verify', self.obnam_verify),
+ ('obnam-mount', self.obnam_mount),
+ ('cleanup',
+ lambda si:
+ self.cleanup(si) if self.settings['cleanup'] else None),
+ ]
+
+ for label, method in steps:
+ print ' %s' % label
+ with StepInfo(label) as step_info:
+ method(step_info)
+ step_infos.append(step_info)
+
+ return {
+ 'steps': [step_info.info for step_info in step_infos],
+ }
+
+ def create_live_data_dir(self):
+ live_data = os.path.join(self.tempdir, 'live-data')
+ os.mkdir(live_data)
+ return live_data
+
+ def create_repo(self):
+ repo = os.path.join(self.tempdir, 'repo')
+ os.mkdir(repo)
+ return repo
+
+ def create_live_data(self, step_info):
+ # Subclasses MUST override this.
+ raise NotImplementedError()
+
+ def backup(self, step_info):
+ self.run_obnam(
+ ['backup', '-r', self.repo, self.live_data], step_info.label)
+ step_info.stop_timer()
+ step_info.add_info('repo-size', self.sum_of_file_sizes(self.repo))
+ step_info.add_info(
+ 'live-data-size', self.sum_of_file_sizes(self.live_data))
+
+ def obnam_verify(self, step_info):
+ self.run_obnam(
+ ['verify', '-r', self.repo],
+ step_info.label)
+
+ def obnam_mount(self, step_info):
+ mount = os.path.join(self.tempdir, 'mount')
+ os.mkdir(mount)
+
+ self.run_obnam(
+ ['mount', '-r', self.repo, '--to', mount],
+ step_info.label)
+
+ cliapp.runcmd(['tar', '-cf', '/dev/null', mount + '/.'])
+ time.sleep(1)
+
+ try:
+ cliapp.runcmd(['fusermount', '-u', mount])
+ except cliapp.AppException as e:
+ print 'ERROR from fusermount: %s' % str(e)
+
+ def cleanup(self, step_info):
+ shutil.rmtree(self.tempdir)
+
+ def run_obnam(self, args, label):
+ base_command = [
+ self.settings['obnam-cmd'],
+ '--no-default-config',
+ '--log', self.result_filename(label, '.log'),
+ '--log-level', 'debug',
+ ]
env = dict(os.environ)
- env['GNUPGHOME'] = gnupghome
- return env
+ env['OBNAM_PROFILE'] = self.result_filename(label, '.prof')
+ cliapp.runcmd(base_command + args, env=env, cwd=self.srctree)
- def bzr_revno(self, branch):
- p = subprocess.Popen(['bzr', 'revno'], cwd=branch,
- stdout=subprocess.PIPE)
- out, err = p.communicate()
- if p.returncode != 0:
- raise cliapp.AppException('bzr failed')
+ def sum_of_file_sizes(self, root_dir):
+ total = 0
+ for dirname, subdirs, basenames in os.walk(root_dir):
+ for basename in basenames:
+ pathname = os.path.join(dirname, basename)
+ st = os.lstat(pathname)
+ if stat.S_ISREG(st.st_mode):
+ total += st.st_size
+ return total
- revno = out.strip()
- logging.debug('bzr branch %s has revno %s' % (branch, revno))
- return revno
- def parse_size_pair(self, pair):
- return pair.split('/', 1)
+class EmptyFilesBenchmark(ObnamBenchmark):
+ files_per_dir = 1000
-if __name__ == '__main__':
- ObnamBenchmark().run()
+ @classmethod
+ def add_settings(self, settings):
+ settings.integer(
+ ['empty-files-count'],
+ 'number of empty files for %s' % self.__class__.__name__,
+ default=10**6)
+
+ @property
+ def num_files(self):
+ return self.settings['empty-files-count']
+
+ def create_live_data(self, step_info):
+ step_info.add_info('empty-files-count', self.num_files)
+ for i in range(self.num_files):
+ subdir = os.path.join(
+ self.live_data, 'dir-%d' % (i / self.files_per_dir))
+ if (i % self.files_per_dir) == 0:
+ os.mkdir(subdir)
+ filename = os.path.join(subdir, 'file-%d' % i)
+ with open(filename, 'w'):
+ pass
+
+
+class SingleLargeFileBenchmark(ObnamBenchmark):
+
+ @classmethod
+ def add_settings(self, settings):
+ settings.bytesize(
+ ['single-large-file-size'],
+ 'size of file to create for %s' % self.__class__.__name__,
+ default='1TB')
+
+ @property
+ def file_size(self):
+ return self.settings['single-large-file-size']
+
+ def create_live_data(self, step_info):
+ step_info.add_info('single-large-file-size', self.file_size)
+ filename = os.path.join(self.live_data, 'file.dat')
+ with open(filename, 'w') as f:
+ n = 0
+ max_chunk_size = 2**10
+ ts = ttystatus.TerminalStatus()
+ ts['written'] = 0
+ ts['total'] = self.file_size
+ ts.format(
+ '%ElapsedTime() '
+ 'writing live data: %ByteSize(written) of %ByteSize(total) '
+ '(%PercentDone(written,total))')
+ while n < self.file_size:
+ num_bytes = min(max_chunk_size, self.file_size - n)
+ data = self.junk_generator.get(num_bytes)
+ f.write(data)
+ n += len(data)
+ ts['written'] = n
+ ts.clear()
+ ts.finish()
+
+
+class ObnamBenchmarkRunner(cliapp.Application):
+
+ benchmark_classes = [
+ EmptyFilesBenchmark,
+ SingleLargeFileBenchmark,
+ ]
+
+ def add_settings(self):
+ self.settings.string(
+ ['obnam-cmd'],
+ 'use CMD as the argv[0] to invoke obnam',
+ metavar='CMD',
+ default='./obnam')
+
+ self.settings.string(
+ ['obnam-treeish'],
+ 'run Obnam from TREEISH in its git repository',
+ metavar='TREEISH',
+ default='HEAD')
+
+ self.settings.string(
+ ['results-dir'],
+ 'put results in DIR',
+ metavar='DIR',
+ default='.')
+
+ self.settings.boolean(
+ ['cleanup'],
+ 'clean up after each benchmark?',
+ default=True)
+
+ for benchmark_class in self.benchmark_classes:
+ benchmark_class.add_settings(self.settings)
+ def process_args(self, args):
+ results_dir = self.create_results_dir()
+ self.store_settings_in_results(results_dir)
+ result_obj = {
+ 'system-info': self.get_system_info_dict(),
+ 'versions': self.get_version_info_dict(),
+ }
+
+ srctree = self.prepare_source_tree()
+
+ junk_generator = BinaryJunkGenerator()
+ benchmark_infos = {}
+ for benchmark_class in self.benchmark_classes:
+ print 'Benchmark %s' % benchmark_class.__name__
+ benchmark = benchmark_class(
+ self.settings, results_dir, srctree, junk_generator)
+ benchmark_info = benchmark.run()
+ benchmark_infos[benchmark.benchmark_name] = benchmark_info
+ result_obj['benchmarks'] = benchmark_infos
+
+ self.save_result_obj(results_dir, result_obj)
+
+ shutil.rmtree(srctree)
+
+ def create_results_dir(self):
+ results = os.path.abspath(self.settings['results-dir'])
+ if not os.path.exists(results):
+ os.mkdir(results)
+ return results
+
+ def store_settings_in_results(self, results):
+ cp = self.settings.as_cp()
+ filename = os.path.join(results, 'obnam-benchmark.conf')
+ with open(filename, 'w') as f:
+ cp.write(f)
+
+ def get_system_info_dict(self):
+ return {
+ 'hostname': platform.node(),
+ 'machine': platform.machine(),
+ 'architecture': platform.architecture(),
+ 'uname': platform.uname(),
+ }
+
+ def get_version_info_dict(self):
+ treeish = self.settings['obnam-treeish']
+ sha1 = cliapp.runcmd(['git', 'show-ref', treeish]).split()[0]
+ describe = cliapp.runcmd(['git', 'describe', treeish]).strip()
+ return {
+ 'obnam-treeish': treeish,
+ 'obnam-sha1': sha1,
+ 'obnam-version': describe,
+ 'larch-version': larch.__version__,
+ }
+
+ def prepare_source_tree(self):
+ srctree = tempfile.mkdtemp()
+ self.extract_sources_from_git(srctree)
+ self.build_obnam(srctree)
+ return srctree
+
+ def extract_sources_from_git(self, srctree):
+ cliapp.runcmd(
+ ['git', 'archive', self.settings['obnam-treeish']],
+ ['tar', '-C', srctree, '-xf', '-'])
+
+ def build_obnam(self, srctree):
+ cliapp.runcmd(
+ ['python', 'setup.py', 'build_ext', '-i'],
+ cwd=srctree)
+
+ def save_result_obj(self, results_dir, result_obj):
+ filename = os.path.join(results_dir, 'benchmark.json')
+ with open(filename, 'w') as f:
+ json.dump(result_obj, f, indent=4)
+
+
+if __name__ == '__main__':
+ ObnamBenchmarkRunner().run()
diff --git a/obnam-benchmark-summary b/obnam-benchmark-summary
new file mode 100755
index 00000000..7edd3a64
--- /dev/null
+++ b/obnam-benchmark-summary
@@ -0,0 +1,136 @@
+#!/usr/bin/env python
+#
+# Copyright 2014 Lars Wirzenius
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
+
+
+import json
+import os
+
+import cliapp
+
+
+MiB = 2**20
+GiB = 2**30
+
+
+class ObnamBenchmarkSummary(cliapp.Application):
+
+ columns = (
+ ('version', 'version'),
+ ('ef-speed', 'EF files/s'),
+ ('ef-repo-size', 'EF repo (GiB)'),
+ ('lf-speed', 'LF MiB/s'),
+ ('lf-repo-size', 'LF repo (GiB)'),
+ )
+
+ def process_args(self, args):
+ summaries = []
+ for dirname in args:
+ summary = self.summarise_directory(dirname)
+ summaries.append(summary)
+ self.show_summaries(summaries)
+
+ def summarise_directory(self, dirname):
+ filename = os.path.join(dirname, 'benchmark.json')
+ with open(filename) as f:
+ obj = json.load(f)
+
+ return {
+ 'version':
+ self.get_obnam_version(obj),
+ 'ef-speed':
+ '%.0f' % self.get_empty_files_speed(obj),
+ 'ef-files':
+ self.get_empty_files_count(obj),
+ 'ef-repo-size':
+ self.format_size(self.get_empty_files_repo_size(obj), GiB),
+ 'lf-speed':
+ self.format_size(self.get_large_file_speed(obj), MiB),
+ 'lf-size':
+ self.format_size(self.get_large_file_size(obj), GiB),
+ 'lf-repo-size':
+ self.format_size(self.get_large_file_repo_size(obj), GiB),
+ }
+
+ def get_obnam_version(self, obj):
+ return obj['versions']['obnam-version']
+
+ def get_empty_files_speed(self, obj):
+ count = self.get_empty_files_count(obj)
+ step = self.find_step(obj, 'EmptyFiles', 'initial-backup')
+ return count / step['duration']
+
+ def get_empty_files_count(self, obj):
+ step = self.find_step(obj, 'EmptyFiles', 'create-live-data')
+ return step['empty-files-count']
+
+ def get_empty_files_repo_size(self, obj):
+ step = self.find_step(obj, 'EmptyFiles', 'initial-backup')
+ return step['repo-size']
+
+ def get_large_file_speed(self, obj):
+ file_size = self.get_large_file_size(obj)
+ step = self.find_step(obj, 'SingleLargeFile', 'initial-backup')
+ return file_size / step['duration']
+
+ def get_large_file_size(self, obj):
+ step = self.find_step(obj, 'SingleLargeFile', 'create-live-data')
+ return step['single-large-file-size']
+
+ def get_large_file_repo_size(self, obj):
+ step = self.find_step(obj, 'SingleLargeFile', 'initial-backup')
+ return step['repo-size']
+
+ def find_step(self, obj, benchmark_name, step_name):
+ for step in obj['benchmarks'][benchmark_name]['steps']:
+ if step['step'] == step_name:
+ return step
+ raise Exception('step %s not found' % step)
+
+ def format_size(self, size, unit):
+ return '%.0f' % (size / unit)
+
+ def show_summaries(self, summaries):
+ lines = [[title for key, title in self.columns]]
+
+ for s in summaries:
+ line = [str(s[key]) for key, title in self.columns]
+ lines.append(line)
+
+ widths = self.compute_column_widths(lines)
+
+ titles = lines[0]
+ results = sorted(lines[1:])
+ for line in [titles] + results:
+ cells = []
+ for i, cell in enumerate(line):
+ cells.append('%*s' % (widths[i], cell))
+ self.output.write(' | '.join(cells))
+ self.output.write('\n')
+
+ def compute_column_widths(self, lines):
+ widths = []
+ n = len(lines[0])
+ for col in range(n):
+ width = 0
+ for line in lines:
+ width = max(width, len(line[col]))
+ widths.append(width)
+ return widths
+
+
+if __name__ == '__main__':
+ ObnamBenchmarkSummary().run()
diff --git a/obnam-benchmark.1.in b/obnam-benchmark.1.in
deleted file mode 100644
index f52ee74c..00000000
--- a/obnam-benchmark.1.in
+++ /dev/null
@@ -1,133 +0,0 @@
-.\" Copyright 2011 Lars Wirzenius <liw@liw.fi>
-.\"
-.\" This program is free software: you can redistribute it and/or modify
-.\" it under the terms of the GNU General Public License as published by
-.\" the Free Software Foundation, either version 3 of the License, or
-.\" (at your option) any later version.
-.\"
-.\" This program is distributed in the hope that it will be useful,
-.\" but WITHOUT ANY WARRANTY; without even the implied warranty of
-.\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
-.\" GNU General Public License for more details.
-.\"
-.\" You should have received a copy of the GNU General Public License
-.\" along with this program. If not, see <http://www.gnu.org/licenses/>.
-.\"
-.TH OBNAM-BENCHMARK 1
-.SH NAME
-obnam-benchmark \- benchmark obnam
-.SH SYNOPSIS
-.SH DESCRIPTION
-.B obnam-benchmark
-benchmarks the
-.BR obnam (1)
-backup application,
-by measuring how much time it takes to do a backup, restore, etc,
-in various scenarios.
-.B obnam-benchmark
-uses the
-.BR seivot (1)
-tool for actually running the benchmarks,
-but makes some helpful assumptions about things,
-to make it simpler to run than running
-.B seivot
-directly.
-.PP
-Benchmarks are run using two different usage profiles:
-.I mailspool
-(all files are small), and
-.I mediaserver
-(all files are big).
-For each profile,
-test data of the desired total size is generated,
-backed up,
-and then several incremental generations are backed up,
-each adding some more generated test data.
-Then other operations are run against the backup repository:
-restoring,
-listing the contents of,
-and removing each generation.
-.PP
-The result of the benchmark is a
-.I .seivot
-file per profile,
-plus a Python profiler file for each run of
-.BR obnam .
-These are stored in
-.IR ../benchmarks .
-A set of
-.I .seivot
-files can be summarized for comparison with
-.BR seivots-summary (1).
-The profiling files can be viewed with the usual Python tools:
-see the
-.B pstats
-module.
-.PP
-The benchmarks are run against a version of
-.B obnam
-checked out from version control.
-It is not (currently) possible to run the benchmark against an installed
-version of
-.BR obnam.
-Also the
-.I larch
-Python library,
-which
-.B obnam
-needs,
-needs to be checked out from version control.
-The
-.B \-\-obnam\-branch
-and
-.B \-\-larch\-branch
-options set the locations,
-if the defaults are not correct.
-.SH OPTIONS
-.SH ENVIRONMENT
-.TP
-.BR TMPDIR
-This variable
-.I must
-be set.
-It controls where the temporary files (generated test data) is stored.
-If this variable was not set,
-they'd be put into
-.IR /tmp ,
-which easily fills up,
-to the detriment of the entire system.
-Thus.
-.B obnam-benchmark
-requires that the location is set explicitly.
-(You can still use
-.I /tmp
-if you want, but you have to set
-.B TMPDIR
-explicitly.)
-.SH FILES
-.TP
-.BR ../benchmarks/
-The default directory where results of the benchmark are stored,
-in a subdirectory named after the branch and revision numbers.
-.SH EXAMPLE
-To run a small benchmark:
-.IP
-TMPDIR=/var/tmp obnam-benchmark --size=10m/1m
-.PP
-To run a benchmark using existing data:
-.IP
-TMPDIR=/var/tmp obnam-benchmark --use-existing=$HOME/Mail
-.PP
-To view the currently available benchmark results:
-.IP
-seivots-summary ../benchmarks/*/*mail*.seivot | less -S
-.br
-seivots-summary ../benchmarks/*/*media*.seivot | less -S
-.PP
-(You need to run
-.B seivots-summary
-once per usage profile.)
-.SH "SEE ALSO"
-.BR obnam (1),
-.BR seivot (1),
-.BR seivots-summary (1).
diff --git a/setup.py b/setup.py
index b421b79d..deda1ba1 100644
--- a/setup.py
+++ b/setup.py
@@ -199,7 +199,7 @@ setup(name='obnam',
author='Lars Wirzenius',
author_email='liw@liw.fi',
url='http://liw.fi/obnam/',
- scripts=['obnam', 'obnam-benchmark', 'obnam-viewprof'],
+ scripts=['obnam', 'obnam-viewprof'],
packages=['obnamlib', 'obnamlib.plugins', 'obnamlib.fmt_6'],
ext_modules=[Extension('obnamlib._obnam', sources=['_obnammodule.c'])],
data_files=[('share/man/man1', glob.glob('*.1'))],