diff options
author | Lars Wirzenius <liw@liw.fi> | 2014-03-02 20:20:16 +0000 |
---|---|---|
committer | Lars Wirzenius <liw@liw.fi> | 2014-03-02 20:20:16 +0000 |
commit | c463c8bd575dbd1ec64ee028467e5e638e762e88 (patch) | |
tree | b3dcf32ba9bf34d1156d8dd40132d9a618fa31f5 | |
parent | d7f4b168eb3148d3943eea3fa4d759d3d42a22b0 (diff) | |
parent | 5fbefbf041a6a8222ff64763ca5aabbfd5449396 (diff) | |
download | obnam-c463c8bd575dbd1ec64ee028467e5e638e762e88.tar.gz |
Merge branch 'liw/benchmarks'
-rw-r--r-- | NEWS | 5 | ||||
-rw-r--r-- | README.benchmarks | 79 | ||||
-rwxr-xr-x | obnam-benchmark | 527 | ||||
-rwxr-xr-x | obnam-benchmark-summary | 136 | ||||
-rw-r--r-- | obnam-benchmark.1.in | 133 | ||||
-rw-r--r-- | setup.py | 2 |
6 files changed, 565 insertions, 317 deletions
@@ -38,6 +38,11 @@ Version 1.7, released UNRELEASED future releases. The error codes are meant to be easy to search for, and will allow error messages to be translated in the future. +* The `obnam-benchmark` program got rewritten so that it'll do + something useful, but at the same time, it is no longer useful as a + general tool. It is now expected to be run from the Obnam source + tree (a cloned git repository), and isn't installed anymore. + Bug fixes: * Obnam now creates a `trustdb.gpg` in the temporary GNUPGHOME it uses diff --git a/README.benchmarks b/README.benchmarks new file mode 100644 index 00000000..30241aa9 --- /dev/null +++ b/README.benchmarks @@ -0,0 +1,79 @@ +README for Obnam benchmarks +=========================== + +I've tried a number of approaches to benchmarks with Obnam over the +years, but no approach has prevailed. This README describes my current +approach in the hope that it will evolve into something useful. + +Ideally I would optimise Obnam for real-world use, but for now, I will +be content with the simple synthetic benchmarks described here. + +Lars Wirzenius + +Overview +-------- + +I do not want a large number of different benchmarks, at least for +now. I want a small set that I can and will run systematically, at +least for each release. Too much data can be just as bad as too little +data: if it takes too much effort to analyse the data, then that eats +up from development time. That said, hard numbers are better than +guesses. + +I've decided on the following data sets: + +* 10^6 empty files, spread over a 1000 directories with 1000 files + each. Obnam has (at least with repository format 6) a high overhead + per file, regardless of the contents of the file, and this is a + pessimal situation for that. + + The interesting numbers here are: number of files backed up per + second, and size of backup repository. + +* A single directory with a single file, 2^12 bytes (1 TiB) long. + little repetition in the data. This benchmarks the opposite end of + the spectrum of number of files vs size of data. + + The interesting numbers here are number of bytes of actual file data + backed up per second and size of backup repository. + +Later, I may add more data sets. An intriguing idea would be to +generate data from [Summain] manifests, where everything except the +actual file data is duplicated from anonymised manifests captured from +real systems. + +[Summain]: http://liw.fi/summain/ + +For each data set, I will run the following operations: + +* An initial backup. +* A no-op second generation backup. +* A restore of the second generation, with `obnam restore`. +* A restore of the second generation, with `obnam mount`. + +I will measure the following about each operation: + +* Total wall-clock time. +* Maximum VmRSS memory, as logged by Obnam itself. + +I will additionally capture Python profiler output of each operation, +to allow easier analysis of where time is going. + +I will run the benchmarks without compression or encryption, at least +for now, and in general use the default settings built into Obnam for +everything, unless there's a need to tweak them to make the benchmark +work at all. + +Benchmark results +----------------- + +A benchmark run will produce the following: + +* A JSON file with the measurements given above. +* A Python profiling file for each operation for each dataset. + (Two datasets times four operations gives eight profiles.) + +I will run the benchmark for each release of Obnam, starting with +Obnam 1.6.1. I will not care about Larch versions at this time: I will +use the installed version. I will store the resulting data sets in a +separate git repository for reference. diff --git a/obnam-benchmark b/obnam-benchmark index d5dc3dc1..6d3ccd16 100755 --- a/obnam-benchmark +++ b/obnam-benchmark @@ -1,6 +1,6 @@ #!/usr/bin/env python # -# Copyright 2010, 2011 Lars Wirzenius +# Copyright 2014 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by @@ -16,203 +16,364 @@ # along with this program. If not, see <http://www.gnu.org/licenses/>. -import cliapp -import ConfigParser -import glob -import logging +import json import os +import platform import shutil -import socket -import subprocess +import stat +import sys import tempfile +import time +import cliapp +import Crypto.Cipher.ARC4 +import larch +import ttystatus -class ObnamBenchmark(cliapp.Application): - default_sizes = ['1g/100m'] - keyid = '3B1802F81B321347' - opers = ('backup', 'restore', 'list_files', 'forget') +class BinaryJunkGenerator(object): - def add_settings(self): - self.settings.string(['results'], 'put results under DIR (%default)', - metavar='DIR', default='../benchmarks') - self.settings.string(['obnam-branch'], - 'use DIR as the obnam branch to benchmark ' - '(default: %default)', - metavar='DIR', - default='.') - self.settings.string(['larch-branch'], - 'use DIR as the larch branch (default: %default)', - metavar='DIR', - ) - self.settings.string(['seivot-branch'], - 'use DIR as the seivot branch ' - '(default: installed seivot)', - metavar='DIR') - self.settings.boolean(['with-encryption'], - 'run benchmark using encryption') - - self.settings.string(['profile-name'], - 'short name for benchmark scenario', - default='unknown') - self.settings.string_list(['size'], - 'add PAIR to list of sizes to ' - 'benchmark (e.g., 10g/1m)', - metavar='PAIR') - self.settings.bytesize(['file-size'], 'how big should files be?', - default=4096) - self.settings.integer(['generations'], - 'benchmark N generations (default: %default)', - metavar='N', - default=5) - self.settings.boolean(['use-sftp-repository'], - 'access the repository over SFTP ' - '(requires ssh to localhost to work)') - self.settings.boolean(['use-sftp-root'], - 'access the live data over SFTP ' - '(requires ssh to localhost to work)') - self.settings.integer(['sftp-delay'], - 'add artifical delay to sftp transfers ' - '(in milliseconds)') - self.settings.string(['description'], 'describe benchmark') - self.settings.boolean(['drop-caches'], 'drop kernel buffer caches') - self.settings.string(['seivot-log'], 'seivot log setting') - - self.settings.boolean(['verify'], 'verify restores') + key = b'obnam-benchmark' + data = b'fake live data' * 1024 - def process_args(self, args): - self.require_tmpdir() - - obnam_revno = self.bzr_revno(self.settings['obnam-branch']) - if self.settings['larch-branch']: - larch_revno = self.bzr_revno(self.settings['larch-branch']) - else: - larch_revno = None - - results = self.results_dir(obnam_revno, larch_revno) - - obnam_branch = self.settings['obnam-branch'] - if self.settings['seivot-branch']: - seivot = os.path.join(self.settings['seivot-branch'], 'seivot') - else: - seivot = 'seivot' - - generations = self.settings['generations'] - - tempdir = tempfile.mkdtemp() - env = self.setup_gnupghome(tempdir) - - sizes = self.settings['size'] or self.default_sizes - logging.debug('sizes: %s' % repr(sizes)) - - file_size = self.settings['file-size'] - profile_name = self.settings['profile-name'] - - for pair in sizes: - initial, inc = self.parse_size_pair(pair) - - msg = 'Profile %s, size %s inc %s' % (profile_name, initial, inc) - print - print msg - print '-' * len(msg) - print - - obnam_profile = os.path.join(results, - 'obnam--%(op)s-%(gen)s.prof') - output = os.path.join(results, 'obnam.seivot') - if os.path.exists(output): - print ('%s already exists, not re-running benchmark' % - output) - else: - argv = [seivot, - '--obnam-branch', obnam_branch, - '--incremental-data', inc, - '--file-size', str(file_size), - '--obnam-profile', obnam_profile, - '--generations', str(generations), - '--profile-name', profile_name, - '--sftp-delay', str(self.settings['sftp-delay']), - '--initial-data', initial, - '--output', output] - if self.settings['larch-branch']: - argv.extend(['--larch-branch', self.settings['larch-branch']]) - if self.settings['seivot-log']: - argv.extend(['--log', self.settings['seivot-log']]) - if self.settings['drop-caches']: - argv.append('--drop-caches') - if self.settings['use-sftp-repository']: - argv.append('--use-sftp-repository') - if self.settings['use-sftp-root']: - argv.append('--use-sftp-root') - if self.settings['with-encryption']: - argv.extend(['--encrypt-with', self.keyid]) - if self.settings['description']: - argv.extend(['--description', - self.settings['description']]) - if self.settings['verify']: - argv.append('--verify') - self.runcmd(argv, env=env) - - shutil.rmtree(tempdir) - - def require_tmpdir(self): - if 'TMPDIR' not in os.environ: - raise cliapp.AppException('TMPDIR is not set. ' - 'You would probably run out of space ' - 'on /tmp.') - if not os.path.exists(os.environ['TMPDIR']): - raise cliapp.AppException('TMPDIR points at a non-existent ' - 'directory %s' % os.environ['TMPDIR']) - logging.debug('TMPDIR=%s' % repr(os.environ['TMPDIR'])) + def __init__(self): + self.cipher = Crypto.Cipher.ARC4.new(self.key) + self.buffer = '' + + def get(self, num_bytes): + n = 0 + result = [] + while n < num_bytes: + if not self.buffer: + self.buffer = self.cipher.encrypt(self.data) + + part = self.buffer[:num_bytes - n] + result.append(part) + n += len(part) + self.buffer = self.buffer[len(part):] + + return ''.join(result) + + +class StepInfo(object): + + def __init__(self, label): + self.label = label + self.info = { + 'step': label, + } + + def add_info(self, key, value): + self.info[key] = value + + def stop_timer(self): + self.end = time.time() + + def __enter__(self): + self.start = time.time() + self.end = None + return self + + def __exit__(self, exc_type, exc_val, exc_tb): + if exc_type is None: + if self.end is None: + self.end = time.time() + self.info['duration'] = self.end - self.start + return False - @property - def hostname(self): - return socket.gethostname() + +class ObnamBenchmark(object): + + def __init__(self, settings, results_dir, srctree, junk_generator): + self.settings = settings + self.results_dir = results_dir + self.srctree = srctree + self.junk_generator = junk_generator + + @classmethod + def add_settings(self, settings): + pass @property - def obnam_branch_name(self): - obnam_branch = os.path.abspath(self.settings['obnam-branch']) - return os.path.basename(obnam_branch) - - def results_dir(self, obnam_revno, larch_revno): - parent = self.settings['results'] - parts = [self.hostname, self.obnam_branch_name, str(obnam_revno)] - if larch_revno: - parts.append(str(larch_revno)) - prefix = os.path.join(parent, "-".join(parts)) - - get_path = lambda counter: "%s-%d" % (prefix, counter) - - counter = 0 - dirname = get_path(counter) - while os.path.exists(dirname): - counter += 1 - dirname = get_path(counter) - os.makedirs(dirname) - return dirname - - def setup_gnupghome(self, tempdir): - gnupghome = os.path.join(tempdir, 'gnupghome') - shutil.copytree('test-gpghome', gnupghome) + def benchmark_name(self): + s = self.__class__.__name__ + if s.endswith('Benchmark'): + s = s[:-len('Benchmark')] + return s + + def result_filename(self, label, suffix): + return os.path.join( + self.results_dir, + '%s-%s%s' % (self.benchmark_name, label, suffix)) + + def run(self): + self.tempdir = tempfile.mkdtemp() + self.live_data = self.create_live_data_dir() + self.repo = self.create_repo() + step_infos = [] + + steps = [ + ('create-live-data', self.create_live_data), + ('initial-backup', self.backup), + ('no-op-backup', self.backup), + ('obnam-verify', self.obnam_verify), + ('obnam-mount', self.obnam_mount), + ('cleanup', + lambda si: + self.cleanup(si) if self.settings['cleanup'] else None), + ] + + for label, method in steps: + print ' %s' % label + with StepInfo(label) as step_info: + method(step_info) + step_infos.append(step_info) + + return { + 'steps': [step_info.info for step_info in step_infos], + } + + def create_live_data_dir(self): + live_data = os.path.join(self.tempdir, 'live-data') + os.mkdir(live_data) + return live_data + + def create_repo(self): + repo = os.path.join(self.tempdir, 'repo') + os.mkdir(repo) + return repo + + def create_live_data(self, step_info): + # Subclasses MUST override this. + raise NotImplementedError() + + def backup(self, step_info): + self.run_obnam( + ['backup', '-r', self.repo, self.live_data], step_info.label) + step_info.stop_timer() + step_info.add_info('repo-size', self.sum_of_file_sizes(self.repo)) + step_info.add_info( + 'live-data-size', self.sum_of_file_sizes(self.live_data)) + + def obnam_verify(self, step_info): + self.run_obnam( + ['verify', '-r', self.repo], + step_info.label) + + def obnam_mount(self, step_info): + mount = os.path.join(self.tempdir, 'mount') + os.mkdir(mount) + + self.run_obnam( + ['mount', '-r', self.repo, '--to', mount], + step_info.label) + + cliapp.runcmd(['tar', '-cf', '/dev/null', mount + '/.']) + time.sleep(1) + + try: + cliapp.runcmd(['fusermount', '-u', mount]) + except cliapp.AppException as e: + print 'ERROR from fusermount: %s' % str(e) + + def cleanup(self, step_info): + shutil.rmtree(self.tempdir) + + def run_obnam(self, args, label): + base_command = [ + self.settings['obnam-cmd'], + '--no-default-config', + '--log', self.result_filename(label, '.log'), + '--log-level', 'debug', + ] env = dict(os.environ) - env['GNUPGHOME'] = gnupghome - return env + env['OBNAM_PROFILE'] = self.result_filename(label, '.prof') + cliapp.runcmd(base_command + args, env=env, cwd=self.srctree) - def bzr_revno(self, branch): - p = subprocess.Popen(['bzr', 'revno'], cwd=branch, - stdout=subprocess.PIPE) - out, err = p.communicate() - if p.returncode != 0: - raise cliapp.AppException('bzr failed') + def sum_of_file_sizes(self, root_dir): + total = 0 + for dirname, subdirs, basenames in os.walk(root_dir): + for basename in basenames: + pathname = os.path.join(dirname, basename) + st = os.lstat(pathname) + if stat.S_ISREG(st.st_mode): + total += st.st_size + return total - revno = out.strip() - logging.debug('bzr branch %s has revno %s' % (branch, revno)) - return revno - def parse_size_pair(self, pair): - return pair.split('/', 1) +class EmptyFilesBenchmark(ObnamBenchmark): + files_per_dir = 1000 -if __name__ == '__main__': - ObnamBenchmark().run() + @classmethod + def add_settings(self, settings): + settings.integer( + ['empty-files-count'], + 'number of empty files for %s' % self.__class__.__name__, + default=10**6) + + @property + def num_files(self): + return self.settings['empty-files-count'] + + def create_live_data(self, step_info): + step_info.add_info('empty-files-count', self.num_files) + for i in range(self.num_files): + subdir = os.path.join( + self.live_data, 'dir-%d' % (i / self.files_per_dir)) + if (i % self.files_per_dir) == 0: + os.mkdir(subdir) + filename = os.path.join(subdir, 'file-%d' % i) + with open(filename, 'w'): + pass + + +class SingleLargeFileBenchmark(ObnamBenchmark): + + @classmethod + def add_settings(self, settings): + settings.bytesize( + ['single-large-file-size'], + 'size of file to create for %s' % self.__class__.__name__, + default='1TB') + + @property + def file_size(self): + return self.settings['single-large-file-size'] + + def create_live_data(self, step_info): + step_info.add_info('single-large-file-size', self.file_size) + filename = os.path.join(self.live_data, 'file.dat') + with open(filename, 'w') as f: + n = 0 + max_chunk_size = 2**10 + ts = ttystatus.TerminalStatus() + ts['written'] = 0 + ts['total'] = self.file_size + ts.format( + '%ElapsedTime() ' + 'writing live data: %ByteSize(written) of %ByteSize(total) ' + '(%PercentDone(written,total))') + while n < self.file_size: + num_bytes = min(max_chunk_size, self.file_size - n) + data = self.junk_generator.get(num_bytes) + f.write(data) + n += len(data) + ts['written'] = n + ts.clear() + ts.finish() + + +class ObnamBenchmarkRunner(cliapp.Application): + + benchmark_classes = [ + EmptyFilesBenchmark, + SingleLargeFileBenchmark, + ] + + def add_settings(self): + self.settings.string( + ['obnam-cmd'], + 'use CMD as the argv[0] to invoke obnam', + metavar='CMD', + default='./obnam') + + self.settings.string( + ['obnam-treeish'], + 'run Obnam from TREEISH in its git repository', + metavar='TREEISH', + default='HEAD') + + self.settings.string( + ['results-dir'], + 'put results in DIR', + metavar='DIR', + default='.') + + self.settings.boolean( + ['cleanup'], + 'clean up after each benchmark?', + default=True) + + for benchmark_class in self.benchmark_classes: + benchmark_class.add_settings(self.settings) + def process_args(self, args): + results_dir = self.create_results_dir() + self.store_settings_in_results(results_dir) + result_obj = { + 'system-info': self.get_system_info_dict(), + 'versions': self.get_version_info_dict(), + } + + srctree = self.prepare_source_tree() + + junk_generator = BinaryJunkGenerator() + benchmark_infos = {} + for benchmark_class in self.benchmark_classes: + print 'Benchmark %s' % benchmark_class.__name__ + benchmark = benchmark_class( + self.settings, results_dir, srctree, junk_generator) + benchmark_info = benchmark.run() + benchmark_infos[benchmark.benchmark_name] = benchmark_info + result_obj['benchmarks'] = benchmark_infos + + self.save_result_obj(results_dir, result_obj) + + shutil.rmtree(srctree) + + def create_results_dir(self): + results = os.path.abspath(self.settings['results-dir']) + if not os.path.exists(results): + os.mkdir(results) + return results + + def store_settings_in_results(self, results): + cp = self.settings.as_cp() + filename = os.path.join(results, 'obnam-benchmark.conf') + with open(filename, 'w') as f: + cp.write(f) + + def get_system_info_dict(self): + return { + 'hostname': platform.node(), + 'machine': platform.machine(), + 'architecture': platform.architecture(), + 'uname': platform.uname(), + } + + def get_version_info_dict(self): + treeish = self.settings['obnam-treeish'] + sha1 = cliapp.runcmd(['git', 'show-ref', treeish]).split()[0] + describe = cliapp.runcmd(['git', 'describe', treeish]).strip() + return { + 'obnam-treeish': treeish, + 'obnam-sha1': sha1, + 'obnam-version': describe, + 'larch-version': larch.__version__, + } + + def prepare_source_tree(self): + srctree = tempfile.mkdtemp() + self.extract_sources_from_git(srctree) + self.build_obnam(srctree) + return srctree + + def extract_sources_from_git(self, srctree): + cliapp.runcmd( + ['git', 'archive', self.settings['obnam-treeish']], + ['tar', '-C', srctree, '-xf', '-']) + + def build_obnam(self, srctree): + cliapp.runcmd( + ['python', 'setup.py', 'build_ext', '-i'], + cwd=srctree) + + def save_result_obj(self, results_dir, result_obj): + filename = os.path.join(results_dir, 'benchmark.json') + with open(filename, 'w') as f: + json.dump(result_obj, f, indent=4) + + +if __name__ == '__main__': + ObnamBenchmarkRunner().run() diff --git a/obnam-benchmark-summary b/obnam-benchmark-summary new file mode 100755 index 00000000..7edd3a64 --- /dev/null +++ b/obnam-benchmark-summary @@ -0,0 +1,136 @@ +#!/usr/bin/env python +# +# Copyright 2014 Lars Wirzenius +# +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <http://www.gnu.org/licenses/>. + + +import json +import os + +import cliapp + + +MiB = 2**20 +GiB = 2**30 + + +class ObnamBenchmarkSummary(cliapp.Application): + + columns = ( + ('version', 'version'), + ('ef-speed', 'EF files/s'), + ('ef-repo-size', 'EF repo (GiB)'), + ('lf-speed', 'LF MiB/s'), + ('lf-repo-size', 'LF repo (GiB)'), + ) + + def process_args(self, args): + summaries = [] + for dirname in args: + summary = self.summarise_directory(dirname) + summaries.append(summary) + self.show_summaries(summaries) + + def summarise_directory(self, dirname): + filename = os.path.join(dirname, 'benchmark.json') + with open(filename) as f: + obj = json.load(f) + + return { + 'version': + self.get_obnam_version(obj), + 'ef-speed': + '%.0f' % self.get_empty_files_speed(obj), + 'ef-files': + self.get_empty_files_count(obj), + 'ef-repo-size': + self.format_size(self.get_empty_files_repo_size(obj), GiB), + 'lf-speed': + self.format_size(self.get_large_file_speed(obj), MiB), + 'lf-size': + self.format_size(self.get_large_file_size(obj), GiB), + 'lf-repo-size': + self.format_size(self.get_large_file_repo_size(obj), GiB), + } + + def get_obnam_version(self, obj): + return obj['versions']['obnam-version'] + + def get_empty_files_speed(self, obj): + count = self.get_empty_files_count(obj) + step = self.find_step(obj, 'EmptyFiles', 'initial-backup') + return count / step['duration'] + + def get_empty_files_count(self, obj): + step = self.find_step(obj, 'EmptyFiles', 'create-live-data') + return step['empty-files-count'] + + def get_empty_files_repo_size(self, obj): + step = self.find_step(obj, 'EmptyFiles', 'initial-backup') + return step['repo-size'] + + def get_large_file_speed(self, obj): + file_size = self.get_large_file_size(obj) + step = self.find_step(obj, 'SingleLargeFile', 'initial-backup') + return file_size / step['duration'] + + def get_large_file_size(self, obj): + step = self.find_step(obj, 'SingleLargeFile', 'create-live-data') + return step['single-large-file-size'] + + def get_large_file_repo_size(self, obj): + step = self.find_step(obj, 'SingleLargeFile', 'initial-backup') + return step['repo-size'] + + def find_step(self, obj, benchmark_name, step_name): + for step in obj['benchmarks'][benchmark_name]['steps']: + if step['step'] == step_name: + return step + raise Exception('step %s not found' % step) + + def format_size(self, size, unit): + return '%.0f' % (size / unit) + + def show_summaries(self, summaries): + lines = [[title for key, title in self.columns]] + + for s in summaries: + line = [str(s[key]) for key, title in self.columns] + lines.append(line) + + widths = self.compute_column_widths(lines) + + titles = lines[0] + results = sorted(lines[1:]) + for line in [titles] + results: + cells = [] + for i, cell in enumerate(line): + cells.append('%*s' % (widths[i], cell)) + self.output.write(' | '.join(cells)) + self.output.write('\n') + + def compute_column_widths(self, lines): + widths = [] + n = len(lines[0]) + for col in range(n): + width = 0 + for line in lines: + width = max(width, len(line[col])) + widths.append(width) + return widths + + +if __name__ == '__main__': + ObnamBenchmarkSummary().run() diff --git a/obnam-benchmark.1.in b/obnam-benchmark.1.in deleted file mode 100644 index f52ee74c..00000000 --- a/obnam-benchmark.1.in +++ /dev/null @@ -1,133 +0,0 @@ -.\" Copyright 2011 Lars Wirzenius <liw@liw.fi> -.\" -.\" This program is free software: you can redistribute it and/or modify -.\" it under the terms of the GNU General Public License as published by -.\" the Free Software Foundation, either version 3 of the License, or -.\" (at your option) any later version. -.\" -.\" This program is distributed in the hope that it will be useful, -.\" but WITHOUT ANY WARRANTY; without even the implied warranty of -.\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -.\" GNU General Public License for more details. -.\" -.\" You should have received a copy of the GNU General Public License -.\" along with this program. If not, see <http://www.gnu.org/licenses/>. -.\" -.TH OBNAM-BENCHMARK 1 -.SH NAME -obnam-benchmark \- benchmark obnam -.SH SYNOPSIS -.SH DESCRIPTION -.B obnam-benchmark -benchmarks the -.BR obnam (1) -backup application, -by measuring how much time it takes to do a backup, restore, etc, -in various scenarios. -.B obnam-benchmark -uses the -.BR seivot (1) -tool for actually running the benchmarks, -but makes some helpful assumptions about things, -to make it simpler to run than running -.B seivot -directly. -.PP -Benchmarks are run using two different usage profiles: -.I mailspool -(all files are small), and -.I mediaserver -(all files are big). -For each profile, -test data of the desired total size is generated, -backed up, -and then several incremental generations are backed up, -each adding some more generated test data. -Then other operations are run against the backup repository: -restoring, -listing the contents of, -and removing each generation. -.PP -The result of the benchmark is a -.I .seivot -file per profile, -plus a Python profiler file for each run of -.BR obnam . -These are stored in -.IR ../benchmarks . -A set of -.I .seivot -files can be summarized for comparison with -.BR seivots-summary (1). -The profiling files can be viewed with the usual Python tools: -see the -.B pstats -module. -.PP -The benchmarks are run against a version of -.B obnam -checked out from version control. -It is not (currently) possible to run the benchmark against an installed -version of -.BR obnam. -Also the -.I larch -Python library, -which -.B obnam -needs, -needs to be checked out from version control. -The -.B \-\-obnam\-branch -and -.B \-\-larch\-branch -options set the locations, -if the defaults are not correct. -.SH OPTIONS -.SH ENVIRONMENT -.TP -.BR TMPDIR -This variable -.I must -be set. -It controls where the temporary files (generated test data) is stored. -If this variable was not set, -they'd be put into -.IR /tmp , -which easily fills up, -to the detriment of the entire system. -Thus. -.B obnam-benchmark -requires that the location is set explicitly. -(You can still use -.I /tmp -if you want, but you have to set -.B TMPDIR -explicitly.) -.SH FILES -.TP -.BR ../benchmarks/ -The default directory where results of the benchmark are stored, -in a subdirectory named after the branch and revision numbers. -.SH EXAMPLE -To run a small benchmark: -.IP -TMPDIR=/var/tmp obnam-benchmark --size=10m/1m -.PP -To run a benchmark using existing data: -.IP -TMPDIR=/var/tmp obnam-benchmark --use-existing=$HOME/Mail -.PP -To view the currently available benchmark results: -.IP -seivots-summary ../benchmarks/*/*mail*.seivot | less -S -.br -seivots-summary ../benchmarks/*/*media*.seivot | less -S -.PP -(You need to run -.B seivots-summary -once per usage profile.) -.SH "SEE ALSO" -.BR obnam (1), -.BR seivot (1), -.BR seivots-summary (1). @@ -199,7 +199,7 @@ setup(name='obnam', author='Lars Wirzenius', author_email='liw@liw.fi', url='http://liw.fi/obnam/', - scripts=['obnam', 'obnam-benchmark', 'obnam-viewprof'], + scripts=['obnam', 'obnam-viewprof'], packages=['obnamlib', 'obnamlib.plugins', 'obnamlib.fmt_6'], ext_modules=[Extension('obnamlib._obnam', sources=['_obnammodule.c'])], data_files=[('share/man/man1', glob.glob('*.1'))], |