From 675059db2349bea4ebea9ee141046f5d271f4794 Mon Sep 17 00:00:00 2001 From: distix ticketing system Date: Mon, 24 Jul 2017 18:16:36 +0000 Subject: imported mails --- .../Maildir/new/1500920195.M176537P30379Q1.koom | 190 +++++++++++++++++++++ 1 file changed, 190 insertions(+) create mode 100644 tickets/11c32688f6ae4c039e5fef65b5007a88/Maildir/new/1500920195.M176537P30379Q1.koom diff --git a/tickets/11c32688f6ae4c039e5fef65b5007a88/Maildir/new/1500920195.M176537P30379Q1.koom b/tickets/11c32688f6ae4c039e5fef65b5007a88/Maildir/new/1500920195.M176537P30379Q1.koom new file mode 100644 index 0000000..1ae0068 --- /dev/null +++ b/tickets/11c32688f6ae4c039e5fef65b5007a88/Maildir/new/1500920195.M176537P30379Q1.koom @@ -0,0 +1,190 @@ +Return-Path: +X-Original-To: distix@pieni.net +Delivered-To: distix@pieni.net +Received: from yaffle.pepperfish.net (yaffle.pepperfish.net [88.99.213.221]) + by pieni.net (Postfix) with ESMTPS id E1379417FB + for ; Mon, 24 Jul 2017 18:15:59 +0000 (UTC) +Received: from platypus.pepperfish.net (unknown [10.112.101.20]) + by yaffle.pepperfish.net (Postfix) with ESMTP id 83373418A5; + Mon, 24 Jul 2017 19:15:59 +0100 (BST) +Received: from ip6-localhost.nat ([::1] helo=platypus.pepperfish.net) + by platypus.pepperfish.net with esmtp (Exim 4.80 #2 (Debian)) + id 1dZhtf-00054g-FS; Mon, 24 Jul 2017 19:15:59 +0100 +Received: from [10.112.101.21] (helo=mx3.pepperfish.net) + by platypus.pepperfish.net with esmtps (Exim 4.80 #2 (Debian)) + id 1dZhtf-00054V-2J + for ; Mon, 24 Jul 2017 19:15:59 +0100 +Received: from barracuda.pco-inc.com ([71.4.36.131]) + by mx3.pepperfish.net with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) + (Exim 4.89) (envelope-from ) + id 1dZhtd-0005OM-4C + for obnam-support@obnam.org; Mon, 24 Jul 2017 19:15:59 +0100 +X-ASG-Debug-ID: 1500920148-0573a21092342010001-phrF5L +Received: from Loki.pcopen.net ([10.0.0.65]) by barracuda.pco-inc.com with + ESMTP id bZPUn48p0LOZ2zPl (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 + bits=256 verify=NO); Mon, 24 Jul 2017 11:15:48 -0700 (PDT) +X-Barracuda-Envelope-From: lperkins@openeye.net +Received: from LOKI.pcopen.net ([fe80::39f5:aaff:14af:6002]) by + Loki.pcopen.net ([fe80::39f5:aaff:14af:6002%10]) with mapi id 14.03.0351.000; + Mon, 24 Jul 2017 11:15:49 -0700 +From: "Laurence Perkins (OE)" +To: "liw@liw.fi" +Thread-Topic: Variable Chunksize +X-ASG-Orig-Subj: Re: Variable Chunksize +Thread-Index: AQHTALO1js8PRLWMaEGpkNs8UOGlYqJb6R8AgAGEm4CAAvDXAIADVVeAgAACZYCAAA9cAA== +Date: Mon, 24 Jul 2017 18:15:48 +0000 +Message-ID: <1500920142.13826.15.camel@openeye.net> +References: <1500484994.13826.5.camel@openeye.net> + <20170719181232.sdqihqdqldsgzmtd@liw.fi> + <1500571405.13826.8.camel@openeye.net> + <20170722141756.yzxatuvogrdsh4jv@liw.fi> + <1500916329.13826.13.camel@openeye.net> + <20170724172043.s2ykrfwcusyzdcgd@liw.fi> +In-Reply-To: <20170724172043.s2ykrfwcusyzdcgd@liw.fi> +Accept-Language: en-US +Content-Language: en-US +X-MS-Has-Attach: yes +X-MS-TNEF-Correlator: +x-originating-ip: [10.0.50.60] +MIME-Version: 1.0 +X-Barracuda-Connect: UNKNOWN[10.0.0.65] +X-Barracuda-Start-Time: 1500920148 +X-Barracuda-Encrypted: ECDHE-RSA-AES256-SHA384 +X-Barracuda-URL: https://10.0.0.6:443/cgi-mod/mark.cgi +X-Barracuda-Scan-Msg-Size: 2537 +X-Virus-Scanned: by bsmtpd at pco-inc.com +X-Barracuda-BRTS-Status: 1 +X-Barracuda-Spam-Score: 0.00 +X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=1000.0 + QUARANTINE_LEVEL=1000.0 KILL_LEVEL=5.0 tests= +X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.41254 + Rule breakdown below + pts rule name description + ---- ---------------------- -------------------------------------------------- +X-Pepperfish-Transaction: 05d4-030d-7dc3-9bf6 +X-Spam-Score: -1.9 +X-Spam-Score-int: -18 +X-Spam-Bar: - +X-Scanned-By: pepperfish.net, Mon, 24 Jul 2017 19:15:59 +0100 +X-Spam-Report: Content analysis details: (-1.9 points) + pts rule name description + ---- ---------------------- -------------------------------------------------- + -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% + [score: 0.0000] +X-ACL-Warn: message may be spam +X-Scan-Signature: 7caf9b1c1dd65fb728edbedec5ffc17f +Cc: "obnam-support@obnam.org" +Subject: Re: Variable Chunksize +X-BeenThere: obnam-support@obnam.org +X-Mailman-Version: 2.1.5 +Precedence: list +List-Id: Obnam backup software discussion +List-Unsubscribe: , + +List-Archive: +List-Post: +List-Help: +List-Subscribe: , + +Content-Type: multipart/mixed; boundary="===============4633542379024154335==" +Mime-version: 1.0 +Sender: obnam-support-bounces@obnam.org +Errors-To: obnam-support-bounces@obnam.org + +--===============4633542379024154335== +Content-Language: en-US +Content-Type: multipart/signed; micalg=pgp-sha512; + protocol="application/pgp-signature"; boundary="=-o1kvqKOW1q8lnQliKXtT" + +--=-o1kvqKOW1q8lnQliKXtT +Content-Type: text/plain; charset="UTF-7" +Content-Transfer-Encoding: quoted-printable + + + +On Mon, 2017-07-24 at 20:20 +-0300, Lars Wirzenius wrote: ++AD4 On Mon, Jul 24, 2017 at 05:12:15PM +-0000, Laurence Perkins (OE) ++AD4 wrote: ++AD4 +AD4 Smaller chunk size makes deduplication more precise regardless of ++AD4 +AD4 the ++AD4 +AD4 type of splitting, but it should generate some pretty big savings ++AD4 +AD4 on ++AD4 +AD4 similar data without reducing the chunk size because it will be ++AD4 +AD4 better ++AD4 +AD4 at finding identical chunks of data since it's not relying on the= +m ++AD4 +AD4 being at fixed offsets. ++AD4=20 ++AD4 If you have actual measurements of this, please report them. Some ++AD4 years ago when this idea first came up in the Obnam context, using ++AD4 the proposed type of chunking without reducing chunk size ++AD4 significatnly didn't much help in de-duplication. Only when the ++AD4 average chunk size became much smaller, did de-deuplication get a lot ++AD4 better, but then the number of chunks became a problem. ++AD4=20 ++AD4 Guessing isn't helpful here. Even if it were, now is not a good time ++AD4 for me to spend any time on this, and I don't want to even consider a ++AD4 patch for this until green albatross is in shape. ++AD4=20 + +Mathematically it's going to depend a lot on your dataset. You will +never see any benefit on files smaller than two chunks, nor on files +where new data is simply appended to the end. This is probably the +kind of data most users have at this point, (along with data that is +never modified) so it's definitely not worth diverting your attention +away from completing green albatross. =20 + +However, when backing up sparse disk images, cloned VMs, chroot +tarballs, certain kinds of database file, or anything that spans +multiple chunks and is routinely subjected to random insertions I +commonly see fixed chunked algorithm tools (including Obnam) go quickly +until they hit the first inserted bit of data, and then proceed to re- +transfer the entire rest of the file because all the chunks are off.=20 +For one of my machines, this often results in hundreds of gigabytes of +unnecessary data transfer per backup run. This is probably not +representative of the typical Obnam user, but there have been a few +people asking for advice about how to optimize such things on the +mailing list, so I doubt I'm the only one. + +I strongly suspect that all the extra hashing is going to crater +performance regardless, but I'll give it a shot. Since the repo +formats are chunksize-agnostic impact on other sections of the program +should be virtually nil. +--=-o1kvqKOW1q8lnQliKXtT +Content-Type: application/pgp-signature; name="signature.asc" +Content-Description: This is a digitally signed message part +Content-Transfer-Encoding: 7bit + +-----BEGIN PGP SIGNATURE----- + +iQIzBAABCgAdFiEEFbYe3ereZkZxAoz7C4CSuysVUSAFAll2OU4ACgkQC4CSuysV +USDS5Q/5AZrLSzkyGO2pgYsYul+uxC4GfAOWWOSQjBcMjyRPWG+NY8lytJtHPm+F +nTsKiCPVSh9NMhW3C/Mpho+LcYBLF4Vy8sMhnQ5MPI41sBSOXnKr4+b1Ed5hLLeD +LgfuMZECoDvy4AhESaafp4AC1YmfGJuCAnG0R65IpcjkncKo2u/wiG/4x3HWTFI9 +6CpwK0Z2U8TzntMayLsWkZ5CQC6DnzScwCXAupqgZIoERJLMXiZGl4sjXNtGAva9 +NtcDFiJXtOfIutyyAjmfX28e3LvN1EaRo706UBXSNQns0ZWkhqj2geRr/g+1JF7u +HQpstObdryM94Y0QpRf70dNEDh+nlRm/vm+iEZwP/zgTBSK489ieUFghn+UhLSuC +1Y2h69gmLUXq2p0wxrG/Sdo/6t1UbhOLq3PnfVbEdzfzZA4/00vKGkFi29mJZkGY +NOcIaozz41lkI9SHl/FrCmlS92muEYoyYLyvjzsWD0Ee/bFj5GXnQ0i6JTTDSHZ+ +i5EiyvAHc5mjgucqqIBt6LfdGVj9/o9t59qvwUkhvVCN8PbInkYNRy1rmIYUJaEu +xBPL7/BVWl6ec056rHACUuCKwwI9GwrJZJwx/hjXtILLwjL+JB5bd+wzaBZ19Wyy +l9Q1worhQ/hd+aPuh2vVznyO7ZMDaEFp0YaVwOhLhkgVb/bZfaU= +=GD/e +-----END PGP SIGNATURE----- + +--=-o1kvqKOW1q8lnQliKXtT-- + + +--===============4633542379024154335== +Content-Type: text/plain; charset="us-ascii" +MIME-Version: 1.0 +Content-Transfer-Encoding: 7bit +Content-Disposition: inline + +_______________________________________________ +obnam-support mailing list +obnam-support@obnam.org +http://listmaster.pepperfish.net/cgi-bin/mailman/listinfo/obnam-support-obnam.org + +--===============4633542379024154335==-- + -- cgit v1.2.1