|
This bug was exposed by a test data set provided by Rob Kendrick.
Saving reference count groups would sometimes overwrite a correct
saved group with one that had all counts set to zero. This would
happen when the "dirty set" (node ids whose refcounts had been changed
but not yet saved) contained, for example, node ids for the Nth and
N+2th group, but not the N+1th group between them. The old
save_refcounts logic would blithely save that group anyway, and if the
in-memory reference counts happened to not be in the refcount dict, it
would save zeroes instead.
To fix this, save_refcounts now only saves groups that have any dirty
refcounts, and skips saving a refcount group that is all clean.
To do this efficiently, we need to change the encode_refcounts
function signature, to get the set of keys it is to actually put into
the group. This set is now computed by save_refcounts.
This meant that all call sites for encode_refcounts need to be fixed
as well. Luckily the fix is easy: there's only one production use of
it, the rest is tests or benchmarks.
|