diff options
author | Lars Wirzenius <liw@liw.fi> | 2010-04-25 06:09:23 +1200 |
---|---|---|
committer | Lars Wirzenius <liw@liw.fi> | 2010-04-25 06:09:23 +1200 |
commit | dc39bf76e7255d4a022216cc96c7bc0a9b1679fc (patch) | |
tree | bb8b2108b34b532358837140349bb119223268a2 /dupfiles | |
parent | 12db03c7f917f6d53f2b8990af8ac55ffe21b4f0 (diff) | |
download | dupfiles-dc39bf76e7255d4a022216cc96c7bc0a9b1679fc.tar.gz |
Do not follow symlinks when statting.
Report all hardlinks to the same file as duplicates.
This is probably stupid, but avoids a bug: if foo and bar
are hardlinks to the same inode, and foobar is not, but
has identical content, then previously it would be random
whether foo or bar was reported as the hardlinks. Further,
only one of foo and bar would be made into a hardlink with
foobar. So the next run would report the other one as a
duplicate.
Diffstat (limited to 'dupfiles')
-rwxr-xr-x | dupfiles | 8 |
1 files changed, 2 insertions, 6 deletions
@@ -58,14 +58,10 @@ class DuplicateFileFinder(object): subdirs.sort() pathnames = [os.path.join(dirname, f) for f in filenames] for pathname in pathnames: - stat = os.stat(pathname) + stat = os.lstat(pathname) t = (stat.st_dev, stat.st_ino, pathname) if stat.st_size in self.by_size: - for dev, ino, pathname in self.by_size[stat.st_size]: - if stat.st_dev == dev and stat.st_ino == ino: - break - else: - self.by_size[stat.st_size].append(t) + self.by_size[stat.st_size].append(t) else: self.by_size[stat.st_size] = [t] self.progress.finished() |