Friday, January 29, 2010

Find Duplicate Photos in an iPhoto Library

03/04/2011- Update: Updated the service with some fixes for invalid paths and quoting problems.
06/23/2016- Update: Updated download link. I'm not really sure if this still works with newer versions of iPhoto or Photo's though.

There are many shareware tools for finding duplicate pictures in an iPhoto library but this should be a simple operation, and honestly shouldn't require a fee to utilize. As a result I created a simple Service via Automator to solve this problem.

This service is very straightforward, simply select your iPhoto library in finder and go to Services->Find Duplicates in iPhoto.

The service will do the following steps:
  • Calculate the md5 of each original picture in the library. This happens to the original backup of the picture so an edited version will still get caught as a duplicate.
  • Sort the list and find any pictures that have the same checksum.
  • Add any duplicates found to a new Album named Duplicates. (I had intended to use a keyword but this turned out to be simpler.)
You can then view the Album and flag and delete photos at your discretion.

Notes:
  • No files are actually deleted, just flagged.
  • The way iPhoto works choosing delete will just remove the photo from he album, not from iPhoto. To get around this you can select the photos from the album then switch to the all Pictures view. Your selection will remain and you can then choose to delete the selected photos.

Install Instructions:

28 comments:

Dave & Collette said...

"Extract the service into the '~Home/Library/Services/' folder."

Could you further explain how to do this? My '~Home/Library/' folder doesn't have a 'Services/' folder. The only Library that does is in the 'System/' folder.

If this is it, do you drag & drop it into the Services folder?

Thanks!

Jeremy Pyne said...

It was slightly mistyped its actually under ~/Library/Services (Home -> Library -> Services). If the directory does not exist you can just create it. Another option is to extract the file to /Library/Services witch will install the service for all users.

Just drag the downloaded service into wither of those folders. (Not the zip file itself, the workflow within it.)

sjohnso said...

Sounds like this will be pretty nice. However, the link to the zip file doesn't seem to work. Could you resolve the issue or provide an alternate means of download?

Thanks!!

Natashka said...

This is what I really need thank u so much :) but the thing is I dont get how u activate the service - I dont get the following step "simply select your iPhoto library in finder and go to Services->Find Duplicates in iPhoto."
what do u have to do specifically?!

Jeremy Pyne said...

You have to select your iPhoto Library in Finder and then run the service. The default Library should be saved in your "HOME FOLDER/Pictures/iPhoto Library". If you are unsure if you found the correct file you can examine the datails and it will have the Kind field set to "iPhoto Library" as well. Once you find the library itself you can just right-click on it and select Find Duplicates in iPhoto.

Tiger said...

Hi Jeremy,

I was excited to find this automator service. I have not had success with the trial of Duplicate Annihilator and don't have confidence in it.

I have used Automator services and folder actions successfully in the past, and had no trouble getting this one set up. I select the library, right click, choose "Find Duplicates in iPhoto". I notice the spinning gear icon on the task bar indicating it is running, and a new Album floder gets added to my iPhoto library called "Duplicates". So far so good -- but the Duplicates folder is always empty, even when I know there are duplicate files there.

Using iPhoto '09 (8.1.2 (424)) on OS X 10.6.4

Am I missing something? I appreciate your efforts and I would really like to get this working :-)

Cheers,
Tiger

Jeremy Pyne said...

The problem you are probably seeing is that this service is only doing a very trivial checksum comparison of the images. That is it will only flag images that are bit for bit identical. That is any change in size, difference in format, or edits will cause them to show as different files. (File names changes and non-destructive edits in iPhoto will however still appear as duplicates.)

You can easily test if the script is in fact working by importing the same image twice and then running the action.

I have tried to fined a better comparison then simple checksum's but the only idea's I have had so far require a lot more complex codding and produce many false positives. (I tried comparing median colors, image deltas, and gray-scales, and thumbnails in the past with no luck.)

Charles said...

The link to the zip file seems to be broken. Is there another location?

Craig Pitout said...

Hi Jeremy,

Any chance you could compare a number of factors from the EXIF data?

Cheers

The Doggie Style Team said...

Hi Jeremy,

I'm having the same problem as Tiger and I've tried importing the same photo twice and running the script but I still end up with an empty Duplicates Album.

Any ideas as to what might be happening?

Brad said...

I am having the same problem. Folder creation but its empty.

Jeremy Pyne said...

Iv noticed some problems myself. I'll take a look when I get back from vacation.

Bob said...

I'm seeing the same empty folder issue... Snow Leopard/iPhoto 09. Not sure when I applied the last iPhoto update, so can't say for sure if that caused the issue or not.

Mike said...

I get the album as well but no duplicates... Any solutions or ideas?

Paul said...

I had terrible problems with duplicates in iPhoto, 10,000 images with about 50% duplicates. After trying several solutions I found one that works well.

Its a bit of a pain, but really the other options are not working. I even bought software like "duplicate annihilator" but even that wont catch everything.

Here is a sure fire way to eliminate duplicates, even if you have multiple iphoto libraries you want to merge.

1) right-click iPhoto library "show package contents" & copy "originals" & "modified" folders to another location. (If you have other iPhoto libs, do the same with those too.)

2) Download Apples "Developer Tools" & use the included application "File Merge" to merge your modified & original folders. (if you have other iPhoto libs merge all your "modified" folders into one & all your "original" folders into one first)

3) Finally do a by-hand check of the folders inside. You may have "2010/Spain Holiday" & "2010/Spain Holiday2" you can manually drag the photos from holiday2 into holiday & then delete holiday2 folder.

4) Now you should be good. Create a ZIP file or backup of your original iPhoto Library. Open iPhoto & delete everything. Then import your newly created merged folder. Check it over, if everything is OK you are safe to delete the ZIP backup you made of the iPhoto library.

mitchbvi said...

Jeremy

Have number of iphoto libraries those I have upgraded to iLife11 will not add album any thoughts on this thanks.

Mitch

Arnaud said...

Really good script. It helped me a lot. There just one error in it, which explains it doesn't work. You didn't "quoted" the directory path in the find command thus it fails for "IPhoto Library"
If you replace the line
`find $dir/Originals -type file -print0 | xargs -0 md5 -r > $file`;
by
`find "$dir/Originals" -type file -print0 | xargs -0 md5 -r > $file`;
the script works
Thx.

ejannett said...

Hi,

Thank a lot for this. As others here I've only found non-free stuffs for such common feature

Following your approach I've tried a faster version which should scale better on large libraries.

============================
#!/usr/bin/perl -w


use File::Find;
use Digest::MD5;

my $dir = shift;

my %candidates = ();

sub is_wanted_file {
if ( -f $File::Find::name ) {
$size = -s "$File::Find::name";

if (!exists($candidates{$size})) {
$candidates{$size} = [];
}
push @{$candidates{$size}}, $File::Find::name;
}
}

sub getMD5 {
($filename) = @_;
open(MDBUF,$filename);
binmode(MDBUF);
$digest = Digest::MD5->new->addfile(*MDBUF)->hexdigest;
close(MDBUF);
return $digest;
}


find({ wanted => \&is_wanted_file, no_chdir => 1, follow => 1 }, $dir);


foreach $size (keys %candidates) {
if (@{$candidates{$size}} > 1) {

@files = @{$candidates{$size}};

while (@files > 0) {
$file = pop(@files);
my $digest = getMD5($file);
my @sames = ();
my @notsames = ();
foreach my $_f (@files) {
if ($digest eq getMD5($_f)) {
push @sames,$_f;
} else {
push @notsames,$_f;
}
}
if (@sames > 0) {
print "$file\n";
foreach $f (@sames) {
print "$f\n";
}
}
@files = (@notsames);
}

}
}

Doug said...

Is there a good way to just find duplicate names and change them. ie photos from different cameras or people that have the same name ie: IMG_0505.

Ipad doesn't like to import duplicate names although iPhoto has no problem with them.

I would just like to find all the duplicate names so I can change them

rossjudson said...

If you're actually trying to delete duplicates, the script is useful if you modify it just a bit, so it prints out only the duplicate file names, not the original. It will then build a Duplicate album that contains just the second or later copy of given photos.

Deleting or putting these in the trash doesn't do the job, though, because that leaves a copy somewhere in iPhoto's events.

You can take advantage of the fact that in iPhoto, a picture can only be in one Event at a time. Bring up the Duplicates album that was created (with the second and later copy of the images), select them all, and flag them (make sure you don't have anything else flagged, beforehand). Then create an event, from the flagged pictures. The duplicates are all gathered into an event that has nothing but duplicates in it (and they're removed from wherever they are). Give the newly created event a name like "duplicate pics".

Delete the duplicates album the script created. Check out the duplicate pics event and make sure you really want to delete them. Drag the event to the trash (the iPhoto trash), and click on the trash and empty it.

At that point you've actually removed and deleted all of the duplicate pics.

Jeremy Pyne said...

Yes, I intentional add both versions to the duplicates album as one may have more meta-data associated with it so I don't want to assume once is trash and just throw it out. Also in the newest version of iPhoto you can actually right click and explicitly trash a photo and it will remove it from iPhoto not just from the album. (In the older version you had to mark all the photo and switch back to the main photo's view to delete them.)

As for name dups we could locate these by a scrip but theres really no way to update/change the names programmability without doing a raw SQL operation witch i wouldn't advise as it could break the whole library.

Christian Albert Mueller said...

Hi,
i just can confirm what rossjudson did say.
i started your script and was very happy to see all the duplicates in a new album, just to realize.. its NOT only duplicates its also the originals...
so I would even have to go through nearly 6000 ! images to delete manual. thats not the way. if its just the duplicates, i would not mind and just click trash... so i cant use the script... would be great you make that fit.
anyhow thanks for the try
greets
chris

lauren said...

I am also facing the same case as Christian Albert Mueller that in listing it shows original along with the duplicates.Can you please work to resolve this.I appreciate your efforts to help for free.Thanks
PDF signature

JJ said...

Thank you for this! it works! Simply works!

mwildcard said...

I've got an error:

An error occurred while converting "com.apple.cocoa.string" to "com.apple.iphoto.photo-object."

Well, really the error is "Workflow has stopped," etc., but when I click "show workflow" and run it again as a workflow with a "Get selected finder items" action added (since it's being run as a workflow, not a service), then I get the above error.

I am good with Automator, decent with Applescript and once upon a time I did do programming in perl...but that was years ago and only for about six months.

Can anyone help with this?

(If it makes any difference, the iPhoto Library isn't in the default location on my computer; it's in the "Shared" folder so that my wife can view the same photos in iPhoto from her login.)

Aaron said...

This is a great service, with one catch for my particular iphoto library:
iphoto has added meta data to the pix i've imported.
can anyone provide a modified script that md5()s the image data only, stripping out header info?

That would be super helpful.

Aaron said...

OK, I figured it out:
Instead of using find | xargs to build /tmp/pictures.txt, we're going to exiftool and a perl loop. There's probably a better way to do this, but here's what I got:

replace this line:
`find "$dir/Originals/" -type file -print0 | xargs -0 md5 -r > $file`;

with this:
@files = `find "$dir" -type file`;
open (FILE, ">$file");
foreach $myfile (@files) {
chomp($myfile);
my $md5 = `exiftool -o - -all= "$myfile" | md5`;
chomp($md5);
my $line = "$md5 $myfile";
chomp($line);
print $line;
print FILE "$line\n";
}

Seems like it's going to take significantly longer, since we're piping everything through exiftool, but it should identify duplicates without respect to exifdata.

MadDog said...

I got this error running the Workflow on Mac OS X 10.7.3 Lion

Conversion from Text to iPhoto photos failed - 1 error

An error ocurred while converting "com.apple.cocoa.string" to "com.apple.iphoto.photo-object"

Please help!

Best Regards.