Magento is a great platform to create powerful e-commerce stores. However for developers with a little know-how of the platform, it’s easy to run into problems, which sometimes can be pretty daunting to resolve.

If you use Magento, you will know a great way to import products is via the CSV dataflow import. A common issue with dataflow import however is with the image import. Every time you import a CSV, Magento copies images from /media/import into the /media/catalog/product folder, which is perfect as all the images are resized to correct dimensions as well. But the problem arises when you are importing/ updating products which are already in the database. If you haven’t updated the image location in the CSV to the cached version, Magento won’t replace the existing images but will annoyingly again import the same image and create a copy as filename_2.jpg etc. So after multiple imports for the same products, you will see the product pages showing the same image multiple times. Annoying, right?

Not anymore.

Here is a small script to take care of duplicate images in a product. The script checks all the products and automatically deletes any duplicates images it finds for any product. Save the code into a file called delete.php (Make sure to add PHP opening and closing tags to the code) and upload it to your website’s root folder and run it from your browser at yourbaseurl/delete.php

include('app/Mage.php');
//Mage::App('default');
Mage::app()->setCurrentStore(Mage_Core_Model_App::ADMIN_STORE_ID);
error_reporting(E_ALL | E_STRICT);
Mage::setIsDeveloperMode(true);
ini_set('display_errors', 1);
ob_implicit_flush (1);

$mediaApi = Mage::getModel("catalog/product_attribute_media_api");
$_products = Mage::getModel('catalog/product')->getCollection();
$i =0;
$total = count($_products);
$count = 0;
foreach($_products as $_prod)
{
$_product = Mage::getModel('catalog/product')->load($_prod->getId());
$_md5_values = array();

//protected base image
$base_image = $_product->getImage();
if($base_image != 'no_selection')
{
$filepath = Mage::getBaseDir('media') .'/catalog/product' . $base_image ;
if(file_exists($filepath))
$_md5_values[] = md5(file_get_contents($filepath));
}

$i ++;
echo "\r\n processing product $i of $total ";

// Loop through product images
$_images = $_product->getMediaGalleryImages();
if($_images){
foreach($_images as $_image){
//protected base image
if($_image->getFile() == $base_image)
continue;

$filepath = Mage::getBaseDir('media') .'/catalog/product' . $_image->getFile() ;
if(file_exists($filepath))
$md5 = md5(file_get_contents($filepath));
else
continue;

if(in_array($md5, $_md5_values))
{
$mediaApi->remove($_product->getId(), $_image->getFile());
echo "\r\n removed duplicate image from ".$_product->getSku();
$count++;
} else {
$_md5_values[] = $md5;
}

}
}

}

NOTE: Please note that this script is provided as is. I highly recommend you take a backup before running it and doing so will be at your own risk. Thank you.