DupShelf

How duplicate detection works

DupShelf finds exact duplicate photos on your computer inside the browser tab you opened. Your library is not uploaded to our servers for scanning. This page explains the scan flow, what SHA-256 matching means, how folder permissions work, and how to verify privacy yourself.

Step-by-step workflow

  1. Add images from your computer

    In Chrome or Edge on desktop, choose a folder to scan nested subfolders, or add files manually with drag-and-drop or paste. Safari and Firefox support file batches but not full-folder scan.

  2. Scan locally with SHA-256

    DupShelf reads each supported image file on your device, computes a SHA-256 hash of the raw bytes, and groups files that share the same hash. Progress and cancel are available during long scans.

  3. Review groups and pick keepers

    Open each duplicate group, preview thumbnails, and mark one keeper per group. The app estimates space you can recover before you commit to any action.

  4. Move or export, then delete on your schedule

    Move non-keepers into dupshelf-duplicate-images inside your library, or download a CSV. DupShelf never deletes files—you confirm removal in Finder or Explorer.

SHA-256 exact matching

For each image file, DupShelf reads the raw bytes on disk and computes a SHA-256 cryptographic hash—a fixed-length fingerprint of the file contents. Files with the same hash are byte-for-byte identical, regardless of filename, folder path, or extension.

That makes exact mode the safest first cleanup pass: if two files group together, keeping one copy is logically safe because every grouped file is provably the same as another. Hashing runs in web workers where supported so the main UI stays responsive; large folders can take time on slow USB drives, which is why progress and cancel are built in.

We do not use filenames, EXIF dates, or thumbnail pixels to decide duplicates—only full file content. Two different photos that happen to share the same file size will not match unless their bytes are identical.

Exact duplicates vs similar photos

Exact duplicates share identical bytes—the same file saved twice under different names, or the same export dropped in two folders.

Similar photos look alike to a perceptual algorithm (dHash or similar) but may differ in compression, crop, or burst timing. Tools like Scanly or PixDuplicate lean perceptual; they help with re-compressed WhatsApp forwards but can false-positive on burst shots you still want.

DupShelf ships exact mode today by design. Optional similar-image detection is planned as a clearly labeled, opt-in mode—exact mode will remain the default because trust matters more than feature count.

Folder permissions in Chrome and Edge

Full-folder scan uses the File System Access API. You grant access in a native browser prompt—we cannot read paths you did not choose.

  • Read access — required to enumerate and hash images under the folder you picked, including subfolders.
  • Write access — requested when you use Move to folder. Non-keepers go into dupshelf-duplicate-images inside your library; keepers stay in place.
  • CSV export — works with read-only access if move is not available (some network mounts).

Revoke permissions anytime in Chrome site settings. Previously moved files remain on disk; the app does not undo moves automatically. See the privacy policy for more detail.

Verify privacy in five minutes

  1. Open DupShelf in Chrome or Edge and open Developer Tools → Network.
  2. Start a scan on a small test folder with a few images.
  3. Filter by your site domain—you should see page assets, not multi-megabyte POST bodies of your photos.
  4. After the first load, try working offline; local scanning should still run for files on disk.
  5. Review duplicate groups before deleting anything in your file manager.

Limitations and honest expectations

  • Re-compressed or cropped versions of the same scene are not grouped in exact mode.
  • Cloud-only placeholders (e.g. online-only Drive files) are not readable until downloaded.
  • Very large libraries depend on RAM, CPU, and disk speed—scan subfolders if needed.
  • Session restore can reload results after hashing; interrupted hashing requires a rescan.

Common questions

Are my photos uploaded during a scan?
No. Scanning runs in your browser tab on your machine. Image bytes are not sent to DupShelf servers for duplicate detection. You can verify in Developer Tools → Network during a test scan.
What is the difference between exact and similar duplicate detection?
Exact mode groups files with identical bytes (SHA-256), even when filenames differ. Similar-image detection uses perceptual hashing and can group resized or re-compressed lookalikes—but may also group burst shots you want to keep. DupShelf ships exact mode first; similar mode will be optional when it launches.
Why use SHA-256 instead of comparing file names?
Filenames lie: WhatsApp forwards, sync tools, and downloads rename the same file constantly. A cryptographic hash of file content is deterministic—equal hash means equal file.
Which browsers support folder scan?
Chrome and Edge on desktop support recursive folder pick and optional move-to-folder via the File System Access API. Safari and Firefox can add files in batches. Mobile browsers are best for small sets; export to a PC for full libraries.
What image formats are supported?
JPEG, PNG, WebP, GIF, BMP, and AVIF when your browser can read them. Videos, PDFs, and other files in the same folder are skipped.
Does DupShelf delete duplicate photos?
No. The app moves duplicates to a quarantine subfolder or exports a CSV. You delete files yourself after reviewing thumbnails.

Related reading