DupShelf

Scan a folder for duplicate images

Single-folder duplicate scanning sounds simple, but many browser tools only accept dragged files or cap batch size. DupShelf is built for a full recursive folder scan: pick a root directory in Chrome or Edge, walk nested subfolders, hash every supported image, and present exact duplicate groups with progress, cancel, keeper marks, and optional move or CSV export.

Open DupShelf How it works

Choosing the right root folder

Pick the highest folder that still makes sense for your job—Downloads, Pictures, a Google Drive sync directory, or a copied phone backup. Scanning too high (entire user profile) takes longer; too low (one subfolder) may miss duplicates elsewhere. You can always run multiple passes.

During the scan

You will see stages for enumeration and hashing with a live progress indicator. Cancel stops work if you selected the wrong path. Large libraries on slow USB drives can take tens of minutes; keep the tab open or rely on session restore after hashing completes.

After the scan: review UI

Each duplicate group expands to show thumbnails, file names, paths, and sizes. Mark one keeper per group; the UI estimates space you can recover. Sorting and scanning again are available if you add files later.

Move vs CSV export

Move-to-folder (Chrome/Edge with write permission) places non-keepers in dupshelf-duplicate-images inside your library—easy visual audit. CSV export suits automation, spreadsheets, or read-only mounts where move is not allowed.

Supported formats and skipped files

JPEG, PNG, WebP, GIF, BMP, and AVIF are hashed when readable. PDFs, videos, and documents are ignored. Online-only cloud placeholders are not readable until downloaded to disk.

Rescan and session restore

Added new photos? Rescan the folder. Closed the tab mid-review? DupShelf can restore session state from browser storage so you do not re-hash from scratch.

Comparing folder scan to drag-and-drop batch

Drag-and-drop is fine for ten files. Folder scan is for libraries: nested paths, consistent keeper rules, and space totals across the tree. If you already dragged files, consider rescanning the parent folder once to catch copies you missed in subfolders.

Enterprise-scale folders

Marketing teams with tens of thousands of assets should scan campaign subfolders per quarter rather than one giant root. Cancel and resume via session restore if review spans days. Export CSV for asset managers who track deletion in a DAM workflow.

Splitting huge libraries

If one root has 50k images, consider scanning by year subfolder (2023, 2024) to reduce memory pressure. Merge results mentally or run a second pass on a parent after child folders are clean.

Accidentally picked the wrong folder

Cancel during hashing if you notice the path early. If scan finished, start a new scan on the correct folder—previous results do not mix.

Summary and next steps

Folder scanning is the feature that separates a toy duplicate checker from a library tool. When you pick a root, you are declaring scope: everything under this tree is fair game for hashing, nothing outside. That discipline prevents accidents and makes results explainable. After your first successful scan, consider a naming convention for future imports so duplicates are obvious even before hashing—YYYY-MM event folders, single canonical Downloads archive, and avoiding duplicate manual copies from USB sticks. If you maintain multiple roots (work vs personal), scan them separately and keep separate quarantine folders. Rescan after big imports; session restore helps when life interrupts review. If performance suffers, hardware upgrades help more than software tricks: SSD, wired USB, and closing browser tabs. The scan itself is deterministic: same files, same hashes, same groups every time. Folder scan is the default power feature—use it on Downloads, then Pictures, then any sync directory on disk. Cancel and rescan are cheap compared to upload-and-wait services.

Questions

Which formats are scanned?: Common web image formats when the browser can read them. Unsupported types are skipped silently.
Does it scan subfolders?: Yes. The folder picker includes nested directories under the root you grant.
Can I cancel a long scan?: Yes. Use cancel if you picked the wrong directory or need to free the tab.
Why is hashing slow on external drives?: USB bandwidth and spin-up latency matter. Copying to an internal SSD first can speed a very large library.
What if move fails?: Export CSV and clean manually, or check that the browser has write permission to that path.
How many files can one folder have?: Thousands are common. Start with a subfolder on low-RAM machines to test stability.
Does scan order affect results?: No. All files in the tree are hashed; grouping is by content only.