Removing Duplicate Files at Scale

80 views
Skip to first unread message

Delvin Bonilla

unread,
Jan 22, 2026, 4:37:58 PM (11 days ago) Jan 22
to GAM for Google Workspace
Hi,

I'm working on a Drive storage reduction project in a large environment and have been analyzing duplicate files using admin tools, GAM, and GAT+. Reporting is clear, but I'm trying to understand what is actually doable.

Has anyone successfully deleted duplicate files at scale while preserving sharing, or is that not feasible due to ownership and permissions. Also, if duplicates are shared with different users, have you found any workable cleanup approaches?

I'm also curious if anyone has used commands or workflows to safely delete non shared duplicates only. I wanted to see if others have tackled this already or if there is a recommended way to test outside of live data, such as a Workspace test environment.

Thank you.

Ross Scroggs

unread,
Jan 22, 2026, 5:24:37 PM (11 days ago) Jan 22
to google-ap...@googlegroups.com
How do you determine if two files are duplicates?

Ross
----
Ross Scroggs



--
You received this message because you are subscribed to the Google Groups "GAM for Google Workspace" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-apps-man...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/google-apps-manager/fabe8f79-f331-4837-b75d-cab484913e8en%40googlegroups.com.

Delvin Bonilla

unread,
Jan 23, 2026, 10:42:51 AM (10 days ago) Jan 23
to GAM for Google Workspace
Hi Ross,

We currently are using two methods. Initially via GAM we did a pretty basic, if name = identical and size = identical, then it's a duplicate. We didn't use any file checksums, however by that alone we were able to find 23 TB worth of duplicates, with their associated file IDs.

We also use Google Admin Tools, which does provide a more information like MD5, path, who it's shared to etc.

Ross Scroggs

unread,
Jan 23, 2026, 10:59:47 AM (10 days ago) Jan 23
to google-ap...@googlegroups.com
Delvin,

Send me a private Meet/Zoom invitation, I'm curious about 25TB of duplicates.

Ross
----
Ross Scroggs


Russ Thibeault

unread,
Jan 23, 2026, 1:58:15 PM (10 days ago) Jan 23
to GAM for Google Workspace
Please do follow up on this, whether or not you have success. This is a great endeavor that would benefit so many of us and I for one would appreciate any insights.

I'm also curious about 25TB of dups, if you don't mind me asking, what is the total size of your Workspace drive? 

Ross Scroggs

unread,
Jan 23, 2026, 2:22:37 PM (10 days ago) Jan 23
to google-ap...@googlegroups.com
My bet is that domain shared files are being counted as duplicates. I'm offline for 4 houes, will check in later.

Ross
----
Ross Scroggs


Delvin Bonilla

unread,
Jan 23, 2026, 2:33:36 PM (10 days ago) Jan 23
to GAM for Google Workspace
Hi Ross,

Thank you for the offer. I will send you one as soon as I can. It's a little busy here today, so most likely looking at Monday or Tuesday if that is ok with you.

For now the command we used was: 
gam config auto_batch_min 1 redirect csv ./all_files.csv multiprocess all users print filelist choose mydrive_any fields id,name,mimeType,owners,size,md5 fullpath showownedby any

Our total workspace storage at the moment is 55 TB being used.

Ross Scroggs

unread,
Jan 23, 2026, 7:09:13 PM (10 days ago) Jan 23
to google-ap...@googlegroups.com
Delvin,

I'm in California (GMT-8 PST) and am available starting at 7:30AM either Mondaty or Tuesday; send me a private Meet/Zoom invitation.

Ross
----
Ross Scroggs


dbon...@lfny.org

unread,
Jan 29, 2026, 10:23:58 AM (4 days ago) Jan 29
to google-ap...@googlegroups.com, ross.s...@gmail.com
Delvin Bonilla is inviting you to a scheduled Zoom meeting.
Important: All Lycée-organized Zoom meetings are recorded and archived


Topic: [GAM] Removing Duplicate Files at Scale
Time: Jan 30, 2026 12:00 PM Eastern Time (US and Canada)

Join Zoom Meeting
https://lfny.zoom.us/j/6178805609?omn=81386737886

Meeting ID: 617 880 5609

One tap mobile
+15074734847,,6178805609# US
+15642172000,,6178805609# US

Dial by your location
+1 507 473 4847 US
+1 564 217 2000 US
+1 646 558 8656 US (New York)
+1 646 931 3860 US
+1 669 444 9171 US
+1 689 278 1000 US
+1 719 359 4580 US
+1 720 707 2699 US (Denver)
+1 253 205 0468 US
+1 253 215 8782 US (Tacoma)
+1 301 715 8592 US (Washington DC)
+1 305 224 1968 US
+1 309 205 3325 US
+1 312 626 6799 US (Chicago)
+1 346 248 7799 US (Houston)
+1 360 209 5623 US
+1 386 347 5053 US
Meeting ID: 617 880 5609
Find your local number: https://lfny.zoom.us/u/kdSAgyQPdJ

Bonilla, Delvin

unread,
Jan 29, 2026, 10:26:09 AM (4 days ago) Jan 29
to google-ap...@googlegroups.com
Hi Ross,

Thank you again for your support with this. It truly makes a difference, and we really appreciate it. 
I was able to send you a meeting for tomorrow if you can make it; if not, that's ok; we can try Monday as you stated.

Thank you again!
Delvin Bonilla


Ross Scroggs

unread,
Jan 29, 2026, 10:33:18 AM (4 days ago) Jan 29
to google-ap...@googlegroups.com
I have a schedule conflict tomorrow morning, Monday is free atfter 7:30AM PST.

Ross
----
Ross Scroggs


Bonilla, Delvin

unread,
Jan 29, 2026, 10:43:01 AM (4 days ago) Jan 29
to google-ap...@googlegroups.com
Hi Ross,

Thank you again. I updated the invitation.
See you then!

Have a great weekend!
Delvin Bonilla
Database Administrator | Lycée Français de New York



Bonilla, Delvin

unread,
Jan 29, 2026, 1:48:08 PM (4 days ago) Jan 29
to google-ap...@googlegroups.com
Hi Ross,

A few colleagues who’ve been working on this project with me are very interested in the discussion and would like to join the call, if that’s okay.
I’ve invited them to the meeting and wanted to give you a heads-up.

Thanks again,
Delvin Bonilla
Database Administrator | Lycée Français de New York


Reply all
Reply to author
Forward
0 new messages