Find all files above a certain size?

958 views
Skip to first unread message

Ian Crew

unread,
Feb 27, 2019, 7:04:12 AM2/27/19
to 'Jordan Tinsley' via GAM for G Suite
Hi all:

Does anyone know of a way (ideally using GAM, but other tools are a possibility) to generate a list of all files above a given size for an entire G Suite instance? Ideally, we’d want to retrieve owner and file size for each file above the limit. 

Note that we’re interested in the cases where a *single file* is over the limit, not the ones where a user’s *total* usage/quota is over that limit...

Thanks!

Ian
--
___
Ian Crew

IST-Architecture, Platforms and Integration (API)
Earl Warren Hall, Second Floor
University of California, Berkeley

Ross Scroggs

unread,
Feb 27, 2019, 10:14:21 AM2/27/19
to google-ap...@googlegroups.com
Ian,

The API has no support for querying about file size, you'll have to write a script.
Do the following:
gam user us...@domain.com print filelist id title filesize mimetype query "mimeType != 'application/vnd.google-apps.folder'" > userfiles.csv
You get these headers: Owner,id,title,fileSize,mimeType
fileSize is only populated for non Google files

Ross
--
You received this message because you are subscribed to the Google Groups "GAM for G Suite" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-apps-man...@googlegroups.com.
To post to this group, send email to google-ap...@googlegroups.com.
Visit this group at https://groups.google.com/group/google-apps-manager.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-apps-manager/CAD2sLFtb%3DTwGr0t44KXtNudPKPhx%3DXze0Vt4kr2jUndfkKixHQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Ian Crew

unread,
Feb 27, 2019, 12:21:27 PM2/27/19
to google-ap...@googlegroups.com
Thanks Ross!

So if I wanted to do that for all users, I’d take the output from 
gam print users emails and pipe that into the command you referenced?

I’m thinking this could be a pretty simple script:

grab the list of all of the users (above)
loop through that list with the “get all files” command you suggested (below)
extract fields 1 and 4
look for files that are above the limit
save the results

Of course, it’d take a while to run, but otherwise it’s not (conceptually) difficult...

Sound about right?

Cheers,

Ian

Ross Scroggs

unread,
Feb 27, 2019, 12:28:01 PM2/27/19
to google-ap...@googlegroups.com
Ian,

No piping required:
gam all users print filelist id title filesize mimetype query "mimeType != 'application/vnd.google-apps.folder'" > userfiles.csv

That's going to be a big file

Ian Crew

unread,
Feb 27, 2019, 12:43:31 PM2/27/19
to google-ap...@googlegroups.com
Even easier, thanks.

This could become a one-liner, I think:

gam all users print filelist id title filesize mimetype query "mimeType != 'application/vnd.google-apps.folder'"  | awk -F ‘,’ $4>1000000000{'{print "$1,$4”}'

(which would also cut down on the output a good bit….)

Cheers,

Ian

+KimNilsson

unread,
Mar 4, 2019, 3:59:14 AM3/4/19
to GAM for G Suite
Ian, di you figure out a complete oneliner?

When I tried yours with piping to file awk gave me an error.

gamx all users print filelist id title filesize mimetype query "mimeType != 'application/vnd.google-apps.folder'"  | awk -F ‘,’ $4>1000000000{'{print "$1,$4”}' > large_userfiles.csv

awk: no program given

Joetje F

unread,
Mar 4, 2019, 5:43:56 AM3/4/19
to GAM for G Suite
This will work:

gamx all users print filelist id filesize mimetype query "mimeType != 'application/vnd.google-apps.folder'"  | awk -F, '{ if ($3 > 100000) { print $2,$3 } }'

I'm leaving out title as it may contain a comma

+KimNilsson

unread,
Mar 4, 2019, 6:55:32 AM3/4/19
to GAM for G Suite
Yes, so far so good.
1834/5947 users processed.

Ian Crew

unread,
Mar 4, 2019, 12:16:45 PM3/4/19
to 'Jordan Tinsley' via GAM for G Suite
Thanks for debugging the code, Joetje! I think I accidentally left an extra single quote in there someplace, and great catch re. commas in the filename. I might still have the output be  { print $1,$2,$3 }  to include the user’s account name, and make sure that the output is comma-separated, so:

gam all users print filelist id filesize mimetype query "mimeType != 'application/vnd.google-apps.folder'"  | awk -F, '{ OFS = ","; ORS = "\n"; if ($3 > 100000) { print $1,$2,$3 } }’

Extending that a bit, it’s possible to have it all run in the background, and have things end up in log files and csv files, named with the date and time (assuming you’re running macOS/Linux):

gam all users print filelist id filesize mimetype query "mimeType != 'application/vnd.google-apps.folder'" 2> "./large_files_`date '+%Y-%m-%d_%H-%M-%S'`.log" | awk -F, '{ OFS = ","; ORS = "\n"; if ($3 > 100000) { print $1,$2,$3 } }' > “./large_files_`date '+%Y-%m-%d_%H-%M-%S'`.csv" &

This could be useful, for example, if you wanted to run it via cron job on a periodic basis.

Cheers,

Ian

--
You received this message because you are subscribed to the Google Groups "GAM for G Suite" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-apps-man...@googlegroups.com.
To post to this group, send email to google-ap...@googlegroups.com.
Visit this group at https://groups.google.com/group/google-apps-manager.

Ross Scroggs

unread,
Mar 4, 2019, 3:29:37 PM3/4/19
to google-ap...@googlegroups.com
Ian,

I've added this to Advanced GAM 4.65.67.

$ gam user testuser1 print filelist minimumfilesize 10000 title size

Getting all Drive Files/Folders that match query ('me' in owners) for test...@rdschool.org

Got 133 Drive Files/Folders for test...@rdschool.org...

Owner,name,size

test...@rdschool.org,TestRevision.pdf,10286

test...@rdschool.org,GamUpdate.txt,359837

test...@rdschool.org,users3.csv,23578

test...@rdschool.org,users.csv,23578

test...@rdschool.org,dashedusers.txt,30691

test...@rdschool.org,clip.dv,625560000

test...@rdschool.org,allusersprofile.out,54709

test...@rdschool.org,TestMHTML.mht,186659

test...@rdschool.org,Kurzweil-K2000-Dual-Bass-C1.wav,1322298

test...@rdschool.org,TextWithPicture.odt,11452334


Ross




For more options, visit https://groups.google.com/d/optout.


--

Seth Dimbert (Hillel Yeshiva)

unread,
May 11, 2022, 2:12:26 PM5/11/22
to GAM for Google Workspace
Ross, how can your new, one-line command be used against all users in a particular OU?

Ross Scroggs

unread,
May 11, 2022, 2:20:51 PM5/11/22
to google-ap...@googlegroups.com
Seth,

gam redirect csv ./BigFiles.csv ou "/Path/To/OU" print filelist minimumfilesize 10000 title size

You received this message because you are subscribed to the Google Groups "GAM for Google Workspace" group.

To unsubscribe from this group and stop receiving emails from it, send an email to google-apps-man...@googlegroups.com.

Seth Dimbert (Hillel Yeshiva)

unread,
May 11, 2022, 2:43:28 PM5/11/22
to GAM for Google Workspace
Thanks. I ran a slightly modified command:

gam all users print filelist minimumfilesize 10000 title size todrive

It's slow... but it's working. Two follow-ups:
  1. I realize I should have asked gam for the fileID of each file, too. That way, I could grant myself View access to the ones I want to check out, right?
  2. I recall another discussion you and I had where we broke a similar command into two parts, filtering the list of results before querying for details. Could we have somehow filtered this list by files above 10000 before reporting the file list? Would that be faster?
-Seth
Reply all
Reply to author
Forward
0 new messages