Find all files Published to the Web

298 views
Skip to first unread message

Shaun Goodwin

unread,
Aug 23, 2024, 3:09:37 AM8/23/24
to GAM for Google Workspace
We've recently completed an audit where we are required to report on all files available publickly.

We've used these tools to find files shared anyone with the link in MyDrive and SharedDrives

However we now have the requirement to find all files published to web without restriction.

Ideally looking to do this too for mydrive and shareddrive




last year



last month


Ross Scroggs

unread,
Aug 23, 2024, 1:14:38 PM8/23/24
to google-ap...@googlegroups.com
Shaun,

What is: published to web without restriction

Ross
----
Ross Scroggs



--
You received this message because you are subscribed to the Google Groups "GAM for Google Workspace" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-apps-man...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-apps-manager/f5fca419-67e7-4695-bea0-1f8d24b33414n%40googlegroups.com.

Maj Marshall Giguere

unread,
Aug 23, 2024, 1:41:35 PM8/23/24
to google-ap...@googlegroups.com
Shaun;

If you have found all the files with the ACL "anyoneWithLink"  then by default you know all the files that could be publicly visible, however, it's not definitive, someone could simply share a file with an external user(s).  In cases like this I scan all the files in my domain for files that have ACLs outside our domains. Finding links to files with "anyoneWithLink" on the web is beyond what gam can help you with.  You would have to search the entire web to find those links, or I totally misunderstand what you're trying to do.

Ross Scroggs

unread,
Aug 23, 2024, 2:07:54 PM8/23/24
to google-ap...@googlegroups.com
Maybe you mean: anyoneCanFind which is expressed two ways
type anyone withLlnk False
type anyone allowFileDiscovery True

Ross
----
Ross Scroggs


Ross Scroggs

unread,
Aug 23, 2024, 2:31:51 PM8/23/24
to google-ap...@googlegroups.com
gam config auto_batch_min 1 num_threads 20 redirect csv ./NoRestrictionShares.csv multiprocess redirect stderr - multiprocess all users print filelist fields id,name,mimetype fullpath query "visibility='anyoneCanFind'" pm type anyone allowfilediscovery true em pmfilter oneitemperrow
config auto_batch_min 1 num_threads 20 - S peed up processing
redirect stderr - multiprocess - Clean getting messages
all users - Everybody
print filelist fields id,name,mimetype fullpath - What to get
query "visibility='anyoneCanFind'"  - Google filter
pm type anyone allowfilediscovery true em pmfilter oneitemperrow - Gam filter

Ross


----
Ross Scroggs



On Aug 23, 2024, at 10:40 AM, Maj Marshall Giguere <mgig...@nhwg.cap.gov> wrote:

Shaun Goodwin

unread,
Aug 23, 2024, 5:30:21 PM8/23/24
to GAM for Google Workspace
Hi All,

We have found 3 ways a user can share externally.
1. Share Anyone with a link - we have a solve for this
2. As Maj above mentioned shared with external users via domain - we have a solve for this
3. Files Published externally - no solve for this

We haven't found a way to identify files published and or published to "anyone" as opposed to publishing and restricting to "domain"

See screenshots below of how a user can Publish a file externally.

Publish to Web 1.png

Publish to Web2.png

Ross Scroggs

unread,
Aug 23, 2024, 6:06:50 PM8/23/24
to google-ap...@googlegroups.com
Shaun,

Very interesting, send me a Meet/Zoom invitation.

Ross
----
Ross Scroggs


On Aug 23, 2024, at 2:30 PM, 'Shaun Goodwin' via GAM for Google Workspace <google-ap...@googlegroups.com> wrote:

Hi All,

We have found 3 ways a user can share externally.
1. Share Anyone with a link - we have a solve for this
2. As Maj above mentioned shared with external users via domain - we have a solve for this
3. Files Published externally - no solve for this

We haven't found a way to identify files published and or published to "anyone" as opposed to publishing and restricting to "domain"

See screenshots below of how a user can Publish a file externally.

<Publish to Web 1.png>
To view this discussion on the web visit https://groups.google.com/d/msgid/google-apps-manager/1fc7b6d2-052e-43e7-a3c8-8711161fcc31n%40googlegroups.com.
<Publish to Web2.png><Publish to Web 1.png>

Maj Marshall Giguere

unread,
Aug 23, 2024, 6:15:38 PM8/23/24
to google-ap...@googlegroups.com
Shaun;

I've never looked at that menu.  I assumed it was similar to the other "Share" menus and the help page isn't helpful as to how this all happens, it just cautions you not to share private items publicly.  It may be connected with Google sites. Google's file api doesn't make it obvious how to identify such files.  It may be one of those things the api's haven't caught up to yet.

The only relevant query item I can find is visibility:
visibility =!=The visibility level of the file. Valid values are anyoneCanFindanyoneWithLinkdomainCanFinddomainWithLink, and limited. Surround with single quotes (').




Maj Marshall E Giguere

NH Wing Director of IT

Civil Air Patrol, U.S. Air Force Auxiliary

GoCivilAirPatrol.com

nhwg.cap.gov

Volunteers serving America's communities, saving lives, and shaping futures.



Maj Marshall Giguere

unread,
Aug 23, 2024, 6:32:09 PM8/23/24
to google-ap...@googlegroups.com
Interesting.  If you publish to the web the link generated is simply a file id with "/pub" appended.  The actual ACL is completely uninteresting. The actual machinery must be embedded in the permission object itself?

Ross Scroggs

unread,
Aug 23, 2024, 6:56:03 PM8/23/24
to google-ap...@googlegroups.com

# I create a doc in My Drive                                                                                                                                          

# Get the info                                                                                                                                                        

$ gam redirect stdout ./unpub.txt user testsimple show fileinfo 1H62lJN6b2xpL2nzyuk69wc9WJR2NREtcfe5MQIpKRXQ

# I publish the doc                                                                                                                                                   

# Get the updated info                                                                                                                                                

$ gam redirect stdout ./pub.txt user testsimple show fileinfo 1H62lJN6b2xpL2nzyuk69wc9WJR2NREtcfe5MQIpKRXQ

# What are the differences                                                                                                                                            

$ diff unpub.txt pub.txt

92,93c92,93

<     thumbnailLink: https://lh3.googleusercontent.com/drive-storage/AJQWtBO_8viy1sdWwkljiglRo4SO4HnEePf73TO5IbUdWSinJ2GLSKAnbYVwZChzoTPPswKFtI4OReUaPMYHFv0Hjn9H16r9\

qtimPudy6y27L_4m7yhpzpOE7xbMe7dG1A=s220

<     thumbnailVersion: 2

---

>     thumbnailLink: https://lh3.googleusercontent.com/drive-storage/AJQWtBPREzg3VK0gufndE4fEIJ6eMtbNT0TbTUkG_OYfHT5WnMFvr1-YeiOG4yFn7vPq6-heXnk0vTxd_3rPnTnqG36xNwGO\

cnr1ObEdsDzkx_MZXbVe21IFSUuSULGXJg=s220

>     thumbnailVersion: 3

95c95

<     version: 6

---

>     version: 9

97c97

<     viewedByMeTime: 2024-08-23T15:44:44-07:00

---

>     viewedByMeTime: 2024-08-23T15:46:37-07:00


There seems to be nothing that indicates that it is shared

Ummmm?

Ross
----
Ross Scroggs


On Aug 23, 2024, at 3:31 PM, Maj Marshall Giguere <mgig...@nhwg.cap.gov> wrote:

Interesting.  If you publish to the web the link generated is simply a file id with "/pub" appended.  The actual ACL is completely uninteresting. The actual machinery must be embedded in the permission object itself?

On Fri, Aug 23, 2024 at 4:14 PM Maj Marshall Giguere <mgig...@nhwg.cap.gov> wrote:
Shaun;

I've never looked at that menu.  I assumed it was similar to the other "Share" menus and the help page isn't helpful as to how this all happens, it just cautions you not to share private items publicly.  It may be connected with Google sites. Google's file api doesn't make it obvious how to identify such files.  It may be one of those things the api's haven't caught up to yet.

The only relevant query item I can find is visibility:
visibility =!=The visibility level of the file. Valid values are anyoneCanFindanyoneWithLinkdomainCanFinddomainWithLink, and limited. Surround with single quotes (').




Maj Marshall E Giguere

NH Wing Director of IT

Civil Air Patrol, U.S. Air Force Auxiliary

GoCivilAirPatrol.com

nhwg.cap.gov

Volunteers serving America's communities, saving lives, and shaping futures.



On Fri, Aug 23, 2024 at 3:30 PM 'Shaun Goodwin' via GAM for Google Workspace <google-ap...@googlegroups.com> wrote:
Hi All,

We have found 3 ways a user can share externally.
1. Share Anyone with a link - we have a solve for this
2. As Maj above mentioned shared with external users via domain - we have a solve for this
3. Files Published externally - no solve for this

We haven't found a way to identify files published and or published to "anyone" as opposed to publishing and restricting to "domain"

See screenshots below of how a user can Publish a file externally.

<Publish to Web 1.png>

Maj Marshall Giguere

unread,
Aug 23, 2024, 7:22:32 PM8/23/24
to google-ap...@googlegroups.com
Ross;

I concur, and nothing in the api's either. 


Ian Crew

unread,
Aug 23, 2024, 7:34:41 PM8/23/24
to google-ap...@googlegroups.com
Does Google Support have anything to say/advise on this, maybe? The fact that something like that isn't auditable is concerning (to say the least).....

--
Ian Crew

Architect, Communication and Collaboration Services
Productivity & Collaboration Services
Berkeley IT
University of California, Berkeley


Shaun Goodwin

unread,
Aug 23, 2024, 7:45:51 PM8/23/24
to GAM for Google Workspace
Thanks everyone for their contribution to this topic.

It seems everyone has found the same results as us.

When a file is published it  appends "/pub" the the end of the URL
Publishing is available across the suite of google products i.e. sheets docs etc.

That alone isn't enough to know whether it's public as files can be published but restricted to domain only.

We haven't identified a way via either advanced gam or the API directly to identify these files.

We have a case open with Google support.


States:
includePermissionsForView

string

Specifies which additional view's permissions to include in the response. Only 'published' is supported.

However there still doesn't appear to be any difference for a published file.


Ian Crew

unread,
Aug 23, 2024, 7:48:10 PM8/23/24
to google-ap...@googlegroups.com
Thanks for summarizing Shaun! Please do report back if Support tells you anything useful (once you argue your way past tier 1, that is--good luck with that part!) I'd be extremely curious to hear what you learn.

Cheers,

Ian


--
Ian Crew

Architect, Communication and Collaboration Services
Productivity & Collaboration Services
Berkeley IT
University of California, Berkeley

Ross Scroggs

unread,
Aug 23, 2024, 8:16:05 PM8/23/24
to google-ap...@googlegroups.com
Confirming that including includePermissionsForView='published' in the API call didn't return any additional info.
----
Ross Scroggs


Ross Scroggs

unread,
Aug 23, 2024, 8:45:04 PM8/23/24
to google-ap...@googlegroups.com
See: https://developers.google.com/drive/api/reference/rest/v3/permissions#Permission

view

string

ndicates the view for this permission. Only populated for permissions that belong to a view. 'published' is the only supported value.

How one assigns a permission to a view is unknown to me.

Ross

----
Ross Scroggs



Maj Marshall Giguere

unread,
Aug 23, 2024, 11:00:30 PM8/23/24
to google-ap...@googlegroups.com
Although "IncludePermissionsForView" is listed in the "get" call the api doesn't return anything.  Maybe an oversight on Google's part.

Jose Luis Rodenas

unread,
Aug 24, 2024, 9:24:27 AM8/24/24
to GAM for Google Workspace
"Published to the web" can be found in the endpoint revisions of the Drive API, or if you are lazy you can just send a HEAD http request to the URL and check the status code returned.

Maj Marshall Giguere

unread,
Aug 24, 2024, 10:00:28 AM8/24/24
to google-ap...@googlegroups.com
Jose;

Thanks for the pointer.  The "revisions" is not a place I would have looked.


Maj Marshall Giguere

unread,
Aug 24, 2024, 10:04:52 AM8/24/24
to google-ap...@googlegroups.com
Shaua;

So, thanks to Jose for pointing out the publication status is reported in the "revisions" part of the api.  That being the case a files published status can be found using gam.  Example:

>gam user alpaca@example print filerevsions id:<fileIDHere>

The attributes at the tail end :,revisions.1.publishAuto,revisions.1.published,revisions.1.publishedOutsideDomain
will be True if a file is published.



Ian Crew

unread,
Aug 24, 2024, 10:54:08 AM8/24/24
to google-ap...@googlegroups.com
Yikes, that’s rough. Think about a domain with potentially millions of files, coupled with the fact that any one Google Docs/Sheets/Slides document could have thousands of revisions. The published files would therefore be effectively unfindable.

And does the revisions API cover every revision of every file back to the beginning of time?

😱

--
Ian Crew

Architect, Communication and Collaboration Services
Productivity & Collaboration Services
Berkeley IT
University of California, Berkeley

Maj Marshall Giguere

unread,
Aug 24, 2024, 12:05:15 PM8/24/24
to google-ap...@googlegroups.com
Ian;

I've been doing some experimentation on that topic and here's what I can definitively say.  Good news, bad news.  Bad news, the api will return the count of revisions followed by a list of revision id's.  The good news.  You can filter on the "publishedOutsideDomain". 
For a one off:
> gam user alp...@example.com print filerevisions id:$fid fields publishedOutsideDomain

Owner,id,revisions,revisions.0.id,revisions.1.id,revisions.2.id,revisions.2.publishedOutsideDomain
alp...@example.com,aslLAdcX_f029ZqudFPUdtqQe3CGfIg,3,1,12,114,True

Assuming you have a list of all your file ID's and their owners you can do something like this:
> gam config csv_output_row_filter ".*publishedOutsideDomain:text=True" csv gam user ~owner print fileversion id:"~id" fields publishedOutsideDomain

If you do the above it appears that you will only see the currently active published state of the file.  My syntax may be a bit off, but you get the basic idea.


Ross Scroggs

unread,
Aug 24, 2024, 12:13:16 PM8/24/24
to google-ap...@googlegroups.com
gam user alp...@example.com print filerevisions id:$fid fields publishedOutsideDomain select last 1

This returns just the last revision.

I'm working on this:

gams config csv_output_row_filter "revisions.0.publishedOutsideDomain:boolean:true" auto_batch_min 1 num_threads 20 redirect csv ./PublishedDocs.csv multiprocess redirect stdout - multiprocess redirect stderr stdout all users print filerevisions my_publishable_items select last 1


Ross

----
Ross Scroggs



Ross Scroggs

unread,
Aug 24, 2024, 1:41:01 PM8/24/24
to google-ap...@googlegroups.com
6.80.13

gam config csv_output_row_filter "revisions.0.publishedOutsideDomain:boolean:true" auto_batch_min 1 num_threads 20 redirect csv ./PublishedDocs.csv multiprocess redirect stdout - multiprocess redirect stderr stdout all users print filerevisions my_publishable_items select last 1


Ross


----
Ross Scroggs


Ross Scroggs

unread,
Aug 24, 2024, 4:31:19 PM8/24/24
to google-ap...@googlegroups.com
Shaun/Ian/Marsh/Others:

This should get you what you want.


Thanks to Jose Luis for the key pointer.

Ross
----
Ross Scroggs


Reply all
Reply to author
Forward
0 new messages