Case Study: What to do when you have a department that is working with a vendor that created Google files and folders on a Google My Drive shared folder; that is currently housed on the vendor’s Google org?
I want to take a moment and say thank you to Ross Scroggs for all his time and help in counseling us during this project. There were more than a few times; I was at my wits end and he helped us get the project back on track; I am forever grateful. He is truly a bright light in this community, and I appreciate all his efforts. I wanted to write this up to help others that may be in a similar circumstance; because we had to deal with multiple roadblocks along the way and I want this to be a cautionary tale for others.
I work for a school system that hired (vendor A), to create a series of workbooks and answer keys for a curriculum that the district paid them to create. The problem was when the workbooks were completed and the project had concluded 4 years later. The vendor, nor the department involved, knew how to take possession of these files for the district. I got a service desk ticket as the lead district Google administrator to take a look. The longer I looked at this, the deeper the rabbit hole went. I want to make it clear at no time was any of this ever put in front of anyone in IT or IT Security in my org prior to them starting the service ticket for approval or checks and balances. The department had already linked the files to syllabuses and did not want to just copy the files over to our org. Because the links would change, making more work for them to update all the necessary course material. Initially we thought we were dealing with a shared drive on Vendor A’s Google Workspace; that would have made this a lot easier. But we soon found out it was a “My Drive” folder that the owner of Vendor A created.
As we dug further, we realized we were dealing with more than just Vendor A users owned documents and district users owned documents. Vendor A had subcontracted with other vendors and people that were using either other email domains or consumer gmail accounts (@gmail.com). The files were actually owned by the individuals that created them and not the Vendor org. What a mess!
To make matters worse during the discovery process, we had gotten word that the students were able to get into some of the answer keys. We needed a way to bring these files all under the district Google Workspace Org, clean up the file permissions and set controls; we wanted to stop the security leaks and maintain control of the structure from the top.
The workbook files and folders totaled over 10,000 items, just under 18gb, that needed to be moved and secured appropriately. It was important that we maintained the folder structure so people would be able to find things; that links to documents continued to work as we worked through this process.
First thing we had to do was understand what was there, we asked the vendor to give us edit access at the root of the Workbook folder; to pull reports with GAM. We figured after the initial report pull; we needed to pull three different reports against the structure. One for external domains, one for the vendor domain and one for the district domain.
## This will look at any domains that are not Vendor A or District domains and the user is owner of a file; that is in the folder / file structure. Giving us a report on all external accounts involved:
gam redirect csv ./shareMockExternalDoms.csv user <user with edit access to root of my drive share> print filelist select <root my drive folder id> fields id,name,createddate,mimetype,basicpermissions,modifieddate,owners.emailaddress pm type user role owner notdomainlist <list your domains separated by comma here> em pmfilter showownedby any filepath
## This will look at the Vendor A domain for files that are owned by its users in the folder structure:
gam redirect csv ./vendorshareMockownedbyVendorA.csv user <user with edit access to root of my drive share> print filelist select <root my drive folder id> fields id,name,createddate,mimetype,basicpermissions,modifieddate,owners.emailaddress pm type user role owner domainlist <List Vendor domain(s) here separated by comma> em pmfilter showownedby any filepath
## This will look at the District domain for files that are owned by its users:
gam redirect csv ./vendorshareMockownedbyDistrict.csv user <user with edit access to root of my drive> print filelist select <root my drive folder id> fields id,name,createddate,mimetype,basicpermissions,modifieddate,owners.emailaddress pm type user role owner domainlist <List district / your org domain(s) here separated by a comma> em pmfilter showownedby any filepath
Now that we have these three reports it clearly showed us who owns each individual file and folder on the “my drive” share that is currently housed / owned by Vendor A. We could now point to all the files and folders knowing who owned what and the quantity of files owned by each category: districted owned users, vendor owned users and external to both domains owned users.
For us this made it easy to figure out what needed to come next. In our case we needed the vendor to reach out to all the folks that came up on the external domain report to get them to manually transfer the files to a temporary share drive that we set up on our district org. This was to ensure that we could get the ownership permission stripped off the file or folder and district could take possession. As a Google admin, I set up a share drive and then gave all the external email addresses contributor access to this share drive for us to use as a drop off point for them. We tried having them do this from a gmail account to an org account and that was not possible directly. Doing it with an Org-created share drive was the best way; we did not secure the files yet; but it allowed us to take possession of the files. Once the external folks dropped off the files; we removed their access to the temporary share drive. The files kept all the original ACL permissions when they were dropped off. We ran the external domain report once every two weeks and sent it out to all parties involved so we could track the progress of the files being moved. Just in case they began to stall; during this project we only had to poke and prod a few times to jump start the process. I referred to this first part as Process A.
While Process A was happening; we were talking with the department to get an idea of the scope of who needed access to these files inside our district once we completed the move, because we wanted to restructure the file permissions. For us, setting up a few dynamic groups was the way to get this done writing a few queries using CEL language in the Google Workspace Console. After Vendor A had all the external files addressed. We then moved on to process B “the move”. We created a new share drive on the district org and granted appropriate users / groups access at the root of this new drive. I want to state a couple things about how these move commands work; upfront realize that GAM does not move folders; it recreates them. That means any links to specific folders are going to change. GAM will only move files; we chose when writing the command to leave the source folders behind in the structure just so people in the department and vendor A could see the progress that was made. I was afraid if the folders disappeared the less technical folks may not be able to navigate causing panic.
We wanted to Allow Duplicate files, just in case they named something identically, ignore folder permissions, we wanted to retain the source folders in original location, we did not want GAM creating shortcuts to files by default; that is why you see the following on the end of the command structure: duplicatefiles duplicatename copysubfolderpermissions false retainsourcefolders createshortcutsfornonmovablefiles false
We ran the same command on both consoles, district owned ran by me and the vendor hired a Google admin contractor to work on running GAM scripts with me. I knew that since we had to run the command twice; that I would see failures if my console tried to move files that were owned by the other org and vice versa. That is why we opted for the createshortcutsfornonmovablefiles false option. If this switch is left off GAM will create shortcuts for anything it can’t move. This would have left the folders and files littered with dead shortcuts.
MergeWithParent is used to take all the folders under the parent folder from the My Drive. The top folder name that the mydrive is shared from is not moved to the Share Drive, instead it’s just the child folders and files. If you had a parent folder called top folder and sub folders called child1, child2, ect… when the MergeWithParent switch is used, you will not put top folder in the Shared drive if you run that on your move command. It will just start moving child1, child2, and whatever files are in the root of your top folder will show up in the root of your Share Drive. After the move your structure will look like this:
Share Drive à Children from top folder and files in root.
If you leave the MergeWithParent switch off; gam will put top folder and all items below that on the drive so it will look like this:
Share Drive Name -> Top Folder -> Children
All your files and folders should stay in the appropriate child folders on the share drive post move. If your structure goes 5 or 6 folder levels deep you should see the same thing happening on the other side. Just remember that the folder names on the share drive will be recreated; so new folder docids post move.
## We ran the following command on move day:
gam redirect stdout .\movefilestosd.txt redirect stderr stdout user <put user name with edit rights to my drive and manager Access to Share Drive> move drivefile <home drive folder ID> teamdriveparentid <shared drive folder ID> mergewithparent duplicatefiles duplicatename copysubfolderpermissions false retainsourcefolders createshortcutsfornonmovablefiles false
I had some discussions with Ross Scroggs about this entire process and if you are worried about file permissions on the google files themselves being explicitly defined, he made an additional update to the GAM source code to reflect a new parameter movefilepermissions false. Please see note below command structure:
## Moves files and folders without keeping any explicit permissions on files or folders.
gam redirect stdout .\movefilestosd.txt redirect stderr stdout user <put user name with edit rights to my drive and manager Access to Share Drive> move drivefile <My Drive Folder ID> teamdriveparentid <teamdriveid> mergewithparent duplicatefiles duplicatename copysubfolderpermissions false retainsourcefolders createshortcutsfornonmovablefiles false movefilepermissions false
** The movefilepermissions option was added in GAM version 7.45.00
Unfortunately, the above command did not exist when we did the move, so we had to do something different to clean up the explicit file permissions.
Even though we did not copy the folder permissions over we soon realized that most files had non-inherited or explicit permissions individually defined on them and we needed to clean this up because of the amount of oversharing that was happening from the original Vendor A folders structure. Our root permissions on the shared drive were not traversing through the file structure correctly. We came up with the following to scan the files in the folder structure:
## Gets our explicit non inherited acls list created for the share drive:
gam redirect csv ./ExplicitShares.csv user <Manager of Share Drive> print filelist select teamdriveid <teamdriveid> fields id,name,mimetype,basicpermissions pm inherited false em pmfilter oneitemperrow
##After we reviewed the data and everything looked good; please scan your CSV files carefully looking for blank fields or formatting issues.
## Danger Deleting the non-inherited explicit acls on these files.
gam config num_threads 20 redirect stdout .\DeleteExplicitShares.txt multiprocess redirect stderr stdout csv ExplicitShares.csv gam user ~Owner delete drivefileacl ~id id:~~permission.id~~
We found over 9,000 explicitly defined permissions on individual file ACLs and removed them. We felt good and started to hand the project off to the next team for Process C. When they pointed out that some folks were having trouble getting to certain files, we saw the defined group they are in doesn’t have access to the specific file; we saw an access removed tab when clicking on the sharing button in the gui. If you hit the settings button off of the share menu you would have seen: a check box that states” Limit Access to “Name of file here”. We started looking at some of these instances and realized what was going on.
## We need to test for inheritedPermissionsDisabled = true; this was how we could find that check box in mass. We wanted a way to find all those files on the share drive so we could flip the bit back to false.
gam redirect csv ./AllLimitedItems.csv user <manager of share drive> print filelist select teamdriveid <YOUR_SHARED_DRIVE_ID> query "inheritedPermissionsDisabled = true" fields id,name,mimetype,webviewlink
## Danger Changes being made on command below. After reviewing the above generated csv, we then updated the affected files by doing the following:
gam config num_threads 20 redirect stdout ./ClearLimitedAccessResults.txt multiprocess csv ./AllLimitedItems.csv gam user <manager of share drive> update drivefile "~id" inheritedPermissionsDisabled false
After all the above we finally got all the kinks out of the file and folder permissions structure on this project. It took us over a year to get this completely organized because we had to coordinate things and make sure there were not going to be disruptions in the files access for our end users. Going forward we will maintain the structure from the top down by using dynamic groups & specific user additional permissions where appropriate. We wrote a few simple queries using CEL language to get these dynamic groups to be continually updated. If you haven’t worked with dynamic groups; I highly suggest looking into it.
I hope this was helpful to a few. Let me know if you have any questions.
Thanks,
Ed