Perkeep next

97 views
Skip to first unread message

michaeljafarr

unread,
Jan 19, 2023, 8:27:24 PM1/19/23
to Perkeep
I have some questions for the Perkeep community to gauge interest in a recalibration of Perkeep's scope.

Perkeep is the most complete system that understands CAS and file indexing.  Its schema/camliTypes and related signing systems are simple yet allow great extensibility.  The CLI, importers, mobile/web apps, encryption and redundant syncing systems are all non-trivial in their own right and have been implemented with the future in mind. I've looked across a range of other systems (IPFS, Syncthing, Seafile, Gluster', Upspin', Filestash etc.), and Perkeep is by far the best match for what I am looking for.

Some file/data management scenarios are not served well by solutions anywhere.  I have some high-level suggestions below, but I'd like to hear your general views on whether there is a need for more features in the personal data management space that Perkeep inhabits.  Do people in the Perkeep community see any opportunities to extend Perkeeps core purpose or scope in general?

There are lots of great things in perkeep, and there are many fundamental things that should never change:
 - the beliefs described on perkeep.org
 - content addressed storage
 - objects, not files + claims and schema system etc
 - the indexing system - how it is managed, extended etc.
 - the ecosystem, including importers and the UIs
 - focus on open formats and protocols

My thoughts on a new capability for Perkeep include the following two things, with hopefully minimal impact on existing components. These two things wouldn't be trivial, but they could increase the reach of this platform.
  1. keeping some form of metadata on all data everywhere, where the data itself isn't stored in blob storage.  Allow users to see this data in a familiar file hierarchy and control smart-ish import rules for these files via a Perkeep management UI. If nothing else, this one thing would lower the threshold for those wishing to start Perkeep.
  2. include additional classifying, tagging and handling data rules based on contextual information about where the 'file' came from.  

I'd guess that the typical responses would be a combination of these things (just hopefully more of the latter):
  1. Perkeep is already covering the essential features; if not, then Upspin et al. do. 
  2. This will make the system so much more complicated that it will become unmaintainable or unmarketable
  3. This might help grow the user community leading to more significant contributions and general support.

I've kept this brief because there is a good chance that Perkeepers may not be motivated to change Perkeep - no blame if that is the case.  However, if there is some interest in discussing this topic, I'd like to be involved.  I have ideas for integrating such new functionality without polluting the existing components and related new formats and protocols.  What are your thoughts on expanding the scope of Perkeep to include new features or capabilities? Are there any specific areas where you would like to see Perkeep grow?

Cheers
Mike

Jim DeLaHunt

unread,
Jan 20, 2023, 3:16:54 AM1/20/23
to per...@googlegroups.com

Hello, Michael:

I am Jim DeLaHunt. I recently discovered Perkeep. A lot of Perkeep's fundamentals resonate with me, as they do with you. Like you, I would like some scope extensions to address my archiving requirements. But the scope extensions I have my eye on may differ slightly from yours.

On 2023-01-19 17:20, michaeljafarr wrote:
I have some questions for the Perkeep community to gauge interest in a recalibration of Perkeep's scope.

Perkeep is the most complete system that understands CAS and file indexing.  Its schema/camliTypes and related signing systems are simple yet allow great extensibility.  The CLI, importers, mobile/web apps, encryption and redundant syncing systems are all non-trivial in their own right and have been implemented with the future in mind. I've looked across a range of other systems (IPFS, Syncthing, Seafile, Gluster', Upspin', Filestash etc.), and Perkeep is by far the best match for what I am looking for.

Some file/data management scenarios are not served well by solutions anywhere.  I have some high-level suggestions below, but I'd like to hear your general views on whether there is a need for more features in the personal data management space that Perkeep inhabits.  Do people in the Perkeep community see any opportunities to extend Perkeeps core purpose or scope in general?

There are lots of great things in perkeep, and there are many fundamental things that should never change:
 - the beliefs described on perkeep.org
 - content addressed storage
 - objects, not files + claims and schema system etc
 - the indexing system - how it is managed, extended etc.
 - the ecosystem, including importers and the UIs
 - focus on open formats and protocols

My thoughts on a new capability for Perkeep include the following two things, with hopefully minimal impact on existing components. These two things wouldn't be trivial, but they could increase the reach of this platform.
  1. keeping some form of metadata on all data everywhere, where the data itself isn't stored in blob storage.  Allow users to see this data in a familiar file hierarchy and control smart-ish import rules for these files via a Perkeep management UI. If nothing else, this one thing would lower the threshold for those wishing to start Perkeep.
  2. include additional classifying, tagging and handling data rules based on contextual information about where the 'file' came from. 

My biggest requirement which seems unmet by Perkeep is to archive files and directory trees as files and directory trees, rather than dissolving them into the blob store.  Part of what I want to preserve is rich file system metadata from past file systems, most notably resources forks of past MacOS HFS+ filesystems, and extended attributes of present APFS filesystems. I also want to be able to archive a software source code directory tree with its filenames, structural relationships, and timestamps intact.

A way to do this is to track metadata for files and directory trees, without dissolving the directory trees themselves.

I have year-based directories of bits going back 35+ years. I have directory trees of photographs. I want to index, manage, and explore that content, but I do not want to do anything to modify those files themselves.

However, I am very new to Perkeep. I don't yet have a clear idea how much of this Perkeep does in fact do, and what represents new scope.

Best regards,
     —Jim DeLaHunt

-- 
.   --Jim DeLaHunt, jd...@jdlh.com     http://blog.jdlh.com/ (http://jdlh.com/)

Michael Farr

unread,
Jan 21, 2023, 4:46:04 AM1/21/23
to per...@googlegroups.com
Hi Jim,

Directory structure features are what I am looking for as well.  Perkeep can import a directory structure as a Permanode, which is sometimes OK for archiving.  I think it's an area where people could struggle with Perkeep, and I'd like to make that stuff easier without changing the core structure of what Perkeep does.

It would be great if there were a comprehensive hierarchy UI that helped to both attach attributes and manage items in a familiar way - possibly that alone would be sufficient for your needs?

Also, there are a few patterns of how directories can be interpreted into attributes. I'm still looking for easier ways to do these things:
 - as you mentioned, the capture date can be interpreted from the directory name (or some other attribute) when it isn't in the file's internal metadata 
 - client folder structures. Besides using folders to infer metadata, you may want extra steps around client data before it is imported into perkeep.
 - git folders or other folders that are synced to some cloud structure. Arguably synced git folders should never go in perkeep. 
 - items in 'system' folders. Ideally, perkeep could be pointed at a root drive and asked to import everything that wasn't a program or system file.
 - also extended attributes, as you mentioned. There isn't one answer to determine how those attributes should flow into child files/items within Perkeep
 - etc

Thanks for your input. 
Mike

--
You received this message because you are subscribed to the Google Groups "Perkeep" group.
To unsubscribe from this group and stop receiving emails from it, send an email to perkeep+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/perkeep/341e4cec-6c35-4ad6-b507-5d3f01b72bb5%40jdlh.com.
Reply all
Reply to author
Forward
0 new messages