--
You received this message because you are subscribed to the Google Groups "SilverStripe Core Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to silverstripe-d...@googlegroups.com.
To post to this group, send email to silverst...@googlegroups.com.
Visit this group at http://groups.google.com/group/silverstripe-dev.
For more options, visit https://groups.google.com/d/optout.
We could refactor the current system to just have pluggable filesystem backends for Files and Images without any of the rest of the design, but that doesn't solve the versioning or other issues. It also doesn't seem to get us significantly closer to the system I described (although we could always just not do that) - although it will require fixing the same "core is tied to assets" issues.
Specifically, my design relies on the filename being the same on all backends. Otherwise when using cascading backends, you need to store the filename for each backend in the database. You might not even know that when first having the file uploaded - I'm imagining the filesystem -> s3 synchronisation would happen in the background, so the files will just "appear" on S3 at some point in the future without any opportunity to query the filename or write it back to the database.
We could refactor the current system to just have pluggable filesystem backends for Files and Images without any of the rest of the design, but that doesn't solve the versioning or other issues. It also doesn't seem to get us significantly closer to the system I described (although we could always just not do that) - although it will require fixing the same "core is tied to assets" issues.My suggestion is that this refactoring means that all of the work that you'd like to do—which is good work—is the creation of the new backend rather than the mandatory replacement of current core functionality.What I am recommending is to design the backend API such it is possible to make a backwards-compatible back-end that doesn't support versioning.As a quick sketch, something like this: https://gist.github.com/anonymous/ea30f2d02f5bc8230216It's missing a lot of methods to see if files exists, delete, etc.Key points:
- GUIDs are generated by the backend and assumptions aren't made about their format (could be a filename or a SHA).- "relative URL" is passed to the create files methods (setContent and transferFile) to assist in deciding how the content should be saved.- Deciding what the link is should be the responsibility of the back-end.If you have those three features in your API we'll be able to make a backend that saves to the assets directory, that doesn't support versioning. Specifically, it would be a mutable backend.Specifically, my design relies on the filename being the same on all backends. Otherwise when using cascading backends, you need to store the filename for each backend in the database. You might not even know that when first having the file uploaded - I'm imagining the filesystem -> s3 synchronisation would happen in the background, so the files will just "appear" on S3 at some point in the future without any opportunity to query the filename or write it back to the database.If I understand correct, what this would mean for my API above is that the *GUIDs* would need to be consistent across backends, so that this code of this form would work:if($s3Backend->hasFile($guid)) {return $this->redirectTo($s3Backend->getLinkForGUID($guid);} else {return $this->redirectTo($filesystemBackend->getLinkForGUID($guid);}Thinking about it more, I think that content-SHA is an inappropriate GUID. The problem is that uploading the same file into two different places is something that is quite likely to happen from time to time, and if the download URL isn't able to be different for those two download links, it will confuse people as to what's going on.A simple solution would be to pack the user-expected URL into the the GUID, make it something like "sha:relativeURL". I'm sure there are better solutions too, but it's worth considering this problem independently from my previous commentary about backwards compatibility.If we're going to drop the whole notion of the Files & Images section letting you manage a hierarchy of files, where the hierarchy corresponds to the URL, that's probably a bridge too far.On Thursday, 23 October 2014 16:57:24 UTC+13, Hamish Friedlander wrote:We could refactor the current system to just have pluggable filesystem backends for Files and Images without any of the rest of the design, but that doesn't solve the versioning or other issues. It also doesn't seem to get us significantly closer to the system I described (although we could always just not do that) - although it will require fixing the same "core is tied to assets" issues.Specifically, my design relies on the filename being the same on all backends. Otherwise when using cascading backends, you need to store the filename for each backend in the database. You might not even know that when first having the file uploaded - I'm imagining the filesystem -> s3 synchronisation would happen in the background, so the files will just "appear" on S3 at some point in the future without any opportunity to query the filename or write it back to the database.So I can't see an easy bridge between "the backend for Files and Images is pluggable" to "it's easy to have versioned files", or at least no easier than it would be now.
--
You received this message because you are subscribed to the Google Groups "SilverStripe Core Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to silverstripe-d...@googlegroups.com.
To post to this group, send email to silverst...@googlegroups.com.
Visit this group at http://groups.google.com/group/silverstripe-dev.
For more options, visit https://groups.google.com/d/optout.
My suggestion <snip>
The problem is that uploading the same file into two different places is something that is quite likely to happen from time to time
A simple solution would be to pack the user-expected URL into the the GUID, make it something like "sha:relativeURL". I'm sure there are better solutions too, but it's worth considering this problem independently from my previous commentary about backwards compatibility.
If we're going to drop the whole notion of the Files & Images section letting you manage a hierarchy of files, where the hierarchy corresponds to the URL, that's probably a bridge too far.
Personally I don't see why versioning needs to be in core. I think the core API should expose itself enough to easily add in this functionality as a module, which at the moment it doesn't do (at least not without causing pain and torment to those involved).
--
Personally I don't see why versioning needs to be in core. I think the core API should expose itself enough to easily add in this functionality as a module, which at the moment it doesn't do (at least not without causing pain and torment to those involved).
Just so I understand, the File data field will store a hash, right? In which case, what would the job of the File DataObject be?
As is already being discussed on Github i've been working on the abstraction of the filesystem backend which will solve one of the issues here. I'm also going to talk to Hamish at some point in the future to ensure we're not going to cause any conflicts and we're both heading in the same direction. Any decisions that come out of that will be relayed back to the dev list for discussion.
I am suggesting that the hierarchy visible in Files & Images isn't replicated anywhere else - it's in the database only.
> "Just replicate exactly what we’ve got, except it can go in S3 too" would be an improvement over the current system, but there's no particular technical difficulty in it, it's just work.
Something shouldn’t need to be technically difficult before it gets solved.
> And we’re basically *only* discussing CMS here, since the File class and the Files & Images section are in CMS.
File, Folder, Filesystem and Image are all in framework, not CMS. AssetAdmin is the only part in the CMS.
- By just storing by key, there's no difficultly keeping multiple backends in sync. When using multiple servers and storing locally, rsyncing between servers will never raise a clash. And we can just ask each registered backend in turn "got this key?", so when we're using S3 we can serve from S3 once the async replication has uploaded it, and off a local filesystem before then.
Awesome discussion! Somebody (Hamish?) should summarise this into a design doc soon though, it becomes quite time intensive to read ;)Hamish wrote:- By just storing by key, there's no difficultly keeping multiple backends in sync. When using multiple servers and storing locally, rsyncing between servers will never raise a clash. And we can just ask each registered backend in turn "got this key?", so when we're using S3 we can serve from S3 once the async replication has uploaded it, and off a local filesystem before then.I like the idea of "cascading backends", but I can't see from the above discussion if the design also considers parallel backends? For example, you might want to store all images uploaded by content editors on S3, but get documents they reference out of their own readonly document management system (DMS). This could be solved by adding a backend type to each File record. While you could go and ask each backend for availability of this key, I'd say that's a performance issue, particularly for remote backends where this check is a relatively slow HTTP call.
The case of a DMS backend might be a misuse of the API though, since other consumers are expected to contribute to the file repository there (create, remove and change keys as well as content), causing sync issues. Do we expect to be the only consumer and contributor to repositories used by the File API? We haven't discussed what happens to the Folder API much in here. I guess it would be only one way to structure file records, while others might rely on HTTP calls to list files and traverse hierarchies. Yet others might rely on non-hierarchical local organisation through tags (many-many). So far I haven't seen anything in the presented interfaces that would contradict these use cases, but we should consider these aspects when designing, right?
"Just replicate exactly what we've got, except it can go in S3 too" would be an improvement over the current system, but there's no particular technical difficulty in it, it's just work.However I can't see any way to solve versioning or replication across backends without some core support. So I'd oppose any solution that didn't at least have a clear idea how to allow those problems to be solved.