Ok, i will precise my previous message and ansver theese questions:
1) what is "234" and "123" and "25" in path to the file?
The formula itself, how the path is converted to the multi-level
structure, isnt so important actually - i dont like the existing
solution. First of all it takes an MD5 hash of "<path>/<file>" (etc.
"/system/user_files/avatars/example.jpg"). Then it summs together each
octet (etc. a5f23=10+5+15+2+3).
As far as md5 hash is 32 characters long hexadecimal number, the result
is betveen 0 and 32*16=512. So this number is name for first level. If
settings require more levels - system takes md5 hash of previous HASH
and scenario repeats untill required number of levels is reached. As a
result there is defined number of folder names generated, etc. 234, 322,
123...
2) what do we do with Resized part of the files?
There, in TBA, resized files are stored together with original ones. As
far as clear event uses proper methotds to get file path
($object->GetField('File', 'full_path')) - there shouldnt be any problem
with removing them.
3) What pros and cons i see in the system, that we already have in TBA?
- too much folders at each level (theoretically up to 512 (32*16)),
which makes it very hard to move such storages;
- user unfriendly, you can browse easylly for this storage, but you will
never find something untill you know the exact location;
- too much computing in algorithm, which generates paths;
+ all files are automatically stored and uniformly distributed so you
will never be bothered with the problem of large file amount.
Alexander Obuhovich wrote:
> MediaWiki also has code, that structures uploaded images into
> sub-folder, e.g. "a", "a1" and so on. I don't know what logic being
> used, but the purpose reminds me of what Nikita proposed.
>
> Resized file clean event is not a problem, since it can be easily
> changed to:
>
> 1. delete files in subfolders
> 2. delete whole subfolder structure
>
>
> Maybe we even could come up with universal (for uploads into any
> folders and any files), that should speed up work, where a lot of
> files are uploaded.
>
>
>
> On Sun, Jan 30, 2011 at 7:35 PM, Dmitry A. <dand...@gmail.com
> <mailto:dand...@gmail.com>> wrote:
>
> Hi guys,
>
>
> Nikita, great point! Also, I just wanted to make sure you've
> understand Alex's point about the "Enter" key. It's quite hard to
> read and comprehend the text when it's in one big paragraph. To
> make your ideas simpler to understand just start braking it into a
> smaller pieces (paragraphs) and we all be happy! ;)
>
> Please note that we do appreciate what you post your opinions and
> start or participate in a discussion. Let's make sure it's easy
> for all of us to read and understand.
>
>
> Now back to your original idea. Yes, I do support your point and I
> have personally have come across at least 2 projects when the
> number of files in the folder became extremely hight and folder
> got close to unusable. I was a Linux, but I am sure with Windows
> things will become even worst. The matter of fact, Linux can't
> even delete when there is 2,000 or so files since "rm" command
> won't accept that many parameters.
>
>
> Let's start with setting our *ultimate goal* by listening each and
> every idea and then come up with the plan for reaching it?
>
> *
> *
> *Goal:*
>
> Have the ability to store large number (2,000+) of User Uploaded
> files (system/user_files) in special folder structure so these
> files can be easily deleted, moved and accessed.
>
>
> *Possible Solution:*
> *
> *
Upload new file:
Revert file:
Thanks for your input alex it is very interesting.
1 of the things I worry about is that you are we going to use hash function to process the image folders everytime we need to access it will take much more time to execute
I would store a full path of the image to speed up processing
DA
We could create base method for that, which would work in some SIMPLIEST
possible way (even up to [1st letter of filename]/[2nd letter of
filename]). In this case:
1) developers would have possibility to rewrite this method and achieve
theyr requirements;
2) in future we can change base method (basing on the experiences, that
we will recieve using first one) very easylly;
3) we wont spend much time right now, when it is really hard to find
best solution without having that experience.