1. We're looking for an easy and efficient way to store blobs (array of
bytes) of data in some sort of circular queue and return some sort of key
someone can use for later access to that blob. The blobs can very varying
lengths. I was wondering whether sparse files would be a reasonable approach
for this as it appears it's an efficient mechanism for storing messages of
varying length (at least that's what I gather).
I guess you could also store each blob in its own file, but then I think the
overhead of creating a file per blob (may have millions) might be costly.
I could also store all the blobs in a single "normal" (non-sparse) file, but
then I think the house keeping of walking the chain of blobs (most likely
I'll need to chain them if I don't use a sparse file) might be costly in
terms of performance and also adds to the code I'll have to write. Though I
guess NTFS is keeping its own list of sections of the file that contain data.
2. If I want to copy a sparse file to other Windows machines running NTFS do
I need to write my own code to do that or does CopyFile() handle sparse files
such that it only copies the parts of the file which contain data such that
the copied file is exactly the same as the source file?
--
Thanks,
Nick
nickno...@community.nospam
remove "nospam" change community. to msn.com
For the first question about an easy and efficient way to store blobs, I
need some time to think over it more carefully and will report back as soon
as possible.
For the second question about copying a sparse file, neither xcopy nor
robocopy (http://technet.microsoft.com/en-us/library/cc733145.aspx) can
keep the sparse attribute in the target file based on my tests. The API
CopyFile does not handle sparse files well either. We need to write codes
or use some third party solutions. A sample code was provided by FlexHex
http://www.flexhex.com/docs/articles/sparse-files.phtml
At the bottom of the page, you will find the source code of a utility to
copy sparse files.
Regards,
Jialiang Ge (jia...@online.microsoft.com, remove 'online.')
Microsoft Online Community Support
Delighting our customers is our #1 priority. We welcome your comments and
suggestions about how we can improve the support we provide to you. Please
feel free to let my manager know what you think of the level of service
provided. You can send feedback directly to my manager at:
msd...@microsoft.com.
==================================================
Get notification to my posts through email? Please refer to
http://msdn.microsoft.com/en-us/subscriptions/aa948868.aspx#notifications.
MSDN Managed Newsgroup support offering is for non-urgent issues where an
initial response from the community or a Microsoft Support Engineer within
2 business day is acceptable. Please note that each follow up response may
take approximately 2 business days as the support professional working with
you may need further investigation to reach the most efficient resolution.
The offering is not appropriate for situations that require urgent,
real-time or phone-based interactions. Issues of this nature are best
handled working with a dedicated Microsoft Support Engineer by contacting
Microsoft Customer Support Services (CSS) at
http://msdn.microsoft.com/en-us/subscriptions/aa948874.aspx
==================================================
This posting is provided "AS IS" with no warranties, and confers no rights.
"nickdu" <nickno...@community.nospam> wrote in message
news:ABA4112C-311F-48C1...@microsoft.com...
You didn't say anything about blob sizes, persistency, etc.
"nickdu" <nickno...@community.nospam> wrote in message
news:ABA4112C-311F-48C1...@microsoft.com...
long GetBlob(int index, byte *buffer, long bytes)
{
Seek(sparse, (index + 1) * 1GB);
Read(sparse, &length, sizeof(length));
return Read(sparse, buffer, min(bytes, length))
}
Otherwise if you don't use sparse files and put one blob right after the
next, then you either need to maintain an index in the file or you need to
read the list each time you want to retrieve a blob.
Most likely I'll use the first blob to store some info, like the next free
index.
--
Thanks,
Nick
nickno...@community.nospam
remove "nospam" change community. to msn.com
I indicated the blob sizes are variable and I can't pick a max. Well
actually I am picking a max, but that should be an unreasonably large value
that should be much larger than anyone needs. That's the beauty of using
sparse files, or so I gathered. There must be some reason why the OS chose
sparse files for the NTFS change journal.
--
Thanks,
Nick
nickno...@community.nospam
remove "nospam" change community. to msn.com
By the way, I don't see anything wrong with simply storing an index in the
start of file. That will also add persistence. Any overhead would be no
greater than filesystem overhead for sparse file access. Actually, it would
be much less.
"nickdu" <nickno...@community.nospam> wrote in message
news:AA991457-B503-4C1E...@microsoft.com...
The Change Journal needs something like a sparse file to store/read
its entries as fast as possible using pre-allocated slack space.
Using a circular queue method, you can discard old when writing new
entries.
Have you considered compression? This might be more reasonable today
given that the speed and overhead can be negligible.
Have you considered the Gather/Scatter I/O functions?
This may apply well to circular queues as well, where you have
pre-define size here for the queue. In this case, it may be queue of
buffer pointers and this can be passed to the WriteGather() and
ReadScatter() functions.
Also, you might want to consider using your own b-tree index system
with varying data size records (this is what we do). Each record has
a meta data block header that defines its length (also contains magic
numbers for integrity checking).
You can look at SQLITE which I believe the current version is said to
pretty fast for blobs. It uses a btree concept on the backend. If its
simply a backend application (not serving users or multi-clients),
SQLITE is something many of the top applications are using as a fast,
lightweight SQL ready framework for ideas like single application
storage, configuration, etc. Google Apps, Thunderbird, FireFox all
use SQLITE for its configuration and storage of data.
--
The indexing would look something similar to:
GetBlob(int index, byte *bytes, int size)
{
Seek(_sparse, (index + 1) * 1GB, SEEKOFFSET_ORIGIN);
Read(_sparse, &length, sizeof(length));
Read(_sparse, bytes, min(length, size));
}
Most likely I would use the first block to store information, like the next
free block. Other than that the indexing would be as showin above. Just
like indexing an array. That's the beauty of sparse files, right? You can
be wasteful and pick a huge block size and the OS will only consume the
actual amount of space you use. Of course the OS must have to do some book
keeping, but better the OS than me.
The NTFS change journal uses sparse file so I assume it saw a benefit in
using it.
--
Thanks,
Nick
nickno...@community.nospam
remove "nospam" change community. to msn.com
The blob size as I mentioned is variable. I would pick an upper bound that
way bigger than I can conceive anyone using and choose that as the size. It
allows very efficient indexing without wasting space. At least that's what I
gather from what I read about sparse files. Also, the NTFS change journal
uses sparse files so there must be some benefit, right?
What do you mean by persistency? Data will be persisted to the sparse file.
I will most likely have to age things, not based on a LRU scheme as that
would require me to keep some sort of LRU list which would complicate things.
I would probably just make the sparse file circular and pick an upper bound
of the index. When it wraps I would end up writing over the first blob.
This points out another thing I might have to do. Write the index into the
location so I can tell when data has been overwritten. If someone ask for
index x and at the slot for index x is written data for index y then I can
return an error. In this case index x modulus the size of the circular
buffer = index y modulus the circular buffer.
--
Thanks,
Nick
nickno...@community.nospam
remove "nospam" change community. to msn.com
BTW: this psudocode is inefficient and can be improved by using page sized
aligned reads. The seek is unnecessary and will preclude multiple readers -
use overlapped IO. Both reads can be consolidated for small (less than 1
buffer sized) blobs and for large blobs, multiple reads will be required.
All of this is true regardless of how the file(s) that store the data are
arranged
Also, storing metadata creates the possibility of file corruption - consider
the case of process termination during a blob insert. If the metadata is
updated, and the blob data isn't, there will be garbage returned when the
app asks for that blob; but if the metadata isn't updated, the insert will
be lost completely; and, even worse, if a partial metadata update is made,
then the whole file might be unreadable. These risks can be mitigated by
using FILE_FLAG_NO_BUFFERING + FILE_FLAG_WRITE_THROUGH, but you must write
code that can handle the consequences of corruption and / or to check for
these conditions and try to correct them. This is something else that the
file system does for you directly.
"nickdu" <nickno...@community.nospam> wrote in message
news:A3F5E9E5-3FC4-49A7...@microsoft.com...
nickno...@community.nospam
remove "nospam" change community. to msn.com
As it was pseudo code I provided I was not too concerned with performance.
"nickdu" <nickno...@community.nospam> wrote in message
news:E9493378-5CA5-4FE8...@microsoft.com...
Do these references persist across restarts? If so, then how do you expect
them to be stored? As part of a larger already existing entity?
It was not the specifics of your psudocode that I was responding to but the
nature of if and the obvious indications that you are not considering the
full scope of the problems you will need to solve for this solution to be
fully functional and reliable.
However, as it is obvious that I can't dissuade you from your purpose, I
have endeavoured to point out some of the problems that you will face
The main advantage of this design is eliminating the need to call CreateFile
when a new blob is to be written or must be retrieved by having a single
data file always open. Depending on your data rate, this overhead is likely
insignificant; but if your data rate is really high, then you may need to
follow this approach. Of course the primary reason to use a sparse file is
to simplify the code, at the expense of IO performance, and this would seem
counterproductive in this case. Also, the code complexity increased by
keeping a blob to block map versus a static offset is not significant with
respect to the complexity required to prevent file corruption during HW / OS
failures.
And hence my original question: what is it about your program that makes it
possible to do this better than the filesystem?
"nickdu" <nickno...@community.nospam> wrote in message
news:3E3FF996-50D7-4179...@microsoft.com...
In addition I have the ability to enumerate all the blobs in the store.
I won't need to have a blob to block map. When someone ask for blob at
index 5 I seek to location 5 * 1GB, read the length of the blob, read the
blob and return it to them. That's it.
The goal was to reduce the complexity of my code. The OS already gives me
this feature for free so why not use it. The Change Journal seems to have
thought it was a good feature. I have the same sort of requirement (I
think). I have a variable blob size and I would like to not waste a lot of
disk space. So I set an arbitrary large size and let the OS handle figuring
out what space to consume.
Of course I want to verify the OS implementation was good and that my
requirements are such that using sparse files might make sense which is why
I'm asking the question here.
nickno...@community.nospam
remove "nospam" change community. to msn.com
"nickdu" <nickno...@community.nospam> wrote in message
news:CAFD464B-8725-41F4...@microsoft.com...
"nickdu" <nickno...@community.nospam> wrote in message
news:97AD404E-8038-404A...@microsoft.com...
I'm not stuck on this implementation at all. From what I know of the sparse
technology it seems to address some of my requirements. It's also used by
the OS to implement the Change Journal so I assume the implementation is
decent. I would rather not write something myself if the OS has already
implemented it for me. That would only reduce the time to market and
increase the amount of code I would have to maintain.