use for streaming archives

159 views
Skip to first unread message

Ariel Feinerman

unread,
Mar 22, 2012, 9:13:43 AM3/22/12
to libarchive-discuss
Hi,

I wish to use lib to uncompress then unarchive the stream of bytes
come from the web connection and write them on the fly block by block
to file

I can set the archive_read_open(a, mydata, NULL, myread, myclose);

but what then ?

- (void) connection: (NSURLConnection *) connection didReceiveData:
(NSData *) data {

// Append the new data to receivedData.

[_receivedData appendData: data];

if ([_receivedData length] >= MAX) {
// How to work with ?
}

}

Tim Kientzle

unread,
Mar 23, 2012, 2:15:40 AM3/23/12
to libarchiv...@googlegroups.com

This can be done pretty cleanly with two threads.

In one thread:
* Your code will call archive_read_next_header() and archive_read_data() to get entries from the archive.
* Libarchive will call myread() when it needs more bytes.
* myread() will block trying to get more bytes from a queue data structure

In a separate thread:
* Your web connection will provide received bytes to didReceiveData
* didReceiveData will store them in a queue
* The queue will wake up the thread with myread()

I haven't worked with them, but GCD is supposed to have some
very nice data stream queue handling that supports exactly this
type of structure.

Good luck!

Tim

Ariel Feinerman

unread,
Mar 23, 2012, 5:58:48 AM3/23/12
to libarchive-discuss
Tim,

thank you for your answer ! Sorry for but can you explain by giving an
sample? That is I have not worked before with libarchive at all.
I will use NSThread or NSOperation for more control

Joerg Sonnenberger

unread,
Mar 23, 2012, 10:34:33 AM3/23/12
to libarchiv...@googlegroups.com
On Thu, Mar 22, 2012 at 2:13 PM, Ariel Feinerman <ariel...@gmail.com> wrote:
> I wish to use lib to uncompress then unarchive the stream of bytes
> come from the web connection and write them on the fly block by block
> to file
>
> I can set the archive_read_open(a, mydata, NULL, myread, myclose);
>
> but what then ?

As example for doing this can be found in

http://cvsweb.netbsd.org/bsdweb.cgi/pkgsrc/pkgtools/pkg_install/files/lib/pkg_io.c?rev=1.11&content-type=text/x-cvsweb-markup&only_with_tag=MAIN

Look for open_archive_by_url. This is using libfetch, an alternative
to curl. It provides a FILE * like interface with buffered read and
write, but that's not relevant for this purpose. The interesting issue
here is using the open, read and close hook. After that, you can use
the normal archive_read API to process the archive as if it was local.

Joerg

Tim Kientzle

unread,
Mar 24, 2012, 4:01:07 AM3/24/12
to libarchiv...@googlegroups.com

On Mar 23, 2012, at 2:58 AM, Ariel Feinerman wrote:

> Tim,
>
> thank you for your answer ! Sorry for but can you explain by giving an
> sample? That is I have not worked before with libarchive at all.
> I will use NSThread or NSOperation for more control

Joerg gave a pointer to one example.

There are many more in the Wiki Examples page:

https://github.com/libarchive/libarchive/wiki/Examples

I can't specifically help you with NSThread or NSOperation;
I haven't programmed Mac OS in quite a few years.

Tim

Ariel Feinerman

unread,
Mar 24, 2012, 9:44:01 AM3/24/12
to libarchive-discuss
I mean how to work with incomplete chunk of bytes from NSData

On Mar 24, 11:01 am, Tim Kientzle <t...@kientzle.com> wrote:
> On Mar 23, 2012, at 2:58 AM, Ariel Feinerman wrote:
>
> > Tim,
>
> > thank you for your answer ! Sorry for but can you explain by giving an
> > sample? That is I have not worked before with libarchive at all.
> > I will use NSThread or NSOperation for more control
>
> Joerg gave a pointer to one example.
>
> There are many more in the Wiki Examples page:
>
>    https://github.com/libarchive/libarchive/wiki/Examples
They work with whole memory buffers or files

Tim Kientzle

unread,
Mar 26, 2012, 11:49:33 AM3/26/12
to libarchiv...@googlegroups.com
Libarchive takes care of that for you. You give it whatever chunks you have and it merges them if necessary.

Tim

> --
> You received this message because you are subscribed to the Google Groups "libarchive-discuss" group.
> To post to this group, send email to libarchiv...@googlegroups.com.
> To unsubscribe from this group, send email to libarchive-disc...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/libarchive-discuss?hl=en.
>
>
>

Ariel Feinerman

unread,
Mar 28, 2012, 9:57:15 AM3/28/12
to libarchive-discuss
Hi,

I write a few lines of

can someone answer on the ones or I must use a stream like NSStream?

@implementation MyClass

ssize_t archive_read (struct archive *a, void *client_data, const void
**buff)
{
NSData *data = (NSData *) client_data;
*buff = [data bytes];
// lock the buffer when read?
// if there is 0 bytes in buffer ?
return [data length];
}

int archive_close (struct archive *a, void *client_data)
{


}

- (void) start {

if ([self isCancelled]) {
[self setIsFinished:YES];
[self setIsExecuting:NO];
return;
}

[self setIsExecuting:YES];
[self setIsFinished:NO]

NSURLRequest *theRequest = [NSURLRequest requestWithURL: _urlFromRead
cachePolicy: NSURLRequestUseProtocolCachePolicy timeoutInterval: 4];
URLConnection *theConnection = [[NSURLConnection alloc]
initWithRequest: theRequest delegate: self];

if (theConnection) {
_receivedData = [NSMutableData new];
}
else {
NSLog(@"Error: cannot create the connection");
}

struct archive *a;
struct archive_entry *entry;

a = archive_read_new();
archive_read_support_filter_all(a);
archive_read_support_format_all(a);

archive_read_support_compression_all(a);
archive_read_support_format_all(a);

archive_read_open(a, _receivedData, NULL, archive_read, NULL);

while (archive_read_next_header(a, &entry) == ARCHIVE_OK) {

// archive_read_data_into_fd (a, f);
// archive_read_data ();
// archive_read_data_block ();
// archive_read_extract ()
// what is the better ?
}

archive_read_free(a);
}

- (void) connection: (NSURLConnection *) connection
didReceiveResponse: (NSURLResponse *) response {

// This method is called when the server has determined that it
// has enough information to create the NSURLResponse.
// It can be called multiple times, for example in the case of a
// redirect, so each time we reset the data.

[_receivedData setLength: 0];

}

- (void) connection: (NSURLConnection *) connection didReceiveData:
(NSData *) data {

// Append the new data to receivedData.

[_receivedData appendData: data];

// lock the buffer when write ?
// is the way to call callback only when next data is available on
the same thread to
// avoid the lock?
}



@end

kahg...@gmail.com

unread,
Nov 18, 2018, 12:08:35 AM11/18/18
to libarchive-discuss
Hi Tim,
I really liked your response. and I have some questions about:

1. Is there a code example for your suggestion? that can help me very much. (I'm working on Linux C/C++)
2. In the second thread, How will I know how much data I have to push to queue until I release the 1st thread to continue running? How can I know how much data libarchive want to read know? 
I saw this example:
myread(struct archive *a, void *client_data, const void **buff)
{
  struct mydata *mydata = client_data;
  *buff = mydata->buff;
  return (read(mydata->fd, mydata->buff, 10240));
}
  what is this "magic" number - 10240? 
  maybe each time libarchive is calling myread callback, I provide to him 10240 until he finished to read? so each time I'll read 10240, and the last time I'll give him less then 10240. How will libarchive know that we finished to read and he doesn't have to call my read callback again? 
3. What is *client_data? What libarchive will send there?


Thanks in advance,
your response will help me a lot.



בתאריך יום שישי, 23 במרץ 2012 בשעה 08:15:40 UTC+2, מאת Tim Kientzle:

Tim Kientzle

unread,
Nov 18, 2018, 12:42:20 AM11/18/18
to kahg...@gmail.com, libarchiv...@googlegroups.com

> On Nov 13, 2018, at 8:28 AM, kahg...@gmail.com wrote:
>
> Hi Tim,
> I really liked your response. and I have some questions about:
>
> 1. Is there a code example for your suggestion? that can help me very much. (I'm working on Linux C/C++)

You should find a good book on Linux multithreading.


> 2. In the second thread, How will I know how much data I have to push to queue until I release the 1st thread to continue running? How can I know how much data libarchive want to read know?

Libarchive only requires that you give it blocks with at least 1 byte. (A block with zero bytes indicates the end of the data.)


> I saw this example:
> myread(struct archive *a, void *client_data, const void
> **buff)
> {
>
> struct
> mydata *mydata = client_data;
> *buff = mydata->buff;
>
> return (read(mydata->fd, mydata->buff, 10240
> ));
> }
>
> what is this "magic" number - 10240?

That is just an example. You can use any amount you wish.


> maybe each time libarchive is calling myread callback, I provide to him 10240 until he finished to read? so each time I'll read 10240, and the last time I'll give him less then 10240. How will libarchive know that we finished to read and he doesn't have to call my read callback again?

When you give libarchive a block with zero bytes, it will assume that is the end of the data.


> 3. What is *client_data? What libarchive will send there?

Libarchive does not look at that or do anything with it. All it does is to give that data to your callback. You can use it for anything you wish.

Good luck!

Tim
> --
> You received this message because you are subscribed to the Google Groups "libarchive-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to libarchive-disc...@googlegroups.com.
> To post to this group, send email to libarchiv...@googlegroups.com.
> Visit this group at https://groups.google.com/group/libarchive-discuss.
> For more options, visit https://groups.google.com/d/optout.

lala leo

unread,
Nov 18, 2018, 2:32:37 AM11/18/18
to t...@kientzle.com, libarchiv...@googlegroups.com
thank you Tim!

‫בתאריך יום א׳, 18 בנוב׳ 2018 ב-7:42 מאת ‪Tim Kientzle‬‏ <‪t...@kientzle.com‬‏>:‬
Reply all
Reply to author
Forward
0 new messages