How to download large filesets.

15 views
Skip to first unread message

Ville Rantanen

unread,
Sep 11, 2013, 1:41:51 PM9/11/13
to andur...@googlegroups.com

Since Anduril is designed to re-execute the incomplete and failing processes, it may be tedious to download large filesets, since network errors etc may force you to restart the whole download.

Here is an unorthodox script to download a set of files, where the URLs are stored in a CSV. This way parallel downloads are enabled.
The script will download files in to an upstream location, which is normally a forbidden component behaviour. However, using this method, the download folder itself will never be deleted, even if the component instance crashes. Using the wget command, we can resume downloads later.
Here, we have an empty folder "files", where the downloads are stored.

files=INPUT(path="files")

list_of_urls
=StringInput(content="""File
http://ftp.funet.fi/pub/Linux/INSTALL/Ubuntu/dvd-releases/releases/raring/release/ubuntu-13.04-desktop-armhf+omap4.img.torrent
http://ftp.funet.fi/pub/Linux/INSTALL/Ubuntu/dvd-releases/releases/raring/release/ubuntu-13.04-desktop-powerpc.iso.torrent
http://ftp.funet.fi/pub/Linux/INSTALL/Ubuntu/dvd-releases/releases/quantal/release/ubuntu-12.10-server-armhf+omap.img.torrent
"""
)

downloads
={}
binder
={}
for i,x:std.enumerate(std.itercsv(list_of_urls)) {
    downloads
[i]=BashEvaluate(var1=files, param1=x.File,
                script
="cd @var1@; wget -c @param1@",
                failOnErr
=true,
                echoStdOut
=true)
    binder
[i]=downloads[i].optOut1
}
binds_downloads
=ArrayCombiner(binder)

using_the_downloads
=Folder2Array(files, @bind=binds_downloads)

The last component, "Folder2Array", may be any component that uses the downloaded folder of files. Just keep the @bind annotation there.

Reply all
Reply to author
Forward
0 new messages