Curl Download Tar.gz

0 views
Skip to first unread message

Soon Gangi

unread,
Jul 21, 2024, 9:45:49 PM7/21/24
to asvipatsu

The tarball download works fine with wget command. I tried reading more about the difference between the two and feel that the curl should have worked. I never had issues with using curl before when downloading archives for e.g., linux source tarball. I am really not sure if the issue is with the curl or the golang server. Any explanation would be helpful.

curl download tar.gz


DOWNLOAD » https://urllie.com/2zz9Xi



At its most basic you can use cURL to download a file from a remote server. To download the homepage of example.com you would use curl example.com. cURL can use many different protocols but defaults to HTTP if none is provided. It will, however, try other protocols as well and it can intelligently guess which protocol to use if hints are given. For instance, if you use curl ftp.example.com it will automatically try the FTP:// protocol.

Quite often when learning curl you will either get an unexpected output or no output at all. The -v option is very useful in these situations. The -v option displays all the information in the request sent to the remote server and the response it receives.

If a site has WordPress installed for example and they are using 301 redirects you will by default download the redirect response only. To ensure you follow the redirects and get the final file you will need to use the -L option. If you try curl google.com you will just get the redirect page, if you now try curl -L google.com you will get the page you were after.

When you are writing a script using cURL sometimes you will want to view the response headers only without seeing the data or the request. Having a clean view of what is happening, without all the data to obscure things, can be helpful with debugging. To do this you would use the -I option. For instance in the previous example with Google we could use curl -I google.com

If you want to verify that your SSL cert is valid without using your browser and run into potential caching issues then use curl --cacert mycert.crt This is also useful if you need to validate the connection to ensure that you are connecting to the right server.

As stated in the introduction there are over 100 command line options for cURL so we have just covered some of the most commonly used ones in our experiences. cURL can do a lot more than described above and man curl is a good place to start to find out more. If you prefer a web-based reference then is the authoritative one.

(without the -O option) so that curl streams the remote contents to stdout from where it can be piped into gunzip, but then you would need to redirect the gunzip output to overwrite the target uncompressed file as appropriate.

Follow redirects when downloading. Sometimes a web server has hidden redirects for security and/or random reasons. If you don't follow the redirect, the wrong data gets downloaded and your application reading the piped data gets confused. You can follow redirects with curl using the -L flag.

I'm trying to build a new server with wildfly 9.02 but I can't download wildfly using wget or curl. Either tool returns an error that the connection is refused but if I put the same URL into a browser window on my desktop it downloads the file just fine. Since I have no browser on the server, it would be very helpful if I could download directly rather than have to create a mount to a fileshare somewhere to get the file. This is for a Teiid server and the Teiid download worked just fine with wget.

I've actually worked around the issue with a mount to a file share but I am trying to create a script that will build additional servers for me so I'd like to resolve the wget or curl question. Is there a better way to get the downloads?

(I know that this is going to sound more like a firewall problem. But given DOD and corporate regulations, I cannot get I.T. to help here. So I want to see if there is a way to trick OPAM to allow me to manually use tar downloads, or get curl to work).

I also zipped and attached the in (sysout.bin) file and out (Sysoutbin_decoded.tar.gz) file used in the curl. With the output file, if we remove the top 5 lines it can be saved as a valid archive/tar.gz and then be opened with a zip utility. Also included, codesnippet.txt shows the Python pattern the backend developers are using to accomplish what we are trying to do here.

Now ./x.py build fails with:...downloading -lang.org/dist/2020-11-18/rust-std-beta-sparcv9-sun-solaris.tar.gz######################################################################### 100.0%extracting /opt/rust/build/cache/2020-11-18/rust-std-beta-sparcv9-sun-solaris.tar.gzcurl: (22) The requested URL returned error: 404

In the Nextcloud installation there's a file lib/private/Streamer.php that does a check on the user agent present in the HTTP headers. If the user agent contains "macintosh" or "mac os x" it's supposed to provide a tar format instead of zip. I pass a user-agent using curl and wget that matches the criteria but it's gobbled up somewhere else and instead the user-agent reported includes WebDAV stuff and so this match does not occur. I'm either doing something wrong or there's a bug in the code. Could also be that with curl/wget we're not passing all the right headers as supposed to a web browser and so different things are happening.

Tar (Tape Archive) is a popular file archiving format in Linux. It can be used together with gzip (tar.gz) or bzip2 (tar.bz2) for compression. It is the most widely used command line utility to create compressed archive files (packages, source code, databases and so much more) that can be transferred easily from machine to another or over a network.

GDAL can access files located on "standard" file systems, i.e. in the / hierarchy on Unix-like systems or in C:, D:, etc... drives on Windows. But most GDAL raster and vector drivers use a GDAL-specific abstraction to access files. This makes it possible to access less standard types of files, such as in-memory files, compressed files (.zip, .gz, .tar, .tar.gz archives), encrypted files, files stored on network (either publicly accessible, or in private buckets of commercial cloud storage services), etc.

To point to a file inside a .tar, .tgz .tar.gz file, the filename must be of the form /vsitar/path/to/the/file.tar/path/inside/the/tar/file, where path/to/the/file.tar is relative or absolute and path/inside/the/tar/file is the relative path to the file inside the archive.

A generic /vsicurl/ file system handler exists for online resources that do not require particular signed authentication schemes. It is specialized into sub-filesystems for commercial cloud storage services, such as /vsis3/, /vsigs/, /vsiaz/, /vsioss/ or /vsiswift/.

While most GDAL raster and vector file systems can be accessed in a remote waywith /vsicurl/ and other derived virtual file systems, performance is highlydependent on the format, and even for a given format on the special dataarrangement. Performance also depends on the particular access pattern madeto the file.

/vsicurl/ is a file system handler that allows on-the-fly random reading of files available through HTTP/FTP web protocols, without prior download of the entire file. It requires GDAL to be built against libcurl.

Starting with GDAL 2.3, options can be passed in the filename with the following syntax: /vsicurl?[option_i=val_i&]*url= where each option name and value (including the value of "url") is URL-encoded. Currently supported options are:

Starting with GDAL 2.1.3, the CURL_CA_BUNDLE or SSL_CERT_FILE configuration options can be used to set the path to the Certification Authority (CA) bundle file (if not specified, curl will use a file in a system location).

Starting with GDAL 2.3, the CPL_VSIL_CURL_NON_CACHED configuration option can be set to values like /vsicurl/ :/vsicurl/ _directory, so that at file handle closing, all cached content related to the mentioned file(s) is no longer cached. This can help when dealing with resources that can be modified during execution of GDAL related code. Alternatively, VSICurlClearCache() can be used.

Starting with GDAL 2.1, /vsicurl/ will try to query directly redirected URLs to Amazon S3 signed URLs during their validity period, so as to minimize round-trips. This behavior can be disabled by setting the configuration option CPL_VSIL_CURL_USE_S3_REDIRECT to NO.

/vsicurl_streaming/ is a file system handler that allows on-the-fly sequential reading of files streamed through HTTP/FTP web protocols, without prior download of the entire file. It requires GDAL to be built against libcurl.

Although this file handler is able seek to random offsets in the file, this will not be efficient. If you need efficient random access and that the server supports range downloading, you should use the /vsicurl/ file system handler instead.

/vsis3/ is a file system handler that allows on-the-fly random reading of (primarily non-public) files available in AWS S3 buckets, without prior download of the entire file. It requires GDAL to be built against libcurl.

/vsis3_streaming/ is a file system handler that allows on-the-fly sequential reading of (primarily non-public) files available in AWS S3 buckets, without prior download of the entire file. It requires GDAL to be built against libcurl.

/vsigs/ is a file system handler that allows on-the-fly random reading of (primarily non-public) files available in Google Cloud Storage buckets, without prior download of the entire file. It requires GDAL to be built against libcurl.

/vsigs_streaming/ is a file system handler that allows on-the-fly sequential reading of files (primarily non-public) files available in Google Cloud Storage buckets, without prior download of the entire file. It requires GDAL to be built against libcurl.

/vsiaz/ is a file system handler that allows on-the-fly random reading of (primarily non-public) files available in Microsoft Azure Blob containers, without prior download of the entire file. It requires GDAL to be built against libcurl.

760c119bf3
Reply all
Reply to author
Forward
0 new messages