WARC/1.0
WARC-Record-ID: <urn:uuid:67d699d5-1b5a-4766-9762-424b0f1f61b6>
Content-Length: 226358
WARC-Date: 2018-07-05T07:29:29Z
WARC-Type: response
WARC-Target-URI: https://www.libertatea.ro/stiri/declaratia-unica-trebuie-depusa-pana-pe-16-iulie-care-reducerile-sunt-acordate-de-fisc-2318001
Content-Type: application/http; msgtype=response
WARC-Payload-Digest: sha1:NN25PNASXCFSJLONXPO4GZ4AAORYJ6LV
WARC-Block-Digest: sha1:W4665EQRODKGSAXTR7LRHU6QDO5NCFXP
Hi Jai,
yes, if you have the file name (S3 path), WARC record offset and length (in bytes)
you can fetch a single WARC record, see also
https://groups.google.com/forum/#!msg/common-crawl/pQ34q-_EARU/FLFtvTfXAwAJ
Best,
Sebastian
On 07/12/2018 12:52 PM, Jai Pancholi wrote:
>
> Would using something like a byte offset
> <http://www.automatingosint.com/blog/2015/08/osint-python-common-crawl/> be useful?
>
> --
> You received this message because you are subscribed to the Google Groups "Common Crawl" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to