I have downloaded a single warc.gz file but it has metadata information. i want the title or header name but i didn't see in it.I want to use elastic search to get the url's when i search the title names. So, i am looking for title names in the data.
WARC/1.0
WARC-Type: metadata
WARC-Date: 2016-05-24T06:19:32Z
WARC-Record-ID: <urn:uuid:63cb79b8-5916-4a92-adb4-a67446db63aa>
WARC-Refers-To: <urn:uuid:08e9fbe3-c842-418a-b0f4-e20718591155>
Content-Type: application/json
Content-Length: 1101
{"Envelope":{"Format":"WARC",
"WARC-Header-Length":"423",
"Block-Digest":"sha1:33WFHKNHJ64W55DRWPWJFJ35T4AN54QF",
"Actual-Content-Length":"20",
"WARC-Header-Metadata":{"WARC-Type":"metadata",
"WARC-Date":"2016-05-24T06:19:32Z",
"WARC-Warcinfo-ID":"<urn:uuid:50214873-4101-4984-8452-db7b5475da62>",
"Content-Length":"20",
"WARC-Record-ID":"<urn:uuid:08e9fbe3-c842-418a-b0f4-e20718591155>",
"WARC-Concurrent-To":"<urn:uuid:c03dfd08-562b-44bf-8c09-dae24d7f67c7>",
"Content-Type":"application/warc-fields"},
"Payload-Metadata":{"Trailing-Slop-Length":"4",
"WARC-Metadata-Metadata":{"Trailing-Slop-Length":"0",
"Metadata-Records":[{"Name":"fetchTimeMs",
"Value":"284"}],
"Actual-Content-Length":"20"},
"Actual-Content-Type":"application/metadata-fields"}},
"Container":{"Compressed":true,
"Gzip-Metadata":{"Footer-Length":"8",
"Deflate-Length":"328",
"Header-Length":"10",
"Inflated-CRC":"-4818014",
"Inflated-Length":"447"},
"Offset":"1038238145",
"Filename":"CC-MAIN-20160524002110-00017-ip-10-185-217-139.ec2.internal.warc.gz"}}
I want to use elastic search to get the url's when i search the title names. So, i am looking for title names in the data.
Bhavana.