after having heard about Clojure for a bit, I started playing around
with it a couple of days ago (which led to this:
http://github.com/citizen428/ClojureX in case anyone is interested).
Anyway, I'm now trying to write a small program which extracts the
enclosure URLs out of an RSS 2.0 feed and downloads the linked files
into the current working directory. The main problem is that I can't
seem to figure out how to use duck-streams to achieve what I want... I
had quite a couple of unsuccessful tries so far (which may have to do
with the fact that I don't really have much of a grounding in either
Lisp or Java).
I'm posting my entire code here, please feel free, in case you have
suggestions on the coding style or anything.
(ns citizen428.rssfetcher
(:use [clojure.contrib.zip-filter.xml :only (attr xml->)]
[clojure.contrib.duck-streams])
(:require [clojure.zip :as zip] [clojure.xml :as xml])
(:import [java.net URL] [java.io BufferedOutputStream]))
(defn get-enclosure-urls [url]
"Extracts enclosure URLs from an RSS 2.0 feed"
(let [feed (zip/xml-zip (xml/parse url))]
(xml-> feed :channel :item :enclosure (attr :url))))
(defn fetch-enclosures [urls]
"Fetches the files provided in an URL list"
(doseq [url urls]
(let [[file-name] (re-find #"(\w|[-.])+$" url)]
-magic missing- )))
Thanks,
Michael
#clojure for the rescue. replaca pointed me to the documentation of
clojure.contrib.http.agent which has a nice example for what I wanted
to do. Here's the solution:
(defn fetch-enclosures [urls]
"Fetches the files provided in an URL list"
(doseq [url urls]
(let [[name] (re-find #"(\w|[-.])+$" url)]
(http-agent url
:handler (fn [agnt] (with-open [w (writer name)]
(copy (stream agnt) w)))))))
Thanks,
Michael