data mining air quality data.

136 views
Skip to first unread message

Nafis

unread,
Apr 27, 2012, 9:53:48 PM4/27/12
to airqualityegg
I noticed that someone is webscraping some airnow sites and putting
the data into Pachube (eg. https://pachube.com/feeds/23848), but I
don't see any sites from NY.
Does anyone know how to get additional sites added? I'd like to have
access to the 2.5 particle info.

New York State Department of Environmental Conservation Bureau of Air
Quality Surveillance has a number of monitoring stations around the
state. My local station (http://www.dec.ny.gov/airmon/
stationStatus.php?stationNo=5) reports O3, SO2, CO, Temp, RH, BP, and
AQI.

I set up a temporary app to scarf the data and post it to Pachube
(https://pachube.com/feeds/57363).

As we try to start validating our sensors, is there a more coordinated
plan to get this data into Pachube? Anyone have a server that could be
set up to do this? Does Pachube have some capability to grab data at
an defined interval?

chduke

unread,
Apr 28, 2012, 2:28:07 AM4/28/12
to airqualityegg
Why should Pachube host AQ data from other sites? To compare Egg data
with other sources and also have additional sensor data (like O3)?
Maybe...
I am not sure if Pachube should provide such a feature, I do not think
there is a common way for parsing data from online weather stations
globally.
Maybe users who wish to do so implement their own system for combining
AQ data from several sources and Pachube. Abt server hosting,
depending on your dev environment there are many Cloud-based solutions
(e.g., for Java is Jelastic and Google App Engine)

On Apr 28, 4:53 am, Nafis <na...@nycap.rr.com> wrote:
> I noticed that someone is webscraping some airnow sites and putting
> the data into Pachube (eg.https://pachube.com/feeds/23848), but I

Nafis

unread,
Apr 28, 2012, 9:17:50 AM4/28/12
to airqualityegg
If the comparison/missing data is not on Pachube, then each user is
going to have to write/host code to scarf data from other sites. It
would be easier if they only had to deal with one API. Since the Air
Egg is for a broad range of users, the more Pachube apps we can
provide to view/analyze the data, the better?

Adrian McEwen

unread,
Apr 28, 2012, 9:35:53 AM4/28/12
to airqua...@googlegroups.com
One way to get the data into Pachube from external sites would be to
write scrapers on ScraperWiki <https://scraperwiki.com/>. That could do
the processing from the external sensor sites, and then call the Pachube
API to push the data in.

Cheers,

Adrian.

chduke

unread,
Apr 28, 2012, 10:53:35 AM4/28/12
to airqualityegg
@Nafis: Your thinking is absolutely right.
I am just worried about the diversity of the different online weather
data resources; how Pachube could parse each weather station data
close to every user? It is even hard to find out about the available
resources and how data is formated, etc.

Adrian's suggestion is also an alternative. Keep in mind, it is more
likely that users who might be more interested in the additional data
are also able to implement their own service.

Nafis

unread,
Apr 28, 2012, 11:09:05 PM4/28/12
to airqualityegg
Scraperwiki is pretty cool, but it looks like you have to pay for
hourly scraping. Some sites you would need to scrape more frequently.
What tools do you think we should be using to visualize the air
quality data? Is the Air Egg project going to produce airnow (http://
airnow.gov/) type plots?
I'm just trying to think ahead for when we have a couple thousand eggs
out there :-)
Reply all
Reply to author
Forward
0 new messages