Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Hacking Wolfram|Alpha

50 views
Skip to first unread message

Fred Klingener

unread,
Jul 23, 2009, 3:54:33 AM7/23/09
to
While waiting for the Wolfram|Alpha API, I've been nosing around what
we have available, looking for ways that Mathematica can make use of W|
A. Here's an exercise that I thought was interesting enough to share
(and post to a place where I could find it again.)

My goal was to draw a graph of the specific entropy of superheated
steam as a function of pressure at a constant temperature.

If I feed into W|A's browser query window "entropy steam 400F 60psia",
I get a newsy page that includes the answer "Result: 7175 J/(kg K)".
The part of W|A that will some day (I hope) respond to requests for
alternate unit systems isn't hooked up yet, but getting any answer is
cool.

The next step is to find out whether the result can be retrieved by
Mathematica. The URL of the response page in the particular example I
ran was "http://www31.wolframalpha.com/input/?i=entropy+steam+400F
+60psia". I'm guessing that the "31" in the "www31" is unnecessary and
it might cause problems later, so I'll leave it out. The Import
[url,"Elements"] gives a normal looking list of forms:

In[1]:= Import["http://www.wolframalpha.com/input/?i=entropy+steam+400F
+60psi", \
"Elements"]

Out[1]= {"Data", "FullData", "Hyperlinks", "Images", "ImageURLs",
"Plaintext", \
"Source", "Title", "XMLObject"}

The construction of the W|A response page is complex, and (at least in
this case) the results are presented in graphic elements that lie
outside the reach of the "Data" and "FullData" Imports. The result pod
is available as one of the Images, but not in a way that makes the
numerical value accessible. The numbers appear in the Source, but
ultimately, the most promising attack has to be picking the XMLObject
apart.

In[2]:= xml = Import[
"http://www.wolframalpha.com/input/?i=entropy++steam+400+degree+F
+60+psia",
"XMLObject"];

The xml object is a pretty daunting thing, but it does have a
structure, and the structure does yield to systematic disassembly.
There is a variety of XMLElements that serve different purposes -
tables, scripts, and some evidently make web Mathematica calls. An
inspection of the xml object suggeted that the most vulnerable spots
were the XMLObjects with tags of "img." While the "src" attributes
point to images computed elsewhere, an "alt" attribute and a "title"
were defined for the Result pod, presumably for display if the call to
the image generator failed.

All of the XMLElements with "img" tags can be listed by:

In[22]:= imgs = Cases[xml, XMLElement["img", _, _], Infinity];

and evidently, the one I need is the fourth.

In[23]:= imgs[[4]];

which is a Mathematica expression with the following characteristics:

In[29]:= {Head[#], Depth[#], Length[#]} &@%

Out[29]= {XMLElement, 4, 3}

I can pick out the attributes section of the fourth XMLElement, which
section is a List of Rules.

In[30]:= imgs[[4]] /. XMLElement[_, attr_, _] -> attr;
Head[#] & /@ %

Out[31]= {Rule, Rule, Rule, Rule}

I can extract a String representation of value assigned to the "title"
attribute with the replacement:

In[8]:= "title" //. %%

Out[8]= "7175 J/(kg K) (joules per kilogram kelvin)..."

The appearance of the superfluous spelled-out units is a nuisance, but
easy enough to fix.

In[9]:= StringTake[#, -1 + StringPosition[#, "(j"][[1, 1]]] &@%

Out[9]= "7175 J/(kg K) "

and a variable can be assigned to the entropy that I'm after.

In[10]:= s = ToExpression[%]

Out[10]= (7175 J)/(K kg)

Some shuffling is required to match the units in which the I expressed
the input query:

In[11]:= << Units`

In[12]:= Quiet@Convert[s /. J -> Joule /. kg -> Kilogram /. K ->
Kelvin,
BTU/(Pound Rankine)]

Out[12]= (1.71371 BTU)/(Pound Rankine)

which matches up pretty well (somewhere between "astonishing" or "what
did I expect?") with the value of 1.7135 BTU/(lb R) given by my
trusty, crusty 1936 Keenan and Keyes "Thermodynamic Properties of
Steam."

Pretty nifty.

To draw the plot, I can nest the whole mess into one inscrutable knot
and poll W|A a couple of times with a range of pressures.

In[17]:= << Units`
ListLinePlot[
table = {#,
First@
Quiet@
Convert[
(
ToExpression[StringTake[#, -1 + StringPosition[#, "(j"]
[[1, 1]]] &@
(
"title" //.
(
Cases[

Import["http://www.wolframalpha.com/input/?i=\
entropy++steam+400+degree+F+" <> ToString[#] <> "+psia", "XMLObject"]
, XMLElement["img", _, _], Infinity][[4]] //.
XMLElement["img", attr_, _] -> attr
)
)
]
)
/. J -> Joule /. K -> Kelvin /. kg -> Kilogram, BTU/(Pound
Rankine)]
} & /@ {40, 60, 80, 100}
]

...

For all that work, the plot should at least be smooth.

In[19]:= line = Fit[table, {1, x, x^2}, x]

Out[19]= 1.88242 - 0.00353908 x + 0.0000123901 x^2

In[20]:= Plot[
line
, {x, 40, 100}
, AxesOrigin -> {40, 1.64}
, AxesLabel -> {"pressure (psia)"
,
"entropy (BTU/(lb R)"
}
, PlotLabel -> "Superheated Steam\nEntropy vs Pressure at 400\
[Degree] F"
, BaseStyle -> "Label"
]

...

A procedure like this, making a few programmed calls to W | A,
evidently doesn't attract the attention of whatever anti - jamming
safeguards W|A has built in. It' s slow, and maybe that' s the
inherent protection.

Overall, what have I learned about how I'd like an API to work?

1.) An API user shouldn't have to become expert in javascripts, web
Mathematica, and XML together to use it.

2.) W|A should place computed results within reach of an Import[url,
"Data"]

3.) If it's a Mathematica API, numerical results should be presented
in a way that's compatible with Units`.

4.) Depending on the way the load evolves, W|A calls should be
Listable with the hope of avoiding the numerous spearate calls I made
in the example.

Fred Klingener

gigabi...@brockeng.com

unread,
Jul 29, 2009, 5:55:30 AM7/29/09
to
On Jul 23, 3:54 am, Fred Klingener <gigabitbuc...@BrockEng.com> wrote:
> While waiting for the Wolfram|Alpha API, I've been nosing around what
> we have available, looking for ways that Mathematica can make use of W|
> A. Here's an exercise that I thought was interesting enough to share
> (and post to a place where I could find it again.)

...

I'm so ashamed to reply to my own post, and I'm so ashamed to report
how much headroom there was to streamline the method to extract
Mathematica-ready data from Wolfram|Alpha output.

Here's the density of gasoline in a Units`-ready form.

-----------------------------------------------------------------------
<<Units`
Clear[rho]
rho =

First@Cases[

Import[
"http://www.wolframalpha.com/input/?i=gasoline+density+in+lb%\
2Fgal", "XMLObject"]

, Rule["title", x_] :>
ToExpression[StringReplace[x, z__ ~~ "lb/gal" ~~ __ -> z]] Pound/
Gallon /; StringMatchQ[x, __ ~~ "lb/gal" ~~ __]

, Infinity]
----------------------------------------------------------

The method sends the query string to W|A and Imports the result as an
XMLObject. The Cases searches all levels (the Infinity argument) of
the XMLObject for Rules that replace the "title" attribute, where the
numerical results appear.

There doesn't seem to be any reason that W|A codes the numerical
answer in a rule like this, but there doesn't seem to be a reason not
to either. Whatever the design intent, the form puts simple numerical
data in an accessible place.

Then, there's a StringReplace to pick out the numerical part of the
string to the left of the "lb/gal" units and an attachment of the
Units` units.

It's still not quite right. The /;StringMatchQ seems extraneous, the
thrashing between expressions and Strings is messy, but I couldn't
figure how to make it work without all that.

API? We don' need no steenking API!

Cheers,
Fred Klingener

gigabi...@brockeng.com

unread,
Aug 2, 2009, 6:00:12 AM8/2/09
to
On Jul 29, 5:55 am, "gigabitbuc...@BrockEng.com"

<gigabitbuc...@BrockEng.com> wrote:
> On Jul 23, 3:54 am, Fred Klingener <gigabitbuc...@BrockEng.com> wrote:
>
> > While waiting for the Wolfram|Alpha API, I've been nosing around what
> > we have available, looking for ways that Mathematica can make use of W|
> > A. Here's an exercise that I thought was interesting enough to share
> > (and post to a place where I could find it again.)
>
> ...
>
> I'm so ashamed to reply to my own post, and I'm so ashamed to report
> how much headroom there was to streamline the method to extract
> Mathematica-ready data from Wolfram|Alpha output.

But bat-winged gotchas lurk everywhere.

Looking for the energy stored in a 18v, 3.0 amp-hour lithium ion
battery, I fed

"3.0 amp hour 18 volt in Joule"

into W|A's query line and get back a page at

"http://www.wolframalpha.com/input/?i=3.0%20amp%20hour%2018%20volt%20in
%20Joule&t=ff3tb01"

, which reports the answer as 194 400 J. This matches the answer
returned by Units`.

So I fed the query line into the code from the second post above, with
adjustments:

<< Units`

Clear[W]
W =

First@Cases[

xml = Import[
"http://www.wolframalpha.com/input/?i=3%20Amp%20hour%2018%20volt%
\
20in%20Joule&t=ff3tb01", "XMLObject"]

, Rule["title", x_] :>

ToExpression[StringReplace[x, z__ ~~ "J" ~~ __ -> z]] Joule /;
StringMatchQ[x, __ ~~ "J" ~~ __], Infinity]

and get 77 200 Joule as a result.

Never mind how long it took me to figure out the obvious. The
ToExpression[] interprets the string result 194 400 as 194 x 400 and
evaluates it.

Slot[] Function[] Prefix[] Out[] Not Not (That's swearing in
Mathematica.)

I applied my customary Monte Carlo programming techniques and
discovered that ToExpression could be persuaded to interpret the
string the way I wanted it to iff I suggested TeXForm.

In[39]:= << Units`

In[80]:= Clear[W]
W =

First@Cases[

xml = Import[
"http://www.wolframalpha.com/input/?i=3%20Amp%20hour%2018%20volt%
\
20in%20Joule&t=ff3tb01", "XMLObject"]

, Rule["title", x_] :>

ToExpression[StringReplace[x, z__ ~~ "J" ~~ __ -> z],
TeXForm] Joule /; StringMatchQ[x, __ ~~ "J" ~~ __], Infinity]

Out[81]= 194400 Joule

Maybe W|A doesn't WANT this to be easy.

Cheers,
Fred Klingener

magma

unread,
Aug 3, 2009, 5:48:04 AM8/3/09
to

> I applied my customary Monte Carlo programming techniques ......
> Cheers,
> Fred Klingener

Congratulations. You hit the jackpot! :-)

AES

unread,
Aug 3, 2009, 5:46:31 AM8/3/09
to
In article <h53o3c$1k4$1...@smc.vnet.net>,
"gigabi...@BrockEng.com" <gigabi...@BrockEng.com> wrote:

> Looking for the energy stored in a 18v, 3.0 amp-hour lithium ion
> battery, I fed
>
> "3.0 amp hour 18 volt in Joule"
>
> into W|A's query line and get back a page at

> , which reports the answer as 194 400 J. This matches the answer
> returned by Units`.
>

> Cheers,
> Fred Klingener

If you're interested in batteries (they're more useful overall than
W|A?), you might try asking W|A something like

"3.0 amp hour 18 volt ?? pounds in feet"

(with suitable rearrangement or rephrasing).

That is, the energy delivery capabilities of any battery technology can
be expressed by a single number: the "battery height" of that
technology, defined as the maximum height to which any battery using
that technology can potentially lift itself against gravity, using some
kind of perfectly efficient lifting machinery.

I've not done a systematic exploration of the battery heights of
different battery technologies (lithium, lead acid, etc), but this is
quite a useful as well as instructive number. If the battery height of
the battery technology used in your Prius or Tesla is 10,000 feet and
the batteries make up 10% of the weight of the vehicle, it doesn't
matter what the claimed range of your vehicle may be -- you're never
going to get from A to B, even starting out fully charged, if there's a
1,001 foot pass between A and B.

gigabi...@brockeng.com

unread,
Aug 3, 2009, 5:49:11 AM8/3/09
to
On Aug 2, 6:00 am, "gigabitbuc...@BrockEng.com"

<gigabitbuc...@BrockEng.com> wrote:
> On Jul 29, 5:55 am, "gigabitbuc...@BrockEng.com"
>
> <gigabitbuc...@BrockEng.com> wrote:
> > On Jul 23, 3:54 am, Fred Klingener <gigabitbuc...@BrockEng.com> wrote:
>
> > > While waiting for the Wolfram|Alpha API, I've been nosing around what
> > > we ...

http://www.wolframalpha.com/termsofuse.html:

----------------------------------

Methods of Access

The Wolfram|Alpha service may be used only by a human being using a
conventional web browser to manually enter queries one at a time.
Because Wolfram|Alpha is doing computation, not just lookup, each
query may require significant CPU time on multiple parallel servers.
Any attempt to use a robot, script, or organized group of humans to
repeatedly access Wolfram|Alpha could place an unacceptable load on
the system, and is strictly forbidden.

If our monitoring systems detect an attempt to access the service in a
forbidden way, to execute systematic patterns of queries, to index the
website, or to do anything else that we feel jeopardizes the integrity
of our system or access to it by other users, we may terminate or
suspend access to the service for specific users or IP ranges.

uery may require significant CPU time on multiple parallel servers.
Any attempt to use a robot, script, or organized group of humans to
repeatedly access Wolfram|Alpha could place an unacceptable load on
the system, and is strictly forbidden.

If our monitoring systems detect an attempt to access the service in a
forbidden way, to execute systematic patterns of queries, to index the
website, or to do anything else that we feel jeopardizes the integrity
of our system or access to it by other users, we may terminate or
suspend access to the service for specific users or IP ranges.
----------------------

I don't need a lawyer to tell me that the Mathematica programs shown
in this thread violate the Terms of Service.

Bye.

Fred Klingener


0 new messages