How to extract text from a Javascript block in an HTML document?

117 views
Skip to first unread message

Seven

unread,
Feb 27, 2010, 12:29:34 PM2/27/10
to greasemonkey-users
There is a webpage I'm trying to write a script for, and that page
contains a Javascript block like this:

<script language="JavaScript" type="text/javascript">
var _editor_lang = "en";
var _jive_is_reply = "true";
var _jive_gui_quote_text = "BLAH BLAH BLAH BLAH BLAH";
var _jive_tables_enabled = "true";
var _jive_images_enabled = "true";
</script>

I need to get that BLAH BLAH BLAH text (because I want to insert it
elsewhere), but how do I extract it?

thanks,

Seven

RodMcguire

unread,
Feb 27, 2010, 12:44:25 PM2/27/10
to greasemonkey-users
umm, your text is the value of

unsafeWindow._jive_gui_quote_text

unless scripts on the page have changed it.

Seven

unread,
Feb 27, 2010, 2:07:02 PM2/27/10
to greasemonkey-users
Thanks, Rod!

(I did try to look it up, if that's what that "umm" means.)

cc

unread,
Feb 27, 2010, 5:44:42 PM2/27/10
to greasemon...@googlegroups.com
Cautionary note: read the entirety of
http://wiki.greasespot.net/UnsafeWindow, especially if you're going to
be running the script on a page you don't control. Bad things can happen
when you use unsafeWindow.

On 2010-02-27 11:07, Seven wrote:
> Thanks, Rod!
>
> (I did try to look it up, if that's what that "umm" means.)

--
cc | pseudonymous |<http://carlclark.mp/>


--
‖ Confidence is the feeling you have before you really understand the
problem. ‖ http://tagzilla.mozdev.org v0.066

Sam

unread,
Feb 27, 2010, 6:48:09 PM2/27/10
to greasemon...@googlegroups.com
This is why i was gong to recommend the innerHTML and substring approach to solving the problem.  It would still be difficult for someone to change the variable to anything terribly malicious unless you were going to eval the contents or for some reason call it like it were a function but I suppose it really all depends on what you do with the value.  You could test for strange contents and reject it, make sure it's a string by using 
var myval = new String(unsafeWindow.tehvariable) or you could try using typeof(unsafeWindow.tehvariable)=='string' possibly to avoid issues, it largely depends on what you're doing with it, if your throwing it at a database there are obvious risks there.  Copy it to your own variable before you use it in case they change it later.  



--
You received this message because you are subscribed to the Google Groups "greasemonkey-users" group.
To post to this group, send email to greasemon...@googlegroups.com.
To unsubscribe from this group, send email to greasemonkey-us...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/greasemonkey-users?hl=en.


cc

unread,
Feb 28, 2010, 3:49:03 AM2/28/10
to greasemon...@googlegroups.com
[Arg, forgot to check the linked page before sending; turns out it doesn't have enough examples. Added a note to the page.]

Actually, it's a lot worse than that, and it's trivial for someone to make a "variable" that was quite malicious indeed: see http://groups.google.com/group/greasemonkey-dev/tree/browse_frm/thread/933ecdb307c4386d/864b5121ad4698cb for details.

esquifit

unread,
Feb 28, 2010, 3:57:43 AM2/28/10
to greasemon...@googlegroups.com
On Sun, Feb 28, 2010 at 9:49 AM, cc <carl...@lavabit.com> wrote:
> [Arg, forgot to check the linked page before sending; turns out it doesn't
> have enough examples. Added a note to the page.]
>
> Actually, it's a lot worse than that, and it's trivial for someone to make a
> "variable" that was quite malicious indeed: see
> http://groups.google.com/group/greasemonkey-dev/tree/browse_frm/thread/933ecdb307c4386d/864b5121ad4698cb
> for details.

The exploit shown in this thread has been fixed since then. This
means that the page doesn't have access to the privileged GM_* api
anymore, at least in this way. I still don't know whether this makes
unsafeWindow actually safe (up to our understanding of the risks), or
if there are further evil actions possible that the page could carry
out and that don't involve GM_*.

esquifit

unread,
Feb 28, 2010, 4:03:52 AM2/28/10
to greasemon...@googlegroups.com
On Sun, Feb 28, 2010 at 12:48 AM, Sam <qufi...@gmail.com> wrote:
> You could test for
> strange contents and reject it, make sure it's a string by using
> var myval = new String(unsafeWindow.tehvariable) or you could try using
> typeof(unsafeWindow.tehvariable)=='string' possibly to avoid issues,

Actually this is exactly what you should avoid. The mere fact of
mentioning "unsafeWindow.something" is (potentially) unsafe, even if
"something" is a variable, since the page can define "something" to be
a getter method that can escalate the call stack back until reaching
the Greasemonkey sandbox scope and do things there with the privileges
of this sandbox, if any. The current release of Greasemonkey makes
provisions for hindering access to the GM_* api, I don't know if there
are other risks.

Anthony Lieuallen

unread,
Feb 28, 2010, 9:36:41 AM2/28/10
to greasemon...@googlegroups.com
On 2/28/2010 3:57 AM, esquifit wrote:
> The exploit shown in this thread has been fixed since then. This
> means that the page doesn't have access to the privileged GM_* api
> anymore, at least in this way. I still don't know whether this makes
> unsafeWindow actually safe..

Well said. The unsafeWindow is unsafe specifically because Javascript
is so powerful and flexible. A blessing and a curse. In short, there
are plenty of good ways to interact with the content page, if that's
really what you want to do:

http://wiki.greasespot.net/Category:Coding_Tips:Interacting_With_The_Page

cc

unread,
Feb 28, 2010, 12:13:27 PM2/28/10
to greasemon...@googlegroups.com
Ah, OK. Hadn't remembered that specifically. My bad.

Sam

unread,
Feb 28, 2010, 3:23:13 PM2/28/10
to greasemon...@googlegroups.com
Ideally there would be some way for the "caller" to opt out of providing any sort of identity built into javascript, isn't this a flaw in wrappers that such information could possibly get through?  If the engine knows who is calling the function shouldn't any wrapper be blacklisted and the caller an empty object or an object that would pretend to be any empty function that was requested?  There must be a list of wrappers that exist somewhere that could be utilized on a low level to block that information by indexed references.  I don't think there is any useful functionality to worry about breaking, though it might render some functions un-callable from GM unless the fake "caller" would respond to any request with an empty string or simply the correct type of return value for the real function within the wrapper?  Ideally you could specify from your script what any of your fake callers functions would return in case the script were to try to detect whether or not you were exposed.  I don't know for sure but can't GM also look at the caller and deny access to it's own functions?  Caller is misleading if a page function can pretend that the scope called it's own function, there must be some way to see where the chain was initiated without getting a reference to it, but even just a boolean that indicates if the origin of or the implicit this of the line of code being executed before it returns was within the scope of your current wrapper or not?  The value would be notScope or something to that effect, I don't think this should be available to the page scope or else it could be used to block GM from calling functions.  When a wrapper is calling a function notScope could be undefined or otherwise the test not performed.  Doesn't the engine have a reference to it's current scope's root to compare with the function's root scope at the time it parses a function being called on any page script?  Do I have to complete firefox to find out?  I mean it's really important I think that the user owns their web browser, why would you ever provide any information to the page?  I would rather run the page in a sandbox than the userscript.  

Reply all
Reply to author
Forward
0 new messages