Bug with split() or column add ?

0 views
Skip to first unread message

Thad Guidry

unread,
Feb 17, 2012, 4:17:35 PM2/17/12
to google-r...@googlegroups.com
Today I tried to split out some IDs from some itunes url metadata, such as


using this GREL expression:

value.split("/id")[1].split("?i=")[1].chomp("&uo=5")

I was able to get 1 column created just fine... then it got stuck on 2nd column with spinner WORKING just twirling away.  I noticed some errors in the log (see below).

I then shutdown Refine and fired it back up, since this was a project file still open and running on my server for about a day, so I thought, hmm better restart since it might be weird Amazon EC2 memory or storage problems or something.

After I restarted Refine, I was able to successfully perform that GREL expression just fine on the 2nd column for all 400,000 url rows.  So who knows what the real cause was...anyways...

LOG
20:52:51.507 [                   refine] GET /command/core/get-project-metadata
(99ms)
20:52:51.583 [                   refine] GET /command/core/get-models (76ms)
20:52:51.722 [                   refine] POST /command/core/get-rows (139ms)
20:52:51.722 [                   refine] GET /command/core/get-history (0ms)
20:52:51.785 [                   refine] GET /command/core/get-history (63ms)
20:54:02.326 [                   refine] POST /command/core/preview-expression (
70541ms)
20:54:02.384 [                   refine] GET /command/core/get-expression-histor
y (58ms)
20:54:02.406 [                   refine] GET /command/core/get-starred-expressio
ns (22ms)
20:54:02.407 [                   refine] GET /command/core/get-expression-langua
ge-info (1ms)
20:54:19.709 [                   refine] POST /command/core/preview-expression (
17302ms)
20:54:31.032 [                   refine] POST /command/core/log-expression (1132
3ms)
20:54:31.032 [                   refine] POST /command/core/add-column (0ms)
20:54:31.045 [          org.mortbay.log] Error for /command/core/add-column (13m
s)
java.lang.AbstractMethodError: java.lang.Exception.toString()Ljava/lang/String;
        at com.google.refine.grel.ast.FunctionCallExpr.evaluate(FunctionCallExpr
.java:71)
        at com.google.refine.grel.ast.FunctionCallExpr.evaluate(FunctionCallExpr
.java:62)
        at com.google.refine.operations.column.ColumnAdditionOperation$1.visit(C
olumnAdditionOperation.java:198)
        at com.google.refine.browsing.util.ConjunctiveFilteredRows.visitRow(Conj
unctiveFilteredRows.java:76)
        at com.google.refine.browsing.util.ConjunctiveFilteredRows.accept(Conjun
ctiveFilteredRows.java:65)
        at com.google.refine.operations.column.ColumnAdditionOperation.createHis
toryEntry(ColumnAdditionOperation.java:151)
        at com.google.refine.model.AbstractOperation$1.createHistoryEntry(Abstra
ctOperation.java:52)
        at com.google.refine.process.QuickHistoryEntryProcess.performImmediate(Q
uickHistoryEntryProcess.java:73)
        at com.google.refine.process.ProcessManager.queueProcess(ProcessManager.
java:82)
        at com.google.refine.commands.Command.performProcessAndRespond(Command.j
ava:248)
        at com.google.refine.commands.EngineDependentCommand.doPost(EngineDepend
entCommand.java:81)
        at com.google.refine.RefineServlet.service(RefineServlet.java:177)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511
)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(Servlet
Handler.java:1166)
        at org.mortbay.servlet.UserAgentFilter.doFilter(UserAgentFilter.java:81)
        at org.mortbay.servlet.GzipFilter.doFilter(GzipFilter.java:132)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(Servlet
Handler.java:1157)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:3
88)
        at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.jav
a:216)
        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:1
82)
        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:7
65)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418)
        at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:1
52)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:54
2)
        at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnectio
n.java:938)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:755)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.
java:228)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExec
utor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
.java:908)
        at java.lang.Thread.run(Thread.java:662)

--
-Thad
http://www.freebase.com/view/en/thad_guidry

Tom Morris

unread,
Feb 17, 2012, 4:43:31 PM2/17/12
to google-r...@googlegroups.com
On Fri, Feb 17, 2012 at 4:17 PM, Thad Guidry <thadg...@gmail.com> wrote:
> Today I tried to split out some IDs from some itunes url metadata, such as
>
> http://itunes.apple.com/album/the-line/id397913818?i=397913832&uo=5
>
> using this GREL expression:
>
> value.split("/id")[1].split("?i=")[1].chomp("&uo=5")
>
> 20:54:31.032 [                   refine] POST /command/core/add-column (0ms)
> 20:54:31.045 [          org.mortbay.log] Error for /command/core/add-column
> (13m
> s)
> java.lang.AbstractMethodError:
> java.lang.Exception.toString()Ljava/lang/String;
>         at
> com.google.refine.grel.ast.FunctionCallExpr.evaluate(FunctionCallExpr.java:71)

I don't think my source tree is exactly the same as what you're
running, but I think this is the code snippet:

try {
return _function.call(bindings, args);
} catch (Exception e) {
>> return new EvalError(e.toString());
}

It would be exceedingly weird for an Exception to not have a
toString() method (especially since Object.toString() is implemented),
so I'm not sure what exception got thrown here.

Tom

David Huynh

unread,
Feb 20, 2012, 3:04:16 AM2/20/12
to google-r...@googlegroups.com
I got nothing, either.

Thad, next time when this happens to you again, could you please not shut down Refine right away? Poke at it some more and see if you get more exceptions.

Thanks,

David

Thad Guidry

unread,
Feb 20, 2012, 10:28:34 AM2/20/12
to google-r...@googlegroups.com
I did actually poke at it over and over...before shutting it down.

I did a SHIFT F5.  Then tried to add the column again with that expression.  Again, the WORKING...spinner, with the same error in the log.  Tried it 3 times with the same expression and same error 3 times.  I gave up and then shutdown Refine and tried again, then it worked.

David Huynh

unread,
Feb 20, 2012, 9:44:01 PM2/20/12
to google-r...@googlegroups.com

Thanks, Thad. I've seen some weird errors in the log for instances of Refine that have been up for a few days (even without invoking any command). Maybe this is another symptom of the same cause.

David

Thad Guidry

unread,
Mar 10, 2012, 9:38:17 PM3/10/12
to google-r...@googlegroups.com
Guys,

Looks like this happened again, but this time my instance was only about 5 mins fresh after starting Refine and then trying to perform this GREL expression:

value.split("/id")[1].split("?i=")[1].chomp("&uo=5") 

on a column of data with 400,000 rows that has a basic itunes album pattern such as this (where the album name and 2 ids are the uniquely different parts, otherwise the pattern is the same on all 400,000 rows):


Now, what I have noticed is that the error seems to happen when I use split() and chomp() and does not seem to happen when I use partition().

Does that narrow the problem down, I wonder ?  (log attached and running newest version r2459 )

errorDuringColumnCreateWithChomp.txt

Tom Morris

unread,
Mar 10, 2012, 11:35:28 PM3/10/12
to google-r...@googlegroups.com
On Sat, Mar 10, 2012 at 9:38 PM, Thad Guidry <thadg...@gmail.com> wrote:
> Guys,
>
> Looks like this happened again, but this time my instance was only about 5
> mins fresh after starting Refine and then trying to perform this GREL
> expression:
>
> value.split("/id")[1].split("?i=")[1].chomp("&uo=5")
>
> on a column of data with 400,000 rows that has a basic itunes album pattern
> such as this (where the album name and 2 ids are the uniquely different
> parts, otherwise the pattern is the same on all 400,000 rows):
>
> http://itunes.apple.com/album/better-the-devil-you-know/id317373594?i=317373712&uo=5
>
> Now, what I have noticed is that the error seems to happen when I use
> split() and chomp() and does not seem to happen when I use partition().
>
> Does that narrow the problem down, I wonder ?  (log attached and running
> newest version r2459 )

Anything special about the data that it blows up on? One thing that I
notice about your expression is that if it's fed

http://itunes.apple.com/album/id-rather-know-the-devil/id6666666?i=66666666&uo=5

it might not give the result you expect. Could a data pattern like
that be triggering unexpected behavior?

Tom

Thad Guidry

unread,
Mar 11, 2012, 3:32:55 AM3/11/12
to google-r...@googlegroups.com
On Sat, Mar 10, 2012 at 10:35 PM, Tom Morris <tfmo...@gmail.com> wrote:
On Sat, Mar 10, 2012 at 9:38 PM, Thad Guidry <thadg...@gmail.com> wrote:
> Guys,
>
> Looks like this happened again, but this time my instance was only about 5
> mins fresh after starting Refine and then trying to perform this GREL
> expression:
>
> value.split("/id")[1].split("?i=")[1].chomp("&uo=5")
>
> on a column of data with 400,000 rows that has a basic itunes album pattern
> such as this (where the album name and 2 ids are the uniquely different
> parts, otherwise the pattern is the same on all 400,000 rows):
>
> http://itunes.apple.com/album/better-the-devil-you-know/id317373594?i=317373712&uo=5
>
> Now, what I have noticed is that the error seems to happen when I use
> split() and chomp() and does not seem to happen when I use partition().
>
> Does that narrow the problem down, I wonder ?  (log attached and running
> newest version r2459 )

Anything special about the data that it blows up on?  One thing that I
notice about your expression is that if it's fed


Nothing special.  I try to perform adding the new column with that expression and it gets stuck on WORKING... with the error in the console.
 
http://itunes.apple.com/album/id-rather-know-the-devil/id6666666?i=66666666&uo=5

it might not give the result you expect.  Could a data pattern like
that be triggering unexpected behavior?


I haven't a clue.  That's why I am throwing this back to you guys to figure out.
 
--
-Thad
http://www.freebase.com/view/en/thad_guidry

Tom Morris

unread,
Mar 11, 2012, 10:45:53 AM3/11/12
to google-r...@googlegroups.com

Sorry, I shouldn't have been so oblique. Your expression

value.split("/id")[1].split("?i=")[1].chomp("&uo=5")

on my made up album

http://itunes.apple.com/album/id-rather-know-the-devil/id6666666?i=66666666&uo=5

is going to give you this array for the first part of the evaluation

[
http://itunes.apple.com/album,
-rather-know-the-devil,
6666666?i=66666666&uo=5
]

the next split will then give you an array out of bounds exception
since it only generates a single element array.

It's still a weird exception though. Since you probably did your own
build, one thing I'd do is a

refine clean
refine build

sequence to make sure that you don't mismatched class files of
different vintages lying around. From the description of the
exception, that kind of mismatch is the main cause of this error.

I'll see if I can add some more debug code to the exception handler to
give better information about what function is blowing up and what
arguments it's being passed.

Tom

Thad Guidry

unread,
Mar 11, 2012, 11:42:43 AM3/11/12
to google-r...@googlegroups.com

Sorry, I shouldn't have been so oblique.  Your expression

 value.split("/id")[1].split("?i=")[1].chomp("&uo=5")

on my made up album

  http://itunes.apple.com/album/id-rather-know-the-devil/id6666666?i=66666666&uo=5

is going to give you this array for the first part of the evaluation

 [
 http://itunes.apple.com/album,
 -rather-know-the-devil,
 6666666?i=66666666&uo=5
 ]


ah, that makes sense.  your right, I did not think of " I'd " , lol.  So..

This works as a better method to reach the albums id itself:

value.rpartition("?i=")[0].rpartition("/id")[2]

and then track id itself:

value.rpartition("?i=")[2].chomp("&uo=5")
 
the next split will then give you an array out of bounds exception
since it only generates a single element array.

It's still a weird exception though.  Since you probably did your own
build, one thing I'd do is a

 refine clean
 refine build

sequence to make sure that you don't mismatched class files of
different vintages lying around.  From the description of the
exception, that kind of mismatch is the main cause of this error.


Yes, I also do a clean and build with latest version of Ant.
 
I'll see if I can add some more debug code to the exception handler to
give better information about what function is blowing up and what
arguments it's being passed.

Tom

Agreed it is weird, Tom, but don't bother to much unless you think it would help overall error handling.

--
-Thad
http://www.freebase.com/view/en/thad_guidry
Reply all
Reply to author
Forward
0 new messages