SCALA processing data out of SQL database

415 views
Skip to first unread message

Gmail

unread,
Mar 11, 2011, 6:12:23 PM3/11/11
to scala...@googlegroups.com
Hi,

I would like to know an elegant way to process data out of SQL database.

a simple query could be:
val rs = statement.executeQuery("SELECT * FROM MyTable LIMIT 5")

thanks for your help.

aliunix

Kevin Wright

unread,
Mar 11, 2011, 6:16:55 PM3/11/11
to Gmail, scala...@googlegroups.com
a. Scala supports all Java libraries, including JDBC
d. Scala is not an SQL keyword, you don't need to SHOUT its name
e. "Gmail" is potentially very confusing as an email username
--
Kevin Wright

gtalk / msn : kev.lee...@gmail.com
mail: kevin....@scalatechnology.com
vibe / skype: kev.lee.wright
quora: http://www.quora.com/Kevin-Wright
twitter: @thecoda

"My point today is that, if we wish to count lines of code, we should not regard them as "lines produced" but as "lines spent": the current conventional wisdom is so foolish as to book that count on the wrong side of the ledger" ~ Dijkstra

Tony Morris

unread,
Mar 11, 2011, 6:16:51 PM3/11/11
to scala...@googlegroups.com
You want elegant eh? Try this to get started:

* scalaz/Iteratee.scala
* case class OrSQLException[A](o: Either[java.sql.SQLException, A])
* case class Connect[A](k: java.sql.Connection =>
WriterT[OrSQLException, IndSeq[L], A]

--
Tony Morris
http://tmorris.net/


Kevin Wright

unread,
Mar 11, 2011, 6:19:20 PM3/11/11
to tmo...@tmorris.net, Tony Morris, scala...@googlegroups.com
You seemed to have accidentally missed the part about actually issuing a database query

 

--
Tony Morris
http://tmorris.net/


Tony Morris

unread,
Mar 11, 2011, 6:21:17 PM3/11/11
to Kevin Wright, tmo...@tmorris.net, scala...@googlegroups.com
On 12/03/11 09:19, Kevin Wright wrote:
> On 11 March 2011 23:16, Tony Morris <tonym...@gmail.com> wrote:
>
>> On 12/03/11 09:12, Gmail wrote:
>>> Hi,
>>>
>>> I would like to know an elegant way to process data out of SQL database.
>>>
>>> a simple query could be:
>>> val rs = statement.executeQuery("SELECT * FROM MyTable LIMIT 5")
>>>
>>> thanks for your help.
>>>
>>> aliunix
>> You want elegant eh? Try this to get started:
>>
>> * scalaz/Iteratee.scala
>> * case class OrSQLException[A](o: Either[java.sql.SQLException, A])
>> * case class Connect[A](k: java.sql.Connection =>
>> WriterT[OrSQLException, IndSeq[L], A]
>
>
> You seemed to have accidentally missed the part about actually issuing a
> database query
>
>
No.

aliunix

unread,
Mar 11, 2011, 6:49:44 PM3/11/11
to scala...@googlegroups.com
OK!

Thanks for your replies.
Well, I'm quite new to scala but coming from java world.
Your hints look fine but it seems to be a lot more complicated than processing SQL response from java.

what I need is quite "simple" to explain:
I managed to execute SQL statement from my scala code: 
val rs = stmt.executeQuery("SELECT * FROM MyTable LIMIT 5")

I just need a way to process "rs" without explicitly knowing the content of "MyTable"
a king to pretty printing the content on standard output

any idea?

Many thanks,
aliunix

Sadek Drobi

unread,
Mar 11, 2011, 9:22:58 PM3/11/11
to Gmail, scala...@googlegroups.com
Actually that is a question I've been working on for the last few months. I have implemented a Packrat Parser for parsing SQL result (for playframework) with some added utility functions.

So in your example it will look something like

SQL("SELECT * FROM MyTable LIMIT 5").as(MyTable *)

having

object MyTable extends MagicParser[MyTable]

As any parser combinator, you win parsers reuse and composability

Sadek
--
www.sadekdrobi.com
ʎdoɹʇuǝ

Derek Williams

unread,
Mar 11, 2011, 9:23:55 PM3/11/11
to aliunix, scala...@googlegroups.com
On Fri, Mar 11, 2011 at 4:49 PM, aliunix <aliu...@gmail.com> wrote:
> any idea?

If you don't plan on using one of the suggested libraries to assist
you, do this in Scala is much like doing it in Java.

--
Derek

Srirangan

unread,
Mar 11, 2011, 9:42:00 PM3/11/11
to scala...@googlegroups.com
If you are more comfortable constructing your own SQL queries, Querulous by Twitter will be very attractive.

Querulous

An agreeable way to talk to your database.

Features

  • Handles all the JDBC bullshit so you don't have to: type casting for primitives and collections, exception handling and transactions, and so forth;
  • Fault tolerant: configurable strategies such as timeouts, mark-dead thresholds, and retries;
  • Designed for operability: rich statistics about your database usage and extensive debug logging;
  • Minimalist: minimal code, minimal assumptions, minimal dependencies. You write highly-tuned SQL and we get out of the way;
  • Highly modular, highly configurable.

--
Thanks | Srirangan   About Blog LinkedIn Twitter

Tony Morris

unread,
Mar 11, 2011, 10:12:17 PM3/11/11
to scala...@googlegroups.com
Unfortunately, there is a contention between what you are calling
"simple" and "elegant." It's not a simple problem and to solve it
effectively and elegantly requires a lot of "complexity." I use quotes
here because I don't necessarily agree that is in fact complex or not
simple, but I expect this is how you are using these terms.

I mention these points because I think, regardless of how you choose to
tackle the problem, it's important to own up. I am currently attempting
to solve this exact problem using Scala in my day job. Among those who
are also attempting it, there is not yet a generally agreed approach.

Hope that helps.

On 12/03/11 09:49, aliunix wrote:
> OK!
>
> Thanks for your replies.
> Well, I'm quite new to scala but coming from java world.
> Your hints look fine but it seems to be a lot more complicated than processing SQL response from java.
>
> what I need is quite "simple" to explain:
> I managed to execute SQL statement from my scala code:
> val rs = stmt.executeQuery("SELECT * FROM MyTable LIMIT 5")
>
> I just need a way to process "rs" without explicitly knowing the content of "MyTable"
> a king to pretty printing the content on standard output
>
> any idea?
>
> Many thanks,
> aliunix
>

> Le 12 mars 2011 � 00:21, Tony Morris a �crit :

aliunix

unread,
Mar 11, 2011, 10:21:28 PM3/11/11
to tmo...@tmorris.net, scala...@googlegroups.com
Hi,

thank you all for your replies and pointers.
That's very exciting following this forum! :-)
Indeed, "simple" and "elegant" is not that easy to have regarding programming languages.
Anyway the pointers that have been provided are helping me to get closer to my goal: to do it in a purely functional style

Thanks a lot (and still checking out whether there will be additional posts on the topic ;-) )

Bests,
aliunix

Le 12 mars 2011 à 04:12, Tony Morris a écrit :

> Unfortunately, there is a contention between what you are calling
> "simple" and "elegant." It's not a simple problem and to solve it
> effectively and elegantly requires a lot of "complexity." I use quotes
> here because I don't necessarily agree that is in fact complex or not
> simple, but I expect this is how you are using these terms.
>
> I mention these points because I think, regardless of how you choose to
> tackle the problem, it's important to own up. I am currently attempting
> to solve this exact problem using Scala in my day job. Among those who
> are also attempting it, there is not yet a generally agreed approach.
>
> Hope that helps.
>
> On 12/03/11 09:49, aliunix wrote:
>> OK!
>>
>> Thanks for your replies.
>> Well, I'm quite new to scala but coming from java world.
>> Your hints look fine but it seems to be a lot more complicated than processing SQL response from java.
>>
>> what I need is quite "simple" to explain:
>> I managed to execute SQL statement from my scala code:
>> val rs = stmt.executeQuery("SELECT * FROM MyTable LIMIT 5")
>>
>> I just need a way to process "rs" without explicitly knowing the content of "MyTable"
>> a king to pretty printing the content on standard output
>>
>> any idea?
>>
>> Many thanks,
>> aliunix
>>

HamsterofDeath

unread,
Mar 12, 2011, 4:23:18 AM3/12/11
to scala...@googlegroups.com
not possible, not in any language i know

AGYNAMIX Torsten Uhlmann

unread,
Mar 12, 2011, 4:48:38 AM3/12/11
to scala-user
I'm not sure if I understand you question correctly.

If you want to know more about the table you are selecting at without building that knowledge into your application, you can use ResultSet.getMetaData() which will give you http://download.oracle.com/javase/6/docs/api/java/sql/ResultSetMetaData.html back.

Iterating this you learn about the names of the columns and their types. With that knowledge you can extract the different values from the ResultSet and do whatever you like with it.
I guess that's the same mechanism that's used by DB tools (SQL Developer or DBVisualizer and lots of others) to dynamically query tables.

Doing this in Scala is basically the same as in Java, just with a different syntax. I used that approach in a little backup script in Groovy that would export tables and Blobs to CSV files and create a file for each blob value...

Is that what you asked for?

Torsten.
Mit freundlichen Grüßen,
Kind Regards,

Torsten Uhlmann

-- 
Torsten Uhlmann
AGYNAMIX(R). Passionate Software.
Inh. Torsten Uhlmann
Buchenweg 5, 09380 Thalheim
Phone:       +49 3721 273445
Fax:             +49 3721 273446
Mobile:       +49 151 12412427
Web:           http://www.agynamix.de
Skype:        torsten.uhlmann
Facebook, Twitter: agynamix

aliunix

unread,
Mar 12, 2011, 5:40:19 AM3/12/11
to AGYNAMIX Torsten Uhlmann, scala-user
Yes, thank you!
actually I figured out a way to do it:

def using[A <: {def close():Unit}, B](param:A)(f: A => B): B ={
    try{ f(param)}
    finally{param.close()}
   }
def start(conn:Connection)= using(conn.createStatement){ st =>
    using(st.executeQuery("SELECT * FROM MyTable LIMIT 5")){ rs => 
                             using(rs.getMetaData){ md => /*metadata handling here*/
}
    }
   }
I'm not sure whether this is a concise and scala-like way to do it or whether there is a more interesting idea...
But you feedbacks really helped!

Thanks,
aliunix

Jason Zaugg

unread,
Mar 12, 2011, 6:10:54 AM3/12/11
to aliunix, AGYNAMIX Torsten Uhlmann, scala-user
On Sat, Mar 12, 2011 at 11:40 AM, aliunix <aliu...@gmail.com> wrote:
> Iterating this you learn about the names of the columns and their types.
> With that knowledge you can extract the different values from the ResultSet
> and do whatever you like with it.
> I guess that's the same mechanism that's used by DB tools (SQL Developer or
> DBVisualizer and lots of others) to dynamically query tables.
> Doing this in Scala is basically the same as in Java, just with a different
> syntax. I used that approach in a little backup script in Groovy that would
> export tables and Blobs to CSV files and create a file for each blob
> value...

That's a one-liner with spring-jdbc:

http://static.springsource.org/spring/docs/3.0.x/javadoc-api/org/springframework/jdbc/core/simple/SimpleJdbcOperations.html#queryForList(java.lang.String,
java.util.Map)

-jason

Nils Kilden-Pedersen

unread,
Mar 12, 2011, 10:57:11 AM3/12/11
to Gmail, scala...@googlegroups.com
On Fri, Mar 11, 2011 at 5:12 PM, Gmail <aliu...@gmail.com> wrote:
Hi,

I would like to know an elegant way to process data out of SQL database.

a simple query could be:
       val rs = statement.executeQuery("SELECT * FROM MyTable LIMIT 5")


Using O/R Broker:

val list = broker.readOnly() { session =>
  session.selectAll("SELECT * FROM MyTable LIMIT 5")
}

Joshua Gooding

unread,
Mar 12, 2011, 11:30:51 AM3/12/11
to scala-user
You're not going to get a simple explanation from Tony :)

On Mar 11, 3:49 pm, aliunix <aliun...@gmail.com> wrote:
> OK!
>
> Thanks for your replies.
> Well, I'm quite new to scala but coming from java world.
> Your hints look fine but it seems to be a lot more complicated than processing SQL response from java.
>
> what I need is quite "simple" to explain:
> I managed to execute SQL statement from my scala code:
>         val rs = stmt.executeQuery("SELECT * FROM MyTable LIMIT 5")
>
> I just need a way to process "rs" without explicitly knowing the content of "MyTable"
> a king to pretty printing the content on standard output
>
> any idea?
>
> Many thanks,
> aliunix
>
> Le 12 mars 2011 à 00:21, Tony Morris a écrit :
>
> > On 12/03/11 09:19, Kevin Wright wrote:

Kevin Wright

unread,
Mar 12, 2011, 11:37:40 AM3/12/11
to Joshua Gooding, scala-user
On 12 March 2011 16:30, Joshua Gooding <skav...@gmail.com> wrote:
You're not going to get a simple explanation from Tony :)


Of course you will... It's just going to be for some contramorphic definition of "simple"

Nils Kilden-Pedersen

unread,
Mar 12, 2011, 3:00:32 PM3/12/11
to aliunix, scala...@googlegroups.com
On Fri, Mar 11, 2011 at 5:49 PM, aliunix <aliu...@gmail.com> wrote:
OK!

Thanks for your replies.
Well, I'm quite new to scala but coming from java world.
Your hints look fine but it seems to be a lot more complicated than processing SQL response from java.

what I need is quite "simple" to explain:
I managed to execute SQL statement from my scala code: 
val rs = stmt.executeQuery("SELECT * FROM MyTable LIMIT 5")

I just need a way to process "rs" without explicitly knowing the content of "MyTable"
a king to pretty printing the content on standard output

Ahh, ok, then try this:

object MapExtractor extends RowExtractor[Map[String, Any]] {
  def extract(row: Row) = {
    var map = new scala.collection.immutable.HashMap[String, Any]
    row.columns foreach { name ⇒
      map += name -> row.any(name)
    }
    map
  }
}

object PrintTable {
  def main(args: Array[String]) {
    val token = Token("SELECT * FROM MyTable LIMIT 5", 'myTableSelectAll, MapExtractor)
    val rows = broker.readOnly() { session ⇒
      session.selectAll(token)
    }
    rows foreach (println)
  }
}

HamsterofDeath

unread,
Mar 12, 2011, 3:11:51 PM3/12/11
to scala...@googlegroups.com
the ideal solution would be for either the language or ide or both to know sql and dynamically add getters to the resultset depending in the result of the select statement.
instead of rs.getInt("columnName"), you should be able to call "rs.getRealMethodWithColumnNameReturningTheCorrectType"

aliunix

unread,
Mar 12, 2011, 3:34:21 PM3/12/11
to HamsterofDeath, scala...@googlegroups.com
Yeah!

That's somehow what I tried to do by the following pattern matching:

metadata.getColumnType(columnNumber) match {
case (Types.VARCHAR | Types.CHAR | Types.LONGVARCHAR) => rs.getString(column)
    case (Types.INTEGER | Types.BIGINT | Types.SMALLINT | Types.NUMERIC) => rs.getLong(column).toString 
    case (Types.DECIMAL | Types.DOUBLE | Types.FLOAT | Types.REAL) => rs.getDouble(column).toString
    case (Types.DATE | Types.TIME | Types.TIMESTAMP) => rs.getTimestamp(column).toString
}

Regarding the processing, I did something like:

def processRow(rs:ResultSet, md:ResultSetMetaData):String =
    (1 to md.getColumnCount) flatMap {column => processCell(rs, md, column)} mkString

where processCell is in charge of handling all the retrieved data from SQL with the pattern matching above.

Nils' solution sounds very interesting. I'll try to see how to integrate something like that in may code ;-)

Still very interested in getting ideas and pointers ;-)

aliunix

HamsterofDeath

unread,
Mar 12, 2011, 3:56:02 PM3/12/11
to scala...@googlegroups.com
i started to think about a dsl in scala that secretly executes sql calls in the background.
you need to explain to scala which tables exist:
class TableDescPerson(val name:ColDef[String], val age:ColDef[Int])

put all of these into a big object and then access the tables like this:
AllTables.person select (TableDescPerson.name, TableDescPerson.age) where
and so on, using lots of implicits. the resulting row would be a typed tuple.

Nils Kilden-Pedersen

unread,
Mar 12, 2011, 4:10:00 PM3/12/11
to aliunix, HamsterofDeath, scala...@googlegroups.com
On Sat, Mar 12, 2011 at 2:34 PM, aliunix <aliu...@gmail.com> wrote:
Yeah!

That's somehow what I tried to do by the following pattern matching:

metadata.getColumnType(columnNumber) match {
case (Types.VARCHAR | Types.CHAR | Types.LONGVARCHAR) => rs.getString(column)
    case (Types.INTEGER | Types.BIGINT | Types.SMALLINT | Types.NUMERIC) => rs.getLong(column).toString 
    case (Types.DECIMAL | Types.DOUBLE | Types.FLOAT | Types.REAL) => rs.getDouble(column).toString
    case (Types.DATE | Types.TIME | Types.TIMESTAMP) => rs.getTimestamp(column).toString
}

Actually, if you're using a half-decent JDBC implementation, calling rs.getObject already give you the right data type back (or in O/R Broker: row.any)

aliunix

unread,
Mar 12, 2011, 4:37:21 PM3/12/11
to HamsterofDeath, scala...@googlegroups.com
I'm delighted to see the outcome ;-)

aliunix

unread,
Mar 12, 2011, 4:38:02 PM3/12/11
to Nils Kilden-Pedersen, HamsterofDeath, scala...@googlegroups.com
You're right!
rs.getObject(column).toString
is enough!

Thanks
aliunix

HamsterofDeath

unread,
Mar 12, 2011, 4:44:49 PM3/12/11
to aliunix, Nils Kilden-Pedersen, scala...@googlegroups.com
i predict an NPE
(not sure: are you supposed to write "an" or "a"? i know N is not a vocal, but is pronounced "en" like in "end")

Kevin Wright

unread,
Mar 12, 2011, 5:36:31 PM3/12/11
to HamsterofDeath, aliunix, Nils Kilden-Pedersen, scala...@googlegroups.com
On 12 March 2011 21:44, HamsterofDeath <h-s...@gmx.de> wrote:
i predict an NPE
(not sure: are you supposed to write "an" or "a"? i know N is not a vocal, but is pronounced "en" like in "end")

The formal rule is to go by the spelled-out letter names: http://en.wikipedia.org/wiki/English_alphabet#Letter_names

So it's: an "en pee e" 

The rule of thumb is: If it begins with a vowel-sound, use "an"

 
Am 12.03.2011 22:38, schrieb aliunix:
You're right!
rs.getObject(column).toString
is enough!

Thanks
aliunix
Le 12 mars 2011 à 22:10, Nils Kilden-Pedersen a écrit :

On Sat, Mar 12, 2011 at 2:34 PM, aliunix <aliu...@gmail.com> wrote:
Yeah!

That's somehow what I tried to do by the following pattern matching:

metadata.getColumnType(columnNumber) match {
case (Types.VARCHAR | Types.CHAR | Types.LONGVARCHAR) => rs.getString(column)
    case (Types.INTEGER | Types.BIGINT | Types.SMALLINT | Types.NUMERIC) => rs.getLong(column).toString 
    case (Types.DECIMAL | Types.DOUBLE | Types.FLOAT | Types.REAL) => rs.getDouble(column).toString
    case (Types.DATE | Types.TIME | Types.TIMESTAMP) => rs.getTimestamp(column).toString
}

Actually, if you're using a half-decent JDBC implementation, calling rs.getObject already give you the right data type back (or in O/R Broker: row.any)



Tony Morris

unread,
Mar 12, 2011, 5:39:49 PM3/12/11
to scala...@googlegroups.com
Stop using the word simple to mean "I do not understand" and complex to
mean, "I do not understand."
By request from Viktor.


--
Tony Morris
http://tmorris.net/


Kevin Wright

unread,
Mar 12, 2011, 6:01:35 PM3/12/11
to tmo...@tmorris.net, scala...@googlegroups.com


On 12 Mar 2011 22:39, "Tony Morris" <tonym...@gmail.com> wrote:
>
> Stop using the word simple to mean "I do not understand" and complex to
> mean, "I do not understand."
> By request from Viktor.
>

I agree, to a certain limit. If you'd said that the use of `Either` and monadic comprehension is an elegant solution to the problem, then you'd be right, and would have some valid basis for the inevitable follow-up claims of anti-intellectualism.

But you didn't. Instead you just posted 3 lines of code that show little more than the names chosen by scalaz to represent certain concepts. That isn't simple, it's accidental complexity.

Worse still, it's only loosely relevant.

Tony Morris

unread,
Mar 12, 2011, 6:07:37 PM3/12/11
to Kevin Wright, tmo...@tmorris.net, scala...@googlegroups.com
On 13/03/11 09:01, Kevin Wright wrote:
> On 12 Mar 2011 22:39, "Tony Morris" <tonym...@gmail.com> wrote:
>> Stop using the word simple to mean "I do not understand" and complex to
>> mean, "I do not understand."
>> By request from Viktor.
>>
> I agree, to a certain limit. If you'd said that the use of `Either` and
> monadic comprehension is an elegant solution to the problem, then you'd be
> right, and would have some valid basis for the inevitable follow-up claims
> of anti-intellectualism.
>
> But you didn't. Instead you just posted 3 lines of code that show little
> more than the names chosen by scalaz to represent certain concepts. That
> isn't simple, it's accidental complexity.
This bullshit gets under my skin a bit.

You have no idea what I posted. No fucking clue whatsoever. Nada. I know
this because what you just wrote is completely wrong on more than one
level. It's a dead give-away.

Stop trying to claim you do understand, or come to understand what I
posted, so that you can pass comment. Why is this unreasonable to you?
Do you think you are less transparent than you are? You're not doing
anyone any favours by making stuff up -- I am of the understanding that
helping people is the intention. Why is "please try" such an
unreasonable request?

> Worse still, it's only loosely relevant.

You have demonstrated that you are severely under-qualified to make this call. Worse still, you're probably going to insist otherwise.

Stefan Zeiger

unread,
Mar 12, 2011, 6:19:22 PM3/12/11
to scala...@googlegroups.com
On 2011-03-12 21:56, HamsterofDeath wrote:
i started to think about a dsl in scala that secretly executes sql calls in the background.
you need to explain to scala which tables exist:
class TableDescPerson(val name:ColDef[String], val age:ColDef[Int])

put all of these into a big object and then access the tables like this:
AllTables.person select (TableDescPerson.name, TableDescPerson.age) where
and so on, using lots of implicits. the resulting row would be a typed tuple.

That's pretty much what ScalaQuery does. And it does indeed make heavy use of implicits (as type classes).

-sz

Jim Powers

unread,
Mar 12, 2011, 6:32:33 PM3/12/11
to scala...@googlegroups.com
> * scalaz/Iteratee.scala

OK, this is definately intriguing.  As a result I've spent a bunch of time reading up on Iteratees (here: http://john-millikin.com/articles/understanding-iteratees/ and here: http://www.haskell.org/pipermail/haskell-cafe/2008-September/047738.html).  I'm sold.

> * case class OrSQLException[A](o: Either[java.sql.SQLException, A])
> * case class Connect[A](k: java.sql.Connection =>
WriterT[OrSQLException, IndSeq[L], A]

So, the question is is the IndSeq[L] for logging or for collecting results or are the results in A, I presume yes (IndSeq[L] -> logging, and A -> results) since a bunch of support for that was recently added to scalaz.

So, guessing at the parts that still need to be done: use Iteratees to "wrap" things like statements and (more likely) ResultSets.

Am I at least on the right track?

--
Jim Powers

Tony Morris

unread,
Mar 12, 2011, 6:36:01 PM3/12/11
to scala...@googlegroups.com
Hi Jim,
Yes you are. Also, much of what has been discussed elsewhere in this
thread can be modelled by the latter.

Jim Balter

unread,
Mar 12, 2011, 7:33:50 PM3/12/11
to scala...@googlegroups.com, Kevin Wright
There's that compulsion again.

-- Jim

Jim Powers

unread,
Mar 12, 2011, 7:54:14 PM3/12/11
to scala...@googlegroups.com
Tony and others...

It looks like IndSeq[L] is a wrapper around Finger Trees. Daniel
Spiewak gave a talk at the NE Scala Symposium
(http://www.nescala.org/2011/) on functional data structures and
covered Finger Trees at some length. The claim Daniel made is that
Finger Trees have very desirable on-paper (time/space complexity)
properties but the actual run-time behavior isn't in agreement with
"the theory". The video can be watched here:
http://vimeo.com/20262239

Thoughts? Comments? Open to all of course.

--
Jim Powers

Tony Morris

unread,
Mar 12, 2011, 8:00:45 PM3/12/11
to scala...@googlegroups.com
Yes I've seen it.

I'm either misunderstanding something important or Daniel is. There is a
lot of nonsense out there, so my optimising algorithm for nonsense
detection compels me to dismiss it and carry on, but I'm also curious
what on earth it could possibly mean to have such a dichotomy in the
first place. If you can clear it up any more that would be great.


--
Tony Morris
http://tmorris.net/


Jim Powers

unread,
Mar 12, 2011, 9:29:05 PM3/12/11
to scala...@googlegroups.com
On Sat, Mar 12, 2011 at 8:00 PM, Tony Morris <tonym...@gmail.com> wrote:
> I'm either misunderstanding something important or Daniel is. There is a
> lot of nonsense out there, so my optimising algorithm for nonsense
> detection compels me to dismiss it and carry on, but I'm also curious
> what on earth it could possibly mean to have such a dichotomy in the
> first place. If you can clear it up any more that would be great.

Well, as best as I can tell there are 4 possibilities

~ == Flips good to not good
T == Test procedure good (for some appropriate definition of "good")
I == Implementation of algorithm good (for some appropriate definition
of "good")

(T,I) => Daniel's claims are valid
(~T,I) => Daniel's claims are not to be relied upon because the
testing procedure is flawed
(T,~I) => Daniel's implementation exhibits the undesirable behavior
claimed (reliably) but the claim is likely not generalizable because
the implementation is not good
(~T,~I) => No useful information can be extracted from Daniel's claims

I take it that you consider the case (T,I) to be unlikely,
understandable. One would expect that the only meaningful definition
of "a good implementation" that could be applied in this case is one
where the theoretical properties of the algorithm are exhibited. It
is certainly possible to verify his claims.

--
Jim Powers

Kevin Wright

unread,
Mar 13, 2011, 10:24:21 AM3/13/11
to Tony Morris, scala...@googlegroups.com


On 13 Mar 2011 13:25, "Tony Morris" <tmo...@tmorris.net> wrote:
>
> You may continue to protest all you like but your constant demands to respond such that "as long as Kevin understands it" will continue to be met with resistance. If you continue to make these demands in such a poorly thought-out manner, I will continue to think to myself, "fuck you", dismiss your babble and carry on.
>
> Of course I want to teach, but even the most dedicated will take prognostic factors into account. If you think you represent a majority in your whining, you're horribly mistaken, again.
>

I believe I represent the majority in having limited time in which to study entire new branches of mathematics, yes.

If you want to teach, do so.

I challenge you to stand by your claim and take all the creative energy that you clearly possess for the sake of personal criticism, using it to explain iteratees instead.

Judge for yourself whether or not it's a useful exercise based on the responses to this list, don't just take my word for it.

> I hope you work it out, I really do. I don't enjoy this one bit.
>
> On 13/03/2011 11:06 PM, "Kevin Wright" <kev.lee...@gmail.com> wrote:

Jim Powers

unread,
Mar 13, 2011, 10:40:54 AM3/13/11
to scala...@googlegroups.com

Well, as a side note on iteratees one of the links I provided on them said asking the original author to explain them required a "save vs. confusion" :-).

--
Jim Powers

On Mar 13, 2011 10:25 AM, "Kevin Wright" <kev.lee...@gmail.com> wrote:

Chris Twiner

unread,
Mar 13, 2011, 11:56:59 AM3/13/11
to Kevin Wright, Tony Morris, scala...@googlegroups.com

I side with Kevin about the direct  usefulness of such replies but, as is almost always the case, Tony gives plenty information enough for the truly curious.

I personally tend to think of Iteratees simply as resumable folds. In this case the db can say its out of data or the row printer can say its had enough (either for now or in total).

The main trick is that the next operation kind of acts as a continuation for the one after it. This then allows "state" to be kept together with the computation itself and lovely composition to boot.

My personal experience with them is via xml comprehension and processing - think composable stream handling made easy (Scales Xml for those interested). And yes I do leverage Scalaz for that :-)

Tony Morris

unread,
Mar 13, 2011, 9:08:54 PM3/13/11
to scala...@googlegroups.com
I've been trying to download the video to help clear this up but twice
it has failed at 80%. Can you provide a more direct link? Vimeo doesn't
allow fast-forward it seems, and I cannot find the video elsewhere.

As a side-note, the use of IndSeq here is already an unfortunate
compromise because a more suitable structure such as Endo[List[A]] has
not been shown to be viable in Scala (is there a way?). If there is a
way of improving the situation, I'd be very keen to see how.

Jim Powers

unread,
Mar 13, 2011, 9:48:13 PM3/13/11
to scala...@googlegroups.com
Well, if you mean a link to the code in question, then it can be found here:

https://github.com/djspiewak/extreme-cleverness

His slides:

PDF: http://bit.ly/fZKfTc
Keynote: http://bit.ly/ge1VhF

There are a lot of animations so if you're on a Mac (I'm under Fedora
so no love) the Keynote version is the way to go because of embedded
animations.

Sadly, he's using buildr (http://buildr.apache.org/ - I've had not the
greatest success using it) to build the project, but should not be
hard to move to SBT or Maven as necessary, seems all self-contained.

Not to fan the flames of recent fires involving you (and myself on
another thread), but I'm trying to catch up on all the papers,
presentations, videos, etc. behind the various things in Scalaz, I
haven't even cracked open Endomorphisms yet - that's about 50 papers
or so down the list. Your posts have that effect on people ;-).

Hope the link is what you want.

--
Jim Powers

Tony Morris

unread,
Mar 13, 2011, 9:57:02 PM3/13/11
to scala...@googlegroups.com
On 14/03/11 11:48, Jim Powers wrote:
> I
> haven't even cracked open Endomorphisms yet - that's about 50 papers
> or so down the list. Your posts have that effect on people ;-).
Sorry, didn't think of that. This particular one is pretty easy to chew
off, spit out and carry on.

// endomorphism
// Wikipedia: endomorphism is a morphism (or homomorphism) from a
mathematical object to itself.
case class Endo[A](k: A => A)

Pretty easy eh?

Scalaz used to have:
case class DList[A](z: Endo[List[A]])

However, Paul Chiusano and friends tried to use it in production only to
find that Scala didn't have sufficient optimisation to make it viable.
We tried to save it to no avail. The reason it would be useful is
because it has O(1) cons and snoc (append), which would be great for
logging. It has a O(n) toList method that runs along the functions. I've
not found a way to save the DList -- would be great if we were wrong
about its viability.

Thanks for the pointers -- I'll check it out when I get a moment.

Jim Powers

unread,
Mar 13, 2011, 10:12:05 PM3/13/11
to scala-user
On Sun, Mar 13, 2011 at 9:57 PM, Tony Morris <tonym...@gmail.com> wrote:
> On 14/03/11 11:48, Jim Powers wrote:
>> I
>> haven't even cracked open Endomorphisms yet - that's about 50 papers
>> or so down the list.  Your posts have that effect on people ;-).
> Sorry, didn't think of that. This particular one is pretty easy to chew
> off, spit out and carry on.
>
> // endomorphism
> // Wikipedia: endomorphism is a morphism (or homomorphism) from a
> mathematical object to itself.
> case class Endo[A](k: A => A)
>
> Pretty easy eh?

HA! I saw that on Wikipedia as well, here's the deal (no doubt you
encounter this *all the time*): from the perspective of someone not
coming from a (modern - Lisps not counting) functional programming
background the natural question is "what's the point?", or "what does
that buy me?", or more commonly "huh?"

I've been programming for a long time and have read a lot, apparently
I've been reading mostly the wrong stuff ;-).

I take it somewhat on faith that I *must* learn this stuff because
people I recognize as objectively vastly smarter than I are able to
make very good use of the concepts (merely in how one thinks about
problems) as well as the various "implementations" of said concepts.

Ah well, back to work.

--
Jim Powers

Tony Morris

unread,
Mar 13, 2011, 10:28:16 PM3/13/11
to scala...@googlegroups.com
Yes, I do see this often and I struggle to give an answer that is both
accurate and satisfactory to the enquirer. The use of endomorphisms here
is not a really good example to help illustrate. I may well have just
used case class DList[A](k: List[A] => List[A]) however you miss out on
some goodies. For example Endo[A] forms a monoid -- a really useful one
in fact. There are a large number of useful operations on monoids, here
is one specialised to the Endo monoid:

def collapse[A](a: List[A]): A = a.foldRight(Endo(z => z)) { case
(Endo(a), Endo(b)) => Endo(a compose b) }

Let's be clear here. This is one operation, specialised to one reducible
container (List) on one specific monoid (Endo). What about all the other
containers and monoids? You might get a glimpse here at potential
usefulness.

The fact of the matter is that you (any programmer) will use these
concepts knowingly or not. The useful question is whether or not it is
used knowingly, not whether it is used at all (it is, it just is). Then
there is the next question of whether or not it matters if it is
knowingly or not (who cares?). I think it is beneficial to use it
knowingly -- in particular because it does a few things:

a) it prevents you from taking extraneous excursions because you can
draw on others' attempts and see those results
b) it gives you more information with which to make decisions when you
are on your own i.e. cannot draw on the efforts of others
c) it's fun and interesting

Worse, too often I see "fumbling in the dark" when someone is just
reinventing a well-documented concept that the fumbler has simply
refused to acknowledge or written off as irrelevant. This is often done
with errors, redundancy and incompleteness. As you might imagine, this
is painful to watch. So painful that I am compelled to step in
sometimes. I'm sure it is agreed that we don't particularly want to be
in either position of fumbling or having to point it out. That's a huge
benefit alone in my opinion.

Hope this helps.

Jim Powers

unread,
Mar 13, 2011, 10:46:03 PM3/13/11
to scala...@googlegroups.com
Great explanation. Some of the concepts I've at least gotten used to
(Monoids), Endomorphisms as Monoids is certainly new to me.

Explicit vs implicit: Agreed. Even if your *using* the concepts
implicitly one is still better off knowing *what* one is implicitly
using.

For context: I'm generally viewed as the "vanguard" (occasionally
crazy) at work. For illustration - getting people to use type
constructors (higher-kinds) and functional closures (correctly, most
of the time) was a major triumph. Getting people to STOP using
Java-idioms is very hard when they cannot see why they should stop
(and look upon the new-fangled stuff with suspicion).

Much appreciate the time. You should take that last post and throw it
up on your blog. Better than Wikipedia.

--
Jim Powers

Tony Morris

unread,
Mar 13, 2011, 11:01:27 PM3/13/11
to scala-user
On 14/03/11 12:28, Tony Morris wrote:
> def collapse[A](a: List[A]): A = a.foldRight(Endo(z => z)) { case
> (Endo(a), Endo(b)) => Endo(a compose b) }
>

that type signature should be:

def collapse[A](a: List[Endo[A]]): Endo[A]

in order to satisfy the stated specialisation to Endo.

Tony Morris

unread,
Mar 13, 2011, 11:21:29 PM3/13/11
to scala...@googlegroups.com
On 14/03/11 12:46, Jim Powers wrote:
> Great explanation. Some of the concepts I've at least gotten used to
> (Monoids), Endomorphisms as Monoids is certainly new to me.
Yeah, these cases are good examples of the concept itself being far less
fanciful than the term used to denote it. The terms often inflate the
expectations of difficulty of people coming to terms with what they
denote -- it's tough to overcome this for all involved.

"It's not as hard as you think, please trust me on this." Too often this
is construed as an under-estimation of how difficult a concept is,
because the person stating it already understands that concept. The best
way I have found to deal with this is to work through for one instance
and then introspect on that initial over-estimation of difficulty so
that the next time I say, "it's not as hard as you think", then some
amount of trust has been gained because there is agreement that I was
right the first time.

> Explicit vs implicit: Agreed. Even if your *using* the concepts
> implicitly one is still better off knowing *what* one is implicitly
> using.
>
> For context: I'm generally viewed as the "vanguard" (occasionally
> crazy) at work.

Ha, I'm happy to share the crazy title with you. Actually, I am lucky to
have a colleague who is very good at learning, so we feed off each other
a lot.

> For illustration - getting people to use type
> constructors (higher-kinds) and functional closures (correctly, most
> of the time) was a major triumph. Getting people to STOP using
> Java-idioms is very hard when they cannot see why they should stop
> (and look upon the new-fangled stuff with suspicion).
>
> Much appreciate the time. You should take that last post and throw it
> up on your blog. Better than Wikipedia.

I'll have to think about that one.

Chris Marshall

unread,
Mar 14, 2011, 4:48:30 AM3/14/11
to tmo...@tmorris.net, scala...@googlegroups.com
Code to play with from Tony :-)

scala> case class Endo[A](k: A => A)
defined class Endo

scala> def collapse[A](a: List[A]): A = a.foldRight(Endo(z => z)) { case(Endo(a), Endo(b)) => Endo(a compose b) }
<console>:7: error: missing parameter type
       def collapse[A](a: List[A]): A = a.foldRight(Endo(z => z)) { case(Endo(a), Endo(b)) => Endo(a compose b) }
                                                         ^

Ah. Err. Perhaps...

scala> def collapse[A](a: List[A]): A = a.foldRight(Endo[A]((z : A) => z)) { case(Endo(a), Endo(b)) => Endo(a compose b) }
<console>:7: error: type mismatch;
 found   : (A) => Any
 required: (A) => A
       def collapse[A](a: List[A]): A = a.foldRight(Endo[A]((z : A) => z)) { case(Endo(a), Endo(b)) => Endo(a compose b) }
                                                                                                              ^
<console>:7: error: type mismatch;
 found   : Endo[A]
 required: A
       def collapse[A](a: List[A]): A = a.foldRight(Endo[A]((z : A) => z)) { case(Endo(a), Endo(b)) => Endo(a compose b) }
                                                                           ^

So this thing must return an A, but quite plainly the return type of the foldr expression is the same as its first argument, which is an Endo[A]. Was it supposed to be?

scala> def collapse[A](a: List[Endo[A]]): Endo[A] = a.foldRight(Endo[A]((z : A) => z)) { case(Endo(a), Endo(b)) => Endo(a compose b) }
collapse: [A](a: List[Endo[A]])Endo[A]

So we have chained a bunch of functions A => A together to get one big A => A. 

Chris


> Date: Mon, 14 Mar 2011 12:28:16 +1000
> From: tonym...@gmail.com
> To: scala...@googlegroups.com
> Subject: Re: [scala-user] Re: SCALA processing data out of SQL database

Philippe Lhoste

unread,
Mar 14, 2011, 5:30:06 AM3/14/11
to scala...@googlegroups.com
On 14/03/2011 02:48, Jim Powers wrote:
> Sadly, he's using buildr (http://buildr.apache.org/ - I've had not the
> greatest success using it)

Same here, I tried this with JRuby and failed to get all Ruby dependencies. :-(

> to build the project, but should not be
> hard to move to SBT or Maven as necessary, seems all self-contained.

You can add Gradle to the list, it is a good tool to compile Scala. (Even if it uses the
evil twin/concurrent of Scala, ie. Groovy, as scripting language.)

--
Philippe Lhoste
-- (near) Paris -- France
-- http://Phi.Lho.free.fr
-- -- -- -- -- -- -- -- -- -- -- -- -- --

Meredith Gregory

unread,
Mar 17, 2011, 4:48:33 PM3/17/11
to tmo...@tmorris.net, Tony Morris, scala...@googlegroups.com
Dear Jim,

When an object of interest has sufficient internal structure then endomorphisms likewise have interesting structure. Consider a graph. This has lots of internal structure. Endomorphisms shift nodes around, respecting connection structure. Specifically, if 
  • G is a graph, 
  • v1, v2 two of its vertices and 
  • e an edge in G with src( e ) = v1, trgt( e ) = v2, 
then a graph (endo)morphism, m : G => G has got to satisfy 
  • src( m( e ) ) = m( v1 ) = m( src( e ) ) and likewise,   
  • trgt( m( e ) ) = m( v2 ) = m( trgt( e ) )
The nifty thing about the collection of all endomorphisms on G, call it E( G ), is that it enjoys a structure of it's own. Firstly, because endomorphisms have the same source and target they can compose. In symbols, if 
  • G is a graph, and 
  • m1 : G => G,
  • m2 : G => G
then 
  • (e1 o e2)( v ) := e2( e1( v ) ) where v is a vertex in G makes sense as an expression, likewise 
  • (e2 o e1)( v ) = e1( e2( v ) ). 
Moreover, this composition is associative; and the identity morphism, id, leaving G unperturbed, is also an identity for the composition. That is
  • e1 o (e2 o e3) = (e1 o e2) o e3
  • id o e = e = e o id
Question: are there any endomorphisms on G that are not isomorphisms, i.e. that don't have an inverse? While you're thinking, let's pick out the ones that are isomorphisms (in case there are any that are not). We'll call this collection, the collection of automorphisms, A( G ). A( G ) is closed under composition (meaning that the composition of automorphisms is again an automorphism). Clearly, it contains id. And, it has an inverse operation, ~m, taking a morphism to its inverse. This means the collection, A( G ) is a group

The group of automorphisms of an object (be it a graph or some other kinds of data structure, like a vector space or a manifold or a list or a tree) shows up everywhere. The reason is that it encodes a ton of information about the internal symmetries of the object. Moreover, the principle of measuring the symmetry of an object of interest is a profound one. It goes to the core of our desire to be DRY. Once you have a real measure of the symmetries you know a lot more about how not to repeat yourself because then you can see when something is really a reflection of a property rather than a property itself.

Best wishes,

--greg
--
L.G. Meredith
Managing Partner
Biosimilarity LLC
7329 39th Ave SW

Jim Powers

unread,
Mar 17, 2011, 6:00:50 PM3/17/11
to scala...@googlegroups.com
Stupendously awesome!  Thanks much.

RE: Groups - I did cover the basics of group theory in one of my undergraduate Physics classes, but is has been a while.  Typically we took advantage of certain symmetry properties to simplify the computation of wave functions/energies, but the "math" stuff was worked out on paper.  We were programming in Fortran and C - not exactly "high-level" stuff ;-).

RE: DRY - yes, I gather that: once a general structure is worked out capture the concept (typeclass) followed by structure specific expressions of the concept (instances).  Now it's possible to write an expression that uses specific concepts (typeclass) but now, effectively, any instance can be provided.  The choice of instance depends on the problem being solved.  Please correct me if I'm completely misguided here.

RE: Question: are there any endomorphisms on G that are not isomorphisms, i.e. that don't have an inverse? - I will need to think about that more, but given what is here

An invertible endomorphism of X is called an automorphism.

It would certainly seem that, in general, not all endomorphisms are automorphisms so some endomorphisms would not have an inverse.  I must say that it's not immediately clear how I would construct such an animal on graphs.  Perhaps something where two distinct vertices v1 and v2 in G map to the same vertex v' in G'.  But I'm just pulling that out of my posterior. ;-)

RE:  Once you have a real measure of the symmetries you know a lot more about how not to repeat yourself because then you can see when something is really a reflection of a property rather than a property itself. - As often the question: when do you know you have a "real measure" of something.  I guess that comes with experience in analyzing problems in a certain way.
--
Jim Powers

Reply all
Reply to author
Forward
0 new messages