Thanks
--Zach
Just call (read <port>) in loop until you get the EOF object. The tabs and
newlines will be trated as delimiters.
--
Barry Margolin, bar...@genuity.net
Genuity, Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.
ZK> OK i am trying to write some software to crunch some data in Gambit
ZK> scheme and I have a pile of data that is tab delimited, is there
ZK> any easy way to just read it into a set of lists or vectors that
ZK> does not involve going character by character over a lot of
ZK> floating point variables? What would be great is some equivalent of
ZK> the perl split function.
This doesn't answer your question directly, but for some inspiration you
might want to read the following article. It may give you some ideas as
to how to process data in Scheme.
http://www.glug.org/docbits/sort-grep-tutorial.txt
-Ben
I believe that won't work, as I think he has several lines of data, each
of which should be treated as a list. The following should do the trick
if my assumptions were correct. Tweak as necessary.
Scott
;; Assumes that the data starts immediately, and the only whitespace
;; is a tab between each value, and a newline to separate each list
;; 3\t4\t5\n6\t\7\n becomes '((3 4 5) (6 7))
;;
;; Requires SRFI-23, 8, and escaping continuations
(define split-on-tabs
(letrec ((read-row
(lambda (input-port)
(call/cc
(lambda (k)
(let loop ((acc '()))
(let* ((rv (read input-port))
(next-char (read-char input-port)))
(cond ((or (eqv? next-char #\newline)
(eof-object? next-char))
(k (reverse (cons rv acc)) next-char))
((eqv? next-char #\tab) (loop (cons rv acc)))
(else
(error 'split-on-tab
"Invalid data separator."))))))))))
(lambda (input-port)
(receive (row end-case) (read-row input-port)
(if (eof-object? end-case)
'()
(cons row (split-on-tabs input-port)))))))
> What would be great is some equivalent of the perl split function.
Yes, that's all you need. Gambit, like probably every Scheme
implementation, has read-line already.
See brl-split in gnu/brl/stringfun.scm in BRL. Not the most efficient
implementation, but it works.
--
<brlewis@[(if (brl-related? message) ; Bruce R. Lewis
"users.sourceforge.net" ; http://brl.codesimply.net/
"alum.mit.edu")]>
If you're reading floating-point numbers, a simple read suffices:
(call-with-input-string
"1.0\t2.0\t\t 3.0 4.0 12345678901234556789"
(lambda (port)
(let loop ((lst '()))
(let ((item (read port)))
(if (eof-object? item) (reverse lst)
(loop (cons item lst)))))))
; => (1. 2. 3. 4. 12345678901234556789) on Gambit-C 3.0
> What would be great is some equivalent of the perl split function
http://pobox.com/~oleg/ftp/Scheme/util.html#string-split
which works on every Scheme I tried it on. It certainly works on
Gambit.
> Gambit, like probably every Scheme implementation, has read-line already.
It does not, but
http://pobox.com/~oleg/ftp/Scheme/parsing.html
does -- again, for any R5RS Scheme. BTW, if you need a more complex
parsing, you might want to look into next-token or next-token-of.
Thanks for all the good advice, what I ended up doing (at least for
now) is to simply output the data from the first program as if it was
a scheme s-expression then it was easy to read it in, as scheme will
parse it. I am looking at those libraries.
--Zach
> Thanks for all the good advice, what I ended up doing (at least for
> now) is to simply output the data from the first program as if it was
> a scheme s-expression then it was easy to read it in, as scheme will
> parse it. I am looking at those libraries.
I had assumed the output format of the first program was outside your
control. Definitely the s-expression approach is the way to go.
I think your assumptions are correct, but my goodness...
> ;; Requires SRFI-23, 8, and escaping continuations
> (define split-on-tabs
> (letrec ((read-row
> (lambda (input-port)
> (call/cc
^^^^^^^^
It's not this hard, especially since he's using Gambit (which has
string-ports).
; assuming read-line, which is trivial to implement (since Gambit
; doesn't have it).
(define (read-tab-delimited port)
; shlurp the tab-del lines off to port into a list of lists
(let get-lines ((line (read-line port)) (lines '()))
(if (eof-object? line)
(reverse lines)
(let ((line-port (open-input-string line)))
(let get-fields ((datum (read line-port)) (data '()))
(if (eof-object? datum)
(get-lines (read-line port)
(cons (reverse data) lines))
(get-fields (read line-port) (cons datum data))
))))))
david rush
--
Scheme: Because closures are cool.
-- Anton van Straaten (the Scheme Marketing Dept from c.l.s)
In this case it was easy enough to make it what I want, but I'm sure
sooner or later it won't be :). So the information was useful.
--Zach
Since you point to your own website, I suppose this is not in SLIB. Any
reason why not? I suppose mentioning SLIB is useful given the discussions
going on about Scheme vs. CL libraries..
..which doesn't mean I don't highly appreciate your productivity, both
constructive (as in writing code) and destructive (as in uncovering
problems/ambiguities) regarding Scheme..
--
Biep
Reply via http://www.biep.org
> <ol...@pobox.com> wrote in message
> news:7eb8ac3e.02071...@posting.google.com...
>> http://pobox.com/~oleg/ftp/Scheme/util.html#string-split
>> http://pobox.com/~oleg/ftp/Scheme/parsing.html
>
> Since you point to your own website, I suppose this is not in SLIB.
> Any reason why not? I suppose mentioning SLIB is useful given the
> discussions going on about Scheme vs. CL libraries..
I'm curious why more SRFIs aren't in slib. In particular 12, 13, and
26, which don't seem so widely supported but should just be a matter
of dropping in the reference code.
adrian
A function find-string-from-port? is in SLIB. The other functions are
not. I asked the maintainer of SLIB a couple of times if he would be
interested in including the rest. I received no reply. I tried to push
string-split into SRFI-13, without success. I understand: different
people have different design and inclusion criteria. Mine is
minimalism: I'd like to be able to remember at least the names of the
library functions. Only when I observe that I keep writing roughly the
same code three or more times that I start considering it for
inclusion into the library.
Actualy gambit does not apear to have read-line. Or if it does I can't
find it in the docs and the system does not appear to see it.
--Zach
> Actualy gambit does not apear to have read-line. Or if it does I can't
> find it in the docs and the system does not appear to see it.
Sorry, my google search on "gambit read-line" led me to this:
http://www.iro.umontreal.ca/~feeley/cours/ift2030/doc/demo11.txt
(read-line port) ; extension propre a Gambit 4.0
I read it as read-line being a Gambit 4.0 extension to the Scheme
standard, but maybe it meant something else.