Segfault when using streamize_fileref_char

175 views
Skip to first unread message

August Alm

unread,
Mar 2, 2017, 5:04:35 PM3/2/17
to ats-lang-users
Hi!

I'm in over my head and tried writing a CSV-parser using linear lazy streams. My code thus far is 600 lines and almost to my own surprise I get it to compile! However, there is something fishy because I get a segfault when applying my program to an actual CSV-file. I've been trying to debug using gdb but the fault eludes me. Since I don't expect anyone to mull through 600 lines of code, I am hoping these code snippets are enough for one of you guys to give me some advice.

This code executes just fine:

        implement main0 () = {
          
           val test = stream_vt_make_cons(
                            'a', stream_vt_make_cons(
                                    ';', stream_vt_make_sing('b')))          (* the stream ('a', ';', 'b') *)
           val lexed = lex_csv(true, ';', test)
           val h = (lexed.head())
           val- CSV_Field(r) = h
           val a = r.csvFieldContent
           val () = println!(a)
        
         }

Here [lex_csv] is my 600-line alogrithm. It reads a [stream_vt(char)] and gives back a [stream_vt(CSVEntry)], where [CSVEntry] is a record type, one of whose fields is [CSVFieldContent]. When executing the program I get "a" printed to the console.

This code results in a segfault:

        implement main0 () = {
       
           val inp = fileref_open_exn("small.csv", file_mode_r)
           val ins = streamize_fileref_char(inp)
           val lexed = lex_csv(true, ';', ins)
           val () = fileref_close(inp)
           val h = (lexed.head())
           val- CSV_Field(r) = h
           val a = r.csvFieldContent
           val () = println!(a)
        
         }

The file "small.csv" only contains the string "a;b". Hence I would expect this code to give the result as the previous one! But, it doesn't just return something else, it segfaults.

gdb indicates there is a malloc problem having to do with "GC_clear_stack_inner", in case that's helpful. (I'm a mathematician who recently left academia after postdoc and decided to teach myself programming to become more useful outside of academia; hence I understand type systems and the like--the mathy stuff--a lot better than I understand memory allocation and other stuff that most programmers are supposed to be confident with.)

What could be the problem here?

Best wishes,
August

Hongwei Xi

unread,
Mar 2, 2017, 8:31:12 PM3/2/17
to ats-lan...@googlegroups.com
I suspect some form of infinite recursion.

Using streams can be a bit tricky.

If you use -DATS_MEMALLOC_LIBC (instead of -DATS_MEMALLOC_GCBDW), what will gdb say?

--
You received this message because you are subscribed to the Google Groups "ats-lang-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ats-lang-users+unsubscribe@googlegroups.com.
To post to this group, send email to ats-lang-users@googlegroups.com.
Visit this group at https://groups.google.com/group/ats-lang-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/ats-lang-users/81770a76-0bf2-4fc3-84ae-0372b7f94077%40googlegroups.com.

August Alm

unread,
Mar 2, 2017, 8:52:38 PM3/2/17
to ats-lang-users
The file compiles (I've tried a few compiler options) and "gdb run" yields

    Program received signal SIGSEGV, Segmentation fault.
    0x00007ffff783eea5 in _int_malloc (av=0x7ffff7b6a620 <main_arena>, bytes=16) at malloc.c:3790

The frames 0-3 involve allocation functions that are not particular to my file. Frame 4 says:

    #4  __patsfun_28__28__14 (arg0=<optimized out>, env1=0x605540, env0=10 '\n') at csv_lexer_dats.c:9023
    9023    ATSINSmove_con1_new(tmpret63__14, postiats_tysum_7) ;

My not-so-educated guess is that this refers to making a cons-cell of a stream.

But: How can my function do just fine when manually fed
 
    cons('a', cons( ';', sing('b'))): stream_vt(char),

but segfault when I use [streamize_fileref_char] to construct the very same stream from the string "a;b" in a file? Where is the room for an infinite recursion in that?

Thank you,
August

Hongwei Xi

unread,
Mar 2, 2017, 9:03:08 PM3/2/17
to ats-lan...@googlegroups.com
I suspect that the file you used contains other characters.

What is in "small.csv"?

--
You received this message because you are subscribed to the Google Groups "ats-lang-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ats-lang-users+unsubscribe@googlegroups.com.
To post to this group, send email to ats-lang-users@googlegroups.com.
Visit this group at https://groups.google.com/group/ats-lang-users.

August Alm

unread,
Mar 2, 2017, 9:13:57 PM3/2/17
to ats-lang-users
Just "a;b", or? (Attached.)
To unsubscribe from this group and stop receiving emails from it, send an email to ats-lang-user...@googlegroups.com.
To post to this group, send email to ats-lan...@googlegroups.com.
small.csv

Hongwei Xi

unread,
Mar 2, 2017, 9:21:21 PM3/2/17
to ats-lan...@googlegroups.com
When tried, I saw the following 5 chars (ascii) in small.csv:

97
59
98
13
10

My testing code:

#include"share/atspre_staload.hats"
#include"share/HATS/atspre_staload_libats_ML.hats"


implement main0 () = {
  val inp = fileref_open_exn("small.csv", file_mode_r)
  val ins = streamize_fileref_char(inp)
  val ins = stream2list_vt(ins)
  val ins = g0ofg1(list_vt2t(ins))97
  val ( ) = println! ("length(ins) = ", length(ins))
  val ( ) = (ins).foreach()(lam c => println!(char2int0(c)))
(*

  val lexed = lex_csv(true, ';', ins)
*)
  val () = fileref_close(inp)
(*

  val h = (lexed.head())
  val- CSV_Field(r) = h
  val a = r.csvFieldContent
  val () = println!(a)
*)
}



To unsubscribe from this group and stop receiving emails from it, send an email to ats-lang-users+unsubscribe@googlegroups.com.
To post to this group, send email to ats-lang-users@googlegroups.com.

gmhwxi

unread,
Mar 2, 2017, 9:24:40 PM3/2/17
to ats-lang-users
BTW, I have a csv parser mostly for my own use:

https://www.npmjs.com/package/atscntrb-hx-csv-parse

gmhwxi

unread,
Mar 3, 2017, 9:22:00 AM3/3/17
to ats-lang-users
Now you may do the following tests:

Try:

val ins = streamize_string_char("a;b") // should work

Try:

val ins = streamize_string_char("a;b\n") // may not work

Try:

val ins = streamize_string_char("a;b\015\012") // should cause crash

August Alm

unread,
Mar 3, 2017, 5:25:55 PM3/3/17
to ats-lang-users
Hi!
I had indeed made a logical error that caused any stream with "carriage return" followed by "newline" to recurse indefinitely. Thank you for your patience and pedagogical instincts, Professor! There is still some issue though, one that I believe is more subtle. I fixed the logical error and my algorithm now handles all the test cases you suggested. However, when fed an actual CSV-file with a thousand rows and about 300 columns it still segfaults--unless I manually increase the stack space on my computer! I don't know exactly where the critical limit is, but increasing it from 8192 kbytes to 65536 certainly did the trick. The whole file parsed without problem, and rather quickly at that. It seems my algorithm makes too much use of stack allocation and that I may have to rethink some of my (would-be) optimization choices.
Best wishes,
August

Hongwei Xi

unread,
Mar 3, 2017, 5:32:15 PM3/3/17
to ats-lan...@googlegroups.com
You are welcome!

Since I have not seen your code, I could only guess :)

Usually, what you described can be fixed by using tail-recursion, or
by using lazy-evaluation. The former approach is straightforward. You
just need to identify the function or functions that cause the deep stack
usage. Then try to rewrite using tail-recursion.



To unsubscribe from this group and stop receiving emails from it, send an email to ats-lang-users+unsubscribe@googlegroups.com.
To post to this group, send email to ats-lang-users@googlegroups.com.

August Alm

unread,
Mar 3, 2017, 5:48:23 PM3/3/17
to ats-lang-users
How would I best share larger code portions? I have no concerns about my making my mistakes public, heh.

I believe everything is lazy as-is (all data is [stream_vt("sometype")]). And I've tried to write tail-recursive functional code. The algorithm is based on two mutually recursing functions, "fun ... and ..", similar to how you did things in your csv-parser (thanks for pointing out that piece of code). However, I cannot set them up with "fn* .. and .." to enforce a local jump because they call each other in a too intertwined way. Might that be it?

Hongwei Xi

unread,
Mar 3, 2017, 5:57:54 PM3/3/17
to ats-lan...@googlegroups.com
One possibility is to build a npm package and then publish it.

If you go to https://www.npmjs.com/ and seach for 'atscntrb'. You can find
plenty packages. You may need to install npm first.

If you do build a npm package, I suggest that you choose a name space for
yourself. E.g., atscntrb-a?a-..., where ? is the first letter of your middle name.

To unsubscribe from this group and stop receiving emails from it, send an email to ats-lang-users+unsubscribe@googlegroups.com.
To post to this group, send email to ats-lang-users@googlegroups.com.

August Alm

unread,
Mar 4, 2017, 7:27:03 PM3/4/17
to ats-lang-users
I've spent  few hours trying to figure out how to make proper use of npm and gave up--for now. If the project turns into something more serious (i.e., useful to others) then I will have another go at it. For now my naive attempts at making effective use of linear streams can be witnessed at GitHub: https://github.com/August-Alm/ats_csv_lexer Any and all comments on how to improve are appreciated.

Best wishes, August.

gmhwxi

unread,
Mar 4, 2017, 9:47:07 PM3/4/17
to ats-lang-users
I took a glance at your code.

I noticed a very common mistake involving the use of
stream (or stream_vt). Basically, the way stream is used
in your code is like the way list is used. This causes the
stack issue you encountered.

Say that you have a function that returns a stream. In nearly
all cases, the correct way to implement such a function should
use the following style:

fun foo(...): stream_vt(...) = $ldelay
(
...
)

The idea is that 'foo' should return in O(1) time. The body of $ldelay
is only evaluated with the first element of the returned stream is neede.
Sometimes, this is call full laziness. Without full laziness, a stream may
behave like a list, defeating the very purpose of using a stream.

gmhwxi

unread,
Mar 4, 2017, 10:07:42 PM3/4/17
to ats-lang-users
BTW, it seems you don't need to do much to fix the issue.

Basically, you just do

1) Put the body of parse_entry into $ldelay(...)
2) Change stream_vt_make_cons into stream_vt_cons

There may be a few other things but they should all be
very minor.

August Alm

unread,
Mar 5, 2017, 5:34:39 PM3/5/17
to ats-lang-users
Thanks for the tip! I think I understand. I treated $ldelay much as a data constructor, so that all streams are equally lazy, whereas there are in fact many ways to sequence into thunks. Let me give an example to anchor the discussion. Both the following implementations of a map-template for linear streams typecheck:

         fun {a, b: t0ype}
         map_make_cons
         ( xs: stream_vt(a)
         , f: a -> b
         ) : stream_vt(b) =
         case !xs of
         | ~stream_vt_nil() => stream_vt_make_nil()
         | ~stream_vt_cons(x, xs1) =>
           stream_vt_make_cons(f(x), map_make_cons(xs1, f))
        
         fun {a, b: t0ype}
         map_ldelay
         ( xs: stream_vt(a)
         , f: a -> b
         ) : stream_vt(b) =
         $ldelay
         ( case !xs of
           | ~stream_vt_nil() => stream_vt_nil()
           | ~stream_vt_cons(x, xs1) =>
             stream_vt_cons(f(x), map_ldelay(xs1, f))
         , ~xs
         )

The second is maximally lazy. The first, [map_make_cons] is less lazy because checking the case-conditions is not delayed. My code was like the first example, only much more was going on inside the case expressions. Is that a correct assessment?

Hongwei Xi

unread,
Mar 5, 2017, 5:58:35 PM3/5/17
to ats-lan...@googlegroups.com
Yes, you definitely got it :)

Stream_vt is very memory-frugal.

Haskell relies on deforestation (complex complier optimization)
to reduce memory usage of lazy evaluation. In ATS, deforestation is
not supported. Instead, the programmer needs to recycle memory explicitly.

Compared to Haskell, corresponding code using stream_vt in ATS can be
much more efficient both time-wise and memory-wise.

For instance, the following example (for computing Mersenne primes) can
run for days without run-time GC:
It convincingly attests to the power of linear streams.

Cheers!


To unsubscribe from this group and stop receiving emails from it, send an email to ats-lang-users+unsubscribe@googlegroups.com.
To post to this group, send email to ats-lang-users@googlegroups.com.

August Alm

unread,
Mar 6, 2017, 8:19:34 AM3/6/17
to ats-lang-users
The points you mention are part of the reason I chose to wrote the csv lexer the way I did. It follows one of the fastests Haskell csv parsers, and I was curious to see how using linear types could optimize performance.

Regarding your suggestion on how to make better use of $ldelay in my code: I'm stuck on a compiler error that I can't make sense of. The following pseudo-minimal example throws the same kind of errors:
        
         #include "share/atspre_define.hats"
         #include "share/atspre_staload.hats"
         staload UN = "prelude/SATS/unsafe.sats"
         staload SBF = "libats/SATS/stringbuf.sats"
         staload _(*SBF*) = "libats/DATS/stringbuf.dats"
        
         datatype DT = D_T of @{ alpha = char }
         vtypedef llstring = stream_vt(char)
        
         fun
         test (acc: !$SBF.stringbuf, cs: llstring): stream_vt(DT) =
         $ldelay
         ( case !cs of
           | ~stream_vt_nil() =>
             if $SBF.stringbuf_get_size(acc) = i2sz(0) then stream_vt_nil()
             else stream_vt_cons(D_T(@{alpha = 'a'}), stream_vt_make_nil())
           | ~stream_vt_cons(c, cs1) =>
             let val crec = D_T(@{alpha = c})
             in stream_vt_cons(crec, test(acc, cs1))
             end
         , ~cs
         )

The compiler can not infer the type I want (which is [stream_vt_con(DT)] for the [stream_vt_nil()] following the first [then] in the function body. The error message says

the dynamic expression cannot be assigned the type [S2EVar(5492)].
[...] mismatch of sorts in unification:
The sort of variable is: S2RTbas(S2RTBASimp(1; t@ype))
The sort of solution is: S2RTbas(S2RTBASimp(2; viewtype))
[...] mismatch of static terms (tyleq):
The actual term is: S2Eapp(S2Ecst(stream_vt_con); S2EVar(5495))
The needed term is: S2EVar(5492)

(There are further errors of the same form.) Is the culprit that [stream_vt] of a nonlinear datatype requires some special care? The version with [stream_vt_make_nil()] instead of explicit [$ldelay] works so the error ought to be subtle.

Best wishes,
August

Hongwei Xi

unread,
Mar 6, 2017, 8:30:05 AM3/6/17
to ats-lan...@googlegroups.com
I forgot to tell you something essential in using stream_vt.
The following interface for 'test' cannot work:


fun test (acc: !$SBF.stringbuf, cs: llstring): stream_vt(DT) =

What you need is

fun test (acc: $SBF.stringbuf, cs: llstring): stream_vt(DT) =

The 'acc' stringbuf needs to be consumed by 'test'. The implementation
of 'test' looks like this:

$ldelay
(
<code for stream construction>
,
(freeing(acc); freeing(cs)) // this part is executed when the stream is freed
)

To unsubscribe from this group and stop receiving emails from it, send an email to ats-lang-users+unsubscribe@googlegroups.com.
To post to this group, send email to ats-lang-users@googlegroups.com.

August Alm

unread,
Mar 6, 2017, 11:30:27 AM3/6/17
to ats-lang-users
Hrrm, I had:

fun
parse_entry
( st: !CSVState >> _
, at: (int, int)
, acc: !$SBF.stringbuf
, cs: llstring
) : stream_vt(CSVEntry)

I gather I have to change not just [!$SBF.stringbuf] but also [!CSVState >> _], right? What about if I did

fun
parse_entry_con
( st: !CSVState >> _
, at: (int, int)
, acc: !$SBF.stringbuf
, cs: llstring
) : stream_vt_con(CSVEntry)

and then put

parse_entry(...) =
$ldelay
( parse_entry_con(...)
, ( free(st)
  ; free(acc)
  ; free(cs)
  )
)

--would that work? Would it be idiomatic and efficient?

Thanks, again,
August

gmhwxi

unread,
Mar 6, 2017, 11:39:38 AM3/6/17
to ats-lang-users
This would not work.

The resources need to be consumed by parse_entry_con as well.

There are two parts in $ldelay. Normally the first part gets executed.
If you free a stream_vt, then the second part is executed (which frees
all the resources stored in the first part, which is a closure). If the stream_vt
is evaluated to the end (stream_vt_nil), then the second part is never executed.

gmhwxi

unread,
Mar 6, 2017, 11:43:36 AM3/6/17
to ats-lang-users
Yes, CSVstate needs to be changed as well.

However, your code needs very little change. This is like a
a 5 minute job to me. I would be happy to give it a try if you say so.
But I thought that you might want to get the thrill of fixing the code :)


On Monday, March 6, 2017 at 11:30:27 AM UTC-5, August Alm wrote:

August Alm

unread,
Mar 6, 2017, 4:06:11 PM3/6/17
to ats-lang-users
The code now seems to work as inteded!

https://github.com/August-Alm/ats_csv_lexer

Thank you for all the help. I still don't fully grokk why the function needs to consume each of its arguments--will have to meditate more on that--but at least I know how to write code like this from now on.

gmhwxi

unread,
Mar 6, 2017, 8:21:00 PM3/6/17
to ats-lang-users
Really glad that you got it to work!

I suggest that you make a npm-package for the parser and then
publish the package. In this way, other ats-lang users can benefit
from your work easily.

You could try to introduce some abstract types into your code. For
instance, I would suggest that you make CSVstate a datavtype (linear datatype)
(a datatype is often referred to as being semi-abstract). Then you can
introduce overloaded symbols for functions processing CSVstate, making your code
more accessible.

Also, the following interface:

extern fun
lex_csv(QNLIN: bool, DELIM: char, cs: llstring): CSVEntries

can and probably should be changed into

extern
fun{}
lex_csv(cs: listing): CSVEntries

The parameters QNLIN and DELIM can be passed via templates:

extern
fun{} lex_csv$QNLIN(): char
extern
fun{} lex_csv$DELIM(): char

implement{} lex_csv$QNLIN() = false
implement{} lex_csv$DELIM() = ',' // default value

Writing function templates (instead of functions) enables you to move
your code around very conveniently. You can even move template code
into the body of another function.

That's all for now. Hope you will like ATS and tell/teach it to your friends.

Cheers!

August Alm

unread,
Mar 7, 2017, 4:52:58 PM3/7/17
to ats-lang-users
I'm glad too! I wrote my first "Hello World" program (in Haskell) less than four months ago, before that I was completely illiterate about programming--writing a linear, lazy CSV-parser in ATS has definitely been my most challenging venture so far. I mean this in a good way. ATS is quickly becoming my favorite language. It is daunting at times, sure, but its unique combination of low-level abilities and functional abstractions makes me feel like the Star Trek idiom "To boldly go where no one has gone before", heh. The ATS sky is so vast I've almost forgot about monads. And YES!, I do suggest trying ATS to every programmer I meet.

Tangential to the topic of monads: Do you know if someone has thought about the relations between ATS and "enriched effect calculus" (as described in http://homepages.inf.ed.ac.uk/als/Research/Sources/eec.pdf) or "linear state monads" (as mentioned in https://arxiv.org/pdf/1403.1477.pdf)? There is a clear analogy. Implementing a concept such as a linear state monad in ATS would be nice, I think. Monadic programming on an Arduino, anyone? =) It would certainly be a unique selling point.

I do not understand what you're aiming at with your suggestion to maje CSVState a datavtype or absvtype. Could you elaborate? I have seen abstract types used as a way to make otherwise allowed operation illegal (there is an example in your book, I think, of how to construct a record type where some fields are mutable and some are not), but not for the sake of overloading symbols.

I will rewrite the code so that DELIM and QNLIN are passed as templates. I also intend to add some further functionality, like functions for filtering out errors, for printing and for collecting the output in tabular form with rows and columns rather than as a single row. When I'm satisfied I will make an npm-package out of it.

Best wishes,
August

gmhwxi

unread,
Mar 7, 2017, 8:03:33 PM3/7/17
to ats-lang-users
I was referring to some kind of code of the following style:

typedef
CSVState_rec =
@{
  tableRow = int,
  tableCol = int,
  textRow = int,
  textCol = int
}

datavtype
CSVState = CSVState of CSVState_rec

extern
fun{}
CSVState_get_tableRow(!CSVState): int
extern
fun{}
CSVState_set_tableRow(!CSVState, int): void
overload .tableRow with CSVState_get_tableRow
overload .tableRow with CSVState_set_tableRow

implement
{}
CSVState_get_tableRow
  (state) = let
//
val+CSVState(x0) = state in x0.tableRow
//
end // end of [CSVState_get_tableRow]
implement
{}
CSVState_set_tableRow
  (state, i0) = let
//
val+@CSVState(x0) = state in x0.tableRow := i0; fold@(state)
//
end // end of [CSVState_set_tableRow]
typedef
CSVState_rec =
@{
  tableRow = int,
  tableCol = int,
  textRow = int,
  textCol = int
}

datavtype
CSVState = CSVState of CSVState_rec

extern
fun{}
CSVState_get_tableRow(!CSVState): int
extern
fun{}
CSVState_set_tableRow(!CSVState, int): void
overload .tableRow with CSVState_get_tableRow
overload .tableRow with CSVState_set_tableRow

implement
{}
CSVState_get_tableRow
  (state) = let
//
val+CSVState(x0) = state in x0.tableRow
//
end // end of [CSVState_get_tableRow]
implement
{}
CSVState_set_tableRow
  (state, i0) = let
//
val+@CSVState(x0) = state in x0.tableRow := i0; fold@(state)
//
end // end of [CSVState_set_tableRow]

August Alm

unread,
Mar 8, 2017, 4:18:45 AM3/8/17
to ats-lang-users
I see. Yes, being able to write

st.tableRow(2)

instead of

st.2->tableRow := 2

would make the code a little cleaner, I guess, What would really reduce syntacic noise in my code though would be a slick way of writing what currently I have as

st.2->tableRow := st.2->tableRow + 1

Continuing your suggestion, I can write something like

extern fun {}
CSVState_update_tableRow (
  st: !CSVState,
  up: int -> int
) : void

implement {}
CSVState_update_tableRow(st, up) =
let val @CSVState(s) = st
in s.tableRow := up(s.tableRow); fold@(st)
end

fun {}
plus(n: int): (int -> int) = lam(i) => n + i

overload .tableRow with CSVState_update_tableRow

After this I can write

st.tableRow(plus(1))

Is there a way to use overloading that would let me write that instead as

st.tableRow(+1)  ?

I tried "overload + with plus" but get the error "operator fixity cannot be resolved" when used.

August Alm

unread,
Mar 8, 2017, 8:11:06 AM3/8/17
to ats-lang-users
In my original formulation of CSVState (not using a datavtype but a concrete vtypedef) I can add

infix +=
macdef(x, i) = ,(x) := ,(x) + ,(i)

and then write

st.2->tableRow += 1 

The best I can do with a datavtype is something like

st.tableRow_inc(1)

They are comparable to my eyes, as far as readability of code is concerned.

How does my vtypedef-version compare with your datavtype-version in terms of efficiency? Is there a clear optimization reason to choose one over the other?

gmhwxi

unread,
Mar 8, 2017, 9:37:41 AM3/8/17
to ats-lang-users
'+' is already declared to be infix.

You could try ++:

prefix ++
overload ++ with plus

Is this really a good idea :)

gmhwxi

unread,
Mar 8, 2017, 9:44:15 AM3/8/17
to ats-lang-users
If you use datavtype, then you do not have to deal with
views explicitly. It is just a style.

I myself tend to use datavtype when memory allocation is involved and
record (plus at-view) in an embedded setting where not memory allocation
is involved.

As for efficiency, there is no difference; internally, the same representation is
used.
Visit this group at <a href="https://groups.google.com/group/ats-lang-users" rel="nofollow" target="_blank" onmousedown="this.href='https://groups.google.com/group/ats-lang-users';return true;" onclick="this.href='https://groups.google.com/group/ats-lang-users';return tr

gmhwxi

unread,
Mar 8, 2017, 11:51:36 AM3/8/17
to ats-lang-users
Tangential to the topic of monads: Do you know if someone has thought about the relations between ATS and "enriched effect calculus" (as described in http://homepages.inf.ed.ac.uk/als/Research/Sources/eec.pdf) or "linear state monads" (as mentioned in https://arxiv.org/pdf/1403.1477.pdf)? There is a clear analogy. Implementing a concept such as a linear state monad in ATS would be nice, I think. Monadic programming on an Arduino, anyone? =) It would certainly be a unique selling point.

I can't really follow these monad papers. Too much for me :)
Given your background, maybe you could give this a try?

Over the years, I have gradually grown more and more cynic about "theoretic" research
in the area of programming languages. I feel that the most urgent issue in programming is
to find effective approaches to reducing programming complexity.

For instance, in your csv parser, there are a lot of if-then-else's. Maybe you took them from
some Haskel code. The point is that if-then-else's make programming hard to write and harder
to read/follow. I propose the following style:

1) Implementing a csv parser without worrying about quotes (DQUOT). Call this version 1.
2) Using templates to improve version 1 without directly modifying version 1. Another way
    to put it: you still have version 1 available after doing the improvement.

I know that this may sound a bit vague but that is my point. Being vague makes people
think more and more deeply :)

Cheers!

August Alm

unread,
Mar 8, 2017, 4:47:03 PM3/8/17
to ats-lang-users
See in.


Den onsdag 8 mars 2017 kl. 17:51:36 UTC+1 skrev gmhwxi:
Tangential to the topic of monads: Do you know if someone has thought about the relations between ATS and "enriched effect calculus" (as described in http://homepages.inf.ed.ac.uk/als/Research/Sources/eec.pdf) or "linear state monads" (as mentioned in https://arxiv.org/pdf/1403.1477.pdf)? There is a clear analogy. Implementing a concept such as a linear state monad in ATS would be nice, I think. Monadic programming on an Arduino, anyone? =) It would certainly be a unique selling point.

I can't really follow these monad papers. Too much for me :)
Given your background, maybe you could give this a try?

I'm tempted but I feel like I have to understand ATS:s function tags ("cloref" and the like, the flavours of function) better first, and generally get a more solid footing. I don't want to write something "cool", I want it to be useful, too.
 
Over the years, I have gradually grown more and more cynic about "theoretic" research
in the area of programming languages. I feel that the most urgent issue in programming is
to find effective approaches to reducing programming complexity.

I take that as being somewhat tongue-in-cheek. ATS is a very theoretical language, after all. To clarify, I think Haskell suffers greatly from having too little focus on efficiency (among many of its users, not among the guys working on the compiler). I heard about ATS about the same time as I heard about Idris (the dependent type thing) and decided to pursue ATS precisely because of its air of efficiency and "real-world-readiness". I do still love my Haskell though, mainly because it is so easy to be productive with it. Scala has a very good no-bs culture and good library hygiene, but I'm not too fond of OOP so...
 
For instance, in your csv parser, there are a lot of if-then-else's. Maybe you took them from
some Haskel code. The point is that if-then-else's make programming hard to write and harder
to read/follow. I propose the following style:

I first tried to write it using only pattern matching but failed to get it passed the typechecker. Maybe I will have another go at it.
 
1) Implementing a csv parser without worrying about quotes (DQUOT). Call this version 1.
2) Using templates to improve version 1 without directly modifying version 1. Another way
    to put it: you still have version 1 available after doing the improvement.

If I was uncertain about the algorithm then such an incremental development style would surely be preferable, but since the code is a port of a tried and tested Haskell library I'm
not very motivated to scrap and start over. But for my next project(s) I will try to heed your words.

Hongwei Xi

unread,
Mar 8, 2017, 5:44:10 PM3/8/17
to ats-lan...@googlegroups.com
>>I take that as being somewhat tongue-in-cheek. ATS is a very theoretical language, after all.

I know that this sounds very ironic but interesting stuff often does sound ironic :)

My view of programming language research is changing gradually but surely. I now strongly feel
that the most important programming support is to facilitate the need to alter/adapt the behaviour
of a written program without actually requiring direct changes to be made to the program. And the
template system of ATS can be seen as an attempt to provide programming support of this sort.


To unsubscribe from this group and stop receiving emails from it, send an email to ats-lang-users+unsubscribe@googlegroups.com.
To post to this group, send email to ats-lang-users@googlegroups.com.

August Alm

unread,
Mar 13, 2017, 4:51:51 PM3/13/17
to ats-lang-users
So... I added some "second stage" parsing functionality, to get the en result in tabular form rahter than as a single [stream_vt], and to check for global errors such as unequal number of columns in the rows, and now I am back to segfaulting! =( However, this time it does not seem to be a stack issue because I run into the segmentation fault already at the compilation stage.

I code in Vim and have it set up so that typing ":make" will run "patsopt -tc -d %", i.e., typechecking only. When I do everything seems fine--no complaints. I have used this utility for some time now and it has always worked like a charm. Wierdly though, if I issue "$ patsopt -tc -d csv_lexer.dats" in the console instead I get a segfault. The same happens for every other compilation command: I've tried compiling just using type checking, just to c, or etc. Always segfault. Compiling with the "-verbose" flag prints

exec(patsopt --output csv_lexer_dats.c --dynamic csv_lexer.dats)
Segmentation fault
exec(patsopt --output csv_lexer_dats.c --dynamic csv_lexer.dats) = 35584

which does not tell me anything.

My code can be found at https://github.com/August-Alm/ats_csv_lexer Note that I have moved some function implementations into a hats-file. Might this be a cause of trouble? Any tips at all on how to debug this are most appreciated. I don't know where to even begin as gdb seems useless as long as I can't even generate the C-code.

Best wishes,
August

Hongwei Xi

unread,
Mar 13, 2017, 5:02:57 PM3/13/17
to ats-lan...@googlegroups.com
I will take a look later. Based on your description, the issue
seems to be caused by not providing certain template arguments
explicitly:

Say, foo is a template. Please use foo<...>(...) instead of foo(...)

Compared to Haskell, type inference in ATS is quite limited :)

To unsubscribe from this group and stop receiving emails from it, send an email to ats-lang-users+unsubscribe@googlegroups.com.
To post to this group, send email to ats-lang-users@googlegroups.com.

Hongwei Xi

unread,
Mar 13, 2017, 6:38:42 PM3/13/17
to ats-lan...@googlegroups.com
I am nearly certain
that the issue was caused by the code in csv_lib.hats.

First, INV(Either(a, b)) should be Either(INV(a), INV(b)) because
the two parameters of Either are declared to be co-variant.

August Alm

unread,
Mar 13, 2017, 6:39:23 PM3/13/17
to ats-lang-users
Thanks for the hint! I added template arguments wherever I could and now I got some error messages that actually say something. However, I do find it a bit disconcerting that the compiler would segfault rather than tell me I need to annotate templates.

Hongwei Xi

unread,
Mar 13, 2017, 6:42:03 PM3/13/17
to ats-lan...@googlegroups.com
Once INV annotation is done properly, template annotation can be pretty much removed.


To unsubscribe from this group and stop receiving emails from it, send an email to ats-lang-users+unsubscribe@googlegroups.com.
To post to this group, send email to ats-lang-users@googlegroups.com.

August Alm

unread,
Mar 13, 2017, 7:57:18 PM3/13/17
to ats-lang-users
Yes, as you guessed I am having problems with covariance. Some I have solved but this one leaves me very puzzled:

I'll copy the code here in the order that it appears in my file. First (in "csv_lib.hats", which is #include:d at the very beginning) I have a template

extern fun {a: vt0ype} extfree(x: a): void

which is used to free variables in some of the functions using the [Either] constructor. Then I have:

vtypedef CSVErrors = List0_vt(CSVError)

where CSVError is a non-linear datatype. If I after that definition write

implement {CSVErrors} extfree(errs) = list_vt_free(errs),

then I get a compiler error telling me that [CSVErrors] can't be assigned the type of linear lists. If I try to go explicit and write

implement{CSVErrors} extfree(errs) = let
     val errs: List0_vt(CSVError) = errs
   in case errs of
       | ~list_vt_nil() => ()
       | ~list_vt_cons(er, ers1) => extfree<CSVErrors>(ers1)
   end

then I get roughly the same error, saying:

The actual term is: S2Evar(CSVErrors(8927))
The needed term is: S2Eapp(S2Ecst(list_vt0ype_int_vtype); S2Ecst(CSVError), S2EVar(5476))

How can I help the compiler infer that CSVErrors is indeed a listvt0ype_int_vtype of CSVError?

August Alm

unread,
Mar 13, 2017, 8:10:47 PM3/13/17
to ats-lang-users
If I write

implement extfree<CSVErrors> = ...

then I get it passed the typechecking level of compilation, but when trying to generate C-code I instead get an error saying [extfree] has not been implemented:

"csv_lexer_dats.c:93504:45: error: ‘extfree’ undeclared (first use in this function)"

Hongwei Xi

unread,
Mar 13, 2017, 9:09:53 PM3/13/17
to ats-lan...@googlegroups.com
This is because CSVErrors is a dependent type.

The right way to do it is to make CSVErrors abstract.

If you are using ATS2-0.3.3, then you can use the feature of 'reassume'.
I will show you how to do it in another message.


To unsubscribe from this group and stop receiving emails from it, send an email to ats-lang-users+unsubscribe@googlegroups.com.
To post to this group, send email to ats-lang-users@googlegroups.com.

Hongwei Xi

unread,
Mar 13, 2017, 9:39:44 PM3/13/17
to ats-lan...@googlegroups.com

Please first do this:

(*
vtypedef
CSVErrors = List0_vt(CSVError)
*)
absvtype CSVErrors
local
assume CSVErrors = List0_vt(CSVError)
in (*nothing*) end

Whenever you need the definition of CSVErrors, please do 'reassume CSVErrors' in the
scope where you need it. For instance, I modified some of your code as follows:

implement {} validate(rs: CSVTable): CSVResult =
let
  reassume CSVErrors
in
  $ldelay(
    case !rs of
    | ~nil() => let
        val nodata = list_vt_make_sing(No_Data()): CSVErrors
      in Left(nodata) :: empty()
      end
    | ~stream_vt_cons(r, rs1) => let
        val length_r = list_vt_length(r)
        implement {} current_length() = length_r
      in extract_errs(r) :: stream_vt_usermap(rs1, extract_errs)
      end
    ,
    ~rs
  )
end

After this, your code should be running, and hopefully, running correctly :)




August Alm

unread,
Mar 14, 2017, 9:35:36 AM3/14/17
to ats-lang-users
Great application of "reassume"! =D I had to reinstall ATS2 to make the new syntax available and now I've had a go at it.
can compile the "reassume"-example with [int2_t0ype] that you posted on this list, so my reinstallation is working.
However, I don't seem to understand how to use "reassume" properly. If I type


         absvtype CSVErrors
         local assume CSVErrors = List0_vt(CSVError) in (* nothing *) end

         implement {CSVErrors} extfree(errs) =
           let reassume CSVErrors

           in case errs of
           | ~list_vt_nil() => ()
           | ~list_vt_cons(er, ers1) => extfree<CSVErrors>(ers1)
           end

then I get an error saying "the identifier [CSVErrors] does not refer to a static constant". Adding [vtypedef CSVErrors = CSVErrors] or
[stadef CSVErrors = CSVErrors] after the initial "absvtype" declaration does not alleviate the problem. What's wrong?
...

gmhwxi

unread,
Mar 14, 2017, 10:32:25 AM3/14/17
to ats-lang-users

CSVErrors in the following syntax
is a bound type variable:

implement {CSVErrors} extfree(errs) = ...

What you need is

implement extfree<CSVErrors>(errs) = free(errs)

August Alm

unread,
Mar 14, 2017, 6:05:13 PM3/14/17
to ats-lang-users
Darn, I was staring myself blind at other stuff.

Ok, I'm down to a single error now. I think I understand it, but I don't know how to best work around it. It occurs in the following portion
of the implementation of [validate] that you showed in a previous email:


| ~stream_vt_cons(r, rs1) => let
        val length_r = list_vt_length(r)
        implement {} current_length() = length_r
      in extract_errs(r) :: stream_vt_usermap(rs1, extract_errs)
      end
  
The compiler says:

error(ccomp): the function is expected to be envless but it is not.
csv_lexer_dats.c: In function ‘__patsfun_65__65__1’:
csv_lexer_dats.c:83362:1: warning: implicit declaration of function ‘ATSERRORnotenvless’ [-Wimplicit-function-declaration]
 ATSINSmove(tmp444__1, stream_vt_usermap_7__7__1(tmp436__1, ATSERRORnotenvless(ATSPMVfunlab(_057_home_057_august_057_Documents_057_programming_057_ATS_057_pearson_057_CSV_lexer_057_csv_lexer_056_dats__extract_errs__71__2)))) ;
 ^
In file included from csv_lexer_dats.c:15:0:
csv_lexer_dats.c:83362:60: warning: passing argument 2 of ‘stream_vt_usermap_7__7__1’ makes pointer from integer without a cast
 ATSINSmove(tmp444__1, stream_vt_usermap_7__7__1(tmp436__1, ATSERRORnotenvless(ATSPMVfunlab(_057_home_057_august_057_Documents_057_programming_057_ATS_057_pearson_057_CSV_lexer_057_csv_lexer_056_dats__extract_errs__71__2)))) ;

My interpretation is that [extract_errs] depends on the implementation of [current_length()] and that the compiler
wants to treat this as being an environmental dependency, whereas I thought I could treat [extract_errs] as envless.
Does dependency on external templates count as environment dependency?
 
Before I had the code above I instead got errors that I figured out had to do with the fact (?) that linear global values
can only be used in the global scope, not inside a function scope. In the implementation

implement current_length() = list_vt_length(r),

r is linear, and I was having problems figuring out where to put it. In Zhiqiang Ren's "ATS Knowledge Documentation" there
is a somewhat analogous example, recommending using [$UNSAFE.castvwtp0] for working around the scope issue.
Should I do something in that vein?

Best wishes,
August

gmhwxi

unread,
Mar 14, 2017, 6:57:39 PM3/14/17
to ats-lang-users

Here is a way to fix it:

fun {a, b: vt0ype}
stream_vt_usermap (
    xs: stream_vt(a),
    f: a -<cloref1> b
  ) : stream_vt(b) = let
    implement stream_vt_usermap$fopr<a, b>(x) = f(x)
  in stream_vt_usermap_aux(xs) end

When using stream_vt_usermap, please pass lam(r) => extract_errs(r)

There are two other places where you need this kind of change. Then you
should be able to compile the entire program. I just did.


On Thursday, March 2, 2017 at 5:04:35 PM UTC-5, August Alm wrote:
Hi!

I'm in over my head and tried writing a CSV-parser using linear lazy streams. My code thus far is 600 lines and almost to my own surprise I get it to compile! However, there is something fishy because I get a segfault when applying my program to an actual CSV-file. I've been trying to debug using gdb but the fault eludes me. Since I don't expect anyone to mull through 600 lines of code, I am hoping these code snippets are enough for one of you guys to give me some advice.

This code executes just fine:

        implement main0 () = {
          
           val test = stream_vt_make_cons(
                            'a', stream_vt_make_cons(
                                    ';', stream_vt_make_sing('b')))          (* the stream ('a', ';', 'b') *)
           val lexed = lex_csv(true, ';', test)
           val h = (lexed.head())
           val- CSV_Field(r) = h
           val a = r.csvFieldContent
           val () = println!(a)
        
         }

Here [lex_csv] is my 600-line alogrithm. It reads a [stream_vt(char)] and gives back a [stream_vt(CSVEntry)], where [CSVEntry] is a record type, one of whose fields is [CSVFieldContent]. When executing the program I get "a" printed to the console.

This code results in a segfault:

        implement main0 () = {
       
           val inp = fileref_open_exn("small.csv", file_mode_r)
           val ins = streamize_fileref_char(inp)
           val lexed = lex_csv(true, ';', ins)
           val () = fileref_close(inp)

           val h = (lexed.head())
           val- CSV_Field(r) = h
           val a = r.csvFieldContent
           val () = println!(a)
        
         }

August Alm

unread,
Mar 14, 2017, 7:12:24 PM3/14/17
to ats-lang-users
Hooray! So did I. :) Thanks for, well everything.

Am I correct in believing there is some sort of tutorial on how to best publish ATS packages using NPM in the pipeline?

gmhwxi

unread,
Mar 14, 2017, 7:55:42 PM3/14/17
to ats-lang-users
Good for you!

Here is a short article on building npm-package for ATS:

http://ats-lang.sourceforge.net/EXAMPLE/EFFECTIVATS/DivideConquer/index.html

Building npm-package for ATS is a pretty new thing. There is unfortunately not much
doc at this moment. Maybe you could try to write something after this experience :)

By the way, various stream_vt functions in your code are already declared in the following file:

prelude/SATS/stream_vt.sats

I thought that part of the reason you implemented them was to get yourself familiar with ATS.

For instance, you can use stream_vt_map_cloptr for stream_vt_usermap; the former also frees
the (linear) closure it uses.
Reply all
Reply to author
Forward
0 new messages