Loading CSV & Sliding Window

187 views
Skip to first unread message

KLearner

unread,
Mar 8, 2021, 7:49:05 PM3/8/21
to Kona Users
Clearly I'm doing something wrong
Load delimited text file (with column names):
(types;,delim)0:`f   
 / This is free csv I downloaded form somewhere. But I don't know how to import it

type error
(3 3 3 3; ,",") 0: "demo.csv"
               ^
There are 4 columns, and I'm trying to load them in as string types

====================================================

In Kona you have eachpair
/ F':A
  ,':!10
(1 0
 2 1
 3 2
 4 3
 5 4
 6 5
 7 6
 8 7
 9 8)
Fair enough, but is there a way to do an n-wise sliding window like I might have in APL or ngn-K? 

  ⍝ APL    n F/ A
     3,/⍳10
┌─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┐
│0 1 2      │1 2 3      │ 2 3 4     │3 4 5      │4 5 6      │5 6 7     │ 6 7 8     │7 8 9      │
└─────┴─────┴─────┴─────┴─────┴─────┴─────┴─────┘

/ngn F n':A
 , 3': !10
,(0 1 2;1 2 3;2 3 4;3 4 5;4 5 6;5 6 7;6 7 8;7 8 9)

Bakul Shah

unread,
Mar 8, 2021, 10:21:31 PM3/8/21
to kona...@googlegroups.com
There is likely a better solution but here is a quick hack for an n-wise sliding window:

  {:[x<#y;(,x#y),_f[x;1_ y];,y]}[3;!10]
(0 1 2
 1 2 3
 2 3 4
 3 4 5
 4 5 6
 5 6 7
 6 7 8
 7 8 9)

For csv loading suggest reading the k3 manuals (see the kona wiki)

--
You received this message because you are subscribed to the Google Groups "Kona Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kona-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kona-user/2f1fdf99-2bba-4c0a-90a9-2418ec31a392n%40googlegroups.com.

douglas mennella

unread,
Dec 10, 2021, 9:42:43 AM12/10/21
to Kona Users
I think this is a little more compact:

  {(!1+y-x)+\:!x}[3;10]

(0 1 2
1 2 3
2 3 4
3 4 5
4 5 6
5 6 7
6 7 8
7 8 9)
  cds:10 _draw -17
  cds@/:({(!1+y-x)+\:!x}[3;10])
(1 4 3
4 3 8
3 8 14
8 14 5
14 5 2
5 2 10
2 10 13
10 13 7)


As for CSV loading, kona doesn't support  this version of 0:.  For a cheap csv loading without taking care of quoting, etc. You can simply use split:

  {(","\)/:("\n"\x)} "this,is,a\ncsv,file,here"
(("this"
  "is"
  ,"a")
 ("csv"
  "file"
  "here"))

If you need to convert to numbers at depth, you can do something like this:

  0$''{(","\)/:("\n"\x)} "1,2,3\n4,5,6"
(1 2 3
 4 5 6)

FWIW, I tried writing something that would take care of quotes, but am running into issues.  I'll post in a separate thread.  If I can get it working I might be able to mimic the behavior of the 0: in k4 and post it to the wiki.

Kevin Lawler

unread,
Dec 10, 2021, 7:32:59 PM12/10/21
to kona...@googlegroups.com
On Fri, Dec 10, 2021 at 8:42 AM douglas mennella
<douglas....@gmail.com> wrote:
>
> I think this is a little more compact:
> {(!1+y-x)+\:!x}[3;10]

nice. added to wiki

> As for CSV loading,
> FWIW, I tried writing something that would take care of quotes, but am running into issues. I'll post in a separate thread.

there's a standard array language trick of
pointer-chasing/finite-state automaton that should do it fairly
compactly. given that the only wrinkle is going in and out of quotes
and then perhaps escapes and newlines (per RFC 4180), you might be
able to get by with lighter tools
> To view this discussion on the web visit https://groups.google.com/d/msgid/kona-user/0fd7984e-397e-435a-80cd-441cdafbaf05n%40googlegroups.com.

pahihu

unread,
Dec 11, 2021, 8:48:07 AM12/11/21
to Kona Users
Hi,

I've just tried the following in Kona and just works:

  ("ICF";",")0:`a.csv

(,1

 ,"alpha"

 ,2.0)


  ("ICF";,",")0:`b.csv

(`num `str `real

 (,1

  ,"alpha"

  ,2.0))


The CSV files are:

===> cat a.csv

1,alpha,2.0

===> cat b.csv 

num,str,real

1,alpha,2.0


Regards,
pahihu

douglas mennella

unread,
Dec 11, 2021, 5:34:09 PM12/11/21
to kona...@googlegroups.com

Oh, wow.  It works for me too.  Thanks for that.  I'm still new to kona so maybe I over interpreted the errors I ran into.  It looks like it's still having issues with quoting though.

➜  kona git:(master) ✗ cat quote.csv
she,asked,"""why not now?"""
➜  kona git:(master) ✗ rlwrap ./k   
kona      \ for help. \\ to exit.

  ("CCC";",")0:`quote.csv
(,"she"
 ,"asked"
 ,"\"\"\"why not now?\"\"\"\r")
 

FWIW, here's what I have for trying to mask out quotes in order to not split on separators in quotes, but I'll have a look at Kevin's references from another post.   I think this is relatively fast as the number of converge steps is only as long as the longest string of consecutive quotes which shouldn't be long.

  tbl:("CCC";",")0:`quote.csv
  ","/*:'tbl
"she,asked,\"\"\"why not now?\"\"\"\r"
  txt:","/*:'tbl
  q:"\""=txt
  rmpr:{{ls:{{*-2#(|/)({{x _ (b&1!b:(0,y))}[x]y}[x])\y}\:[-1 1;x]}x;c:-/*:'&:'ls;x&~|/((-!(c!2)+c)!\:*|ls)}/x}
  (q; rmpr q)
(0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0
 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0)
  txt:"\"that\",\"guy\",\"said, \"\"hi\"\" to \"\"me\"\"\",\"why?\""
  q:"\""=txt
  (q; rmpr q)
(1 0 0 0 0 1 0 1 0 0 0 1 0 1 0 0 0 0 0 0 1 1 0 0 1 1 0 0 0 0 1 1 0 0 1 1 1 0 1 0 0 0 0 1
 1 0 0 0 0 1 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1)

The next step is to turn this into masked regions of in an out of quotes.

--
You received this message because you are subscribed to a topic in the Google Groups "Kona Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/kona-user/53T4Rzt_STQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to kona-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kona-user/7bb058bc-8253-4343-8cdb-0653dc8021bbn%40googlegroups.com.

douglas mennella

unread,
Dec 12, 2021, 7:14:26 AM12/12/21
to kona...@googlegroups.com

FWIW, here's a first cut at quote handling for a csv.  It likely breaks pretty bad when the csv is not well formatted, but I'm not sure what should be expected in that case.

➜  kona git:(master) ✗ rlwrap ./k
kona      \ for help. \\ to exit.

  sp:{(0,(&x)-!#&x) _ y[&~x]}
  rm:{{ls:1_-1!(/|){-1 _ (b&1!b:(0,x))}\x;c:#ls;(x&~|/((c!2)+!c-(c!2))!\:*|ls)}/x}
  ma:{wh:-':(&x),(#x);,/{(x[0]+-1+2*c)#c:~x[1]}'+:(wh;(!#wh)!\:2)}
  uq:{bs:"\""=x;c:|/bs;rm:({1_ b&-1!b:(0,x)}bs);x[&~rm|c*f||f:~!#x]}
  snq:{qs:"\""=y;qm:ma[rm[qs]];s:(~qm)&x=y;sp[s;y]}
  csv:{uq''snq[","]'snq["\n"]x}

  txt:"\"this\",\"guy\",\"said, \"\"hi\"\" to \"\"me\"\"\",\"sad\"\n\"this\",\"guy\",\"said, \"\"hi\"\" to \"\"me\"\"\",\"sad\""
 
  csv txt
(("this"
  "guy"
  "said, \"hi\" to \"me\""
  "sad")
 ("this"
  "guy"
  "said, \"hi\" to \"me\""
  "sad"))


douglas mennella

unread,
Dec 12, 2021, 8:35:52 PM12/12/21
to kona...@googlegroups.com

Better!  I can just take the running sum of the index function of where the quotes are and look at the parity to generate the mask directly instead of my "fancy" construction below.  I also changed split because the previous version didn't handle empty fields.

 qm:{s|((+\s:(x=y))!\:2)}
 sp:{(,*l),1_/:1_ l:((0,&x) _ y)}


 uq:{bs:"\""=x;c:|/bs;rm:({1_ b&-1!b:(0,x)}bs);x[&~rm|c*f||f:~!#x]}

 snq:{m:qm["\""][y];s:(~m)&x=y;sp[s;y]}


 csv:{uq''snq[","]'snq["\n"]x}

 csv txt
(("this"
  "guy"

  "said, \"boo\" to \"me\""
  "why?")
 ("this"
  "guy"
  "said, \"boo\" to \"me\""
  "why?"))

FWIW, I'm a little worried about this being spam but the traffic here is pretty low.  Please feel free to redirect me to a more appropriate outlet.

Kevin Lawler

unread,
Dec 13, 2021, 7:50:13 AM12/13/21
to kona...@googlegroups.com
On Sun, Dec 12, 2021 at 7:35 PM douglas mennella
<douglas....@gmail.com> wrote:
>
> FWIW, I'm a little worried about this being spam but the traffic here is pretty low. Please feel free to redirect me to a more appropriate outlet.

this is just good list content. you can do as much as you want.
everyone knows how to filter lists these days if they need

douglas mennella

unread,
Dec 29, 2021, 8:19:50 PM12/29/21
to kona...@googlegroups.com
Okay.  Thanks.  FWIW, removing quotes after parsing wasn't quite right
when there are more than three consecutive quotes.  I think the
following should be good.  The quote and separator can be made arguments
to the csv function.

I don't quote understand why uq["\""]'' didn't work but {uq["\""]x}''
works as a workaround.  I also don't understand why the behavior changes
when qm[x][z] is replaced by qm[x;z]

$ cat quote.csv
she,asked,"""why not now, steve?"""
"I",said,"""well judy, because """"I"""" am busy.
har har"""%
$ ./k
kona      \ for help. \\ to exit.

   qm:{s|((+\s:(x=y))!\:2)}
   sp:{(,*l),1_/:1_ l:((0,&x) _ y)}
uq:{l:{y&~x}':{{-1_(b&1!b:(0,x))}\x}bs:x=y;c:|/bs;(y@&~(c*f||f:~!#y)||/l@1+2*!_.5*#:l)}
   snq:{m:qm[x][z];s:(~m)&y=z;sp[s;z]}
   ncr:{x[&~("\r"=x)&1!"\n"=x]}
   csv:{{uq["\""]x}''snq["\"";","]'snq["\"";"\n"]x}

   txt:6:`quote.csv
   csv[ncr txt]
(("she"
  "asked"
  "\"why not now, steve?\"")
 (,"I"
  "said"
  "\"well judy, because \"\"I\"\" am busy.\nhar har\""))
  ("CCC";",")0:`quote.csv
(("she"
  "\"I\""
  "har har\"\"\"")
 ("asked"
  "said"
)
 ("\"\"\"why not now"
  "\"\"\"well judy"
))

Kevin Lawler

unread,
Dec 31, 2021, 2:18:57 PM12/31/21
to kona...@googlegroups.com
> I also don't understand why the behavior changes when qm[x][z] is replaced by qm[x;z]

generally, a[b][c] is standard recursive indexing whereas a[b;c;...]
is "cross-sectional" indexing with repeatable, multiplicative selects
at depth.

without looking more closely, it's possibly a list vs. atom issue,
particularly with strings, where "a" is a char but "aa" is a charvec.
compare symbols. the various results of the indexes depend on how the
arguments are enlisted

> I don't quite understand why uq["\""]'' didn't work but {uq["\""]x}'' works as a workaround.

could be a variety of things. maybe someone else could assist


On Wed, Dec 29, 2021 at 5:19 PM douglas mennella
> --
> You received this message because you are subscribed to the Google Groups "Kona Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kona-user+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kona-user/205c1e5e-c63a-0761-75bd-0fddfd5e2ba1%40gmail.com.

douglas mennella

unread,
Feb 18, 2022, 8:04:13 AM2/18/22
to Kona Users
I learned a few more tricks!  ~= for binary is basically addition mod 2 so you can roll the two steps into one.

Also, that same mask alternates 1's and 0's for consecutive quotes so can be used for unescaping the quotes after the split.

  txt:"he,asked,\"\"\"why not now, steve?\"\"\"\n\"I\",said,\"\"\"well judy, because \"\"\"\"I\"\"\"\" am busy.\nhar har\"\"\""
 
  uq:{[m;q;x]:[*q;(1-2**q)_ x@&~q&m;x]}
  sp:{[s;m;q;x]+((~~!#l)_')'(l:0,&(~m)&s=x)_/:(m;q;x)}
  csv:{m:(~=)\q:"\""=x;{(x).'y}[uq]'sp[","].'sp["\n"].(m;q;x)}
 
  csv txt
(("he"

  "asked"
  "\"why not now, steve?\"")
 (,"I"
  "said"
  "\"well judy, because \"\"I\"\" am busy.\nhar har\""))


Again, I would have preferred to do uq'' instead of {(x).'y}[uq]', but this workaround seems to work.

Tom Szczesny

unread,
Feb 19, 2022, 3:23:15 PM2/19/22
to kona...@googlegroups.com
At first, I thought that you preferred to use uq'' instead of {(x).'y}[uq]' in kona because uq'' works in the original k.
However, uq'' doesn't work in k2.8 either.  (Your workaround does.)

K 2.8 2000-10-10 Copyright (C) 1993-2000 Kx Systems
\ for help. \\ to exit.

  0: "dg2"
("txt:\"he,asked,\\\"\\\"\\\"why not now, steve?\\\"\\\"\\\"\\n\\\"I\\\",said,\\\"\\\"\\\"well judy, because \\\"\\\"\\\"\\\"I\\\"\\\"\\\"\\\" am busy,\\nhar har\\\"\\\"\\\"\""

 "uq:{[m;q;x]:[*q;(1-2**q)_ x@&~q&m;x]}"
 "sp:{[s;m;q;x]+((~~!#l)_')'(l:0,&(~m)&s=x)_/:(m;q;x)}"
 "csv:{m:(~=)\\q:\"\\\"\"=x;uq''sp[\",\"].'sp[\"\\n\"].(m;q;x)}")
 
  .:' 0: "dg2"
(;;;)
 
  csv txt
valence error
{m:(~=)\q:"\""=x;uq''sp[","].'sp["\n"].(m;q;x)}
                 ^
>


Tom Szczesny

unread,
Feb 19, 2022, 3:58:20 PM2/19/22
to kona...@googlegroups.com
I began to look at
> I don't quite understand why uq["\""]'' didn't work but {uq["\""]x}'' works as a workaround.

=========================================================================
I found that although your example works in kona:[tom@localhost kona]$ rlwrap -n ./k

kona      \ for help. \\ to exit.

  0: `quote.csv
("she,asked,\"\"\"why not now, steve?\"\"\""

 "\"I\",said,\"\"\"well judy, because \"\"\"\"I\"\"\"\" am busy."
 "har har\"\"\"%")
 
  0: `dg3
("qm:{s|((+\\s:(x=y))!\\:2)}"

 "sp:{(,*l),1_/:1_ l:((0,&x) _ y)}"
 "uq:{l:{y&~x}':{{-1_(b&1!b:(0,x))}\\x}bs:x=y;c:|/bs;(y@&~(c*f||f:~!#y)||/l@1+2*!_.5*#:l)}"
 "snq:{m:qm[x][z];s:(~m)&y=z;sp[s;z]}"
 "ncr:{x[&~(\"\\r\"=x)&1!\"\\n\"=x]}"
 "csv:{{uq[\"\\\"\"]x}''snq[\"\\\"\";\",\"]'snq[\"\\\"\";\"\\n\"]x}")
 
  txt: 6:`quote.csv
  .:' 0: `dg3

 
  csv[ncr txt]
(("she"
  "asked"
  "\"why not now, steve?\"")
 (,"I"
  "said"
  "\"well judy, because \"\"I\"\" am busy.\nhar har\"\"")
 ,"")

\\
  ===========================================================
I get a syntax error in k2.8:

[tom@localhost k2.8]$ rlwrap -n ./k

K 2.8 2000-10-10 Copyright (C) 1993-2000 Kx Systems
\ for help. \\ to exit.

  0: `quote.csv
("she,asked,\"\"\"why not now, steve?\"\"\""

 "\"I\",said,\"\"\"well judy, because \"\"\"\"I\"\"\"\" am busy."
 "har har\"\"\"%")
 
  0: `dg3
("qm:{s|((+\\s:(x=y))!\\:2)}"

 "sp:{(,*l),1_/:1_ l:((0,&x) _ y)}"
 "uq:{l:{y&~x}':{{-1_(b&1!b:(0,x))}\\x}bs:x=y;c:|/bs;(y@&~(c*f||f:~!#y)||/l@1+2*!_.5*#:l)}"
 "snq:{m:qm[x][z];s:(~m)&y=z;sp[s;z]}"
 "ncr:{x[&~(\"\\r\"=x)&1!\"\\n\"=x]}"
 "csv:{{uq[\"\\\"\"]x}''snq[\"\\\"\";\",\"]'snq[\"\\\"\";\"\\n\"]x}")
 
  txt: 6:`quote.csv
  .:' 0: `dg3
syntax error

l:{y&~x}':{{-1_(b&1!b:(0,x))}\x}bs:x=y;c:|/bs;(y@&~(c*f||f:~!#y)||/l@1+2*!_.5*#:l)
                                                                               ^
parse error
.:' 0: `dg3
^
>
========================================================================

Tom Szczesny

unread,
Feb 19, 2022, 8:30:46 PM2/19/22
to kona...@googlegroups.com
In Kona:[
tom@localhost kona]$ rlwrap -n ./k
kona      \ for help. \\ to exit.

  # 1 2 3
3
  #: 1 2 3
3

****************************************************************************
in k2.8
[tom@localhost k2.8]$ rlwrap -n ./k
K 2.8 2000-10-10 Copyright (C) 1993-2000 Kx Systems
\ for help. \\ to exit.

  # 1 2 3
3
  #: 1 2 3
syntax error
#: 1 2 3
 ^
parse error
In k2.8:

Tom Szczesny

unread,
Feb 19, 2022, 8:41:09 PM2/19/22
to kona...@googlegroups.com
Easy fix:
[tom@localhost k2.8]$ rlwrap -n ./k
K 2.8 2000-10-10 Copyright (C) 1993-2000 Kx Systems
\ for help. \\ to exit.

  0: `quote.csv
("she,asked,\"\"\"why not now, steve?\"\"\""
 "\"I\",said,\"\"\"well judy, because \"\"\"\"I\"\"\"\" am busy."
 "har har\"\"\"%")
 
  0: `dg4

("qm:{s|((+\\s:(x=y))!\\:2)}"
 "sp:{(,*l),1_/:1_ l:((0,&x) _ y)}"
 "uq:{l:{y&~x}':{{-1_(b&1!b:(0,x))}\\x}bs:x=y;c:|/bs;(y@&~(c*f||f:~!#y)||/l@1+2*!_.5*#l)}"

 "snq:{m:qm[x][z];s:(~m)&y=z;sp[s;z]}"
 "ncr:{x[&~(\"\\r\"=x)&1!\"\\n\"=x]}"
 "csv:{{uq[\"\\\"\"]x}''snq[\"\\\"\";\",\"]'snq[\"\\\"\";\"\\n\"]x}")
 
  txt: 6:`quote.csv
  .:' 0: `dg4
(;;;;;)

 
  csv[ncr txt]
(("she"
  "asked"
  "\"why not now, steve?\"")
 (,"I"
  "said"
  "\"well judy, because \"\"I\"\" am busy.\nhar har\"\"")
 ,"")

Tom Szczesny

unread,
Feb 19, 2022, 8:54:09 PM2/19/22
to kona...@googlegroups.com
Regarding:
> I don't quite understand why uq["\""]'' didn't work but {uq["\""]x}'' works as a workaround.

it works in k2.8
This is a BUG in Kona.

[tom@localhost k2.8]$ cat quote.csv

she,asked,"""why not now, steve?"""
"I",said,"""well judy, because """"I"""" am busy.
har har"""%

[tom@localhost k2.8]$ cat dg5

qm:{s|((+\s:(x=y))!\:2)}
sp:{(,*l),1_/:1_ l:((0,&x) _ y)}
uq:{l:{y&~x}':{{-1_(b&1!b:(0,x))}\x}bs:x=y;c:|/bs;(y@&~(c*f||f:~!#y)||/l@1+2*!_.5*#l)}
snq:{m:qm[x][z];s:(~m)&y=z;sp[s;z]}
ncr:{x[&~("\r"=x)&1!"\n"=x]}
csv:{uq["\""]''snq["\"";","]'snq["\"";"\n"]x}

[tom@localhost k2.8]$ rlwrap -n ./k
K 2.8 2000-10-10 Copyright (C) 1993-2000 Kx Systems
\ for help. \\ to exit.
 
  txt: 6:`quote.csv
  .:' 0: `dg5

(;;;;;)
 
  csv[ncr txt]
(("she"
  "asked"
  "\"why not now, steve?\"")
 (,"I"
  "said"
  "\"well judy, because \"\"I\"\" am busy.\nhar har\"\"")
 ,"")

Tom Szczesny

unread,
Feb 19, 2022, 9:10:26 PM2/19/22
to kona...@googlegroups.com
Actually, you have identified 2 bugs since 
#:
should produce a syntax error in Kona.

douglas mennella

unread,
Feb 20, 2022, 2:13:19 AM2/20/22
to Kona Users
It seems to me this last is more likely a bug in k2.8.  It may be unnecessary in this instance, but it should be fair game to indicate that this is a monadic use of #.

I don't have access to k2.8 (can I find a version somewhere?), but it might be worth testing this with other verbs like - or +.  In kona these all seem to work:

kona      \ for help. \\ to exit.

  +:(1 2;3 4)
(1 3
 2 4)
  -:234
-234
  *:!5
0


They all work in ngn/k as well.

douglas mennella

unread,
Feb 20, 2022, 2:31:45 AM2/20/22
to Kona Users
It's not clear to me what's different here.  It might also be a little easier to parse if you use `0:6:`dg4 instead.

  `0:6:`d4

qm:{s|((+\s:(x=y))!\:2)}
sp:{(,*l),1_/:1_ l:((0,&x) _ y)}
uq:{l:{y&~x}':{{-1_(b&1!b:(0,x))}\x}bs:x=y;c:|/bs;(y@&~(c*f||f:~!#y)||/l@1+2*!_.5*#l)}
snq:{m:qm[x][z];s:(~m)&y=z;sp[s;z]}
ncr:{x[&~("\r"=x)&1!"\n"=x]}
csv:{{uq["\""]x}''snq["\"";","]'snq["\"";"\n"]x} 


Oh, I see it now.  It's the superfluous monadizing(TM) :.

Tom Szczesny

unread,
Feb 20, 2022, 10:51:12 PM2/20/22
to kona...@googlegroups.com
They all fail in k3.2
It doesn't look like a bug in k2.8

C:\k3.2>k
K 3.2 2005-06-25 Copyright (C) 1993-2004 Kx Systems
WIN32 8CPU 3904MB desktop-fvkenu9.myfiosgateway.com 0


  #: 1 2 3
syntax error
#: 1 2 3
 ^
parse error


  +:(1 2;3 4)
syntax error

+:(1 2;3 4)
 ^
parse error


  -:234
syntax error
-:234
 ^
parse error

  *:!5
syntax error
*:!5
 ^
parse error

Reply all
Reply to author
Forward
0 new messages