Help with Dataframes

Vikram Rawat

unread,

May 29, 2017, 7:23:51 AM5/29/17

to golang-nuts

Can anybody please tell me how to write GOTA Golang dataframes on a csv...

It's been 2 days I am trying to find a way to write dataframes onto a csv. can anybody please help me understand what does this IO.writer means and how to use it...

I have given up understanding it...

Please any help will be appriciated.

Jesper Louis Andersen

unread,

May 29, 2017, 7:33:00 AM5/29/17

to Vikram Rawat, golang-nuts

Don't give up! When things becomes to daunting, go do something else then come back later. Brains needs some processing time.

Your post suggests that you are missing some background information and that you are plunging into deep waters. io.Writer is an interface, which is a concept central to Go. Make sure you have a good understanding of interfaces first. io.Writer is an abstraction over something you can write to, so it generalizes files, memory buffers, network sockets and so on.

GOTA dataframes seems to have some packages written for it already, so if the format is complex it is perhaps better to use a library which is already written.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Sebastien Binet

unread,

May 29, 2017, 7:33:54 AM5/29/17

to Vikram Rawat, golang-nuts

Vikram,

On Mon, May 29, 2017 at 1:23 PM, Vikram Rawat <vikram...@gmail.com> wrote:

Can anybody please tell me how to write GOTA Golang dataframes on a csv...

It's been 2 days I am trying to find a way to write dataframes onto a csv. can anybody please help me understand what does this IO.writer means and how to use it...

I am not an expert wrt gota/dataframe but here is what I got:

$> go run ./main.go

$> cat out.csv

COL.1,COL.2,COL.3

b,1,3.000000

a,2,4.000000

with the following main.go file:

package main

import (

"log"

"os"

"github.com/kniren/gota/dataframe"

"github.com/kniren/gota/series"

)

func main() {

df := dataframe.New(

series.New([]string{"b", "a"}, series.String, "COL.1"),

series.New([]int{1, 2}, series.Int, "COL.2"),

series.New([]float64{3.0, 4.0}, series.Float, "COL.3"),

)

o, err := os.Create("out.csv")

if err != nil {

log.Fatal(err)

}

defer o.Close()

err = df.WriteCSV(o)

if err != nil {

log.Fatal(err)

}

err = o.Close()

if err != nil {

log.Fatal(err)

}

here is the doc+examples for gota/dataframe:

https://godoc.org/github.com/kniren/gota/dataframe#pkg-examples

hth,

-s

Ayan George

unread,

May 29, 2017, 7:42:30 AM5/29/17

to golan...@googlegroups.com

On 05/29/2017 07:23 AM, Vikram Rawat wrote:> Can anybody please tell me

io.Writer is an interface that matches any concrete type that implements
the Write() method:

https://golang.org/pkg/io/#Writer

os.Create returns a writer that you can use to write to a file like:

w, err := os.Create("myfile.csv")

and you can write your CSV to it using the WriteCSV() method described
below:

https://godoc.org/github.com/kniren/gota/dataframe

So based on the documentation, something like the code below should work:

df := dataframe.LoadRecords(
[][]string{
[]string{"A", "B", "C", "D"},
[]string{"a", "4", "5.1", "true"},
[]string{"b", "4", "6.0", "true"},
[]string{"c", "3", "6.0", "false"},
[]string{"a", "2", "7.1", "false"},
},
)

w, err := os.Create("myfile.csv")

if err == nil {
/* handle os.Create() error here. */
}

df.WriteCSV(w)

...

Vikram Rawat

unread,

May 29, 2017, 7:48:20 AM5/29/17

to golang-nuts

Thank You
Thank You
Thank You
Thank You
Thank You
Thank You
Thank You
Thank You
Thank You

Thank You

Very very very MUCH

My brain was about to bleed to death... I am not a programmer but somebody suggested me GOLANG and I started it just a MONTH Ago.

It's quite different and hard to grasp But if it has an active group like you guys It will surely not die a slow death..

thanks again everybody....

jesper, sebestian and Ayan thanks again guys...

Pee Jai

unread,

Jul 26, 2020, 5:59:50 AM7/26/20

to golang-nuts

I created https://github.com/rocketlaunchr/dataframe-go to make dealing with data much easier.

It has an example code snipped in the docs on how to write dataframes to a csv file.

I created it because I found gota to be very cumbersome to use.

Yassine KICH

unread,

Sep 24, 2025, 3:14:43 PM (5 days ago) Sep 24

to golang-nuts

I recommend using https://github.com/kishyassin/goframe

Jason E. Aten

unread,

Sep 24, 2025, 6:19:23 PM (5 days ago) Sep 24

to golang-nuts

Hi Vikram,

Sounds like you got it working--great! Also the LLMs are terrific for explaining language concepts

if you are stuck conceptually.

If you need a dataframe package that scales to big data

(as it turns out parsing floating

point numbers is a very slow operation),

I wrote a use-all-cores fast parallel loading dataframe

for Go called SlurpDF. I was envious of how

fast R's data.table could read in CSV files in parallel. See

https://github.com/glycerine/slurpdf

See slurp_test.go for an example of writing back to CSV on disk.

(this was in service of a little Xgboost-like gradient boosted decision

tree ensemble machine learner, e.g. https://github.com/glycerine/gocortado)

Enjoy,

Jason

robert engels

unread,

Sep 24, 2025, 6:31:17 PM (5 days ago) Sep 24

to Jason E. Aten, golang-nuts

As an aside, your slurp isn’t really doing what you think.

The line byby := bytes.Split(buf, newline) is causing the entire file to be read into memory on a single core, which is unnecessary.

You need to modify the code a bit to get the optimum performance.

You should calculate a base offset which is (total file size / number of cores).

Then calculate the actual offsets by seeking to that point, then advancing to the next new line, then do the same for the rest - so then you having an array of slices - each of which is a portion of the file.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.

To view this discussion visit https://groups.google.com/d/msgid/golang-nuts/a6bf2f0f-4775-4e03-a69e-c567e45d8db1n%40googlegroups.com.

Jason E. Aten

unread,

Sep 24, 2025, 11:18:42 PM (5 days ago) Sep 24

to golang-nuts

Thanks Robert. Of course you are right, and a pull request would be welcome :)

Seriously though -- I do appreciate the comment. At the time, if

I remember -- this was 2 years ago when I wrote it -- I recall

not wanting to complicate the code by having to deal

with the CVS file lines that got split between two goroutines

if I didn't find the newlines first. Once you do that

you need more locking to resolve the conflict and

not step on the same memory another goroutine is using...

much more coordination seemed necessary.

That and the true bottle neck usually being the

parsing of the floats means once I matched what

the C code for data.table was doing, I moved on. So yes,

it could be faster, but the simpler code was appealing.

- J

On Wednesday, September 24, 2025 at 11:31:17 PM UTC+1 robert engels wrote:

As an aside, your slurp isn’t really doing what you think.

The line byby := bytes.Split(buf, newline) is causing the entire file to be read into memory on a single core, which is unnecessary.

You need to modify the code a bit to get the optimum performance.

You should calculate a base offset which is (total file size / number of cores).

Then calculate the actual offsets by seeking to that point, then advancing to the next new line, then do the same for the rest - so then you having an array of slices - each of which is a portion of the file.

Reply all

Reply to author

Forward