Best way to read an STL binary file

2,189 views
Skip to first unread message

Guillermo Estrada

unread,
Sep 30, 2013, 10:41:59 PM9/30/13
to golan...@googlegroups.com
Hi, I'm writing an STL parser for a project, and I like some idiomatic, but more over performance opinions. An STL binary file is a triangle file and its pretty simple (http://en.wikipedia.org/wiki/STL_(file_format)), I defined something like this...

type Model struct {
  Header      [80]byte
  Count       uint32
  Triangles   []Triangle
}

type Triangle struct {
  Vertex1    [3]float32
  Vertex2    [3]float32
  Vertex3    [3]float32
  Attribute     uint16
}

My problem as always is that Go has never liked byte arrays, it encourages byte slices for everything, and I'm used to parsing with bytes.buffer and encoding/binary libraries, but they have issues with byte arrays as in "invalid type" issues as far as I have tested, usually all my structs when dealing with this have ints, floats, and all that "defined size" stuff, I though a [80]byte would be the same. I like byte arrays cause you can be sure how much memory you are using (no memory leaks, garbage collection, memory copying and stuff) and I need this part to perform as tight as possible. Right now I'm using ioutil to read the whole file and get my structs, but I am pretty sure the optimal way would be using bufio in case of large files.

So... what would be the optimal (as in less memory and parse time) method to store my model using slices? Do i preallocate slice capacity in every single one? Buffer size? (cause im pretty sure reading a triangle (14 bytes) at a time would be painfully slow. And then again... would using slices would not incur in the usual memory copy operations when trying to expand them? Good thing is I have the Triangle count on bytes [80:84] so I can proceed then to allocate length/capacity of my Triangle slice. Any method is welcome.

Ty Gophers,

Guillermo Estrada

unread,
Sep 30, 2013, 10:57:27 PM9/30/13
to golan...@googlegroups.com
I screwed the endianess on one try, this seems to work though:

  data, err := ioutil.ReadFile(filename)
  if err != nil { 
    panic(err)
  }
  fmt.Println(len(data))
  buf := new(bytes.Buffer)
  m := new(Model)
  buf.Write(data[0:84])
  err = binary.Read(buf, binary.LittleEndian, m)
  if err != nil {
    panic(err)
  }
  fmt.Println("Triangles:", m.Count)

The triangle number seems to work, but I had to modify my Model struct because of the Triangle Slice, I guess I'll have to read that on another one. Any other thoughts...

Ian Lance Taylor

unread,
Sep 30, 2013, 11:03:33 PM9/30/13
to Guillermo Estrada, golang-nuts
On Mon, Sep 30, 2013 at 7:41 PM, Guillermo Estrada <phro...@gmail.com> wrote:
>
> My problem as always is that Go has never liked byte arrays, it encourages
> byte slices for everything, and I'm used to parsing with bytes.buffer and
> encoding/binary libraries, but they have issues with byte arrays as in
> "invalid type" issues as far as I have tested, usually all my structs when
> dealing with this have ints, floats, and all that "defined size" stuff, I
> though a [80]byte would be the same.

If a function takes a slice, and you have an array, you can always
pass the array to the function by using [:] to slice the array.

http://play.golang.org/p/ZEZObSrqyW

Ian

Elliott Polk

unread,
Sep 30, 2013, 11:04:33 PM9/30/13
to golan...@googlegroups.com
Unrelated to Go, just remember there are 2 types of STL formatting. I work for a 3D printer manufacture and we're also working with STL files. I wish you the best if you're writing a Go based slicer. This is currently on my list as well.

Guillermo Estrada

unread,
Sep 30, 2013, 11:10:54 PM9/30/13
to golan...@googlegroups.com
Eliott, now that you mention it... It is a Go based slicer for a 3D printer preview program. What two types are you referring to? ASCII and Binary?

Erwin

unread,
Oct 1, 2013, 7:12:07 AM10/1/13
to Guillermo Estrada, golang-nuts
I think for optimal read performance you'll need to use package unsafe and avoid encoding/binary. After you have read in the header (which you can do with encoding/binary because it's not a lot), you can allocate a byte slice the size of triangle data that will follow. Just read this with one call to os File.Read(), which should be fast. Once you have your byte slice, you can use package unsafe to convert the slice to a []Triangle without copying anything. It's like C where you cast a void * to the expected data type.  

something like this (untested):

const triSize = int(unsafe.Sizeof(Triangle{}))

bytes = make([]byte, model.Count*triSize)
_, err := file.Read()
...

// create a []Triangle from []byte
var slice = struct {
    addr uintptr
    len int
    cap int
}{
    addr: uintptr(unsafe.Pointer(&bytes[0])
    len: len(bytes) / triSize
    cap: cap(bytes) / triSize
}
triangles := *(*[]Triangle)(unsafe.Pointer(&slice))


I haven't done the above with .stl files yet, but with other types of files that have large chunks of similar data, and it turns out to be much faster than using encoding/binary. Ideally encoding/binary would be optimized to use a similar technique (when the file byte order matches the machine byte order), so one could simply use encoding/binary and read the []Triangle in a single call as well and it would fly like the code above.










--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Guillermo Estrada

unread,
Oct 1, 2013, 11:28:20 AM10/1/13
to golan...@googlegroups.com
I still have to test using unsafe library to read all the data. But wgile using encoding/binary I have a problem like this

Reading from the buffer data[0:84] works while using encoding/binary

type Model struct {
  Header      [80]byte
  Count       uint32
}

Reading from the buffer data[84:98] panics with an unexpected EOF, I think it has something to do with the 8 byte allocation on 64 bit systems, but I have not had this problem before. Panics the same with the float arrays or 9 floats, just in case you were wondering.
type Triangle struct {
  Vertex1    [3]float32
  Vertex2    [3]float32
  Vertex3    [3]float32
  Attribute     uint16
}

Any ideas on this? Why is this happening? 

Ian Lance Taylor

unread,
Oct 1, 2013, 11:39:55 AM10/1/13
to Guillermo Estrada, golang-nuts
On Tue, Oct 1, 2013 at 8:28 AM, Guillermo Estrada <phro...@gmail.com> wrote:
>
> Reading from the buffer data[0:84] works while using encoding/binary
>
>> type Model struct {
>> Header [80]byte
>> Count uint32
>> }
>
>
> Reading from the buffer data[84:98] panics with an unexpected EOF, I think
> it has something to do with the 8 byte allocation on 64 bit systems, but I
> have not had this problem before. Panics the same with the float arrays or 9
> floats, just in case you were wondering.
>>
>> type Triangle struct {
>> Vertex1 [3]float32
>> Vertex2 [3]float32
>> Vertex3 [3]float32
>> Attribute uint16
>> }
>
>
> Any ideas on this? Why is this happening?

Show us the code.

Ian

Guillermo Estrada

unread,
Oct 1, 2013, 11:44:50 AM10/1/13
to golan...@googlegroups.com, Guillermo Estrada
@Ian

// READ THE WHOLE FILE
  data, err := ioutil.ReadFile(filename)
  if err != nil { 
    panic(err)
  }

//CREATE A BUFFER
  buf := new(bytes.Buffer)
  m := new(Model)

//WRITING 84 bytes OF DATA TO A BUFFER AND DECODING IN MODEL WORKS
  buf.Write(data[0:84])
  err = binary.Read(buf, binary.LittleEndian,m)
  if err != nil {
    panic(err)
  }
  buf.Reset()

  tri := new(Triangle)

//WRITING 14 bytes OF DATA AND DECODING INTO TRIANGLE PANICS WITH UNEXPECTED EOF
  buf.Write(data[84:98])
  err = binary.Read(buf, binary.LittleEndian, tri)
  if err != nil {
    panic(err)
  }
  buf.Reset()

Guillermo Estrada

unread,
Oct 1, 2013, 11:55:36 AM10/1/13
to golan...@googlegroups.com, Guillermo Estrada
I am realizing Triangle struct is actually 38 bytes long.. Silly me. Anyway, I'll do a quick benchmark using ioutil, bytes.buffer, and encoding/binary to check for speed and memory usage, and I'll proceed to do use unsafe, to compare the options. Also, I guess bufio is a better way to read the file. Any recommended approach for large files for optimal parsing?

Guillermo Estrada

unread,
Oct 1, 2013, 4:58:35 PM10/1/13
to golan...@googlegroups.com, Guillermo Estrada
Ok I did what you proposed! I used nitro to get a simple profiling of the data, I exported both methods to an ASCII version of the STL and both are identical so... Here is what I got.

Parsing STL:
        7.0004ms (8.0005ms)         1.24 MB     33347 Allocs
Parsing Unsafe STL:
        1.0001ms (13.0008ms)        0.26 MB     27 Allocs

That is a pretty small file with just 4.7K Triangles, I will do more testing with different file sizes. On the other hand...

@Elliott what is the biggest size you have seen on a STL for a 3D Printer? with unsafe method, bufio might not be needed after all...

I think for optimal read performance you'll need to use package unsafe and avoid encoding/binary. After you have read in the header (which you can do with encoding/binary because it's not a lot), you can allocate a byte slice the size of triangle data that will follow. Just read this with one call to os File.Read(), which should be fast. Once you have your byte slice, you can use package unsafe to convert the slice to a []Triangle without copying anything. It's like C where you cast a void * to the expected data type.  
 
I haven't done the above with .stl files yet, but with other types of files that have large chunks of similar data, and it turns out to be much faster than using encoding/binary. Ideally encoding/binary would be optimized to use a similar technique (when the file byte order matches the machine byte order), so one could simply use encoding/binary and read the []Triangle in a single call as well and it would fly like the code above.








Erwin

unread,
Oct 1, 2013, 5:59:42 PM10/1/13
to Guillermo Estrada, golang-nuts

@Elliott what is the biggest size you have seen on a STL for a 3D Printer? with unsafe method, bufio might not be needed after all...


I've been printing files up the 350MB, larger than that, the printer software balks... I suppose you'll find gigabyte .stl files in the wild. 
 

布艾德 - Elliott Polk

unread,
Oct 1, 2013, 9:02:54 PM10/1/13
to Erwin, golang-nuts, Guillermo Estrada

The biggest we're playing with is around 9.5MB, tho we're just looking at things from thingiverse at the moment.

Have a look for a large desk tidy or even the Yoda is a decent stress test.

Is this a personal project? Feel free to ping me outside of this channel. Would be interested to follow the progress.

On Oct 2, 2013 6:00 AM, "Erwin" <snes...@gmail.com> wrote:

@Elliott what is the biggest size you have seen on a STL for a 3D Printer? with unsafe method, bufio might not be needed after all...


I've been printing files up the 350MB, larger than that, the printer software balks... I suppose you'll find gigabyte .stl files in the wild. 
 

--
You received this message because you are subscribed to a topic in the Google Groups "golang-nuts" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-nuts/eqtmH2WGSKE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to golang-nuts...@googlegroups.com.

Guillermo Estrada

unread,
Oct 1, 2013, 11:17:24 PM10/1/13
to golan...@googlegroups.com, Guillermo Estrada
Apparently although everything "looked" fine, I am unable to get it right using the unsafe method described by @notnot
The first normal and vertex are all good, but I'm having different data after the first one...
Does anyone have an example?

My code looks like this...

type Triangle struct {
  Normal      [3]float32
  Vertex1     [3]float32
  Vertex2     [3]float32
  Vertex3     [3]float32
  Attribute   uint16
}

func ParseUnsafeSTL(filename string) *Model {
  data, err := ioutil.ReadFile(filename)
  if err != nil { 
    panic(err)
  }
  m := new(Model)
  binary.Read(bytes.NewBuffer(data[80:84]), binary.LittleEndian, &m.Length)
  // create a []Triangle from []byte
  triSize := int(unsafe.Sizeof(Triangle{}))
  var slice = struct {
      addr uintptr
      len int
      cap int
  }{
      addr: uintptr(unsafe.Pointer(&data[84])),
      len: len(data[84:]) / triSize,
      cap: cap(data[84:]) / triSize,
  }
  m.Triangles = *(*[]Triangle)(unsafe.Pointer(&slice))
  return m
}

Only the first one read is fine, I suppose it has something to do with the sizes, but I'm unsure as I have never used unsafe before. Any help is welcome.

Erwin

unread,
Oct 2, 2013, 12:09:50 AM10/2/13
to Guillermo Estrada, golang-nuts

Only the first one read is fine, I suppose it has something to do with the sizes, but I'm unsure as I have never used unsafe before. Any help is welcome.

What is the value of triSize exactly? I suspect it is 52 instead of 50, for alignment purposes. Too bad, because then the above unsafe conversion from []byte to []Triangle won't work. In memory a Triangle is 52 bytes, in the file it is 50 bytes. So it looks like you'll have to decode triangle after triangle...
 

Guillermo Estrada

unread,
Oct 2, 2013, 1:07:58 AM10/2/13
to golan...@googlegroups.com, Guillermo Estrada
Yeah I was expecting something like that to be troublesome...
This is the Best Implementation I got yet. It loads everything in memory on a single go, guess larger files will be troublesome and I must do a bufio implementation, but I'm already expecting it to be kinda slow, cause I guess I will be reading a triangle at a time? Anyway, thoughts on optimizing this code, or on a performant bufio implementation are always welcome.

func ParseSTL(filename string) *Model {

  // Reading entire STL file into memory
  data, err := ioutil.ReadFile(filename)
  if err != nil { 
    panic(err)
  }
  
  m := new(Model)

  // Parsing Header first 80 bytes.
  err = binary.Read(bytes.NewBuffer(data[0:80]), binary.LittleEndian, &m.Header)
  if err != nil {
    panic(err)
  }

  // Parsing triangle count uint32 at byte 80
  err = binary.Read(bytes.NewBuffer(data[80:84]), binary.LittleEndian, &m.Length)
  if err != nil {
    panic(err)
  }

  // Allocating enough memory for all the triangles in the slice
  m.Triangles = make([]Triangle, m.Length)

  // Parsing the Triangle slice on byte 84 onwards, 50 bytes per triangle
  err = binary.Read(bytes.NewBuffer(data[84:]), binary.LittleEndian, &m.Triangles)
  if err != nil {
    panic(err)
  }
  
  return m
}

Jesse van den Kieboom

unread,
Oct 2, 2013, 3:14:25 AM10/2/13
to golan...@googlegroups.com
Wrt the alignment problem, you might be able to make a special io.Reader which emits bytes for the padding as you read from an underlying io.Reader. Not sure if it will be faster though.

Guillermo Estrada

unread,
Oct 2, 2013, 3:18:21 PM10/2/13
to golan...@googlegroups.com


On Wednesday, October 2, 2013 2:14:25 AM UTC-5, Jesse van den Kieboom wrote:
Wrt the alignment problem, you might be able to make a special io.Reader which emits bytes for the padding as you read from an underlying io.Reader. Not sure if it will be faster though.

Yeah although the idea behind using ioutil to read the entire file at once rely on the possibility to use that already reserved memory and data with unsafe to cast it to my Triangle slice to make it as efficient as possible.

Once I have to actually check the data read it becomes the exercise of optimizing the allocations in memory of data read into my structs, either way I'm better off there with bufio because I cannot use in my struct slice the same memory space I used to read the file in the first place. I'll post the bufio implementation and some numbers to get ideas on optimizations. Thnx!

Erwin

unread,
Oct 2, 2013, 3:54:17 PM10/2/13
to Guillermo Estrada, golang-nuts
Perhaps a reasonable compromise is to read the file in smaller chunks, using a smaller read buffer (say 50000 bytes -> 1000 triangles) and allocate the full triangle slice once you know the number of them, then decode the triangles one by one into your triangle slice. That way you are keeping the number of file read calls low, and you don't have to allocate twice the memory that you need for the model?

Guillermo Estrada

unread,
Oct 2, 2013, 4:05:23 PM10/2/13
to golan...@googlegroups.com, Guillermo Estrada
That not only sounds plausible, but looks like it is the way to go. Problem is... I already did!! This is the profiling of the loading of the XYZ Stanford Dragon (352MB STL binary file)

Parsing STL:
        3.8892224s (3.8892224s)  1097.35 MB     57 Allocs
File has 7219045 triangles.

Parsing STL (buffered):
        3.8702214s (7.7594438s)  1097.32 MB     41 Allocs
File has 7219045 triangles.

Numbers are almost identical! And I found no gain in the buffered reading! Something must be wrong, but I'm pretty sure I'm reading a triangle at a time with a 50 bytes buffer! Can someone help me with this? 

Code:

Seems like memory usage in the buffered version is the same, either that or the profiler is reporting it wrong... Can someone else runs tests?

Guillermo Estrada

unread,
Oct 2, 2013, 4:18:07 PM10/2/13
to golan...@googlegroups.com, Guillermo Estrada
Sorry wrong posting of data, this is the real one, and its even worse. Seems like that memory allocation alone while doing triangle per triangle (I could read 1000 at a time with the buffer but would be kinda the same) slows the process and consumes more memory that loading the whole file with ioutil. I'm pretty sure my bufio version is not great, but I didn't suppose it would be soo bad. Everytime you call encoding/binary Read on a Triangle, more memory is used, and as far as I know, Garbage collector won't work until the function returns making the ioutil reader a lot better in almost all cases.

Parsing STL:
        4.0122294s (4.0122294s)  1097.35 MB     57 Allocs
File has 7219045 triangles.

Parsing STL (buffered):
        11.1656386s (15.1788681s)        2315.18 MB     58100620 Allocs
File has 7219045 triangles.


Anyway, I hope someone can come with a better solution. Help and code examples appreciated.
 

Guillermo Estrada

unread,
Oct 2, 2013, 4:53:29 PM10/2/13
to golan...@googlegroups.com, Guillermo Estrada
Ok, getting some progress with the buffered version...

Parsing STL:
        3.8682212s (3.8692213s)  1097.35 MB     57 Allocs
File has 7219045 triangles.

Parsing STL (buffered) 100:
        566.0324ms (4.4352537s)   395.42 MB     652648 Allocs
File has 7219045 triangles.

Parsing STL (buffered) 1000:
        142.0081ms (4.5782618s)   377.46 MB     65311 Allocs
File has 7219045 triangles.

Parsing STL (buffered) 10000:
        102.0058ms (4.6812677s)   376.12 MB     6547 Allocs
File has 7219045 triangles

Parsing STL (buffered) 100000:
        110.0063ms (4.8582779s)   380.42 MB     699 Allocs
File has 7219045 triangles.

Now this is a lot of progress, tests are run with a 100, 1K, 10K, and 100K triangles (*50bytes) buffers. And assigning them like in a go. As we can see the difference in memory usage  between 1K and a 10K is not much, but in this case the speed is. 376 MB of RAM used in parsing a 352MB file works for me. also 100ms on the parsing of the file is great!

I hope people trying to parse binary file formats in an optimal way find this thread. Anyway. comments are still welcome! Ty Gophers.

Erwin

unread,
Oct 2, 2013, 5:22:03 PM10/2/13
to Guillermo Estrada, golang-nuts
excellent progress! what exactly did you change from the previous version?

Guillermo Estrada

unread,
Oct 2, 2013, 6:08:49 PM10/2/13
to golan...@googlegroups.com, Guillermo Estrada
First of all I incremented the buffer size, but checking the result files converted to STL ASCII format, only the first Triangle is fine. So I guess I'll have to find a way to read a 1000 triangles and assign them to the slice in a single memory operation because binary.Read() only plays well with fixed size data, I guess I will have to create a secondary slice of the size and copy the data over to the Triangle slice (copy built in func maybe), that will surely put more overhead on memory, but at least won't put memory allocation calls to the roof.

I'll keep testing code, profiling and I'll post back the results with the code. If anyone knows a way to use a buffered read of binary file and do large chunks assignment to a slice would be GREAT!

Guillermo Estrada

unread,
Oct 2, 2013, 6:49:09 PM10/2/13
to golan...@googlegroups.com, Guillermo Estrada
OK, I'm officially lost, I'm trying the copy slice form and it kinda works (and performs OK), But I'm getting sometimes the same values, and some other different values. No errors at all. Dunno if padding has something to do with it. Here is he code if someone wants to try it out. Just grab any STL file out there.

Guillermo Estrada

unread,
Oct 2, 2013, 6:53:41 PM10/2/13
to golan...@googlegroups.com, Guillermo Estrada
Here is a simple Output of the program if helps, I still dunno why the first read is always length 80! (I already read 84 bytes out of the reader) I dunno if that is the problem, but as you can see, sometimes I get good results and sometimes bad. Maybe I have to set the buffer size of the reader? Ty.

>STL -file="GEAR.STL" -stepAnalysis
GEAR.STL
Parsing STL:
        5.0042ms (8.0069ms)         0.76 MB     56 Allocs
File has 4748 triangles.
true 0 4012 80 80
true 80 5000 100 100
true 180 5000 100 100
true 280 5000 100 100
true 380 5000 100 100
false 480 5000 100 100
true 580 5000 100 100
true 680 5000 100 100
false 780 5000 100 100
true 880 5000 100 100
false 980 5000 100 100
true 1080 5000 100 100
true 1180 5000 100 100
false 1280 5000 100 100
false 1380 5000 100 100
false 1480 5000 100 100
true 1580 5000 100 100
true 1680 5000 100 100
true 1780 5000 100 100
false 1880 5000 100 100
true 1980 5000 100 100
true 2080 5000 100 100
false 2180 5000 100 100
true 2280 5000 100 100
true 2380 5000 100 100
false 2480 5000 100 100
true 2580 5000 100 100
false 2680 5000 100 100
true 2780 5000 100 100
true 2880 5000 100 100
true 2980 5000 100 100
true 3080 5000 100 100
false 3180 5000 100 100
true 3280 5000 100 100
false 3380 5000 100 100
true 3480 5000 100 100
true 3580 5000 100 100
true 3680 5000 100 100
true 3780 5000 100 100
true 3880 5000 100 100
true 3980 5000 100 100
true 4080 5000 100 100
true 4180 5000 100 100
true 4280 5000 100 100
false 4380 5000 100 100
true 4480 5000 100 100
false 4580 5000 100 100
true 4680 3388 67 67
Parsing STL (buffered) 100:
        18.0157ms (26.0226ms)       0.61 MB     666 Allocs
File has 4748 triangles.
true 0 4012 80 80
false 80 50000 1000 1000
false 1080 50000 1000 1000
false 2080 50000 1000 1000
false 3080 50000 1000 1000
false 4080 33388 667 667
Parsing STL (buffered) 1000:
        5.0034ms (34.029ms)         0.69 MB     111 Allocs
File has 4748 triangles.
Saving ASCII:
        81.068ms (115.097ms)        0.28 MB     33263 Allocs
Saving ASCII:
        77.0633ms (193.1612ms)      0.27 MB     33254 Allocs
Saving ASCII:
        77.065ms (270.2262ms)       0.27 MB     33256 Allocs

Guillermo Estrada

unread,
Oct 2, 2013, 7:16:27 PM10/2/13
to golan...@googlegroups.com, Guillermo Estrada
Ok, so I wish there was a more "elegant" way of doing this, and maybe someone can point it out for me. Here is what I added to the code.

  rs :=  io.ReadSeeker(file)
  rs.Seek(84,0)
  b = bufio.NewReaderSize(rs, 50*size)

1) Open File and create a bufio.Reader out of it to read the first 84 bytes, with a 84 byte buffer ([]byte slice)
2) Afterwards I had to create a new io.ReadSeeker of the same file, and seek to position 84 from the start of the file.
3) Create a new bufio.Reader with buffer size of at least 50 bytes * TriangleBufferSize, no ensure I'll always read full triangles.
4) Profit...

Thing is to optimize as possible I'm reading 1000 Triangles at a time, But That buffer size is variable depending on the number of triangles in the file. But for that... I have to red the 84 first bytes of the file. And then create another 2 Readers of it? If I use the first bufio.Reader, I always start with a used buffer of X size and problems start. Is there a way to cleanup this mess in an idiomatic, elegant, non-c-monkey-hack way?

Still, numbers aren't THAT great, but certainly are better than the alternatives. One thing that pops out is that 1000 Triangles seems to be the best choice in almost any situation. That checks out with something I read somewhere in this group about keeping buffers as close to but below 65,535 bytes (50*1000 is close) to optimize them, something to do with register size on CPU and RAM and stuff.

STL -file="Yoda_fixed.stl" -stepAnalysis
Yoda_fixed.stl
Parsing STL:
        220.1787ms (222.1846ms)    73.15 MB     60 Allocs
File has 480844 triangles.
Parsing STL (buffered) 100:
        225.2346ms (447.4192ms)    50.70 MB     43513 Allocs
File has 480844 triangles.
Parsing STL (buffered) 1000:
        213.1773ms (660.5965ms)    50.90 MB     4375 Allocs
File has 480844 triangles.
Parsing STL (buffered) 10000:
        216.1788ms (876.7753ms)    51.25 MB     483 Allocs
File has 480844 triangles.

Thanxs everyone, code is still here: https://gist.github.com/phrozen/6799694
If someone with more experience with Readers and Buffers can give it a try and make observations on it, I will appreciate it a lot.

Kyle Lemons

unread,
Oct 2, 2013, 10:27:11 PM10/2/13
to Guillermo Estrada, golang-nuts
a few things.  Sorry I'm so late to the game.  If I'm suggesting a repeat, let me know.

Just start out by reading from the file directly.  You don't know your buffer size yet, so hold off on the bufio.

use io.ReadFull

bytes.NewReader

however, you should be able to read these straight out of the file, you shouldn't need to read it in and then make a bytes.Reader

if you don't make a bufio.Reader above, you don't need to seek, and can just make the ReaderSize here.

You've got some crazy double-buffering going on here.  Get rid of "buf" and just read straight out of the bufio.Reader.  When it empties, it will fill itself up, and then you can binary.Read straight out of it.

I would be very surprised if this is helping you.  Read straight into the triangle you're currently decoding.  If your profile shows that you're spending a lot of time in reflect, unroll that binary.Read into a ReadFull and then the right sequence of calls to binary.LittleEndian.*


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.

Jesse van den Kieboom

unread,
Oct 3, 2013, 5:00:18 AM10/3/13
to golan...@googlegroups.com, Guillermo Estrada
I tried writing some code to parse binary STL files. The biggest bottleneck (on my PC) is reading from the harddisk. Reading it all at once seems to give the best performance I can measure. Assuming you are on a little endian machine, and that alignment causes the STL triangle struct to be 52 bytes, you can try this code:


It's unsafe, of course, but should be pretty fast:

63.628341ms (63.79001ms)   49.16 MB 15 Allocs

That's of course on my PC and the disk is hot, so can't really compare those numbers with yours (it's on the Yoda_fixed.stl model).

Guillermo Estrada

unread,
Oct 3, 2013, 10:40:57 AM10/3/13
to golan...@googlegroups.com, Guillermo Estrada
Thanks Kyle! I was expecting your take on this, I have yet to implement those changes once I read about them again. Menwhile I tried Jesse's code and... here are some results. Uses the same memory as the buffered version, but its faster! (without Kyle's corrections yet...) I'll post more results once I correct the buffered one. Kudos!

>STL -file="xyzrgb_dragon.stl" -stepAnalysis
C:\Users\highurz1\Desktop\dragon_recon\xyzrgb_dragon.stl

Parsing STL:
        4.0142296s (4.0142296s)  1097.35 MB     57 Allocs
File has 7219045 triangles.

Parsing STL (buffered) 10000:
        3.8372194s (7.851449s)    740.84 MB     6561 Allocs
File has 7219045 triangles.

Parsing STL (unsafe):
        460.0264ms (8.3114754s)   736.37 MB     37 Allocs
File has 7219045 triangles.

roger peppe

unread,
Oct 3, 2013, 11:48:30 AM10/3/13
to Jesse van den Kieboom, golang-nuts, Guillermo Estrada
That will give the wrong result if run on a big-endian machine, no?
Or if the Go compiler starts to use different alignment rules.

Please let's not go this way

Jesse van den Kieboom

unread,
Oct 3, 2013, 12:30:44 PM10/3/13
to golan...@googlegroups.com, Jesse van den Kieboom, Guillermo Estrada
On Thursday, October 3, 2013 5:48:30 PM UTC+2, rog wrote:
That will give the wrong result if run on a big-endian machine, no?
Or if the Go compiler starts to use different alignment rules.

Please let's not go this way

The standard library does the same thing where performance matters. What I do when I really care enough to use unsafe is to test at runtime for endianness and alignment and choose the fast path only when possible. What's wrong with that?

Jan Mercl

unread,
Oct 3, 2013, 12:48:59 PM10/3/13
to Jesse van den Kieboom, golang-nuts, Guillermo Estrada
On Thu, Oct 3, 2013 at 6:30 PM, Jesse van den Kieboom
<jess...@gmail.com> wrote:
> The standard library does the same thing where performance matters.

I'm not aware of any single place where any stdlib code is not
endianess agnostic. Can you please point out such place? I'll fill a
bug in that case.

-j

PS: Really good reading is
http://commandcenter.blogspot.de/2012/04/byte-order-fallacy.html

Jesse van den Kieboom

unread,
Oct 3, 2013, 12:55:01 PM10/3/13
to golan...@googlegroups.com, Jesse van den Kieboom, Guillermo Estrada
On Thursday, October 3, 2013 6:48:59 PM UTC+2, Jan Mercl wrote:
On Thu, Oct 3, 2013 at 6:30 PM, Jesse van den Kieboom
<jess...@gmail.com> wrote:
> The standard library does the same thing where performance matters.

I'm not aware of any single place where any stdlib code is not
endianess agnostic. Can you please point out such place? I'll fill a
bug in that case.

crypto/md5/gen.go

Jesse van den Kieboom

unread,
Oct 3, 2013, 1:03:22 PM10/3/13
to golan...@googlegroups.com, Jesse van den Kieboom, Guillermo Estrada
On Thursday, October 3, 2013 6:48:59 PM UTC+2, Jan Mercl wrote:
Sure, I read that already, and I don't disagree that you should normally not care about endianness. But when your code ends up processing a file 4 or 5 times faster and you _care_ about this performance, then in my opinion you can justify caring. 

Jan Mercl

unread,
Oct 3, 2013, 1:12:54 PM10/3/13
to Jesse van den Kieboom, golang-nuts, Guillermo Estrada
On Thu, Oct 3, 2013 at 6:55 PM, Jesse van den Kieboom
<jess...@gmail.com> wrote:
> crypto/md5/gen.go

gen.go produces the same result regardless of endianess. The file this
tool generates, md5block.go produces the same results regardless of
endianess.

Your code works for one specific endianess and fails for the other one.

-j

roger peppe

unread,
Oct 3, 2013, 1:19:33 PM10/3/13
to Jan Mercl, Jesse van den Kieboom, golang-nuts, Guillermo Estrada
Not even mentioning potential alignment issues, which md5block.go
copes with ok. Doing this right is awkward.

It would be nice to have a version of the encoding/binary package
that could optionally generate custom Go code for marshalling
and unmarshalling given types. Hopefully the API would not
need to change at all.

Jesse van den Kieboom

unread,
Oct 3, 2013, 1:26:19 PM10/3/13
to golan...@googlegroups.com, Jesse van den Kieboom, Guillermo Estrada
I also said, and I quote "What I do when I really care enough to use unsafe is to test at runtime for endianness and alignment and choose the fast path only when possible"
 

-j

Guillermo Estrada

unread,
Oct 3, 2013, 8:15:12 PM10/3/13
to golan...@googlegroups.com, Guillermo Estrada
Hey Kyle, I'm trying to go through all this changes, but I'm not used to the Reader Interface. Should I create a bytes.NewReader to feed into the io.Reader with a set size? One other thing, you mention the copy operation is useless and that I should read directly to the Triangle slice... thing is, how do I use a slice of my slice as a param in the binary.Read function. I already tried that, and only the first Triangle was copied correctly when I sent the pointer to the first element of the subslice.

If you have an example of code with a bytes.Reader to send into my io.ReaderSize with a set sized buffer, that would be awesome! (and I can get rid of buf, and the bytes.Buffer I'm used to) Then I can just figure out the best way to copy the data to the Triangle slice with binary.Read or else. Thing is, I'm concerned how parsing a file requires at least twice the memory even while buffering the read... I suppose that should not happen and I hope you changes help me fix that.

Kyle Lemons

unread,
Oct 4, 2013, 2:40:13 AM10/4/13
to Guillermo Estrada, golang-nuts
On Thu, Oct 3, 2013 at 5:15 PM, Guillermo Estrada <phro...@gmail.com> wrote:
Hey Kyle, I'm trying to go through all this changes, but I'm not used to the Reader Interface. Should I create a bytes.NewReader to feed into the io.Reader with a set size? One other thing, you mention the copy operation is useless and that I should read directly to the Triangle slice... thing is, how do I use a slice of my slice as a param in the binary.Read function. I already tried that, and only the first Triangle was copied correctly when I sent the pointer to the first element of the subslice.

To feed into which io.Reader?  The one for binary.Read?  I didn't look at the code again, but I think it should look like this:

f, err := os.Open(filename)
if err != nil { ... }
defer f.Close()

var header Header
if err := binary.Read(f, binary.LittleEndian, &header); err != nil { ... }

triangles := make([]Triangle, header.Triangles)

r := bufio.NewReaderSize(f, BufferTriangles*binary.Size(Triangle{}))
for i := range triangles {
  if err := binary.Read(f, binary.LittleEndian, &triangles[i]); err != nil { ... }
}
 
If you have an example of code with a bytes.Reader to send into my io.ReaderSize with a set sized buffer, that would be awesome! (and I can get rid of buf, and the bytes.Buffer I'm used to) Then I can just figure out the best way to copy the data to the Triangle slice with binary.Read or else. Thing is, I'm concerned how parsing a file requires at least twice the memory even while buffering the read... I suppose that should not happen and I hope you changes help me fix that.


On Wednesday, October 2, 2013 9:27:11 PM UTC-5, Kyle Lemons wrote:
a few things.  Sorry I'm so late to the game.  If I'm suggesting a repeat, let me know.

Just start out by reading from the file directly.  You don't know your buffer size yet, so hold off on the bufio.

use io.ReadFull

bytes.NewReader

however, you should be able to read these straight out of the file, you shouldn't need to read it in and then make a bytes.Reader

if you don't make a bufio.Reader above, you don't need to seek, and can just make the ReaderSize here.

You've got some crazy double-buffering going on here.  Get rid of "buf" and just read straight out of the bufio.Reader.  When it empties, it will fill itself up, and then you can binary.Read straight out of it.

I would be very surprised if this is helping you.  Read straight into the triangle you're currently decoding.  If your profile shows that you're spending a lot of time in reflect, unroll that binary.Read into a ReadFull and then the right sequence of calls to binary.LittleEndian.*

--

Kyle Lemons

unread,
Oct 4, 2013, 2:40:57 AM10/4/13
to Guillermo Estrada, golang-nuts
On Thu, Oct 3, 2013 at 11:40 PM, Kyle Lemons <kev...@google.com> wrote:
On Thu, Oct 3, 2013 at 5:15 PM, Guillermo Estrada <phro...@gmail.com> wrote:
Hey Kyle, I'm trying to go through all this changes, but I'm not used to the Reader Interface. Should I create a bytes.NewReader to feed into the io.Reader with a set size? One other thing, you mention the copy operation is useless and that I should read directly to the Triangle slice... thing is, how do I use a slice of my slice as a param in the binary.Read function. I already tried that, and only the first Triangle was copied correctly when I sent the pointer to the first element of the subslice.

To feed into which io.Reader?  The one for binary.Read?  I didn't look at the code again, but I think it should look like this:

f, err := os.Open(filename)
if err != nil { ... }
defer f.Close()

var header Header
if err := binary.Read(f, binary.LittleEndian, &header); err != nil { ... }

triangles := make([]Triangle, header.Triangles)

r := bufio.NewReaderSize(f, BufferTriangles*binary.Size(Triangle{}))
for i := range triangles {
  if err := binary.Read(f, binary.LittleEndian, &triangles[i]); err != nil { ... }

Whoops, obviously that should be "r" instead of "f"

Jesse van den Kieboom

unread,
Oct 4, 2013, 5:52:01 AM10/4/13
to golan...@googlegroups.com, Guillermo Estrada
Just to put some measurements on my claims, I compared both reading unsafe and reading using standard reflect. The two functions I used are on http://play.golang.org/p/HlAFRnxQ-K

STL unsafe:
49.527777ms (49.69955ms)   25.05 MB 10 Allocs

STL reflect:
974.764166ms (974.900398ms)   99.02 MB 3371574 Allocs

This is on the Yoda model. Barring that I didn't make any mistakes in the implementation, using reflect is about 20 times slower on my machine and uses about 4 times more memory.

Guillermo Estrada

unread,
Oct 4, 2013, 10:22:20 AM10/4/13
to golan...@googlegroups.com, Guillermo Estrada
Hey Kyle, I implemented your ideas and I really like themthe code looks awesome, but after testing, it seems kinda suboptimal. I think that iterating over each triangle is pretty slow compared over copying chunks of triangle slices. Hope to hear your thoughts on this cause your method is hurting on RAM a LOT.

C:\Users\highurz1\Desktop\dragon_recon\xyzrgb_dragon.stl
Parsing STL:
        3.9622266s (3.9622266s)  1097.35 MB     57 Allocs
File has 7219045 triangles.

Parsing STL (buffered) Kyle:
        9.7515577s (13.7137843s)         1486.41 MB     50625332 Allocs
File has 7219045 triangles.

Parsing STL (buffered):
        3.986228s (17.7000123s)   752.17 MB     70691 Allocs
File has 7219045 triangles.

Parsing STL (unsafe):
        447.0256ms (18.148038s)   736.35 MB     26 Allocs
File has 7219045 triangles.

And here is the code I used (love the code though, clean and simple)...

func ParseBufferedSTLKyle(filename string) *Model {
  f, err := os.Open(filename)
  if err != nil { panic(err) }
  defer f.Close()

  m := new(Model)

  if err := binary.Read(f, binary.LittleEndian, &m.Header); err != nil { panic(err) }
  if err := binary.Read(f, binary.LittleEndian, &m.Length); err != nil { panic(err) }

  m.Triangles = make([]Triangle, m.Length)

  r := bufio.NewReaderSize(f, 1024*binary.Size(Triangle{}))

  for i := range m.Triangles {
    if err := binary.Read(r, binary.LittleEndian, &m.Triangles[i]); err != nil { panic(err) }
  }

  return m
}

Guillermo Estrada

unread,
Oct 4, 2013, 10:42:38 AM10/4/13
to golan...@googlegroups.com, Guillermo Estrada
Switched the main loop for something like this:

for i := 0; i < length; i = i + bufferSize {
  if length - i < bufferSize {
    bufferSize = length - i
  }
  if err := binary.Read(r, binary.LittleEndian, m.Triangles[i:i+bufferSize]); err != nil { panic(err) }
}

And got WAY better results!!!

Parsing STL (buffered) Kyle:
        3.9282246s (8.0194587s)   751.77 MB     63565 Allocs
File has 7219045 triangles.

Pretty much the same as my buffered version but using less RAM! I have to play with the buffer size to determine what is the best size depending on the file. But I don't see any other way to optimize this further.

Jesse's version with unsafe pkg is FAST! but uses a lot of RAM too. I would like to parse it slow and then do something like binary.Marshal and save the optimized format for future use.

Does anybody know if a Marshalling system for binary structs exist? I think notnot or someone did mentioned it here earlier.

Jan Mercl

unread,
Oct 4, 2013, 12:38:30 PM10/4/13
to Jesse van den Kieboom, golang-nuts, Guillermo Estrada
On Fri, Oct 4, 2013 at 11:52 AM, Jesse van den Kieboom
<jess...@gmail.com> wrote:
> Just to put some measurements on my claims, I compared both reading unsafe
> and reading using standard reflect. The two functions I used are on
> http://play.golang.org/p/HlAFRnxQ-K
>
> STL unsafe:
> 49.527777ms (49.69955ms) 25.05 MB 10 Allocs
>
> STL reflect:
> 974.764166ms (974.900398ms) 99.02 MB 3371574 Allocs

Comparing the performance against a solution using reflect makes no
sense as you can write code running correctly regardless of endianess
without using whatsoever from the reflect package, which is okay per
se, but inevitably principally slow in many, if not most cases.

-j

Jesse van den Kieboom

unread,
Oct 4, 2013, 4:11:28 PM10/4/13
to golan...@googlegroups.com, Jesse van den Kieboom, Guillermo Estrada


On Friday, October 4, 2013 6:38:30 PM UTC+2, Jan Mercl wrote:
On Fri, Oct 4, 2013 at 11:52 AM, Jesse van den Kieboom
<jess...@gmail.com> wrote:
> Just to put some measurements on my claims, I compared both reading unsafe
> and reading using standard reflect. The two functions I used are on
> http://play.golang.org/p/HlAFRnxQ-K
>
> STL unsafe:
> 49.527777ms (49.69955ms)   25.05 MB 10 Allocs
>
> STL reflect:
> 974.764166ms (974.900398ms)   99.02 MB 3371574 Allocs

Comparing the performance against a solution using reflect makes no
sense

Oh come on, of course it makes sense. Using reflect would be the cleanest way to implement this, and it's also been suggested in this topic. It is both valid and valuable to evaluate how much performance difference there is between the cleanest and fastest code to accomplish reading a binary file.  
 
as you can write code running correctly regardless of endianess
without using whatsoever from the reflect package, which is okay per
se, but inevitably principally slow in many, if not most cases

Less talk, more numbers
 
 
.

-j

Jan Mercl

unread,
Oct 4, 2013, 4:18:31 PM10/4/13
to Jesse van den Kieboom, golang-nuts, Guillermo Estrada
On Fri, Oct 4, 2013 at 10:11 PM, Jesse van den Kieboom
<jess...@gmail.com> wrote:
> Oh come on, of course it makes sense. Using reflect would be the cleanest
> way to implement this, and it's also been suggested in this topic.

Please define what you mean by "cleanest". Reflect is guaranteed to be
the slowest option in your usage case. Some careless programmers abuse
both computer's resources and the reflect package - instead of writing
sometimes a bit more code.

Everyone is free to use what she wants, but if the topic is reaching
maximum possible speed/performance then using reflect is the wrong
answer in about every case.

That said, it's not reflect's fault when used where not appropriate.

-j

Kyle Lemons

unread,
Oct 4, 2013, 4:24:45 PM10/4/13
to Guillermo Estrada, golang-nuts
Write your own Read that doesn't call binary.Read.  It's not hard, just use io.ReadFull and binary.LittleEndian.*.

If you want more guidance on my solution, you'll have to post a profile of where it's spending time and memory.  I suspect the garbage is coming from the reflection and any internal buffering that binary.Read is doing.


Jesse van den Kieboom

unread,
Oct 4, 2013, 4:44:43 PM10/4/13
to golan...@googlegroups.com, Jesse van den Kieboom, Guillermo Estrada
Fine, so I also implemented using just binary.LittleEndian.* and here are the results:

STL reflect:
981.812881ms (981.975427ms)   99.02 MB 3371579 Allocs
STL binary:
886.004605ms (1.868050023s)   25.02 MB 14 Allocs
STL unsafe:
42.3406ms (1.910452496s)   25.04 MB 9 Allocs

The memory usage is now down, so no more allocations (the second case), but the time for parsing hasn't decreased much. The unsafe method is still an order of magnitude faster...

Jesse van den Kieboom

unread,
Oct 4, 2013, 4:58:46 PM10/4/13
to golan...@googlegroups.com, Jesse van den Kieboom, Guillermo Estrada
I realized that my implementation of the binary case was not completely comparable because it didn't use the same batching of reading from the io.Reader. I thought that since I'm using a buffered io, it wouldn't make a big difference, but it seems doing small io.ReadFull can hurt a lot. In any case, with the same batching the results are:

STL reflect:
983.321431ms (983.47328ms)   99.02 MB 3371579 Allocs
STL binary:
115.144052ms (1.098696975s)   25.10 MB 18 Allocs
STL unsafe:
39.730116ms (1.138491161s)   25.04 MB 9 Allocs

So no longer an order of magnitude of difference, but unsafe is still ~3 times faster.

Guillermo Estrada

unread,
Oct 4, 2013, 5:05:04 PM10/4/13
to golan...@googlegroups.com, Jesse van den Kieboom, Guillermo Estrada
Hey Jesse, I'm keeping an eye in that binary implementation, I will try what Kyle suggests and write my own Read func with ReadFull and binary.LittleEndian, and compare results.
Just one thing.. are you checking the data being parsed? One of my implementations ran like hell with almost no memory overhead (much like unsafe), until I realized... not all memory was being copied correctly. I mean, less ram, more speed, etc... but in the end you need the right data...

Paul Gruenbacher

unread,
Aug 6, 2015, 10:17:10 PM8/6/15
to golang-nuts, jess...@gmail.com, phro...@gmail.com
Hi Guillermo,
Came across this thread, looking to implement an stl binary/ascii converter. What implementation did you end up using?
Paul
Reply all
Reply to author
Forward
0 new messages