gz compress a file in Elixir

1,194 views
Skip to first unread message

Edward Stembler

unread,
Nov 9, 2015, 10:36:19 AM11/9/15
to elixir-lang-talk
I'm converting a Ruby Map/Reduce process to Elixir.  One of the steps in my pipeline is to gz compress a TSV file.  Here's what my Ruby code looks like:

require 'zlib'
def gzip_file(input_filename, output_filename, chunk_size=16*1024)
  fail
ArgumentError, 'input_filename is nil' unless input_filename
  fail
ArgumentError, 'output_filename is nil' unless output_filename
 
Zlib::GzipWriter.open(output_filename) do |gz|
   
File.open(input_filename) do |f|
      chunk
= f.read(chunk_size)
     
while chunk do
        gz
.write chunk
        chunk
= f.read(chunk_size)
     
end
   
end
    gz
.close
 
end
end

I starting writing an Elixir version, until I realized there's no while loop.

def gzip_file(input_filename, output_filename, chunk_size \\ 16*1024) when is_binary(input_filename) and is_binary(output_filename) do
 
unless input_filename, do: raise ArgumentError, message: "input_filename is nil"
 
unless output_filename, do: raise ArgumentError, message: "output_filename is nil"
 
File.stream(input_filename, [:read], chunk_size), fn(input_file) ->
   
File.open(output_filename, [:write, :compressed]), fn(output_file) ->
      chunk
= IO.read(input_file)

     
# TODO: Finish
     
# Opps, no while loop in Elixir

     
File.close(output_file)
   
end)
   
File.close(input_file)
 
end)
end

So, surely someone has had to gz compress a file in Elixir or Erlang before?  Does anyone know of any example code out there?  I couldn't find anything via Google...

Edward Stembler

unread,
Nov 9, 2015, 11:16:16 AM11/9/15
to elixir-lang-talk
I was hoping to avoid having to use shell, however, if no one knows this, I can always resort to:

def gzip_file(filename) when is_binary(filename) do
 
{result, 0} = System.cmd("gzip", [filename])
  result
end

Peter Hamilton

unread,
Nov 9, 2015, 11:19:07 AM11/9/15
to elixir-lang-talk
http://www.erlang.org/doc/man/zlib.html#gzip-1 is probably your friend here.

--
You received this message because you are subscribed to the Google Groups "elixir-lang-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-ta...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elixir-lang-talk/0323bef9-5361-4718-80c8-001cc3f163af%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

José Valim

unread,
Nov 9, 2015, 11:19:38 AM11/9/15
to elixir-l...@googlegroups.com
There is a zlib module in Erlang:


Remember that double quote strings in Erlang are char lists and are represented as single quoted in Elixir.

Here is an example of a module using it in Phoenix for assets digesting: https://github.com/phoenixframework/phoenix/blob/master/lib/phoenix/digester.ex#L92-L93



José Valim
Skype: jv.ptec
Founder and Director of R&D

--
You received this message because you are subscribed to the Google Groups "elixir-lang-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-ta...@googlegroups.com.

Edward Stembler

unread,
Nov 9, 2015, 11:37:55 AM11/9/15
to elixir-lang-talk, jose....@plataformatec.com.br
Okay thanks Peter & José!  I'll check that out...

Booker Bense

unread,
Nov 9, 2015, 1:01:25 PM11/9/15
to elixir-lang-talk
You might find this post useful. 


There is a while in Elixir, Just about anything you can write in a while loop, you 
can write in an Enum.reduce function call. You just have to think about things 
in a different way. 

As an aside, the bigger chunks you can do your I/O in the faster it will run. I've found for 
anything involving file processing, if the file is a megabyte or less, just slurp it into a 
single binary and process the binary. It's machine and code dependent of course, but
the limit where doing things in smaller chunks is faster is much higher than in
other languages in my experience. 

If you do use File.stream! be sure to set the chunk size to something large if you don't
need single line processing. 

- Booker C. Bense
Reply all
Reply to author
Forward
0 new messages