seqls: A tool for listing file sequences

204 views
Skip to first unread message

Justin Israel

unread,
Dec 13, 2014, 5:31:40 PM12/13/14
to python_in...@googlegroups.com
Hey all,

This isn't directly python related, but I thought I would share anyways. If anyone has been using the python library fileseq, which I help maintain, to work with file sequence and frame range patterns, I've built a tool around this functionality (kind of similar to rvls). 

gofileseq is a port of fileseq, to the Go language. Included in this repo is a utility called seqls, which uses gofileseq to provide a way to list files on the filesystem that are rolled up into sequence formats. So far it can do short and long listings, rolling up the file sizes and showing the latest mod time. I've cross compiled it for win/linux/osx

If you find it useful or have suggestions, please let me know! Also feel free to register issues for it on github if needed. I develop on osx and linux, so I have only briefly tested it on windows..

Enjoy!

Justin

Marcus Ottosson

unread,
Dec 15, 2014, 7:54:34 AM12/15/14
to python_in...@googlegroups.com
I haven't used the Python library, but what would be the advantage(s) of a Go version compared with the Python version?


Justin

--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_m...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/python_inside_maya/CAPGFgA2KpONX8cs9%3Df%2B4zzQx11yo1e7qV09bNTrT9zwudEfTRA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.



--
Marcus Ottosson
konstr...@gmail.com

Justin Israel

unread,
Dec 15, 2014, 1:35:11 PM12/15/14
to python_in...@googlegroups.com

The Go version was getting about a 25x speedup when using FindSequencesOnDisk (which makes use of all of the other facilities) . The parsing and string building of the paths, frame ranges, and results are faster. I reuse buffers instead of constantly allocating new strings. Also the find operation is concurrent.
An example was that my co-worker was using the python version to drive a graphical file sequence browser. He saw that massively apparent speedup in the interface when he switched to the Go version.


Marcus Ottosson

unread,
Dec 16, 2014, 2:57:20 AM12/16/14
to python_in...@googlegroups.com
Sounds good, no benchmarks?


For more options, visit https://groups.google.com/d/optout.



--
Marcus Ottosson
konstr...@gmail.com

Justin Israel

unread,
Dec 16, 2014, 3:18:08 AM12/16/14
to python_in...@googlegroups.com

I can work up some benchmarks against the python fileseq.


Justin Israel

unread,
Dec 16, 2014, 5:32:30 AM12/16/14
to python_in...@googlegroups.com
Here are some benchmarks:

I don't have a case on hand to reproduce where my co-worker was getting a 25x speedup. I will have to see if I can find that. It was more a case of "what were you getting before? Ok what are you getting now?" :-)
These were done on an SSD local drive, where as his tests were across an NFS mounted filesystem. 

Also, the gofileseq library itself isn't yet using any concurrency for its logic. Only the seqls tool uses concurrency for walking directories, and processing lists of frames into sequences. I will probably do another pass on gofileseq to get the FindSequencesOnDisk function processing files in parallel. Haven't done much of any profiling yet.


On Tue Dec 16 2014 at 9:18:04 PM Justin Israel <justin...@gmail.com> wrote:

I can work up some benchmarks against the python fileseq.


On Tue, 16 Dec 2014 8:57 PM Marcus Ottosson <konstr...@gmail.com> wrote:
Sounds good, no benchmarks?

On 15 December 2014 at 18:35, Justin Israel <justin...@gmail.com> wrote:

The Go version was getting about a 25x speedup when using FindSequencesOnDisk (which makes use of all of the other facilities) . The parsing and string building of the paths, frame ranges, and results are faster. I reuse buffers instead of constantly allocating new strings. Also the find operation is concurrent.
An example was that my co-worker was using the python version to drive a graphical file sequence browser. He saw that massively apparent speedup in the interface when he switched to the Go version.


On Tue, 16 Dec 2014 1:54 AM Marcus Ottosson <konstr...@gmail.com> wrote:
I haven't used the Python library, but what would be the advantage(s) of a Go version compared with the Python version?

On 13 December 2014 at 22:31, Justin Israel <justin...@gmail.com> wrote:
Hey all,

This isn't directly python related, but I thought I would share anyways. If anyone has been using the python library fileseq, which I help maintain, to work with file sequence and frame range patterns, I've built a tool around this functionality (kind of similar to rvls). 

gofileseq is a port of fileseq, to the Go language. Included in this repo is a utility called seqls, which uses gofileseq to provide a way to list files on the filesystem that are rolled up into sequence formats. So far it can do short and long listings, rolling up the file sizes and showing the latest mod time. I've cross compiled it for win/linux/osx

If you find it useful or have suggestions, please let me know! Also feel free to register issues for it on github if needed. I develop on osx and linux, so I have only briefly tested it on windows..

Enjoy!

Justin

--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_maya+unsub...@googlegroups.com.



--
Marcus Ottosson
konstr...@gmail.com

--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_maya+unsub...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_maya+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Marcus Ottosson
konstr...@gmail.com

--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_maya+unsub...@googlegroups.com.

Anthony Tan

unread,
Dec 16, 2014, 7:26:32 AM12/16/14
to python_in...@googlegroups.com
Wildish speculation on my part, I wouldn't be surprised if the go version of had a better equivalent implementation of the os.listdir() call giving you a lot of that speed up - esp if you're using NFS/CIFS - which would make the go version better-er even without implementing any swanky concurrency stuff. Specifically, i'm thinking about PEP-471. https://www.python.org/dev/peps/pep-0471/ 
 
Some notes that show a rewritten os.walk is going to be a tad faster (http://www.gossamer-threads.com/lists/python/dev/1153484) so 25x doesn't sound immediately improbable either. (Would be curious to see what the speed up is from python 2.7 vs python 3.5.. maybe I'll run a benchmark post Christmas for my own curiosity..)
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_m...@googlegroups.com.

Marcus Ottosson

unread,
Dec 16, 2014, 9:13:42 AM12/16/14
to python_in...@googlegroups.com

Specifically, i’m thinking about PEP-471. https://www.python.org/dev/peps/pep-0471/

That’s some really interesting stuff! Looks like it’s available for 2.7 too.



For more options, visit https://groups.google.com/d/optout.



--
Marcus Ottosson
konstr...@gmail.com

Justin Israel

unread,
Dec 16, 2014, 1:28:30 PM12/16/14
to python_in...@googlegroups.com

If you are curious, the source for Go's walk is here:
https://golang.org/src/path/filepath/path.go#L389

It isn't quite what the python Pep describes, as it uses an lstat in combination with readdirnames. It also isn't really the fastest of walk implementations as some have pointed out. I'm actually using someone's modified implementation which does the walk concurrently (although it doesn't change the way it stats) :
https://github.com/MichaelTJones/walk/blob/master/walk.go#L140

I was actually thinking about this last night, now that you mention it. Because my walk process in seqls doesn't need the stats for all the files. I could probably get improved results if I just did a recursive Readdirnames, as that is all I am interested in on each path.

Also, in my benchmarks, just to mention, there wasn't much walking going on except for the "all" test. The tests were all doing a readdir for the listing of a given dir, to process the files into sequences.


Justin Israel

unread,
Dec 16, 2014, 1:52:41 PM12/16/14
to python_in...@googlegroups.com

Oh woops. You were talking about the listdir impl in python. The equivalent is readdir, and here:
https://golang.org/src/os/file_unix.go#L154

They are reading the names of the location and doing an lstat on each.


To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_maya+unsubscribe@googlegroups.com.
 
 
 
--
Marcus Ottosson


--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_maya+unsubscribe@googlegroups.com.


--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_maya+unsubscribe@googlegroups.com.
 
For more options, visit https://groups.google.com/d/optout.

 
 
 
--
Marcus Ottosson


--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_maya+unsubscribe@googlegroups.com.


--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_maya+unsub...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_maya+unsub...@googlegroups.com.



--
Marcus Ottosson
konstr...@gmail.com

--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_maya+unsub...@googlegroups.com.

Anthony Tan

unread,
Dec 16, 2014, 5:13:11 PM12/16/14
to python_in...@googlegroups.com
Hm, interesting, I'll definitely have to have a poke when I get some time - been looking for an excuse to do more Go but I can't justify any of it here at work. Homework! :D
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_m...@googlegroups.com.

Justin Israel

unread,
Dec 16, 2014, 6:45:00 PM12/16/14
to python_in...@googlegroups.com

Seems like if I wanted to do what is proposed in that Pep, I would have to drop down to the platform specific syscalls. Although at least it is already working like a "generator" and streaming the listing on each subsequent call.


Justin Israel

unread,
Dec 17, 2014, 12:29:28 AM12/17/14
to python_in...@googlegroups.com
This was an interesting tidbit I got from the Go mailing list, when I asked about the availability of the readdir system call directly in a generalized sense:


The only fields in the dirent structure that are mandated by POSIX.1
are: d_name[], of unspecified size, with at most NAME_MAX characters
preceding the terminating null byte ('\0'); and (as an XSI extension)
d_ino.  The other fields are unstandardized, and not present on all
systems; see NOTES below for some further details.
...
Other than Linux, the d_type field is available mainly only on BSD
systems.  This field makes it possible to avoid the expense of
calling lstat(2) if further actions depend on the type of the file.

So I suppose the reason that there isn't a fully abstracted non-platform-specific approach is because it could only reliably cover windows/linux/bsd, with no similar offering to any other supported platforms. I assume Python has to deal with that same situation in that PEP. I would have to create build constraints that use this faster path for windows/linux/bsd, and fallback on the higher level Readdir/walk for anything else.



--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_maya+unsubscribe@googlegroups.com.
 
 
 
--
Marcus Ottosson


--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_maya+unsubscribe@googlegroups.com.


--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_maya+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_maya+unsubscribe@googlegroups.com.

Justin Israel

unread,
Jun 10, 2015, 7:30:59 PM6/10/15
to python_in...@googlegroups.com
Hi All,

If anyone has found my seqls tool useful for listing file sequences (and files), I have some new updates to announce since the 0.9.1 was first mentioned ....


Changes:

1.0.0
cmd/seqls - Large refactor ; added support for passing sequence patterns (path/files.#.jpg)
cmd/seqls - Skip hidden directories (in addition to hidden files), unless -a flag is used
Docstring corrections

0.9.9
Add support for reverse frame ranges (10-1)
Improve the logic for parsing non-sequence single file paths
Improve the logic for parsing frame numbers
Add options to FindSequencesOnDisk for showing hidden files
Fix various parsing conditions that could crash FindSequencesOnDisk
seqls: Expose options for all / hidden files in results

0.9.2
seqls: Buffer stdout

Enjoy!
Justin
Reply all
Reply to author
Forward
0 new messages