[Nitro] question for unix wizzards

1 view
Skip to first unread message

George Moschovitis

unread,
Feb 4, 2008, 11:09:14 AM2/4/08
to General discussion about Nitro
Dear devs,

2 small unix related questions.

does anyone know about:

- a quick and easy way to remove duplicate lines from a text files?
- a quick and easy way to decide if two image (binary) files are the same image/picture?

thanks in advance for your help!

-g.

--
http://gmosx.me.gr
http://joy.gr
http://cull.gr
http://nitroproject.org
http://phidz.com
http://joyerz.com

Jonathan Buch

unread,
Feb 4, 2008, 11:31:05 AM2/4/08
to General discussion about Nitro
Hi,

> - a quick and easy way to remove duplicate lines from a text files?

cat file | sort | uniq > file
# should work (if the sorting doesn't matter in the file)

cat file | uniq > file
# works if the duplicate lines are next to each other

> - a quick and easy way to decide if two image (binary) files are the same
> image/picture?

findimagedupes << search for that in google, I'm using one of those
perl scripts.

> thanks in advance for your help!

Hope that helps,

Jo

--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
_______________________________________________
Nitro-general mailing list
Nitro-...@rubyforge.org
http://rubyforge.org/mailman/listinfo/nitro-general

Eivind Eklund

unread,
Feb 4, 2008, 11:31:35 AM2/4/08
to General discussion about Nitro
On Feb 4, 2008 5:09 PM, George Moschovitis <george.mo...@gmail.com> wrote:
> Dear devs,
>
> 2 small unix related questions.
>
> does anyone know about:
>
> - a quick and easy way to remove duplicate lines from a text files?

uniq < filename (assuming that the lines are consequtive; otherwise,
you need to do a sort first, or keep the lines in a hash in either
Ruby or Perl)

> - a quick and easy way to decide if two image (binary) files are the same
> image/picture?

Do you mean if they are identical? cmp should do it. Another
approach is to get a checksum of the files (e.g. using "openssl rmd160
<filename>") and see if the checksum is the same.

Eivind.

Reid Thompson

unread,
Feb 4, 2008, 11:49:55 AM2/4/08
to General discussion about Nitro

On Mon, 2008-02-04 at 18:09 +0200, George Moschovitis wrote:
> Dear devs,
>
> 2 small unix related questions.
>
> does anyone know about:
>
> - a quick and easy way to remove duplicate lines from a text files?
from http://www.student.northpark.edu/pemente/sed/sed1line.txt

# delete duplicate, nonconsecutive lines from a file. Beware not to
# overflow the buffer size of the hold space, or else use GNU sed.
sed -n 'G; s/\n/&&/; /^\([ -~]*\n\).*\n\1/d; s/\n//; h; P'

> - a quick and easy way to decide if two image (binary) files are the
> same image/picture?

diff will tell you if the files are different also

rthompso@raker ~ $ diff 10000_Galaxies,_HST_Ultra_Deep.png Deathvalleysky_nps_big.png
Files 10000_Galaxies,_HST_Ultra_Deep.png and Deathvalleysky_nps_big.png differ
rthompso@raker ~ $ cp 10000_Galaxies,_HST_Ultra_Deep.png junk.png
rthompso@raker ~ $ diff 10000_Galaxies,_HST_Ultra_Deep.png junk.png

George Moschovitis

unread,
Feb 5, 2008, 3:35:42 AM2/5/08
to Reid.T...@ateb.com, General discussion about Nitro
Thanks for the help   everyone!
Reply all
Reply to author
Forward
0 new messages