Thanks for your help!
Regards,
Zhengquan
Maximum compression with bzip2:
find /dir -type f -size +100M|tar cjvf /tmp/file.tar.bz2 -T -
To list content:
tar tjvf /tmp/file.tar.bz2
Then, if ok, delete files is easy.
If you have GNU find, xargs and tar, you can do something like
find /srcdir -type f -size +100M -print0 | xargs -r0 \
tar --remove-files -cSjvf tarfile.tbz2
I suggest you try first without the "--remove-files" option.
--
D.
I was not clear in the original post, Can I have a separate tarball
for each file?
Thanks!
Zhengquan
Can I have the individual files tarred to separate tarballs in the
same directory as the original ones?
Thanks!
Zhengquan
> I want to recursively tar the files that are bigger than 100M in a
> directory and delete the original files.
find /srcdir -type f -size +100M -print0 |
tar -cjf archive.tar.bz2 --remove-files --null -T -
It works for GNU tar, don't ask for others.
> find /srcdir -type f -size +100M -print0 | xargs -r0 \
> tar --remove-files -cSjvf tarfile.tbz2
If you have more than one or two hundreds of files
you repeatly overwrite tarfile.tbz2 again and again
and more again; very smart :(
> Can I have the individual files tarred to separate tarballs?
Yes
find $dir -type f -size +100M -exec tar --remove-files -cjf {}.tar.bz2 {} \;
Correct. That's why I suggested the OP to try the command without
the --remove-files option first.
--
D.
> > > Hello,
> > > I want to recursively tar the files that are bigger than 100M in a
> > > directory and delete the original files.
> > > Can any one give me a hint how to do it?
>
> > If you have GNU find, xargs and tar, you can do something like
>
> > find /srcdir -type f -size +100M -print0 | xargs -r0 \
> > tar --remove-files -cSjvf tarfile.tbz2
>
> > I suggest you try first without the "--remove-files" option.
>
> Can I have the individual files tarred to separate tarballs in the
> same directory as the original ones?
wait: you want to make tar archivea with one file? Or do you just want
to compress large files in your directory tree?
Matteo
Thank you, that is exactly what I had hoped.
Zhengquan
The simulation data is too huge so I just want to compress them to the
original directory and delete the original huge files.
Zhengquan
Ok but in this case just compress the file (gzip) instead of creating
a tar file with one file and compress it.
Matteo
> The simulation data is too huge so I just want to compress them to the
> original directory and delete the original huge files.
>
> Zhengquan
Ah ok,
for that I think the best solution is something as:
find $dir -type f -size +100M -exec bzip2 {} \;
See:
bzip2 --help
Consider your target, compression time or size file.
Thanks, is gzip more time efficient to compress the files? I know
bzip2 has small sizes.
Zhengquan
Note that you don't need to execute a bzip2 for each file, as
bzip2 can compress more than one file at once.
You forgot to quote $dir.
Note that the above assumes that $dir doesn't start with "-".
Note that the M in +100M is not standard.
K=1024 M=$((1024*$K)) G=$((1024*$M))
find "$dir" -type f -size +"$((100*$M))" -exec bzip2 {} +
With zsh:
bzip2 ./**/*(D.LM+100)
--
Stéphane
Well, it seems that there is no --remove-files mechanism in gzip and
bzip2, so I can only use find $dir -type f -size +100M -exec rm -i {}
\; to delete the orginal files. I added -i in case it deletes files I
want to keep.
Thanks.
Zhengquan
Zhengquan
>> Ah ok,
>> for that I think the best solution is something as:
>> find $dir -type f -size +100M -exec bzip2 {} \;
>>
>> See:
>> bzip2 --help
>>
>> Consider your target, compression time or size file.
>
> Well, it seems that there is no --remove-files mechanism in gzip and
> bzip2, so I can only use find $dir -type f -size +100M -exec rm -i {}
> \; to delete the orginal files. I added -i in case it deletes files I
> want to keep.
AFAIK, bzip2 operates directly on the file, so no need to remove it
afterwards (you would get an error it you tried).
$ ls foo
file1 file2
$ find foo -type f -exec bzip2 {} \;
$ ls foo
file1.bz2 file2.bz2
As Stéphane already mentioned, you can use + instead of \; to terminate the
bzip2 command, to get an increase in efficiency.
--
D.
I was just thinking of how to remove the original file. Now it seems
there is no need for that.
Thanks!
Zhengquan
Thanks, Stéphane, on my centos 4 machine there is no M option for the
size, so I use k. I was just using 100M as a rough divider for large
and small files. I wonder if there is any efficiency difference
bewtween using 100M and exponents of 2.
Zhengquan
Hi,
no bzip is fine, I was just wandering about the tar non the
compression :-)
Matteo