[CS-FSLUG] [Linux4christians] Question: Working With Compressed Files in Brash

Chris Debenham chris at adebenham.com
Mon Apr 23 17:34:34 CDT 2012


On 24 April 2012 05:14, Don Parris <parrisdc at gmail.com> wrote:

> Hi all,
>
> In my "Brash" shell script, I want to compress the files into an archive
> to save space.  What I am curious about is the best approach to this.  I
> see at least 2 options:
> <> compress each file or set of files at the time the user tosses them
> into the trash (1 compressed archive for each time files are tossed)
> <> create an archive on the initial toss action, and append future files
> to the one archive
>
> From a "safety" standpoint, it seems best to create a new archive for each
> time the user tosses files into the trash:
> <> Don't put all the eggs in one basket - if one archive is somehow lost,
> the remaining archives should remain (apart from the user emptying the
> trash)
> <> Avoids one monolithic archive
>
> Ok, nothing prevents the files being deleted when the user empties the
> trash.  They *will* be gone, recoverable only by forensics tools.  Even so,
> I lost a lot of data once when I created a monolithic archive and copied to
> a CD-ROM.  When I later tried to open it, gzip basically just churned away
> - never actually opening my files.  I had help from various lists (I think
> this one included) and still never recovered my files.  I have not
> forgotten that.
>
> Anyway, just curious as to whether anyone else sees something I should
> think about.  I am considering combining tar and bzip.  Tar will let one
> view a list of files in an archive - something that will be useful when the
> user wants to see what is in the trash.
>
> Thoughts?
> Don


I would think that compressing each file individually is best/safest.
Adding files to a compressed archive is slow - and removing one file from
it is even worse (ie when restoring a single file from the trash)
If you used tar+bzip2 (or similar) then each operation requires
uncompressing the entire archive first.
This would be made even worse if you delete a few large binary files that
don't compress well (ie: you delete 2 * 1G video files which take up 1.9G
total compressed.  When you go to undelete one of them for a short while
you will have the original 1.9G compressed archive + a 2G tarball + the 1G
file being restored. If you use pipes then the 2G tarball may sit in
memory/swap but it will still use a fair bit of memory)
Finally due to the way tar works reading the archive to find out what is in
it requires a full decompress of the file - which could be slow due to IO,
CPU and memory usage
If you keep to just using bzip2 (or 7z or gz or lz ;) ) on each individual
file then figuring out what is in trash becomes a much easier "ls | sed -e
's/.bz2$//g' " :)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://ofb.biz/pipermail/christiansource_ofb.biz/attachments/20120424/792435a9/attachment.htm>


More information about the Christiansource mailing list