[tz] Inappropriate project direction

Bryan Smith b.j.smith at ieee.org
Sun Feb 11 10:15:53 UTC 2018

Not weighing in on the overall debate, just adding some technical bits ...

Steffen Nurpmeso <steffen at sdaoden.eu> wrote:
> Stephen Colebourne <scolebourne at joda.org> wrote:
>  |3) Move to a niche archive format
>  |
>  |The main archive format was changed from the well-known and widely
>  |supported gz to the niche lz format. There was and is no justification
>  |for using a niche format on a project as important as this one.
>  |
>  |- downstream consumers need to find and use the niche format
>  |- Windows does not have proper support for the niche format
> Yes i do not like that either, given that xz is far more common
> and zstd is seeing more usage over time.  It may be that it has
> archive format advantages over the former.

Ignoring the compressor and/or archiver aspect**,
 and ...
Ignore whether adding an LZ77-based compressor (gzip) was justified ...

If one is going to add an LZMA container-based compressor (e.g., 7z,
xz, lzip, et al), if portability and longevity are paramount, then
lzip is probably best.  Although xz finally added some POSIX meta that
7z lacked, it's still not something I'd recommend when exchanging
between various POSIX (let alone non-POSIX) systems.

E.g., I only use xz with tar when I'm sending data to be used on the
same POSIX architecture/platform, and not for long-term retention.
lzip is really the first real attempt -- not not saying it is, but it
is the first, real attempt -- at an universal POSIX archive standard.

Which brings me to ...

> That is a GNU project decision and Paul Eggert is one of _the_
> GNU contributors.

There is still a heavy copyleft (e.g., GPL) v. non-copyleft FLOSS
debate that rages on, with various people putting various values --
from nothing to everything -- on that.  lzip -- or more problematic,
it's lzlib -- is GPLv2 -- not even LGPL -- which has some asterisks
for commercial developers.  I don't know the state of public domain
lzip (pdlzip), but it seems to be good enough for any decompression
(and most general compression).

Lastly ...

> Also, .gz is supported, it is just the big all-in-one ball which uses that format.

I don't see projects dropping LZ77-based compressors anytime soon.  If
they want to add a 2nd format option, then that's going to happen, and
debates will rage.

However, I also expect the GNU Project or pro-copyleft maintainers to
push a GNU solution over others.  So lzip isn't surprising.

- bjs

**P.S.  The other bit is that lzip -- like everything from legacy
LZ77-based PKZip to LZMA2 7-Zip -- actually focuses on compressing a
file into an archive, instead of being a streaming compressor for
either files or an archive.  Yes, it's not very UNIX-like to do that,
but there are too many advantages with it to ignore.  It's a common
approach on Windows, since Windows never provided a native archiving
(w/o compression) approach, unlike UNIX system.

E.g., years ago, when linear access (tape) was common, I used to use
afio for per-file compression, over a compressed cpio stream (of files
in the archive), to backup (to tape).  It offered too many advantages,
all while being cpio compatible (if cpio was used to extract, then it
was a stream of compressed files).  Several of these advantages still
exist, like better handling of multi-volume archives (no different
than in the days of tape).

I was hopeful the "Austin Group" of IEEE POSIX + XOpen SuS
lineage/workgroup would have addressed this for the 21st century, or
at least defined an archiver that could do compression on a per-file
basis, even if the compressor choice was still external.  But instead,
we got 'pax' which just 'punted' (I'll use the popular, American
reference) the problem, and why 'tar' is still popular.

Bryan J Smith  -  http://www.linkedin.com/in/bjsmith
E-mail:  b.j.smith at ieee.org  or  me at bjsmith.me

More information about the tz mailing list