I have released a version of Sawzall and given it the release version 1.0.1.
It is a guerilla release, meaning that it is not approved by the official
maintainers. Still, I think it is worthwhile. This build of szl seems to work
correctly and it is easier to maintain than the 1.0 release.
The source tarball can be downloaded from the following URL:
The source repository can be cloned with:
The release notes are below (they are also in the file NEWS).
--Adrian.
[shuffling intensifies]
Possibly controversial things:
- szl is now linked using -rpath, by default.
This means that the installed szl binary contains within it a hint for the
loader about where to find the Sawzall shared libraries. This departs from
a popular convention of letting the system's shared library cache
(ldconfig) find the libraries. However, a bug in libtool means that the
cache is not updated on installation (at least on Debian); this use of
-rpath is a workaround. Give the --disable-rpath flag to ./configure if you
want to suppress this workaround.
- public/ header files have moved to google/szl/
This is for symmetry with protobuf, as well as the fact that it fixes
the problems in issue 13.
Maintainability improvements:
- The platform is now assumed to be compatible with POSIX.1
(IEEE Std 1003.2-2001 or later). The compiler is also assumed to be
compatible with C++11 (ISO/IEC 14882:2011). These assumptions are
valid on all modern systems. They enable the elimination of many
unnecessary configuration tests.
- Using-directives (e.g. "using namespace std;") have been eliminated, and
replaced with specific using-declarations (e.g. "using std::string") in .cc
source files. Header files contain neither using-directives nor
using-declarations. The benefit is that the namespace in each .cc file is
predictable.
- Includes of standard C headers (such as <stdio.h>) have been replaced
by their C++ counterparts (such as <cstdio>) as far as possible.
(It's not possible where the code uses a name only found in a POSIX
extension to a C standard header, such as popen.) Using-declarations
are used to make the non-macro identifiers visible. The benefit of
this is that it reduces namespace pollution, making name clashes less
likely and making it easier to add or remove #include directives.
- The giant, opaque test-runner szl_regtest.sh has been replaced with a
wrapper testszl.sh. The individual test cases (*.testszl) are visible
to the "make check" test-runner, and they can now be run individually.
- Header files now have proper multiple-inclusion #define guards.
Bugs fixed:
- Static Initialization Order Fiasco.
The several pieces of code which run at dynamic initialization time, but
which assume incorrectly that some other dynamic initialization has
completed, have been corrected. The Google C++ Style Guide
prohibits the definition of static or global variables of class type
exactly to prevent this problem arising in the first place.
- Segmentation fault when processing "proto" with nonexistent file.
Now you get an error message instead.
- Memory corruption in native code generation, due to use of a pointer to a
stack variable after its lifetime ended. This is issue 33.
- Buffer overrun in protoc_plugin, apparently hidden by some
difficult-to-read C++ templatology. Fixed by Matt Proud.
- Command-line flag parsing used memcpy with overlapping arrays, leading
to memory corruption when compiled with -O2. Fixed by using memmove
instead.
- Issue 31: FrameIterator stack underflow
- Issue 30: Array overrun scanning opcodetab (memory corruption)
- "proto" declarations with more than one slash in the pathname were
mishandled. This was a strchr/strrchr confusion in the source.
- PATH_MAX was set to a guessed value in porting.h.
It's a dangerous constant to guess wrong, because of functions which
assume you have passed a buffer of exactly PATH_MAX bytes size (such
as realpath). The references to PATH_MAX have been removed from the
global headers, and realpath() is now wrapped in RealPath() where
the platform's correct value for PATH_MAX is used appropriately.
- szl --print_code used to use the fixed name "/tmp/funcode" for a
temporary file. Now it creates and deletes a temporary directory.
- Similarly, compilation of szl code with proto declarations no longer
uses fixed names in /tmp, but now uses fixed names in a temporary
directory.
- Hardcoded /usr/local paths.
The ./configure script tells us where protoc is and where protoc_gen_szl
will be installed, and the code now uses those paths instead of guessing.
- Output to "proc" tables did not close all file descriptors.
The previous code closed the first 100 file descriptors after forking.
The current code closes all of them.
- Issue 29: ICU4C API compatibility problems
- GPL'd code.
I have removed all the code which looked like it was covered by the
GNU Public License, and replaced it with something compatible with
the Apache license. Specifically, the autoconf macros ACX_PTHREAD
and AX_CHECK_LINKER_FLAGS.
- The configure flags --enable-debug and --disable-debug now work properly.
The debug options are more aggressive when debug is enabled. The code
has been fixed so it now compiles OK with -Wall -Werror.
- Code to measure the virtual process size on Mac OS X has been included
in VirtualProcessSize, meaning that this functionality is no longer
broken on OS X. The tests that were disabled on OS X for this reason
are now running and passing.
- GOOGLE_DISALLOW_EVIL_CONSTRUCTORS missing.
Some class definitions in protoc_plugin attempted to use this obsolete
Google macro. Interestingly, the compiler accepted it as a valid
member function declaration.
- RandFloat and RandDouble made unwarranted assumptions about the
implementation of floats (namely, IEEE754). They now run correctly on all
machines and efficiently on IEEE754 machines.
- SZL_BYTE_ORDER has been eliminated, so that code which wants to embed
libszl no longer has to define it. The more-standard (less nonstandard?)
WORDS_BIGENDIAN/WORDS_LITTLEENDIAN pair is used instead, and the code is
now correct even if the endianness is not specified.
- Some tests were locale-sensitive, so LC_MESSAGES=C is now specified
for them.
- Static builds don't work. Actually, this hasn't been fixed. Instead,
static builds are disabled by default. Use the --enable-static flag
to ./configure if you want to waste time building them. To see them
fail, use --enable-static --disable-shared.
- An unexplained "assert(false);" blocked processing of recordio input.
Optimization:
- The checks in google/szl/logging.h (including assert) still call abort()
as before, but now the abort() call is in inline code. This allows the
compiler to deduce that asserted conditions are now true, because it
can see that otherwise abort() would not return.
- The gcc compiler flags -fwrapv -fno-strict-aliasing -fno-tree-vrp are
set, if the compiler supports them. Without those modifications, the
-O2 level of optimization is unsafe.
Minor changes for convenience or style:
- A new MD5 implementation has been included, replacing the previous
dependency on OpenSSL.
- "assert" now refers to the assert macro in google/szl/logging.h, and
never to the one in assert.h/cassert. This means the assertion error
message will be written to the file named in the --logfile flag, if any.
- The dependency on "objdump" (a tool from the GNU binutils package) has
been relaxed. If there is no objdump, the sole test that relies on it
(elfgen_unittest) will be skipped, and "szl --native --print_code" will
fail at runtime (so don't do that).
- The doc/ and wiki/ directories have been imported, containing copies of
broken links have been fixed (for now).
- The installed header files hash_map.h and hash_set.h are now generated from
templates so that they contain the values found by ./configure's search for
a usable hash_map implementation. Previously, those files were not usable
unless szl's config.h had been #included first.
- SzlMutex has been replaced by the more-capable Mutex, imported from the
googletest project.
- "make check" now works in out-of-source-tree builds.
- porting.h uses int_least64_t, INT64_C etc. for better portability.
However, the bulk of the code still assumes sizeof(int64)==8 etc.
- Tests used to leave litter in /tmp, but now they use $SZL_TMP which is
a temporary directory made just for that test run.
- ./configure uses the config cache much more, so reconfiguration is much
faster if you use --config-cache.
- The output from autogen.sh is now checked in to source code control, so
that it's possible to go straight from "hg clone" to "./configure"
without needing to install a new toolchain.
- szl --help now alphabetically sorts the options.
- Several szl regression tests by Matt Proud are included, including a
much-needed test of recordio input.
- The UNUSED macro expands to __attribute__((__unused__)) if the compiler
supports it, or to nothing if the compiler doesn't. It helps suppress
compiler warnings about unused declarations.
- The Makefile rule for compiling sawzall.proto to .pb.h and .pb.cc files
is generalized to arbitrary .proto files.
- Pkg-config files for sawzall's libraries and headers are generated and
installed. This is superdupont's fix for issue 12.