See
http://www.gnu.org/software/pspp/pspp-dev/html_node/System-File-Format.html#System-File-Format
and
http://www.wotsit.org/download.asp?f=spssdata&sc=288712674
--
I love deadlines.
I love the whooshing noise they make as they go by.
--Douglas Adams
Please note. These are not official SPSS format definitions. The
first document does not include recent changes to the format. Even
some features introduced in V14 seem not to be mentioned. The second
link points to a file with a date of 1998, so it is utterly obsolete.
In addition, you should be aware that the format is rather complex and
is easy to get wrong.
While you may not need newer features, there are some such as very
long strings and Unicode that could matter in a lot of cases.
It would probably be easier to wrap the i/o dll for your favorite
language (at least on Windows) than to write your own creator.
Regards,
Jon Peck
> Please note. These are not official SPSS format definitions. The
> first document does not include recent changes to the format. Even
> some features introduced in V14 seem not to be mentioned. The second
> link points to a file with a date of 1998, so it is utterly obsolete.
> In addition, you should be aware that the format is rather complex and
> is easy to get wrong.
Jon, it's within your power to release a complete specification.
You have not chosen to do so.
--
Ben Pfaff
http://benpfaff.org
Do you work for SPSS? I am curious as to why the file specification
is not made available. I understand that the file structure is
complex, but there is more than one kind of complexity. I am working
in a LAMP environment. Creating a dll wrapper and testing it,
requiring moving parts between several languages, doesn't seem simpler
than generating a .sav file from scratch. I'd love to get the
official SPSS spec if it is available. I am, after all, using it to
promote use of your program.
Thanks,
-- Steve Bearman
Ben,
Thanks so much for your speedy reply. These resources are just what I
needed, and I would not have know to look where you pointed me!
-- Steve Bearman
> Thanks so much for your speedy reply. These resources are just what I
> needed, and I would not have know to look where you pointed me!
You are welcome.
You mentioned elsewhere in the thread that you are working in a
LAMP environment. If the P in your LAMP stands for Perl, then
you might take a look at the Perl module for working with SPSS
.sav files that is included in the Git repository for GNU PSPP.
With this module, it is pretty easy to work with .sav files. For
example, you can build a very simple-minded web service for
dumping out .sav files as comma-separated fields with the
following Perl program (which can be tried out at
http://pspp.benpfaff.org/conversion.html):
#! /usr/bin/perl
use strict;
use warnings;
use CGI;
use PSPP;
use Digest::MD5;
$CGI::POST_MAX = 10 * 1024 * 1024; # max 10 MB posts
my $q = new CGI;
if ($q->param('file')) {
my $temp = $q->upload ('file');
my $ctx = Digest::MD5->new;
$ctx->addfile ($temp);
my $digest = $ctx->hexdigest;
my $file = "/home/www-pspp/input/$digest.sav";
seek ($temp, 0, 0);
open (FILE, '>', $file);
my $s;
while (sysread ($temp, $s, 4096)) {
syswrite (FILE, $s);
}
close FILE;
my $reader = PSPP::Reader->open ($file);
my $dict = $reader->get_dict ();
print "Content-type: text/plain\r\n\r\n";
while (my @case = $reader->get_next_case ()) {
my @values;
for (my $i = 0; $i < $dict->get_var_cnt (); $i++) {
push (@values, PSPP::format_value ($case[$i], $dict->get_var ($i)));
}
print join (',', @values), "\n";
}
}
The Perl module is pretty experimental, so it is not unlikely
that there are bugs that cause problems in production use. We do
accept bug reports, though.
Information about the PSPP Git repository is available at:
http://savannah.gnu.org/git/?group=pspp
--
"GNU does not eliminate all the world's problems,
only some of them."
--Richard Stallman
This looks cool. I am in need of a .sav file writer, not a reader. I
haven't hunted through the code in the repository yet. Do you have
methods defined for constructing .sav (or equivalent) files?
> This looks cool. I am in need of a .sav file writer, not a reader. I
> haven't hunted through the code in the repository yet. Do you have
> methods defined for constructing .sav (or equivalent) files?
Yes, the PSPP Perl module (and the C code that it is based on)
can also write .sav files.
The C code can also read and write .por files, but the Perl
module hooks for that haven't been written.
--
"Premature optimization is the root of all evil."
--D. E. Knuth, "Structured Programming with go to Statements"
>
> This looks cool. I am in need of a .sav file writer, not a reader. I
> haven't hunted through the code in the repository yet. Do you have
> methods defined for constructing .sav (or equivalent) files?
The way EpiData (www.epidata.dk) exports to .sav files is by writing
two text files. The first contains the data, and the second contains
the SPSS syntax to import the data (plus assign variable and value
labels, etc). Depending on your needs, that approach might be easier
than trying to write a .sav file directly.
--
Bruce Weaver
bwe...@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/
"When all else fails, RTFM."
Thanks for the suggestion. Currently, like EpiData I give my users a
data file and a syntax file along with instructions for where to save
these on their computers and how to assemble them into an SPSS .sav
file. This is more complicated than I would like for my users. If
SPSS had a way to include arbitrary numbers of variables and arbitrary
numbers of cases in the DATA section of a syntax file, it might be
easy enough for users to work with this single file. As it is, I want
to be more user-friendly for non-computer-saavy users, which, of
course, is most of them.
> Thanks for the suggestion. Currently, like EpiData I give my users a
> data file and a syntax file along with instructions for where to save
> these on their computers and how to assemble them into an SPSS .sav
> file. This is more complicated than I would like for my users. If
> SPSS had a way to include arbitrary numbers of variables and arbitrary
> numbers of cases in the DATA section of a syntax file, it might be
> easy enough for users to work with this single file. As it is, I want
> to be more user-friendly for non-computer-saavy users, which, of
> course, is most of them.
You can actually do that, I think. BEGIN DATA...END DATA will
allow you to include data along with syntax in a single file. I
think that SPSS has a line length limitation, but you should be
able to get around that by specifying more than one line per
case.