Importing Fat SAVs


Memory issues from SAV files with many tens of thousands of variables have arisen occasionally, more so on Win32/Ruby32, but you could also have problems in Win64 on a small-RAM machine. Here are some strategies to handle this.

  1. Check your SAV file for excessive variable widths

 

 

 

 

Q1 is a text variable, with a single case, comprising 39 characters. But the width is set to allow up to 2000 characters:

 

 

 

If I now save, Untitled.sav is just under 9k. If I reduce the width to 40, the file size reduces to a mere 620 bytes, nearly 15 times smaller. I have seen SAV files with hundreds of verbatim variables, all at width=2048 or worse, when the expected responses were actually just brand names, with the longest at a couple of hundred characters, and most < 20.

You can get SPSS to reduce all widths to their minimums by executing this syntax:

 

 

 

 

 

 

 

 

 

SPSS stores text in hidden variables, one for each eight character group, so a width of 2000 requires 2000/8 = 250 extra variables you never see, but which Ruby must nonetheless still process, because you could have a string comprising all whitespace until the last character.

You should do this for all SAV files with verbatims regardless, since no file should ever be larger than it needs to be for the information contained therein.

  1. Upgrade to Ruby 64 bit (version 4.x)

If you are running Win64, then use Ruby version 4, which is 64 bit. 64 bit applications will use nearly all installed RAM, whereas 32 bit applications (such as Ruby version 3.x) have a 2 gigabyte ceiling, in practice usually about 1.75 gig before memory problems start to happen. You need at least 8 gig of RAM for the 64 bit benefits – 16 gig is better – 32 gig or more if you can get it.

  1. Optimise your memory write threshold on the Import form.

See earlier posts here:

http://redcentresoftware.com/memory-import-threshold-with-more-than-4-gig-ram/

http://redcentresoftware.com/when-my-imports-go-one-case-at-a-time-and-take-forever/

http://redcentresoftware.com/eefface/

 

  1. Try No Dup(licate Code)Frames ON

 

 

 

 

 

 

 

 

 

 

 

 

 

 

This will prevent duplicate codeframes from taking up individual space by referencing all the duplicates to the first-encountered master.  Given 100 main brand variables, the codeframe is stored once, with 99 references, rather than 100 times.

  1. If the SAV is just too humungous…

It is possible for other software to write a SAV file which is too big for SPSS 64 bit on 16 gig RAM to even open, in which case Ruby could have problems too. If none of the above work for you, contact us. We have yet to encounter a SAV which could not be imported.

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.