Issue 63 in rpostgresql: Could not allocate memory in C function 'R_AllocStringBuffer' Error

12 views
Skip to first unread message

rpost...@googlecode.com

unread,
Jul 16, 2014, 2:31:42 PM7/16/14
to rpostgr...@googlegroups.com
Status: New
Owner: ----
Labels: Type-Defect Priority-Medium

New issue 63 by rmcge...@gmail.com: Could not allocate memory in C
function 'R_AllocStringBuffer' Error
http://code.google.com/p/rpostgresql/issues/detail?id=63

Hello,
I am trying to save R objects into a PostgreSQL table by using
rawToChar(serialize(obj,NULL,TRUE)) to save ascii representations of the
object, and charToRaw(unserialize(obj)) to convert the ascii back into an R
object. Unfortunately, for sufficiently large objects, I am getting the
error "Could not allocate memory (2246 Mb) in C
function 'R_AllocStringBuffer'". This error seems incorrect to me as I am
running a 64-bit version of R and PostgreSQL on a server with 256GB of
(mostly free) RAM. There should not be any difficulty allocating memory.
Moreover, I only encounter the error if I call dbWriteTable from within
another function. Calling it directly inside a script works fine. This
leads me to believe there is perhaps a subtle bug here. I am running
RPostgreSQL 0.4 on R 3.0.3 / Linux x86_64. Please let me know if there is a
problem on my end or a better way for me to do this.

Here is a simple script to reproduce the error:

## This function creates a large object, serializes it, and saves it to the
## table '_test' with an associated unique 'id' column.
saveTest <- function(id) {
## Make a large object
obj <- list(1:3e7, sample(LETTERS, 3e7, replace=TRUE), a~b)
txt <- rawToChar(serialize(obj, NULL, TRUE))
x <- data.frame(id=id, txt=txt)
con <- dbConnect(dbDriver("PostgreSQL"), host=Sys.getenv("PGHOST"),
user=Sys.getenv("PGUSER"),
password=Sys.getenv("PGPWD"),
dbname="template1")
dbSendQuery(con, "SET CLIENT_ENCODING TO 'UTF8';")
dbWriteTable(con, "_test", x, append=TRUE, row.names=FALSE)
dbDisconnect(con)
}

## Create the _test table
> con <- dbConnect(dbDriver("PostgreSQL"), host=Sys.getenv("PGHOST"),
user=Sys.getenv("PGUSER"),
password=Sys.getenv("PGPWD"),
dbname="template1")
> dbSendQuery(con, "CREATE TABLE _test (id INTEGER PRIMARY KEY, txt TEXT);")
> dbDisconnect(con)

> saveTest(1) # Wait ~30 seconds
Error in postgresqlCopyInDataframe(new.con, value) (from
20140509.production.R@32#10) :
could not allocate memory (2246 Mb) in C function 'R_AllocStringBuffer'

> R.version
_
platform x86_64-pc-linux-gnu
arch x86_64
os linux-gnu
system x86_64, linux-gnu
status
major 3
minor 0.3
year 2014
month 03
day 06
svn rev 65126
language R
version.string R version 3.0.3 (2014-03-06)
nickname Warm Puppy

Thanks,
Robert McGehee

--
You received this message because this project is configured to send all
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings

rpost...@googlecode.com

unread,
Jul 16, 2014, 3:01:42 PM7/16/14
to rpostgr...@googlegroups.com

Comment #1 on issue 63 by rmcge...@gmail.com: Could not allocate memory in
C function 'R_AllocStringBuffer' Error
http://code.google.com/p/rpostgresql/issues/detail?id=63

I just checked out the most recent version of the RPostgreSQL code from svn
and confirmed that the behavior is the same in the devel version of the
code as well (0.5-1 30-Jan-2014).

rpost...@googlecode.com

unread,
Jul 16, 2014, 6:30:55 PM7/16/14
to rpostgr...@googlegroups.com

Comment #2 on issue 63 by rmcge...@gmail.com: Could not allocate memory in
C function 'R_AllocStringBuffer' Error
http://code.google.com/p/rpostgresql/issues/detail?id=63

Ok, I now suspect that this may not be a bug, but caused by strangeness (or
my lack of understanding) in how R environments work. Read on if you want
details, but no action is required.

It seems that inside of a function, R sometimes will return both the object
and a pointer to the calling environment (the function). However, in the
global environment, the environment is not returned. If the calling
environment has a lot of other objects in it, I suspect that they may get
copied as well when serialize is run on an object inside a function,
causing the serialized version to be (much?) bigger than it would otherwise
be, and thus causing the memory error.

Notice the different behavior of returning list(a~b) either in the global
environment or inside a function. In the second case, the calling
environment is also returned (as a pointer, I believe). I suspect that this
causes objects in the calling environment to be copied into the serialize
command inside a function, but not in the global environment.

> list(a~b)
[[1]]
a ~ b

> test <- function() list(a~b)
> test()
[[1]]
a ~ b
<environment: 0x1a788f80>

## Note that the character representation is about 10% larger when the
object is
## returned inside a function.
> nchar(rawToChar(serialize(list(a~b), NULL, TRUE))
[1] 183
> ncharTest <- function() nchar(rawToChar(serialize(list(a~b), NULL, TRUE))
> ncharTest()
[1] 199

Thus, a possible explanation for the "bug" is that serialize is returning
both the large object plus a large environment of other objects that is
legitimately larger than my available contiguous memory or larger than some
SQL configuration parameter. Thus, I withdraw the previous report.

Thanks, Robert
Reply all
Reply to author
Forward
0 new messages