The problem I'm trying to solve right now is being able to efficiently load 70 million chemical compounds into postgres. I know there are other avenues for accomplishing this, but using R is the best solution in this case.
--
Tomoaki NISHIYAMA
Advanced Science Research Center,
Kanazawa University,
13-1 Takara-machi,
Kanazawa, 920-0934, Japan
The problem I have though is that I need to leave the primary key
field un-specified so that it will fill in the key from a sequence.
So, right now it executes "COPY tablename FROM", but could it be changed to grab
the list of fields from the given data frame and then add them to the copy command,
like so: "COPY tablename (col1, col2, ... ) FROM"?
--
Tomoaki NISHIYAMA
Advanced Science Research Center,
Kanazawa University,
13-1 Takara-machi,
Kanazawa, 920-0934, Japan
--
You received this message because you are subscribed to the Google Groups "RPostgreSQL Development and Discussion List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rpostgresql-d...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Dear Kevin,
The problem I have though is that I need to leave the primary keyfield un-specified so that it will fill in the key from a sequence.I do not understand why you need to leave the primary key unspecified.You can change the field characteristics with ALTER TABLE, ALTER SEQUENCE,and so on. To see what operation is needed you can prepare a smallexample table and use pg_dump."row.names" are copied by dbWriteTableto ensure that every record is distinguishable.It is by default a text field.If you need a serial primary key, you may create a new column, and computethe initial values and add constraints and index.(or just alter the column may work depending on the data.frame)
I want to use the serial primary key provided by postgres. I intend to do parallel inserts into the table and it is not practical to have the application generate unique ids. Thus, I need to leave it un-specified and let postgres generate valid ids. This is not really a "one time load". It needs to work under normal database usages and be very fast at the same time, to the extent possible.
Note that pg_dump/restore should have been very well tested byPostgreSQL developers to ensure the reliability and the speed.So, it would be a very hard to invent a better method.So, right now it executes "COPY tablename FROM", but could it be changed to grabthe list of fields from the given data frame and then add them to the copy command,like so: "COPY tablename (col1, col2, ... ) FROM"?What would be the calling convention of dbWriteTable, then?Current one is very simple, that is,dbWrtieTable(con, tablename, data.frame)If you want to write the list of columns, then wouldn'tit easier to make a data.frame that have only those columns?
Yes, that was my intention. The calling convention would be the same, just see what column names are in the given data frame
( sorry for the delay, I actually sent this from my email client 2 days ago and then the bounce got sent to my junk bin, just now found it)
Thus, I need to leave it un-specified and let postgres generate valid ids.
--
Tomoaki NISHIYAMA
Advanced Science Research Center,
Kanazawa University,
13-1 Takara-machi,
Kanazawa, 920-0934, Japan
--
西山智明
金沢大学学際科学実験センター
ゲノム機能解析分野
(920-0934 金沢市宝町13−1)
Tomoaki NISHIYAMA
Advanced Science Research Center,
Kanazawa University,
13-1 Takara-machi
Kanazawa, 920-0934 Japan
The given patch also should not break anything because prior to the patched version, all columns in the table must have been given in the data frame, so simply listing the columns in the COPY statement will not change anything.
--
西山智明
金沢大学学際科学実験センター
ゲノム機能解析分野
(920-0934 金沢市宝町13−1)
Tomoaki NISHIYAMA
Advanced Science Research Center,
Kanazawa University,
13-1 Takara-machi
Kanazawa, 920-0934 Japan
The given patch also should not break anything because prior to the patched version, all columns in the table must have been given in the data frame, so simply listing the columns in the COPY statement will not change anything.This patch, in fact, destroys many of the existing use of dbWriteTable.(you can run the tests with sh check_with_vars.sh after installing modified RPostgreSQL)Your usecase doesnot seem to fit the concept of dbWriteTable, which isintended to write the R data.frame to a table whose structure is definedby the driver rather than a table that preexisted.Perhaps, you are better coping the function under different namethat represent what you want to do.
You state that this use case doesn't fit the concept of dbWriteTable (the intention being to write to a new table, although append parameter contradicts this),
Tomoaki NISHIYAMA
Advanced Science Research Center,
Kanazawa University,
13-1 Takara-machi,
Kanazawa, 920-0934, Japan