So I have a postgres database on a live server and I am trying to copy all of the data from the database into a new object oriented database (ZODB).
I started by doing a data dump and storing it all in csv files. I have then used pandas to open the csv files;store the data in a dataframe; manipulate the data as needed and then create the objects in the new database. On the whole this has been successful but there are a few data issues which have been introduced by creating the csv files.
So, I was thinking it might make more sense to connect to the postgres database and then read the data straight from there into a dataframe. I tried to do this today and hit problems.
First I used psycopg2 to connect, got a connection but then realised this way was deprecated and I should use a sqlalchemy connection. I set this up but I can't get anything out of it.
I have something along the lines of the code below but I get an error saying ''psycopg2 not installed'. Do I need to have psycopg2 installed as well? I can't see why from the code. Does it make sense to do this or should I stick with downloading to csv and reading from there? I have heard that can be quicker. I will have tens of thousands of records to deal with. Thanks in advance.
import pandas as pd
import sqlalchemy as sq
engine = sq.create_engine("postgresql+psycopg2://username:password@host:port/database")
the_frame = pd.read_sql_table(name_of_table, engine)