Hi,
I was asked to post this here by Mark Needham (@markhneedham) who thought my query took longer than it should.
USING PERIODIC COMMIT 500
LOAD CSV
FROM "file://path/to/csv/Active_Corporations___Beginning_1800__without_header__wonky_characters_fixed.csv"
AS company
CREATE (:DataActiveCorporations
{
DOS_ID:company[0],
Current_Entity_Name:company[1],
Initial_DOS_Filing_Date:company[2],
County:company[3],
Jurisdiction:company[4],
Entity_Type:company[5],
DOS_Process_Name:company[6],
DOS_Process_Address_1:company[7],
DOS_Process_Address_2:company[8],
DOS_Process_City:company[9],
DOS_Process_State:company[10],
DOS_Process_Zip:company[11],
CEO_Name:company[12],
CEO_Address_1:company[13],
CEO_Address_2:company[14],
CEO_City:company[15],
CEO_State:company[16],
CEO_Zip:company[17],
Registered_Agent_Name:company[18],
Registered_Agent_Address_1:company[19],
Registered_Agent_Address_2:company[20],
Registered_Agent_City:company[21],
Registered_Agent_State:company[22],
Registered_Agent_Zip:company[23],
Location_Name:company[24],
Location_Address_1:company[25],
Location_Address_2:company[26],
Location_City:company[27],
Location_State:company[28],
Location_Zip:company[29]
}
);
Each row is one node so it's as close to the raw data as possible. The idea is loosely that these nodes will be linked with new nodes representing people and addresses verified by reporters.
This is what I got:
+-------------------+
| No data returned. |
+-------------------+
Nodes created: 1964486
Properties set: 58934580
Labels added: 1964486
4550855 ms
Some context information:
Neo4j Milestone Release 2.1.0-M01
Windows 7
java version "1.7.0_03"
Best,
Aram