matchBlocks error: A 0-dimensional array is not permitted

26 views
Skip to first unread message

Alexandre Rodrigues

unread,
Jan 24, 2015, 1:05:31 PM1/24/15
to open-source-...@googlegroups.com
Hello again,

I am testing dedupe with a script similar to mysql_example.py but with no database connection, only csv files.
When I run the script I get an error when running the matchBlocks from StaticDedupe Class:

hcluster/distance.py", line 1382, in squareform
-> The first argument must be one or two dimensional array. A 0-dimensional array is not permitted

If I choose blocks with more than 5 elements it works. 

Any ideas why this this is happening?

Can this happen because am I testing it with a low amount of data (4000 lines)? 
I also set the block records like this: [ id , row , set([]) ), .... ] . I don't understand yet what is 3º argument for. :/   

My operation system is centos 6.6 with python 2.7.6 and the following packages:

affinegap (1.0)
BTrees (4.0.8)
categorical-distance (1.3)
coverage (3.7.1)
Cython (0.21.2)
dedupe (0.7.6.9)
dj-database-url (0.3.0)
fastcluster (1.1.13)
haversine (0.1)
hcluster (0.2.0)
MySQL-python (1.2.5)
nose (1.3.4)
numpy (1.9.1)
persistent (4.0.8)
pip (1.5.6)
psycopg2 (2.5.4)
pudb (2014.1)
Pygments (2.0.1)
rlr (1.0)
setuptools (3.6)
six (1.8.0)
Unidecode (0.04.17)
urwid (1.3.0)
wsgiref (0.1.2)
zope.index (4.1.0)
zope.interface (4.1.2)


Thanks in advance,

Alex

Alexandre Rodrigues

unread,
Jan 25, 2015, 5:16:09 PM1/25/15
to open-source-...@googlegroups.com
I think I found the problem.

I did the training using the recordLink class and after loaded the setting file using the StaticDedupe class.
After redoing the training with the Dedupe class the problem disappeared.

Cheers,

Alex
Reply all
Reply to author
Forward
0 new messages