Having problems with the cell.cross() function

155 views
Skip to first unread message

Antoine Beaubien

unread,
Mar 19, 2020, 11:47:19 PM3/19/20
to OpenRefine
Hi users of OR,

   I'm not sure if I'm not using well the cell.cross() function, I can seam to be able to use it in many situations. Here's a few examples.

I have these 2 projects, pretty simple. 
#1 TEST-Provinces : Prov_ID, Prov_DIM, Prov_Lfr, Prov_WDID, Prov_CoordGeo
#2 TEST-Personnes : Pers_ID, Pers_Lfr, Pers_Ville, Prov_DIM

By using the column Prov_DIM as the key, I want to cross some data from one table to the other. But I can't, either with « Add column based on this column... » nor « Transform... ».

Here's my small data sample and a couple of screenshots.


#1 : trying to get data of an external project (fails), using « Add column based on this column... »: 
cell.cross("TEST-Provinces", "Prov_DIM")

Fonction_Cross_1.png

#2 : trying to get data from the same project using the same key as #1 (succede), using « Add column based on this column... »:
cell.cross("TEST-Personnes", "Prov_DIM")

Fonction_Cross_2.png


Just for the context, here a screenshot of the other project:

Fonction_Cross_3.png


Am I doing something wrong? The documentation does not indicate anything problematic with what I want to do.

Regards,
   Antoine

TEST-Personnes.openrefine_v1.1.tar.gz
TEST-Provinces.openrefine_v1.1.tar.gz

Antoine Beaubien

unread,
Mar 20, 2020, 12:15:04 AM3/20/20
to OpenRefine
Here's a better version that correct some data integrity issues (2 bad values).


Regards, A.
TEST-Personnes.openrefine_v1.2.tar.gz

Antoine Beaubien

unread,
Mar 20, 2020, 12:49:18 AM3/20/20
to OpenRefine
Also not working if using a string value (cell.value, or just "QC").

Fonction_Cross_4.png

Fonction_Cross_5.png




Le jeudi 19 mars 2020 23:47:19 UTC-4, Antoine Beaubien a écrit :
Hi users of OR,

   I'm not sure if I'm not using well the cell.cross() function, I can seam to be able to use it in many situations. Here's a few examples.

I have these 2 projects, pretty simple. 
#1 TEST-Provinces : Prov_ID, Prov_DIM, Prov_Lfr, Prov_WDID, Prov_CoordGeo
#2 TEST-Personnes : Pers_ID, Pers_Lfr, Pers_Ville, Prov_DIM

By using the column Prov_DIM as the key, I want to cross some data from one table to the other. But I can't, either with « Add column based on this column... » nor « Transform... ».

Here's my small data sample and a couple of screenshots.


#1 : trying to get data of an external project (fails), using « Add column based on this column... »: 
cell.cross("TEST-Provinces", "Prov_DIM")

#2 : trying to get data from the same project using the same key as #1 (succede), using « Add column based on this column... »:
cell.cross("TEST-Personnes", "Prov_DIM")


Just for the context, here a screenshot of the other project:


Lu Liu

unread,
Mar 20, 2020, 1:34:13 AM3/20/20
to OpenRefine
Hey Antoine, you need to trim "Prov_DIM" in both tables first. 

You can trim by 
1. select "Prov_DIM"
2. Edit cells
3. Transform
4. change the grel from "value" to "trim(value)"


After this, you can cross successfully.

捕获.PNG

Notice the first two errors occured because there are no "Prov_DIM" with value "PQ" in "TEST-Provinces".

Hope this is helpful!


Lu Liu

unread,
Mar 20, 2020, 1:42:19 AM3/20/20
to OpenRefine

Oh, the new data for Personnes you provided solves the "IndexOutOfBoundsException":

捕获1.PNG

Antoine Beaubien

unread,
Mar 20, 2020, 2:15:34 AM3/20/20
to OpenRefine
Yes! Thank you, Lu, it was an obvious mistake. Thank for seeing it.

Now, the real questions are coming… ;-)

So, if I create a new column with « Add column based on this column... » I have my data, with your corrections to my dataset and this Grel query: cross(value, "TEST-Provinces", "Prov_DIM")[0].cells["Prov_WDID"].value

Fonction_Cross_6.png

Now, if I clear the row after, and I try to get the same data again with the same query (but in the context of « Transform... », I get null values.

#1 :
cells.Prov_DIM.cross("TEST-Provinces", "Prov_DIM")

Fonction_Cross_8.png



#2 :  
cells.Prov_DIM.value.cross("TEST-Provinces", "Prov_DIM")[0].cells["Prov_WDID"] or even cells.Prov_DIM.value.cross("TEST-Provinces", "Prov_DIM") 

Fonction_Cross_7.png

I think I've read something from Antonin about the cell context, but there is nothing in the documentation about that.

What do you think of that?
 
Regards, Antoine

Owen Stephens

unread,
Mar 20, 2020, 9:29:32 AM3/20/20
to OpenRefine
Yes unfortunately the cross function does not work if you use `cells` to reference another cell - I'm not sure exactly why this fails, although there is discussion at https://github.com/OpenRefine/OpenRefine/issues/1950

Owen

Lu Liu

unread,
Mar 20, 2020, 9:41:12 AM3/20/20
to OpenRefine
I am looking at the code to find out why this fails, your post is really helpful. I believe I can gei it fixed :)

Antoine Beaubien

unread,
Mar 20, 2020, 2:23:52 PM3/20/20
to OpenRefine
@Owen, 

   I really wonder: why can we provide only a string value as the first parameter if that function is bound to a cell? Should'n cross, the way it's documented, handle cross("QC", "TEST-Provinces", "Prov_DIM")?

   Also, if what you say is thru, then this cross function can never be used somewhere else the in Add column based on this column...? Because a Transform is not bound to a particular cell…

@Lu: glad my information is helpful. Thank for your involvment.

Thanks for your input, regards,
   Antoine
Reply all
Reply to author
Forward
0 new messages