OpenRefine project not uploading, java?

128 views
Skip to first unread message

Iñaki LL

unread,
Apr 4, 2022, 5:09:26 AM4/4/22
to OpenRefine
Hi again,

I attempted to upload a simple OpenRefine project, cited before, after I reconciled all values. When I thought that I was finally done with the schema, I pressed the button for upload (upload edits to Wikibase) and it seemed to start. Still the progress read "0% upload" and nothing happened.

On one of the attempts to upload, a notice popped up providing a list of issues, Java related. (I installed the exe provided in the download with OpenRefine) See also attached the capture of the screen while on upload "progress". I can extract operation history. Many thanks in advance

Iñaki
2022-03-31 Test schema issues.PNG
2022-03-31 Test schema not uploading.PNG

Owen Stephens

unread,
Apr 4, 2022, 10:54:26 AM4/4/22
to OpenRefine
Hi Iñaki

I'm afraid I don't know (and not sure how to diagnose) the issue you are seeing but I do see that in the screenshot of your schema that the "Argitaratze data" field still has a green underlining in the top of the schema - which suggests that the schema is still seeing this as a reconciled value. Since the "Publication date" field needs to have a simple date string, not a reconciled field - this could be causing a problem potentially.

If you haven't already done it, the first thing to try is clearing all the reconciliation data from that column - but just experimenting with this I'm not sure the schema is correctly realising that this has been done. An alternative approach might be to duplicate that column (using 'add column based on this column') to get a new, never reconciled, column with the date in, and then try using that in the schema instead of your existing column.

This may have nothing to do with your problem - but it was the one thing I noticed from the screens you've shared.

Best wishes

Owen

Iñaki LL

unread,
Apr 9, 2022, 5:47:50 AM4/9/22
to OpenRefine
No progress here, and I do not know if it has eventually something to do with Java, since pattypan is not opening either, despite having updated the version (Open JDK 17, confirmed in cmd, and Adoptium) and my numerous attempts in different computers. Best regards

Iñaki

2022(e)ko apirilakren 4(a), astelehena (16:54:26 (UTC+2)); Owen Stephens erabiltzaileak hau idatzi zuen:

Iñaki LL

unread,
Apr 9, 2022, 6:36:21 AM4/9/22
to OpenRefine
Hi Antoine, so following up from this thread, I cannot find JSON on the export button. Should it do if I export in another option? (There are a range of them)

Iñaki

2022(e)ko apirilakren 9(a), larunbata (11:47:50 (UTC+2)); Iñaki LL erabiltzaileak hau idatzi zuen:

Antoine Beaubien

unread,
Apr 10, 2022, 5:26:00 AM4/10/22
to OpenRefine
You can export you Schema with the menu Wikidata -- Export schema. This will save a json file. You can post it here, or even just copy the JSON text.

Regards,
   Antoine

Iñaki LL

unread,
Apr 11, 2022, 5:22:21 AM4/11/22
to OpenRefine
Hi Antoine, here you are, any hints welcome. Regards

Iñaki
2022(e)ko apirilakren 10(a), igandea (11:26:00 (UTC+2)); ant...@beaubien.qc.ca erabiltzaileak hau idatzi zuen:
schema.json

Antoine Beaubien

unread,
Apr 12, 2022, 6:16:46 AM4/12/22
to openr...@googlegroups.com
Hi Iñaki and Owen,

  So, to clarify the situation:
Argitalpena is the name of your publication in Wikidata THAT DOES NOT exists yet...

  If that is the case, you have a problem. Because I really doubt Wikidata will let you have an item with such a long name. 
And, to be honest, its can't be the name of the publication. It's a mix of part of a name and some description.

   A title would be "Collection of XYZ from ABC" or something like a science article, long and boring, but still, not what you have. Try to do at least one by hand.

   I think we may have valid material as to create an issue for "impossible to create item because of their name instead of waiting forever...".
On my machine (a ChromeBook at this time), it just passes thru, no error, but no element creation either.

   So Iñaki, can you simplify the name of the publication, and really put a title there? Also, I highly recommend that you do at least one creation manually with the most complex name you have. The error handling on Wikidata is a bit better than ours... ;-)

   Also, since you have a creation problem, you might also have some statement creation problem once the main problem is resolved.

Regards,
   Antoine






--
You received this message because you are subscribed to a topic in the Google Groups "OpenRefine" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/openrefine/t45RQjluL4M/unsubscribe.
To unsubscribe from this group and all its topics, send an email to openrefine+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/openrefine/a968ca01-7392-40c2-b5b6-3847f5a51ca3n%40googlegroups.com.

Iñaki Lopez de Luzuriaga

unread,
Apr 12, 2022, 9:58:30 AM4/12/22
to openr...@googlegroups.com
Thanks Antoine for checking!

Hau idatzi du Antoine Beaubien (ant...@beaubien.qc.ca) erabiltzaileak (2022 api. 12, ar. (12:16)):
Hi Iñaki and Owen,

  So, to clarify the situation:
Argitalpena is the name of your publication in Wikidata THAT DOES NOT exists yet...

Your are right.

  If that is the case, you have a problem. Because I really doubt Wikidata will let you have an item with such a long name. 
And, to be honest, its can't be the name of the publication. It's a mix of part of a name and some description.

Actually, during Cluster and edit... (as far as I remember) I got a message in my other larger project that went OpenRefine could never deal with 255 chars or more. At this stage, I do not think it was referring to Wikidata yet.

   A title would be "Collection of XYZ from ABC" or something like a science article, long and boring, but still, not what you have. Try to do at least one by hand.

 The title is longer than what we can see now, but I shortened in order to make it have less than 255 characters. It has now 95 characters, so far from reaching 255, it should not pose a problem, but will create manually in Wikidata. My challenge is to insert a valid title, but not producing any significant alteration, especially when it comes to searches. Now that is difficult. Also usually works of this period are really long. How does WD cope with them?

   I think we may have valid material as to create an issue for "impossible to create item because of their name instead of waiting forever...".
On my machine (a ChromeBook at this time), it just passes thru, no error, but no element creation either.

   So Iñaki, can you simplify the name of the publication, and really put a title there? Also, I highly recommend that you do at least one creation manually with the most complex name you have. The error handling on Wikidata is a bit better than ours... ;-)

Many thanks, Antoine/Owen. Hope I get some help there too :) 
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/openrefine/CAOEY%2B_8BNQ1VrzUet9Hw6QGXTzACbeJz8%3Dba4x4nMuD31bp71w%40mail.gmail.com.

Iñaki LL

unread,
Apr 12, 2022, 11:00:22 AM4/12/22
to OpenRefine
For the time being, no progress in upload. I created manually a Wikidata item for the first record in OpenRefine and that went alright. It allows for a long title (173 chars w spaces). I reconciled the publication name (argitalpena) value in OpenRefine w it. I did not reconcile publication date, leaving it plain text (capture attached), but I did attempt also reconciling it (calendar year) later, resulting in unsuccessful upload again. Regards

Iñaki

2022(e)ko apirilakren 12(a), asteartea (15:58:30 (UTC+2)); Iñaki LL erabiltzaileak hau idatzi zuen:
2022-04-12 Probakoa gora kargatzen % 0.PNG

Antoine Beaubien

unread,
Apr 12, 2022, 8:09:55 PM4/12/22
to OpenRefine
Hi Iñaki,

   So I went to the bottom of yours problems.
1) So, like I said, the first item had a title of 271, the is above the 255 limit.
 OR SHOULD have returned an error for that.
create a column based on the title, and put this grel expression: cells.Argitalpena.value.length()
Rerun the grel each time you modify your title cells.

2) the mul language is not working with OR, you didn't do your manual test with that language, did you? ;-)
OR SHOULD not show this language in the language menu.

3) even if I discarded the Reconciliation data for the publish data column, it still appears in the schema like a recon column,
OR SHOUL not display it as Recon Column, because recon data is removed.
For me, I had NO PROBLEMS with the date (after clearing Recon Data).

So, please go check the 2 Wikidata items I created for you:
https://www.wikidata.org/wiki/Q111591404

Regards,
   Antoine

Iñaki LL

unread,
Apr 13, 2022, 4:22:51 AM4/13/22
to OpenRefine
Thanks Antoine!

2022(e)ko apirilakren 13(a), asteazkena (02:09:55 (UTC+2)); ant...@beaubien.qc.ca erabiltzaileak hau idatzi zuen:
Hi Iñaki,

   So I went to the bottom of yours problems.
1) So, like I said, the first item had a title of 271, the is above the 255 limit.
 OR SHOULD have returned an error for that.

The whole title does have a lot of characters, but the shortened title with no author names and descriptions has 173 chars with spaces. 
create a column based on the title, and put this grel expression: cells.Argitalpena.value.length()
Rerun the grel each time you modify your title cells.

Um. What does that function do exactly, fit the title in the cell in order to be valid? 

2) the mul language is not working with OR, you didn't do your manual test with that language, did you? ;-)
OR SHOULD not show this language in the language menu.

I tested one OpenRefine record, the one with the longest title. The manual test? As posted above, I created that element manually in WD, where I could be more precise with re to the language; I removed the last part of the title in order to avoid length issues. In OpenRefine's schema I added multiple languages  (mul), because there are a number of languages in the list, especially ES and EU. 

3) even if I discarded the Reconciliation data for the publish data column, it still appears in the schema like a recon column,
OR SHOUL not display it as Recon Column, because recon data is removed.
For me, I had NO PROBLEMS with the date (after clearing Recon Data).

Ok, I understand you did not reconcile the publication year and it worked fine in the upload.  

So, please go check the 2 Wikidata items I created for you:
https://www.wikidata.org/wiki/Q111591404

I actually left the second record title (Guiristinoen doctrina laburra : haur-gaztei irakhasteco) untouched (unreconciled) and did no create it manually in order to see the upload performance and the eventual WD element. 

Antoine Beaubien

unread,
Apr 13, 2022, 3:57:33 PM4/13/22
to openr...@googlegroups.com
Hi,

Le mer. 13 avr. 2022, à 04 h 22, Iñaki LL <inaki.l...@gmail.com> a écrit :
Thanks Antoine!

You're welcome.
 
2022(e)ko apirilakren 13(a), asteazkena (02:09:55 (UTC+2)); ant...@beaubien.qc.ca erabiltzaileak hau idatzi zuen:
Hi Iñaki,

   So I went to the bottom of yours problems.
1) So, like I said, the first item had a title of 271, the is above the 255 limit.
 OR SHOULD have returned an error for that.

The whole title does have a lot of characters, but the shortened title with no author names and descriptions has 173 chars with spaces. 

Well, in the project I imported, one of the title was more than 255.
 
create a column based on the title, and put this grel expression: cells.Argitalpena.value.length()
Rerun the grel each time you modify your title cells.

Um. What does that function do exactly, fit the title in the cell in order to be valid? 

It counts the number of characters in the title.
image.png

 
2) the mul language is not working with OR, you didn't do your manual test with that language, did you? ;-)
OR SHOULD not show this language in the language menu.

I tested one OpenRefine record, the one with the longest title. The manual test? As posted above, I created that element manually in WD, where I could be more precise with re to the language; I removed the last part of the title in order to avoid length issues. In OpenRefine's schema I added multiple languages  (mul), because there are a number of languages in the list, especially ES and EU.

What I meant is that the element you created in Wikidata, you didn't enter the language MUL, but EN and basque. I don't think you can create a Wikidata item with only a label in MUL language.

In OR, only use ES, FR, EN, etc. Not mul. It a bug in OR as it should refuse to create with only that language. Or something like that. There is an error that is not displayed.
 
3) even if I discarded the Reconciliation data for the publish data column, it still appears in the schema like a recon column,
OR SHOUL not display it as Recon Column, because recon data is removed.
For me, I had NO PROBLEMS with the date (after clearing Recon Data).

Ok, I understand you did not reconcile the publication year and it worked fine in the upload.

Correct. It took the string value, 4 digits.
 
So, please go check the 2 Wikidata items I created for you:

I actually left the second record title (Guiristinoen doctrina laburra : haur-gaztei irakhasteco) untouched (unreconciled) and did no create it manually in order to see the upload performance and the eventual WD element.

Well, to test, I had to create it! ;-) Anyway, you can now add more info. The usage of the P1476 (Title) property is good because this can take a longer title, so you can put the original one here.

You should now be set to go on! If you have other problems, we'll be happy to assist you.

Regards,
   Antoine

Thad Guidry

unread,
Apr 13, 2022, 5:11:31 PM4/13/22
to openr...@googlegroups.com
I don't think you can create a Wikidata item with only a label in MUL language.

A label needs to be applied, however, it can be in any human language.
MUL is the ISO 639-3 designation for "multiple languages", and has no good usage for a Wikidata item as it's already implied that an item will have a label in multiple languages (named differently in different languages).

You'll need to create an item with at least 1 valid human language code in Wikidata.
If the language code for labeling is not available in Wikidata (as "frc" was not a few years ago!), you can request that it be added.

In OpenRefine you will need to supply at least 1 language code within the schema for the labeling of a column of entities.
If your column contains a mixture of different languages, then you might wait for this issue to be resolved, or use Python/Jython libraries to detect the language and filter the dataset.
Or filter and partition the dataset into the various languages outside of OpenRefine in whatever tool works well and then import into OpenRefine, one language set at a time, to create the Wikidata entities and schema.

Antoine Beaubien

unread,
Apr 13, 2022, 5:18:26 PM4/13/22
to openr...@googlegroups.com
Hi Thad,

> If your column contains a mixture of different languages, then you might wait for this issue to be resolved, or use Python/Jython libraries to detect the language and filter the dataset.

   I think we need to identify what's the bug here. Our user is not using multiple languages. Only one.

   Should we prevent MUL language in the Wikidata Schema? Or should we enforce that it can be added, but only if another language is also defined?
In our user example, he only has ONE language, but he tried to apply to the MUL language, and it failed, but with no error reporting or warning.

   If I want to be honest, in this user experience, OR is to blame a lot for not reporting the good errors. It would have spared us this post user support.

Regards,
   Antoine



--
You received this message because you are subscribed to a topic in the Google Groups "OpenRefine" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/openrefine/t45RQjluL4M/unsubscribe.
To unsubscribe from this group and all its topics, send an email to openrefine+...@googlegroups.com.

Thad Guidry

unread,
Apr 13, 2022, 5:34:05 PM4/13/22
to openr...@googlegroups.com
Ah, I see what you mean now.

Yes, a bug.  Specifically, I would say that there are certain validation rules that should be in place against any particular service.
In this case, that is the Wikidata extension with its upload capability.
We should add some additional constraint checks into the extension for acceptable language codes for a label field.
Or maybe instead of hardcoding into the extension, it can be deduced from the service, whatever some think best.

Iñaki Lopez de Luzuriaga

unread,
Apr 14, 2022, 4:59:56 AM4/14/22
to openr...@googlegroups.com
Thanks Antoine and Thad,

On the language label, got it. I added MUL in the schema because the dataset has documents with different languages, in this case Basque and Spanish. However, I think the language label belongs to the Wikidata element (description, alias) and not the title/content of the work, while the OpenRefine language column seems to refer to the title/content of the work.

I attempted to upload the schema, despite their having been added manually to Wikidata. For whatever reason, it does not upload (progress % 0). Hope this has nothing to do w java, it is giving me a headache, not only OpenRefine. Will try to upload another item, see what the result is. Best regards

Iñaki

Hau idatzi du Thad Guidry (thadg...@gmail.com) erabiltzaileak (2022 api. 13, az. (23:34)):
--
You received this message because you are subscribed to the Google Groups "OpenRefine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openrefine+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/openrefine/CAChbWaOUMh7tYV%2BUdXHX-7RSyBzUOX8ra0rkTXy1mQrHbsVJ-Q%40mail.gmail.com.

Antoine Beaubien

unread,
Apr 14, 2022, 6:13:08 AM4/14/22
to OpenRefine
Hi Iñaki,

  from there, you cannot just update to Wikidata, because the element now exists there (thanks to my tests). So, you got to reconciale, and they will find the new wikidata items. Is it with new publications (that have a new name) or with the 2 in your dataset?

   for the multiple languages, I'm not sure you understand me. Yes, you can put at the same time multiple languages. Just create a row for each. Me, I do a lot French, English and Spanish. This is true for labels, descriptions and aliases. Just DONT use the MUL language.

   If you want to know if you have a java problem, it's easy. Just open a terminal, and type: java -version. If you don't see something between v8 and v17, you may have a problem. 

Regards,
   Antoine

Iñaki Lopez de Luzuriaga

unread,
Apr 14, 2022, 11:06:23 AM4/14/22
to openr...@googlegroups.com
I tested an OR project with just one record, and uploaded the schema. I started to be happy when I saw that the upload progressed at last, but it stopped at 80%. I do not know why. I could check that the year was not text (perhaps date type), neither unreconciled, but actually generated the year in Wikidata when I checked WD: I could see that the element had been created, also creating other items, as scheduled in the schema. I add exported project if you could check it for possible flaws in its architecture. Many thanks

Iñaki

Hau idatzi du Antoine Beaubien (ant...@beaubien.qc.ca) erabiltzaileak (2022 api. 14, og. (12:13)):
You received this message because you are subscribed to a topic in the Google Groups "OpenRefine" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/openrefine/t45RQjluL4M/unsubscribe.
To unsubscribe from this group and all its topics, send an email to openrefine+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/openrefine/bbbf2fe3-80e2-4db6-a575-ffde7a4e0de8n%40googlegroups.com.
2022-04-14 Probako errenkada.PNG

Owen Stephens

unread,
Apr 21, 2022, 5:43:10 AM4/21/22
to OpenRefine
Hi Iñaki,

When a process in OpenRefine stops likes this, the first thing I do is check for any messages in the console/terminal where OpenRefine is running. Can you still recreate this issue - and if so are there any messages in the console?

Best wishes

Owen

Iñaki LL

unread,
Apr 21, 2022, 5:35:50 PM4/21/22
to OpenRefine
Um, do not know, where is the console? Recreate, you mean reproduce? I cannot remember, but I could send you the history. Best regards

Iñaki
2022(e)ko apirilakren 21(a), osteguna (11:43:10 (UTC+2)); Owen Stephens erabiltzaileak hau idatzi zuen:

Owen Stephens

unread,
Apr 22, 2022, 6:03:34 AM4/22/22
to OpenRefine
On Thursday, April 21, 2022 at 10:35:50 PM UTC+1 Iñaki LL wrote:
Um, do not know, where is the console?
When you run OpenRefine on Windows you get a small window with text in that pops up - and remains there while OpenRefine is running - that's what I mean by the console/terminal.
If you are running OpenRefine on a Mac there is no console by default, but if you follow the instructions at https://docs.openrefine.org/manual/running for Mac and following the steps listed under "To run OpenRefine using Terminal:" then you will get a console window

It will look something like this
Screenshot 2022-04-22 at 11.02.06.png

Recreate, you mean reproduce? I cannot remember, but I could send you the history. Best regards
Yes I mean reproduce. 
Unfortunately if you can't reproduce the problem it's unlikely we can work out what went wrong

Best wishes
Owen

 

Iñaki Lopez de Luzuriaga

unread,
Apr 22, 2022, 11:00:35 AM4/22/22
to openr...@googlegroups.com
That is fine, Owen. I will keep this in mind when I upload something next. Thank you

Iñaki

Hau idatzi du Owen Stephens (ow...@ostephens.com) erabiltzaileak (2022 api. 22, or. (12:03)):
--
You received this message because you are subscribed to a topic in the Google Groups "OpenRefine" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/openrefine/t45RQjluL4M/unsubscribe.
To unsubscribe from this group and all its topics, send an email to openrefine+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages