Hello,
I'm trying to interface with OpenRefine 2.6 (as it comes from GitHub at the moment) from a Scala webapp.
I need to create a project and import json files through one ore more HTTP requests. I've been digging around the OpenRefine Java and JavaScript code to determine how to do this.
To simplify the problem, I am:
- using the command: /command/core/create-project-from-upload
- sending only one JSON file as binary body of an HTTP entity
- setting properties as text parameters on the HTTP entity
For example, using the Java Apache HTTP library:
val file = new File("myFile.json")
val httpEntity = MultipartEntityBuilder.create()
.addBinaryBody("project-file", file, ContentType APPLICATION_JSON, file.getName)
.addTextBody("project-name", "testing")
.addTextBody("format", "text/json")
.build
val post = new HttpPost(url)
post.setEntity(httpEntity)
val client = HttpClientBuilder.create().build
val response = client.execute(post)
EntityUtils.toString(response.getEntity())
After this, OpenRefine returns a 500 error: "Failed to import file: java.lang.ArrayIndexOutOfBoundsException: 0"
The method parses the json file, including its content (for preview). However, it does not specify a "recordPath" field, which is required when parsing the content into the Open Refine tree. The importer
looks for the field and the following call to XmlImportUtilities.importTreeData fails because the recordPath array is empty.
If I supply recordPath as an option as an HTTP request parameter, then CreateProjectCommand
does not parse the content of my json file and only continue with the provided fields. So the project gets created, but it's obviously empty.
Is there anyone with more experience of the code that can help me figure out what's missing? My hypothesis at this point is that the JavaScript client manages this in two steps because the user has to provide a recordPath through the UI (i.e. by clicking on the JSON object property that they want). Would it be possible to emulate this programmatically?
Finally, this problem does not exist with CSV files; they are imported correctly. I assume this is why the python client only supports CSV input?
Thanks for any help you can provide!
Raff