Dear All
The CAAL group is running an Arches Version 5.0 / 5.1 system for the display of cultural heritage data from Central Asia.
We are uploading a large number of records into our Arches system through CSVs. We have 20k+ records already in the system, using both point data and polygons that display correctly. We are increasing our record number by about 20k, and are suddenly running into problems with elasticsearch:
1) Our palaeolithic data is being rejected because of the very early start date figure (-3000000).
2) We are receiving a large number of geometry errors (unable to tessellate shape/duplicate consecutive coords etc.).
To my knowledge, we haven’t received these errors before; we have had an imminent deadline to honour.
Has anyone else come across these problems before? And if so, do you have a solution? We are investigating to see if we can configure the tolerance of elasticsearch. We are also pursuing geometry corrections through QGIS geometry simplifications and if we find anything we will keep you posted - but if anyone has come across these problems before, your help would be appreciated.
Does anyone know what the smallest (negative) number is that Arches will accept as a date before it falls over?
Best wishes,
Bryan Alvey
Hi Alexei
Thank you for taking the time to help me with this!
I am uploading 13k+ records for a resource type via CSV.
My problem was occurring when I was trying to describe a
site with Palaeolithic origins, with a start date therefore of around -3,000,000
BCE. When uploading the data, I was getting an error of:
'type': 'mapper_parsing_exception',
'reason': "failed to parse field [date_ranges.date_range] of type [integer_range] in document with id '1ec942a7-e76c-4ea6-8e21-9e39ed6fd4b9'. Preview of field's value: '-12999999899'",
'caused_by': {
'type': 'json_parse_exception',
'reason': 'Numeric value (-12999999899) out of range of int\n at [Source: org.elasticsearch.common.bytes.AbstractBytesReference$MarkSupportingStreamInputWrapper@1a41a2d0; line: 1, column: 237
What I didn’t outline in my plea for help (mea culpa) was that dates are stored in our Arches application as EDTF fields. Buried in the standards document in the Library of Congress website (https://www.loc.gov/standards/datetime/) is the section below:
Letter-prefixed calendar year
'Y' may be used at the beginning of the date string to signify that the date is a year, when (and only when) the year exceeds four digits, i.e. for years later than 9999 or earlier than -9999.
So when loading up date data with more than 4 digits for the year you must add the letter Y .
When I add the prefix Y to the start and dates whose figures have more than four digits, (i.e. less than -9999 or greater than 9999) the data is uploaded correctly – and the response times of Arches improves markedly!
Result!
Thanks guys for all your help (and thanks Mahmoud!)
Best wishes,
Bryan
--
You received this message because you are subscribed to a topic in the Google Groups "Arches Development" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/arches-dev/nyM0lMje7ws/unsubscribe.
To unsubscribe from this group and all its topics, send an email to arches-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/arches-dev/977fc42e-1dd3-480f-90b4-59e6c53bb9d0n%40googlegroups.com.