I am trying to add data to an rtree virtual table in sqlite but it changes the number from the 6th decimal place. I am saving a 7 decimal place number in the x values and it is not committing the exact number I tell it to.
RTREE was originally envisioned as an index, not as the primary data store.The idea is that if you have high-precision location information, you canstore that separately, either in a separate table or inAuxiliary Columns of theRTREE. The RTREE helps you to narrow down your search for records to asmall subset of the total table, but you might still need to check all the values in the results of an RTREE query to ensure that each row reallydoes meet your constraints.
DOWNLOAD ✒ https://t.co/yNRvmnWzXt
RTREE is designed such that the precision limitations of 32-bit floatsmight cause extra rows to appear in the result, if those rows are righton the edge, but rounding errors willnever exclude rows that ought to be in the result.
I doubt this is an SQLite problem, but from a whole lot of experience doing this sort of thing (one of our major systems do human and vehicle tracking), I can tell you that most GPS systems are not that accurate (7th decimal defines a negligible earth distance near the equator, if I recall correctly
I think a normal 32-bit Float (also called "Single") is only accurate to about 7/8 digits, and this translates badly to LAT/LON storage due to the degree range - that's 7 decimals for a value < 10, 6 decimals for a value < 100 and only 5 decimals accurate above 100. I'm quoting from memory so forgive me if one of these statements are not exact.
Latitude only goes from -90 to 90 (though anything over +/-87.5 becomes nonsense on the Google Mercator) so it's always 6 decimals+ accurate, Longitude is accurate as long as you are close to Greenwhich, but over 100deg or before -100 it gets only 5 decimals of accuracy, etc. 64-bit Floats give about 16 digits precision I think, which obliterates the problem, and as far as I know SQLite will store floats as 64bit whenever needed, so this sounds to me like a 32-bit Float problem on the reading side - what API are you using to read the values, and what size float do you store the value in within your application? Are any of these done in a 32bit Float variable?
as far as I know SQLite will store floats as 64bit whenever needed, so this sounds to me like a 32-bit Float problem on the reading side - what API are you using to read the values, and what size float do you store the value in within your application? Are any of these done in a 32bit Float variable?
The min/max-value pair columns are stored as 32-bit floating point values for "rtree" virtual tables or as 32-bit signed integers in "rtree_i32" virtual tables. Unlike regular SQLite tables which can store data in a variety of datatypes and formats, the R*Tree rigidly enforce these storage types. If any other type of value is inserted into such a column, the r-tree module silently converts it to the required type before writing the new record to the database.
Would a 64-bit version of the RTREE module ever be on the cards? While GPS measurement devices are rarely more accurate (as mentioned), we've found trying to place things in a World map on small scale (within a room, say) to be quite handy, and that wouldn't be possible at 32-Bit accuracy.
It's not a strong feature request - We do not use SQLite for the system I mentioned (as my ignorance w.r.t. rtree format clearly shows), but I was imagining an exportable SQLite file format, which in turn would benefit from rtree, but none of that left the drawing board yet. The OP's situation may also benefit.
RTREE might still do calculations as floats, losing a part of that precision,
but as Richard explained, you can post-process the matches to use the full 32-bit precision,
instead of the 23-fraction-bits that remain from the IEEE float.
I don't think it translates like that exactly, depends on the minimum change in value that can be had from a Float32, and if memory serves it was around 3cm (rather than your 1.1cm), however, that is close enough for happiness and suffices for the next point.
3cm (or 1.1) that can be had using all 7 digits in the decimals only is plenty good enough at human scales [7 digits = .1234567], which is why 7 decimals is accurate enough and used almost universally when representing LAT/LONs. This also holds for longitude at the equator.
Apologies, I thought you had meant to try store Integer values into the current Float32 format (which won't work)... I did indeed not realise you advocated for changing the RTREE format to store 32bit Integers rather than floats (which would work, but if you going to go that far, why not just upgrade it to 64-bit floats?).
I'm not advocating to change anything, I'm working around the limited precision of 32-bits floats in this case,
to use the existing rtree_i32 variant, that uses integers instead of floats for storage (but not computation).
Which I explicitly linked to in my previous post. Sorry, I was apparently not clear enough.
The "location precision" of "decimal degrees at the equator" (this is the minimum -- as you get further away from the equator the "distance" covered by a degree diminishes from 60 nautical miles at the equator to 0 angstroms at the poles) looks like the following:
Here is the "minimum epsilon" values for the various IEEE754 floating point types (note that the epsilon for binary128 and binary256 show as zero because the number is smaller than can be represented in binary64 output to 17 decimal places).
This would indicate that binary16 can hold a value that is accurate to 27 kilometres at the equator; binary32 can hold a value that is accurate to 111 meters at the equator; binary64 and longer can hold a value accurate to less than a millimeter. This, of course, depends on how many guard bits you need. These numbers are based on having a "guard decimal 0" (about 3.6 bits I believe).
Anything more accurate than a CA precise coordinates will require at least binary64 storage (note that this is easily obtained with pseudorange equipment that was available before the turn of the century when the appropriate "error dispersion" corrections are applied).
It sounds like it is BETTER to use an RTREE_i32 instead of float-32 RTREE by multiplying by some power of 2? or 10,000,000? And, (in almost all cases?) one will get more accuracy by multiplying by a number to turn it into an int instead of using a 32-bit float (and then dividing by the same number when taking it out of the database).
@ddevienne points out an issue that I'm not sure how it affects things (specifically about the 23-fraction bits): "RTREE might still do calculations as floats, losing a part of that precision,but as Richard explained, you can post-process the matches to use the full 32-bit precision, instead of the 23-fraction-bits that remain from the IEEE float." I'm not sure what the implication of this is for using Geo / world coordinates.
Also, Keith Medcalf (kmedcalf) and Ryan Smith (cuz) makes a point about accuracy, but I'm not sure I grasp the implications for int32/float32 (which may only have 23 bits for -180/+180 lng and +90/-90 lat?)? ... I think the take away is still that one gets more accuracy by using RTREE_i32.
... and, it sounds like you get good (down to a few mm?) coverage of the entire world (lat/lng) using an RTREE_i32 (with multiplying/dividing by 10,000,000 (or some other number) when one writes/reads a value from the RTREE table.
That said, ALL of the storage stores exactly 32 bits, the difference is only in how those 32 bits are mapped. For instance a Float32 (IEEEE754) floating point format maps 1 sign bit, 23 bits used for the mantissa (significant digits) and 8 bits for the exponent. That is, only 23 of the bits really carry information on the mantissa or precision of the value, the other bits move the floating point or change the sign. 23 bits allow us 7 decimal digits of precision.
With a 32-bit Integer, there's still 1 sign bit, but all the other 31 bits carry precision information, allowing us 9 decimal digits of precision, and if we stick to below 2.1Bil, then we can use 10 digits.
What Richard suggests is that you could use the RTREE index to simply narrow down your search and then look up the actual (very precise) values in an adjacent table. i.e. use the RTREE as an index next to your data table. This means you can save lots of other fields also, and as high precision as you want, while still having superfast lookups using the index (even if it may hit one or two extra references), so this brings a lot of advantages, but with more data and an extra step in all lookups.
What Dominique suggests is that knowing the the underlying value is indeed 32 bits, design a translation function that converts the stored 32 bits to another more accurate format than FLoat32, much like a type-cast, but of your own design.More precisely, using the i32 version of RTREE in which all 32 bits (1 sign + 31 data) in storage are already in a format that is clear and easy to work with (Integer) and for the conversion function simply multiply the coordinate by 107 (or 10,000,000) to ensure you push all 7 useful significant LAT/LON decimals into the full Integer value, which, as long as you stay under 2.1 Billion (which +/-1.8Billion does), will fit snugly in those 32 bits.
This way you will have pinpoint accuracy (better than 3cm) over the total of Earth's surface at the average ground-level, but you will need those extra conversions everywhere you reference or add the data points.
Note that the accuracy is a rough estimation because the area marked out by 4 LAT/LON coordinates will grow in size with height, because projection lines from the center of a sphere will forever grow further apart as they continue outward, ditto for the quoted accuracy - it gets better as you come closer to ground/sea-level, and better still as you go underground.
IF you are okay with "casting" your LAT/LON values into an 32-bit Integer, by multiplying each by 10^7 (10,000,000) when you insert into the Sqlite RTREE_32 Table, and dividing by 10^7 when you read the values out...
795a8134c1