type BAST interface {
AddRow() error
IsLeft(interface{}, Cursor) bool
IsRight(interface{}, Cursor) bool
IsEqual(interface{}, Cursor) bool
IsEmpty(Cursor) bool
…
But I don't want to tie my users down to integers and floats. What if they want to sort strings? What if they want to sort complex numbers, matrixes, or what have you?
type Node struct {
Element
Left *Node
Right *Node
}
func (a Node) LeftEqual() bool {
if a.Left == nil {
return false
}
return a.Equal(a.Left.Element)
}
type Element interface {
Equal(Element) bool
Null() bool
}
-j
On 23 April 2018 at 19:58, Louki Sumirniy
<louki.sumir...@gmail.com> wrote:
> I set many of those identifiers to export because I wanted to keep separate
> derived libraries that replaced the storage types embedded into the main
> library. These limitations would not exist were the 'back store' over the
> wire or on disk. They only exist for the storage that is most accessible to
> the program, the main memory.
I don't understand this. If you're replacing the underlying storage
types, why would you need to export details of an algorithm that the
storage types shouldn't need to have anything to do with?
> Regarding the use of zero as an nil, that is already accepted by numerous
> built in types (error, for example), and in this case the reason why I have
> put that into the *abstract* type is because the data I am writing this
> algorithm for is hashes, which are never zero, at least in practise, over a
> period of decades, no hash functions produce zero out of any significant
> amount of known data.
Why does your algorithm need to know about the possible nil-ness of
the stored items? It doesn't use that property now, at any rate.
> This idea that I should be hiding all the functions smells like OOP to me.
This isn't OOP - it's just good code hygiene. By hiding implementation
details, you make the public interface clear and you prevent clients
from relying on internal implementation details that you might want to
change.
> As it stands with how it has to be implemented in Go, the accessing of these
> would-be-private-in-C++ scenario, is very cumbersome due to the need to use
> type assertions. I have tinkered with this for educational value but I don't
> want to have to deal with this in the actual implementation. A key insight
> of Schneier's work is 'security by obscurity' is weak. Private functions fit
> that maxim pretty well.
Private functions are not "security by obscurity". Private functions
aren't "obscure" (you can see them in the source perfectly well) but
they are "secure" because you cannot access them.
> I think that there is no real benefit in doing this,
> since after all, this is the source, and if you can be told you can't access
> that thing that is 'private' then it's not terribly private in reality, only
> for arbitrary philosophical reasons excluded from your programmatic access
> without forking the code.
As above, doing this is considered good code hygiene. If you do decide
to fork the code and make something public, then it's obvious that
you're breaking the API boundaries.
> The search functions are already in there, it's a piece of cake, given the
> ability to test for equality, lesser and greater. That was probably one of
> the first things I implemented, because it's so simple. Nobody has ever
> implemented a binary search tree with a dense store like this before, and so
> the insert and delete functions are not there yet. Just use your imagination
> for a little bit, even with a 4 row *dense* tree and consider what happens
> when I delete a node with children, let's say, at row 3, with two children.
> Where do thtose children go? They are now orphans, as their parent has been
> cut out of the tree. So, and critical to the design of BAST, is that it be
> possible to perform short optimisations during delete/insert write
> operations that do not cost too much time, but at the same time, do not
> cause the search cost to become excessively spread out, and specifically
> with delete, you can't have orphans. They have to be pushed up and inwards,
> it's not complicated, but it gets more complicated the more nodes are
> children of the severed node.
>
> It's not a model for tree structures that ANYONE is familiar with.
I think you overestimate the novelty of your algorithm. It sounds to
me like it bears significant resemblance to a "complete binary tree"
(see https://xlinux.nist.gov/dads/HTML/completeBinaryTree.html).
> Unlike simple, stupid newbie mistakes, this is intentional. I am writing this
> algorithm because it, by its architecture, reduces random access of memory,
> and in this, it gains a benefit of reduced cache misses in the CPU and thus
> will have, potentially, a serious performance advantage compared to any
> search-and-sort algorithm whether it be hashtables or vector reference based
> binary search trees.
You keep on making references to the efficiency of this algorithm, but
I don't see any benchmarks or algorithm analysis that supports your
viewpoint here.
I'd suggest starting with the basic algorithm without any abstraction
(just hard-code in the type you want to store), then benchmark/tweak
the algorithm, and only then try to make it general.
--
I worked for a little while on the C++ server application for the Steem network node, and I was intending to remove a whole swathe of code relating to protocol changes at various hard forks. The number of times I ran across poorly ordered if/then (not even using switch!) that would perform unnecessary comparisons in more cases than not, to me it explained the ballooning amount of time that was required to play the blockchain into a database shared file.
Any code that keeps data aligned to memory page and disk page sizes is automatically significantly faster, because misalignment automatically doubles the amount of memory that has to be accessed to satisfy a request. This is why Binary Heaps are way slower than B-heaps.