Arc Forumnew | comments | leaders | submitlogin
4 points by almkglor 6158 days ago | link | parent

Suppose we have two f<, f<old and f<new, to be used in a binary search tree b.

f<new may safely be used to replace f<old iff:

  all x in b: all y in b: (f<old x y) == (f<new x y)
This stems from the binary sort tree invariants:

  all x in b!l: (f<old x b!v)
  all x in b!r: (f<old b!v x)
And the definition of all x of b:

  all x of b = {all x in b!l, b!v, all x in b!r}
  all x of nil = {}
This is because if even a single case is wrong, that part of the binary search tree concerned with keeping track of the correct order becomes wrong. In such a case, it is necessary to rebuild the tree.

Of course, note that the above condition is necessary only for all current existing nodes, i.e. if we have a new m:

  all x in b: m != x
then it is okay that:

  all x in b: (f<old m x) != (f<new m x)
While your style allows such an edge case, it is arguably such an edgy edge case that you might as well rebuild the tree if f<old != f<new; this is largely because you have to check that f<new == f<old for all current members of the tree, and that checking is O(N^2), whereas rebuilding the tree is O(N) taking O(2N) space (and your nondestructive bst takes O(N) space for each operation anyway).


3 points by pg 6157 days ago | link

I was thinking about cases where you'd be willing to tolerate a small amount of misordering in the tree (since you were till then using the old scoring function, which presumably misordered things or you wouldn't have improved it). But deletion wouldn't work if you changed the scoring fn, and the number of cases where you both want to change the ordering function and never need to do deletions must be small enough that a general-purpose bst lib doesn't need to support that.

-----

3 points by almkglor 6157 days ago | link

I don't think you quite understand. A misordered binary search tree will fail lookups - (bst-find ...) will fail, even if that object is returned by (bst-elts ...). If you're going to create a general-purpose bst lib with "you can change the sort function without rebuilding the tree!" then you'd better make plenty damn sure there's a big warning saying "...but if you do (bst-find ...) might fail." Personally, I'd rebuild the tree anyway, because if the change is significant enough to change the sort function, it's big enough to completely destroy the meaning of the binary search tree.

-----

3 points by pg 6156 days ago | link

I get that. When I think of using a bst I think of what News.YC needs: the top-ranked 100 or so stories out of 100,000, and no need for find or delete. (Though currently News.YC just truncates the list and sorts it, which would stop working at very high submission rates.)

-----

1 point by almkglor 6156 days ago | link

c/ref Mythical Man-Month, Frederick P. Brooks.

A programming systems product takes about nine times as much effort as the component programs written separately for private use. I estimate that productizing imposes a factor of three; and that designing, integrating, and testing components into a coherent system imposes a factor of three; and that these cost components are essentially independent of each other.

-----