B-Tree

liuxinyu

unread,

May 30, 2010, 3:28:02 AM5/30/10

to TopLanguage

先粘一段CLRS上的chapter notes:
"B-trees were introduced by Bayer and McCreight in 1972 [32]; they did
not explain their choice of name."

WIKI上：
" Douglas Comer explains:
The origin of "B-tree" has never been explained by the authors. As we
shall see, "balanced," "broad," or "bushy" might apply. Others suggest
that the "B" stands for Boeing. Because of his contributions, however,
it seems appropriate to think of B-trees as "Bayer"-trees. (Comer
1979, p. 123 footnote 1)"

由于复杂和篇幅原因，CLRS上没有给Delete部分的算法pseudo code。而是作为作业题18.3-2留下来了。
扩展一下，可以不仅仅implement B-Tree，而且实现下B+-Tree和B*-Tree。

先开个头，后继跟帖给实现(C++, Haskell, Scheme/Lisp和Python)

--
刘新宇

liuxinyu

unread,

Jun 1, 2010, 3:41:18 AM6/1/10

to TopLanguage

继续：

CLRS上关于B-Tree的章节，是从磁盘存取这样的问题开始导入B-Tree的概念的。
当然B-Tree的确大量应用于文件系统和数据库。但是如果从另一个角度理解B-Tree，也很有意思：

B-Tree可以认为是排序二叉树(BST: Binary search tree)的扩展，换言之，平衡的排序二叉树是一种特殊的B-Tree。

从BST来理解B-Tree就忒别容易和自然了。

回顾下排序二叉树的定义(前序pre-order定义)：
一个排序二叉树要么为空，要么包含：
一个可比较的数值key；
一个左子树，其中所有左子树中的值都小于key
一个右子树，其中所有右子树中的值都大于key
（我们忽略相等的情况）
一句话概括为，1个值2个子树。

现在我们扩充下二叉树的前序定义：
1，存在不止一个数值，而是一组数值key_1, key_2, ..., key_n
2，自然而然，子树也不止2个，而是t_1, t_2, ..., t_{n+1}
3，类似关系仍然满足
any_value(t_1)<key1<any_value(t_2)<key2<...<any_value(t_n)<key_n<any_value(t_{n
+1})

这就初步得到了一个B-tree的部分定义（不完全）。
然后再加上平衡限制就好了：
1，每个节点的子树最少0个（叶子节点），最多2*t个，其中t是一个固定的常量
2，所有叶子节点的深度一样。

那么如果在插入（生成）过程中，万一超过2*t的限制了，咱么半？答案是分裂：
插入时自root开始，按照类似二叉树插入的路径，检查每个节点，如果路径上任何一个节点的子树达到了2*t，就先分裂之，一变二。
CLRS给出的方法是沿着路径预先分裂，所以不用回溯回去。这样每个节点也不用存储父指针parent。
（对比"Data Structures and Algorithms with Object-Oriented Design
Patterns in Python"中的实现
http://www.brpreiss.com/books/opus7/html/page344.html#SECTION0010720000000000000000）
就会发现他们的不同之处，CLRS的实现特别方便使用函数式编程语言实现，这个我以后会再讨论。

现在先给出一个Python版本的insert。这次，我特别想到用c++的class template特别有利于定义B-tree的
minimum degree。
大体如下：
template<int t> struct BTree{
//t is minumum degree
};
这样，可以这样在client program中使用：
BTree t = new BTree<3>(...);

C++的代码我以后会给出。

TREE_2_3_4 = 2 #by default, create 2-3-4 tree

class BTreeNode:
def __init__(self, t=TREE_2_3_4, leaf=True):
self.leaf = leaf
self.t = t
self.keys = [] #self.data = ...
self.children = []

# self: (...x, key[i]=k, ...) ==>
# self: (...x, key'[i], y, key'[i+1]=k...)
def split_child(self, i):
t = self.t
x = self.children[i]
y = BTreeNode(t, x.leaf)
self.keys.insert(i, x.keys[t-1])
self.children.insert(i+1, y)
y.keys = x.keys[t:]
x.keys = x.keys[:t-1]
if not y.leaf:
y.children = x.children[t:]
x.children = x.children[:t]
#disk_write(y)
#disk_write(z)
#disk_write(x)

def is_full(self):
return len(self.keys) == 2*self.t-1

以上是BTree节点的定义。我把节点分裂做为成员方法提供。其含义是，本节点的第i个子树已满，将其对半分裂之。
缺省情况下，如果不提供minimum degree，则默认是2-3-4树。

对于磁盘的读写，暂时都注释掉了，后面我会尝试从文件系统读写节点的内容。

下面是建立一个空树：
def B_tree_create(t=TREE_2_3_4):
x = BTreeNode(t)
#disk_write(x)
return x

插入实现如下：
def B_tree_insert(tr, key): # + data parameter
root = tr
if root.is_full():
s = BTreeNode(root.t, False)
s.children.insert(0, root)
s.split_child(0)
root = s
B_tree_insert_nonfull(root, key)
return root

这是CLRS上的方法，也就是自顶向下，预先分裂法。分裂好后，就可以确信任何插入都不会造成节点子树的数目过大了。所以可以放心调用
B_tree_insert_nonfull

def B_tree_insert_nonfull(tr, key):
if tr.leaf:
orderred_insert(tr.keys, key)
#disk_write(tr)
else:
i = len(tr.keys)
while i>0 and key < tr.keys[i-1]:
i=i-1
#disk_read(tr.children[i])
if tr.children[i].is_full():
tr.split_child(i)
if key> tr.keys[i-1]:
i = i+1
B_tree_insert_nonfull(tr.children[i], key)

这里面orderred_insert是实现，向一个有序列表插入数值，这是由于Python标准库（最小的那个）不提供orderred
collection
def orderred_insert(lst, x):
i = len(lst)
lst.append(x)
while i>0 and lst[i]<lst[i-1]:
(lst[i-1], lst[i]) = (lst[i], lst[i-1])
i=i-1
这个算法也挺有意思，我把带插入值直接放到列表最后，因为一般append的代价最小。若用link list实现，代价为O(1)
然后我从后向前，如果发现违反有序条件，则依次交换相邻元素。

为了方面输出，我写了一个B-tree到字符串的变换函数，是中序方式：
def B_tree_to_str(tr):
res = "("
if tr.leaf:
res += ", ".join(tr.keys)
else:
for i in range(len(tr.keys)):
res+= B_tree_to_str(tr.children[i]) + ", " + tr.keys[i] +
", "
res += B_tree_to_str(tr.children[len(tr.keys)])
res += ")"
return res

然后有一个方便从list生成B-tree的auxilary函数（从key-value对生成B-tree，从而实现类似电子词典的功能后继会给
出）
def list_to_B_tree(l, t=TREE_2_3_4):
tr = B_tree_create(t)
for x in l:
tr = B_tree_insert(tr, x)
return tr

下面就可以测试了：
class BTreeTest:
def __init__(self):
print "B-tree testing"

def run(self):
self.test_insert()

def test_insert(self):
lst = ["G", "M", "P", "X", "A", "C", "D", "E", "J", "K", \
"N", "O", "R", "S", "T", "U", "V", "Y", "Z"]
print "2-3-4 tree of", lst
tr = list_to_B_tree(lst)
print B_tree_to_str(tr)
print "B-tree with t=3 of", lst
print B_tree_to_str(list_to_B_tree(lst, 3))

输出结果如下：
# ./btree.py
B-tree testing
2-3-4 tree of ['G', 'M', 'P', 'X', 'A', 'C', 'D', 'E', 'J', 'K', 'N',
'O', 'R', 'S', 'T', 'U', 'V', 'Y', 'Z']
(((A, D), C, (E)), G, ((J, K), M, (N)), P, ((O), R, (S)), T, ((U), V,
(X, Y, Z)))
B-tree with t=3 of ['G', 'M', 'P', 'X', 'A', 'C', 'D', 'E', 'J', 'K',
'N', 'O', 'R', 'S', 'T', 'U', 'V', 'Y', 'Z']
((A, C, J), D, (E, G, K), M, (N, O), P, (R, S), T, (U, V, X, Y, Z))

暂时到这里，后继跟帖我会继续写。

--
刘新宇
http://sites.google.com/site/algoxy/

archer

unread,

Jun 1, 2010, 6:34:52 PM6/1/10

to pon...@googlegroups.com

Linux最近一个新的文件系统btrfs，就是基于btree的：http://en.wikipedia.org/wiki/Btrfs

2010/6/1 liuxinyu <liuxi...@gmail.com>

liuxinyu

unread,

Jun 1, 2010, 9:44:45 PM6/1/10

to TopLanguage

Hi,

修正昨天post中的一些错误

1.
昨天贴的代码中的一个bug:

<<<before<<<
def B_tree_insert_nonfull(tr, key):
#...

if key>tr.keys[i-1]:
i = i+1

#...

fix后的运行结果为：

# ./btree.py
B-tree testing
2-3-4 tree of ['G', 'M', 'P', 'X', 'A', 'C', 'D', 'E', 'J', 'K', 'N',
'O', 'R', 'S', 'T', 'U', 'V', 'Y', 'Z']

(((A), C, (D)), E, ((G, J, K), M, (N, O)), P, ((R), S, (T), U, (V), X,
(Y, Z)))

B-tree with t=3 of ['G', 'M', 'P', 'X', 'A', 'C', 'D', 'E', 'J', 'K',
'N', 'O', 'R', 'S', 'T', 'U', 'V', 'Y', 'Z']

((A, C), D, (E, G, J, K), M, (N, O), P, (R, S), T, (U, V, X, Y, Z))

中序输出为字母顺序(alphabetic order)

2. 错误的表述：
B-Tree的平衡限制为：
除叶子节点外，任何中间节点的子树数目为t到2*t之间，也就是说，最少有t棵子树，最多有2t棵子树。
根节点不在此限制之列。

故而，对于t=2时，构成2-3-4树，亦即任何中间几点的子树数目为2或3或4。

--
刘

>>>fixing>>>
def B_tree_insert_nonfull(tr, key):
#...
if key>tr.keys[i]:
i = i+1
#...

On Jun 1, 3:41 pm, liuxinyu <liuxiny...@gmail.com> wrote:
> 继续：
>
> CLRS上关于B-Tree的章节，是从磁盘存取这样的问题开始导入B-Tree的概念的。
> 当然B-Tree的确大量应用于文件系统和数据库。但是如果从另一个角度理解B-Tree，也很有意思：
>
> B-Tree可以认为是排序二叉树(BST: Binary search tree)的扩展，换言之，平衡的排序二叉树是一种特殊的B-Tree。
>
> 从BST来理解B-Tree就忒别容易和自然了。
>
> 回顾下排序二叉树的定义(前序pre-order定义)：
> 一个排序二叉树要么为空，要么包含：
> 一个可比较的数值key；
> 一个左子树，其中所有左子树中的值都小于key
> 一个右子树，其中所有右子树中的值都大于key
> （我们忽略相等的情况）
> 一句话概括为，1个值2个子树。
>
> 现在我们扩充下二叉树的前序定义：
> 1，存在不止一个数值，而是一组数值key_1, key_2, ..., key_n
> 2，自然而然，子树也不止2个，而是t_1, t_2, ..., t_{n+1}
> 3，类似关系仍然满足

> any_value(t_1)<key1<any_value(t_2)<key2<...<any_value(t_n)<key_n<any_value( t_{n

> +1})
>
> 这就初步得到了一个B-tree的部分定义（不完全）。
> 然后再加上平衡限制就好了：
> 1，每个节点的子树最少0个（叶子节点），最多2*t个，其中t是一个固定的常量
> 2，所有叶子节点的深度一样。
>
> 那么如果在插入（生成）过程中，万一超过2*t的限制了，咱么半？答案是分裂：
> 插入时自root开始，按照类似二叉树插入的路径，检查每个节点，如果路径上任何一个节点的子树达到了2*t，就先分裂之，一变二。
> CLRS给出的方法是沿着路径预先分裂，所以不用回溯回去。这样每个节点也不用存储父指针parent。
> （对比"Data Structures and Algorithms with Object-Oriented Design

> Patterns in Python"中的实现http://www.brpreiss.com/books/opus7/html/page344.html#SECTION00107200...

liuxinyu

unread,

Jun 3, 2010, 5:25:02 AM6/3/10

to TopLanguage

Hi,

今天给出B-Tree的删除程序。CLRS上没有给出这段程序的伪代码。但是有一个很清晰的描述。

删除时如果做到希望一次自顶向下搞定，而不需要回溯来保持B-Tree的特性，主要采取了如下的策略：
自顶向下检查，如果当前检查的中间节点的子树数目过少（<t，其中t是minimum degree）就通过一系列变换来增加节点的子树数目，然后继续
向下删除之。

这一系列变换无非包括以下几种：
1. 将两子节点和其中的key值合并成一个节点：(...child_x, key, child_y...)==>(...(child_x,
key, childy)...)
2, 向左侧或者右侧兄弟节点借一个key和child过来

下面是python代码，为了方便操作，增加了一些辅助成员函数：
class BTreeNode:
#...
def merge_children(self, i):
#merge children[i] and children[i+1] by pushing keys[i] down
self.children[i].keys += [self.keys[i]]+self.children[i
+1].keys
self.children[i].children += self.children[i+1].children
self.keys.pop(i)
self.children.pop(i+1)
#disk_write(self)
#disk_write(self.children[i])

def replace_key(self, i, key):
self.keys[i] = key
#disk_write(self)
return key

def can_remove(self):
return len(self.keys) >= self.t

其中merge_children用来将第i个child,第i个key和第i+1个child合并成一个B-Tree节点。
replace_key很简单，用来将第i个key替换成新值, can_remove用来判断一个节点的子树数目是否过少。

删除的主函数比较长，基本上就是CLRS上描述的一个实现。希望有人能指出如何简化：
def B_tree_delete(tr, key):
i = len(tr.keys)
while i>0:
if key == tr.keys[i-1]:
if tr.leaf: # case 1 in CLRS
tr.keys.remove(key)
#disk_write(tr)
else: # case 2 in CLRS
if tr.children[i-1].can_remove(): # case 2a
key = tr.replace_key(i-1,
tr.children[i-1].keys[-1])
B_tree_delete(tr.children[i-1], key)
elif tr.children[i].can_remove(): # case 2b
key = tr.replace_key(i-1, tr.children[i].keys[0])
B_tree_delete(tr.children[i], key1)
else: # case 2c
tr.merge_children(i-1)
B_tree_delete(tr.children[i-1], key)
if tr.keys==[]: # tree shrinks in height
tr = tr.children[i-1]
return tr
elif key > tr.keys[i-1]:
break
else:
i = i-1
# case 3
if tr.leaf:
return tr #key doesn't exist at all
if not tr.children[i].can_remove():
if i>0 and tr.children[i-1].can_remove(): #left sibling
tr.children[i].keys.insert(0, tr.keys[i])
tr.keys[i] = tr.children[i-1].keys.pop()
if not tr.children[i].leaf:

tr.children[i].children.insert(tr.children[i-1].children.pop())
elif i<len(tr.children) and tr.children[i+1].can_remove():
#right sibling
tr.children[i].keys.append(tr.keys[i])
tr.keys[i]=tr.children[i+1].keys.pop(0)
if not tr.children[i].leaf:
tr.children[i].children.append(tr.children[i
+1].children.pop(0))
else: # case 3b
if i>0:
tr.merge_children(i-1)
else:
tr.merge_children(i)
B_tree_delete(tr.children[i], key)
if tr.keys==[]: # tree shrinks in height
tr = tr.children[0]
return tr

测试一下，我按照CLRS中图18.8制作了一颗B-tree，然后顺序删除一些节点：
def test_delete(self):
print "test delete"
t = 3
tr = BTreeNode(t, False)
tr.keys=["P"]
tr.children=[BTreeNode(t, False), BTreeNode(t, False)]
tr.children[0].keys=["C", "G", "M"]
tr.children[0].children=[BTreeNode(t), BTreeNode(t),
BTreeNode(t), BTreeNode(t)]
tr.children[0].children[0].keys=["A", "B"]
tr.children[0].children[1].keys=["D", "E", "F"]
tr.children[0].children[2].keys=["J", "K", "L"]
tr.children[0].children[3].keys=["N", "O"]
tr.children[1].keys=["T", "X"]
tr.children[1].children=[BTreeNode(t), BTreeNode(t),
BTreeNode(t)]
tr.children[1].children[0].keys=["Q", "R", "S"]
tr.children[1].children[1].keys=["U", "V"]
tr.children[1].children[2].keys=["Y", "Z"]
print B_tree_to_str(tr)
lst = ["F", "M", "G", "D", "B"]
reduce(self.__test_del__, lst, tr)

def __test_del__(self, tr, key):
print "delete", key
tr = B_tree_delete(tr, key)
print B_tree_to_str(tr)
return tr

输入结果和CLRS上描述完全一致：
test delete
(((A, B), C, (D, E, F), G, (J, K, L), M, (N, O)), P, ((Q, R, S), T,
(U, V), X, (Y, Z)))
delete F
(((A, B), C, (D, E), G, (J, K, L), M, (N, O)), P, ((Q, R, S), T, (U,
V), X, (Y, Z)))
delete M
(((A, B), C, (D, E), G, (J, K), L, (N, O)), P, ((Q, R, S), T, (U, V),
X, (Y, Z)))
delete G
(((A, B), C, (D, E, J, K), L, (N, O)), P, ((Q, R, S), T, (U, V), X,
(Y, Z)))
delete D
((A, B), C, (E, J, K), L, (N, O), P, (Q, R, S), T, (U, V), X, (Y, Z))
delete B
((A, C), E, (J, K), L, (N, O), P, (Q, R, S), T, (U, V), X, (Y, Z))

未完。
后继我会给出search实现，然后给出Haskell的FP实现，希望能给出一个简洁些的程序

--
刘新宇
http://sites.google.com/site/algoxy/

liuxinyu

unread,

Jun 3, 2010, 11:04:04 PM6/3/10

to TopLanguage

Hi,

B-Tree的python version删除程序。下一步给出Haskell的纯FP实现。
def B_tree_search(tr, key):
for i in range(len(tr.keys)):
if key<= tr.keys[i]:
break
if key == tr.keys[i]:
return (tr, i)
if tr.leaf:
return None
else:
if key>tr.keys[-1]:
i=i+1
#disk_read
return B_tree_search(tr.children[i], key)

测试：
class BTreeTest:
def test_search(self):

lst = ["G", "M", "P", "X", "A", "C", "D", "E", "J", "K", \
"N", "O", "R", "S", "T", "U", "V", "Y", "Z"]

tr = list_to_B_tree(lst, 3)
print "test search\n", B_tree_to_str(tr)
for i in lst:
self.__test_search__(tr, i)
self.__test_search__(tr, "W")

def __test_search__(self, tr, k):
res = B_tree_search(tr, k)
if res is None:
print k, "not found"
else:
(node, i) = res
print "found", node.keys[i]

输出结果：
test search

((A, C), D, (E, G, J, K), M, (N, O), P, (R, S), T, (U, V, X, Y, Z))

found G
found M
found P
found X
found A
found C
found D
found E
found J
found K
found N
found O
found R
found S
found T
found U
found V
found Y
found Z
W not found

--
刘
http://sites.google.com/site/algoxy/

liuxinyu

unread,

Jun 7, 2010, 9:56:12 PM6/7/10

to TopLanguage

今天给出Haskell的B-Tree insert实现。

为了增加趣味性，我采取了和Python版本实现完全不同的策略。
Python版本的实现，基本上是原样根据CLRS上的算法，在插入时，先检查路径上经过的节点是否已经满了，若满则进行分裂操作。
这样，整个算法自顶向下仅仅traverse一遍，不需要回溯。

Haskell算法，我采取了类似Chris Okasaki在红黑树实现时的策略：直接插入，然后递归修复，以保持树的平衡特性。
这样算法看起来特别一致简洁。

首先是树的定义：
data BTree a = Node{ keys :: [a]
, children :: [BTree a]
, degree :: Int} deriving (Eq, Show)

empty deg = Node [] [] deg
一个节点包含n个key和n+1个child，并且还有一个整形值，表示B-tree的minimum degree
然后我提供了一个empty函数，供方便生成一个空的B-tree

我同事还提供了一个full函数，用以判断一个节点是否已经满了：
full::BTree a -> Bool
full tr = (length $ keys tr) >= 2*(degree tr)-1

然后我们看最核心的插入函数：
insert :: (Ord a)=> BTree a -> a -> BTree a
insert (Node ks [] t) x = fix $ Node ([a|a<-ks, a<x]++[x]++[b|b<-ks,
b>x]) [] t
insert (Node ks cs t) x = fix $ merge left (insert c x) right
where
left = (ks' , cs' )
right = (ks'', cs'')
(ks', ks'') = partition (< x) ks
cs' = take (length ks') cs
cs''= drop (length cs' +1) cs
c = head $ drop (length cs') cs
这个函数接受一个待插入的B-tree和一个值，然后尝试把这个值插入。总共只有两种情况：
1，当前的节点是一个叶子节点，所以代表children的列表为空。这是我们只要把key按大小顺序插入即可，当然，插入后可能不再满足B-tree
不能满的约束，所以用一个fix函数来修正之，fix的定义我后面给出。注意其中的list comprehension，极大地简化了程序；
2，当前节点为普通节点，我们只要根据待插入值x的大小，找到合适的位置，然后向此位置的子节点递归插入即可。找到此位置后，真个节点就分成了三部分，
左侧部分，待插入子节点和右侧部分；递归调用后我们需要把此3部分合并，如果出现了过满的情况，还要通过fix进行分裂操作。

其中，左侧部分，右侧部分的寻找我使用了partition函数，其定义在Data.List中，当然，也可以用类似的list
comprehension来实现之。
这里特别受益于Haskell库中的take和drop的特性：
take n [] == drop n[] == []，所以我不需要进行额外的是否为空的判断。

用来修正b-tree过满的函数定义如下：
fix :: BTree a -> BTree a
fix tr = if full tr then split tr else tr

split :: BTree a -> BTree a
split (Node ks cs t) = Node [k] [c1, c2] t where
c1 = Node (take (t-1) ks) (take t cs) t
c2 = Node (drop t ks) (drop t cs) t
k = head (drop (t-1) ks)
也就是说，如果一个节点的key的个数超过2*t-1，则说明需要分裂，然后split函数会把节点对称一分为二。

将左右节点合并的函数实现如下：

merge :: ([a], [BTree a]) -> BTree a -> ([a], [BTree a]) -> BTree a
merge (ks', cs') (Node [k] cs t) (ks'', cs'') = Node (ks'++[k]++ks'')
(cs'++cs++cs'') t
merge (ks', cs') c (ks'', cs'') = Node (ks'++ks'') (cs'++[c]++cs'')
(degree c)

合并时有两种情况：
1，中间的待插入节点，插入后发生了分裂，这时，我们需要把分裂后提升上来的新key合并进去；
2，中间待插入及诶单插入后没有发生分裂，这是仅仅需要把新的子节点更新即可。

为了方便打印输出，我给了一个把B-tree按照中序转换成字符串的函数：

toString :: (Show a)=>BTree a -> String
toString (Node ks [] _) = "("++(concat $ intersperse "," (map show ks))
++")"
toString tr = "("++(toStr (keys tr) (children tr))++")" where
toStr (k:ks) (c:cs) = (toString c)++", "++(show k)++", "++(toStr
ks cs)
toStr [] [c] = toString c

其中使用了intersperse函数，它类似python的join，定义在Data.List中。

然后可以测试了：
listToBTree::(Ord a)=>[a]->Int->BTree a
listToBTree lst t = foldl insert (empty t) lst

--test
testInsert = toString $ listToBTree "GMPXACDEJKNORSTUVYZ" 3

GHCi输出如下：
*BTree> testInsert
"((('A','C'), 'D', ('E','G','J','K'), 'M', ('N','O')), 'P',
(('R','S'), 'T', ('U','V'), 'X', ('Y','Z')))
请注意，结果和Python的略有不同，这是因为插入策略的不同。但是仔细验证，这一结果符合B-tree特性，结果是正确的。

--
刘
http://sites.google.com/site/algoxy/

liuxinyu

unread,

Jun 9, 2010, 1:37:54 AM6/9/10

to TopLanguage

几处简化：

使用Data.List可以将上次给出的Haskell程序简化一下。

首先，为了避免名字空间冲突：
import qualified Data.List as L

简化1：
before:

Node ([a|a<-ks, a<x]++[x]++[b|b<-ks, b>x]) [] t

after:
Node (L.insert x ks) [] t

简化2:
before:

cs''= drop (length cs' +1) cs
c = head $ drop (length cs') cs

after:

cs' = take (length ks') cs

(c:cs'')= cs L.\\ cs'

简化3：
before:

"("++(concat $ intersperse "," (map show ks)) ++")"

after:
"("++(L.intercalate "," (map show ks))++")"

后继我会给出FP实现的remove
--
刘
http://sites.google.com/site/algoxy/

liuxinyu

unread,

Jun 21, 2010, 1:30:19 AM6/21/10

to TopLanguage

Hi,

前一阵太忙，今天我继续给出B-tree的纯函数式解法。由Haskell实现。

我对Insert做了大幅改动。以前给出的merge函数含义不明，并且比较tricky。现在Insert部分结合Delete的需要改动如下。

首先，我显示给出了B-tree的限制条件，除根节点以外的其他节点，最多可以有2*t-1个key，最少可以有t-1个key

full::BTree a -> Bool

full tr = (length $ keys tr) > 2*(degree tr)-1

low::BTree a -> Bool
low tr = (length $ keys tr) < (degree tr)-1

插入函数变为如下：

insert :: (Ord a)=> BTree a -> a -> BTree a

insert tr x = fixRoot $ ins tr x

ins :: (Ord a) => BTree a -> a -> BTree a
ins (Node ks [] t) x = Node (L.insert x ks) [] t
ins (Node ks cs t) x = make (ks', cs') (ins c x) (ks'', cs'')
where
(ks', ks'') = L.partition (<x) ks
(cs', (c:cs'')) = L.splitAt (length ks') cs

插入实际上是调用ins这个函数后对root进行fix；其中fixRoot定义为：
fixRoot :: BTree a -> BTree a
fixRoot (Node [] [tr] _) = tr -- shrink height
fixRoot tr = if full tr then Node [k] [c1, c2] (degree tr)
else tr
where
(c1, k, c2) = split tr
也就是说，如果root节点仅仅指向一个唯一的子节点。则shrink整个B-tree的高度。若root节点含有太多的子节点。则对它进行split
操作。

split操作定义如下：
split :: BTree a -> (BTree a, a, BTree a)
split (Node ks cs t) = (c1, k, c2) where

c1 = Node (take (t-1) ks) (take t cs) t
c2 = Node (drop t ks) (drop t cs) t
k = head (drop (t-1) ks)

再把目光回到ins函数，若待插入节点为叶子节点，则直接插入之，否则，要找到待插入的位置。所谓的待插入位置，就是需要把一个
B-Tree节点分成三部分，左侧是所有小于待插入值的key和children，右侧是所有大于待插入值的key和children，中间剩余的
children
就是待插入位置。程序递归对其插入，插入好后，再用make函数将这三部分合并成一个节点。

make函数定义为：

make :: ([a], [BTree a]) -> BTree a -> ([a], [BTree a]) -> BTree a
make (ks', cs') c (ks'', cs'')
| full c = fixFull (ks', cs') c (ks'', cs'')
| low c = fixLow (ks', cs') c (ks'', cs'')
| otherwise = Node (ks'++ks'') (cs'++[c]++cs'') (degree c)

make函数接受3个参数，左侧的key和children，右侧的key和children，以及中间的子节点，若中间节点有太多的子节点，则调用
fixFull,
反之，若含有的子节点太少则调用fixLow，其他情况则仅仅把他们合并成一个B-Tree节点。

我们稍后再讨论delete时再详细解释fixLow，先看fixFull，它定义为：
fixFull :: ([a], [BTree a]) -> BTree a -> ([a], [BTree a]) -> BTree a
fixFull (ks', cs') c (ks'', cs'') = Node (ks'++[k]++ks'')
(cs'++[c1,c2]++cs'') (degree
c)
where
(c1, k, c2) = split c
其定义和fixRoot非常像，我就不多解释了。

现在可以测试插入操作了：
testInsert = do
putStrLn $ toString $ listToBTree "GMPXACDEJKNORSTUVYZBFHIQW" 3
putStrLn $ toString $ listToBTree "GMPXACDEJKNORSTUVYZ" 2
这个程序输出：

((('A','B'), 'C', ('D','E','F'), 'G', ('H','I','J','K')), 'M',
(('N','O'), 'P', ('Q','R','S'), 'T', ('U','V'), 'W', ('X','Y','Z')))
((('A'), 'C', ('D')), 'E', (('G','J','K'), 'M', ('N')), 'O', (('P'),
'R', ('S'), 'T', ('U'), 'V', ('X','Y','Z')))

注意后者是个2-3-4树。

下个帖子是delete的定义
--
刘新宇
http://sites.google.com/site/algoxy/

liuxinyu

unread,

Jun 21, 2010, 1:41:10 AM6/21/10

to TopLanguage

现在给Delete的定义，这个定义和insert比起来，很一致：

delete :: (Ord a) => BTree a -> a -> BTree a
delete tr x = fixRoot $ del tr x

del:: (Ord a) => BTree a -> a -> BTree a
del (Node ks [] t) x = Node (L.delete x ks) [] t
del (Node ks cs t) x =
case L.elemIndex x ks of
Just i -> merge (Node (take i ks) (take (i+1) cs) t)
(Node (drop (i+1) ks) (drop (i+1) cs) t)
Nothing -> make (ks', cs') (del c x) (ks'', cs'')

where
(ks', ks'') = L.partition (<x) ks
(cs', (c:cs'')) = L.splitAt (length ks') cs

从B-tree删除一个值时，其实就是调用内部函数del，然后对结果的根节点进行修正。这里主要是修正空root问题，也就是会造成树的高度
shrink。
仔细观察del，分两种情况，一种是叶子节点，我们直接删除之。另一种是非叶子节点。由细分两种情况：
1，待删除值就在这个节点的key中，我们移除这个key，然后将剩余的左右部分merge起来。
2，待删除值不再这个节点的key中，我们找到合适位置，认定它在某个子树种，递归调用delete，然后将结果合并成新的B-tree节点

其中merge的定义为：
merge :: BTree a -> BTree a -> BTree a
merge (Node ks [] t) (Node ks' [] _) = Node (ks++ks') [] t
merge (Node ks cs t) (Node ks' cs' _) = make (ks, init cs)
(merge (last cs) (head
cs'))
(ks', tail cs')

合并两个叶子节点，就是简单将key合并成一个新列表。合并两个非叶子节点则是递归调用merge，然后利用make将左侧，递归合并结果和右侧组合起
来。

我们回头看看make中的fixLow函数：
fixLow :: ([a], [BTree a]) -> BTree a -> ([a], [BTree a]) -> BTree a
fixLow (ks'@(_:_), cs') c (ks'', cs'') = make (init ks', init cs')
(unsplit (last cs')
(last ks') c)
(ks'', cs'')
fixLow (ks', cs') c (ks''@(_:_), cs'') = make (ks', cs')
(unsplit c (head ks'')
(head cs''))
(tail ks'', tail cs'')
fixLow _ c _ = c

如果一个子节点含有的key太少，我们要么从左侧借一个key，一个child，要么从右侧借，然后调用split的相反过程unsplit，其中
unsplit定义为：
unsplit :: BTree a -> a -> BTree a -> BTree a
unsplit c1 k c2 = Node ((keys c1)++[k]++(keys c2))
((children c1)++(children c2)) (degree c1)

OK，现在可以测试delete了：
testDelete = foldM delShow (listToBTree "GMPXACDEJKNORSTUVYZBFHIQW" 3)
"EGAMU" where
delShow tr x = do
let tr' = delete tr x
putStrLn $ "delete "++(show x)
putStrLn $ toString tr'
return tr'

结果为：
*BTree> testDelete
delete 'E'
((('A','B'), 'C', ('D','F'), 'G', ('H','I','J','K')), 'M', (('N','O'),

'P', ('Q','R','S'), 'T', ('U','V'), 'W', ('X','Y','Z')))

delete 'G'
((('A','B'), 'C', ('D','F'), 'H', ('I','J','K')), 'M', (('N','O'),

'P', ('Q','R','S'), 'T', ('U','V'), 'W', ('X','Y','Z')))

delete 'A'
(('B','C','D','F'), 'H', ('I','J','K'), 'M', ('N','O'), 'P',

('Q','R','S'), 'T', ('U','V'), 'W', ('X','Y','Z'))

delete 'M'
(('B','C','D','F'), 'H', ('I','J','K','N','O'), 'P', ('Q','R','S'),

'T', ('U','V'), 'W', ('X','Y','Z'))

delete 'U'
(('B','C','D','F'), 'H', ('I','J','K','N','O'), 'P',
('Q','R','S','T','V'), 'W', ('X','Y','Z'))

仔细对比Haskell这个FP版本和前面的Python版本，就发现思路的完全不同。Haskell版本更显得暴力brute-force，但是这
种“操作-修复”的approach，显得非常的
一致。

好了，下面我抽时间开始写B-tree的文章放到algoxy上，并给出C++和Scheme/Lisp的实现。

--
刘新宇
http://sites.google.com/site/algoxy/

liuxinyu

unread,

Jun 30, 2010, 11:14:32 PM6/30/10

to TopLanguage

Hi,

前一阵太忙了。今天继续。
为了写B-tree算法的文章，我需要一个工具能够方便的画B-tree。并且能方便放到我的Latex manuscript里。
由于懒得学习，我继续用dot工具(Graphviz)

为此我写了一个程序把B-tree方便转换成dot。下面共享并解释一下这个工具的代码。

首先介绍一下使用方法：
在命令行运行：
bt2dot foo.dot "(((A, B), C, (D, E, F), G, (H, I, J, K)), M, ((N, O),
P, (Q, R, S), T, (U, V), W, (X, Y, Z)))"
这样就会输出一个foo.dot文件，里面绘制了一个B-tree，内容就是后面的A..Z这串字符表示的（中序遍历）

同时程序会在屏幕上打印foo.dot文件的内容：
digraph G{
node[shape=record]
tM[label="<C0>|M|<C1>"];
tMCG[label="<C0>|C|<C1>|G|<C2>"];
tMCGAB[label="A|B"];
tMCGDEF[label="D|E|F"];
tMCGHIJK[label="H|I|J|K"];
tMCG:C0->tMCGAB;
tMCG:C1->tMCGDEF;
tMCG:C2->tMCGHIJK;
tMPTW[label="<C0>|P|<C1>|T|<C2>|W|<C3>"];
tMPTWNO[label="N|O"];
tMPTWQRS[label="Q|R|S"];
tMPTWUV[label="U|V"];
tMPTWXYZ[label="X|Y|Z"];
tMPTW:C0->tMPTWNO;
tMPTW:C1->tMPTWQRS;
tMPTW:C2->tMPTWUV;
tMPTW:C3->tMPTWXYZ;
tM:C0->tMCG;
tM:C1->tMPTW;
}

此后运行dot就会生成图像，格式可以为png或者ps等等。
下面我来解释下bt2dot的程序，以前这种脚本我一般用Python写，这次我用Haskell写来增加趣味性。

程序首先import了如下的库：

import System.Environment (getArgs)
import Text.ParserCombinators.Parsec
import Control.Monad (mapM_)
import Data.List (intercalate, zipWith)
import System.IO (writeFile)

其中getArgs用来获取命令行的参数，我使用了Haskell的Text.Parsec来方便进行parsing。

由于命令行中描述的B-tree不用规定minimum degree故而我简化了B-tree的定义如下：
data Node a = Node [a] [Node a] deriving (Eq, Show)

然后是parsing部分：
node = (try leaf) <|> branch

这句parsing的含义是：一个B-tree的node要么是一个叶子leaf，要么是branch。然后parsing leaf的函数定义为：

leaf = do
char '('
ks <- key `sepBy` (string "," >> spaces)
char ')'
return (Node ks [])

其含义是：一个叶子节点首先是一个左括号，然后是一组用逗号（以及若干空格）分割的keys，最后是右括号，例如：(A, Bar, Foo)

Branch节点的parser定义如下：

branch = do
char '('
(cs, ks)<- everyOther node key (string "," >> spaces)
char ')'
return (Node ks cs)

一个Branch通常是如下形式(子节点，key，子节点，key, ...子节点)。也就是子节点比key多一个，交替出现，用逗号(以及若干空格)
分隔。
这种形式我定义了everyOther函数，后面会介绍。

然后我们看看key的parser
key = many (noneOf ", ()")
其含义是key是若干文字，但是不能包含逗号，空格以及左右括号。

现在我们来看看everyOther函数是如何定义的：
-- input: a1, b1, a2, b2, ... an, bn, a{n+1}
-- output: ([a1, a2, ..., an, a{n+1}], [b1, b2, ..., bn])
everyOther pa pb sep = do{ x <- pa
; sep
; (ys, xs)<- every2 pb pa sep
; return (x:xs, ys)}
everyOther接受3个参数，解析1,3,5...奇数位置上的parser pa，和解析2,4,6...偶数位置上的parser pb，以
及分隔符parser sep。
他首先利用pa提取出第一个位置上的值，然后调用every2去把剩余的内容解析出来。

其中every2定义为：
-- input a1, b1, a2, b2, ... an, bn
-- output: ([a1, ..., an], [b1, ..., bn])
every2 pa pb sep = scan where
scan = do { x<-pa
; y<-(sep>>pb)
; do {
; (xs, ys)<-(sep>>scan)
; return (x:xs, y:ys)
}
<|> return ([x], [y])
}
这个函数调用了内部的scan函数，它首先利用pa获取x，然后拿到一个分隔符并丢弃，之后利用pb获取一个y，此后如果有剩余则递归调用scan，如
果没有
剩余则将x, y封装成仅有一个元素的list的pair。然后扔到monad的结果里。

以上就完成了parse，有兴趣的读者可以用测试下，例如：
testParse = mapM_ (\x->putStrLn $ show $ (parse node "unknown" x))
["((A, B), C, (D, E))",
"((A,B), C, (D,E))",
"(((A, B), C, (D, E, F), G, (H, I, J, K)), M, ((N, O), P,
(Q, R, S), T, (U, V), W, (X, Y, Z)))"]

这个测试会生成结果：
*Main> testParse
Loading package parsec-2.1.0.1 ... linking ... done.
Right (Node ["C"] [Node ["A","B"] [],Node ["D","E"] []])
Right (Node ["C"] [Node ["A","B"] [],Node ["D","E"] []])
Right (Node ["M"] [Node ["C","G"] [Node ["A","B"] [],Node
["D","E","F"] [],Node ["H","I","J","K"] []],Node ["P","T","W"] [Node
["N","O"] [],Node ["Q","R","S"] [],Node ["U","V"] [],Node
["X","Y","Z"] []]])

parse完成后，就需要根据dot语法，将Node变换成dot了。

我们看核心转换函数：
toDot (Node ks []) prefix = prefix++(concat ks)++"[label=\""++
(intercalate "|" ks)++"\"];\n"
toDot (Node ks cs) prefix = prefix'++"[label=\""++(defNode ks)++"\"];
\n" ++
(concat $ map (\c->toDot c prefix') cs) ++
(defCons cs prefix')
where prefix' = prefix ++ (concat ks)
如果是个叶子节点，例如(A, B, C)我们输出prefixABC = [label="A|B|C"];
intercalate函数可以帮助完成这个任务

如果是branch节点，例如(子节点0, A, 子节点1, B, 子节点2, C，子节点3）则首先需要转换成：
prefixABC = [label="<c0>|A|<c1>|B|<c2>|C|<c3>"];
然后递归定义子节点1到3的dot语句，之后再把他们连接起来。

其中defNode定义为：
defNode :: [String] -> String
defNode ks = "<C0>"++ (concat $ zipWith (\i k->"|"++k++"|<C"++(show i)+
+">") [1..] ks)

这里使用了无穷序列[1..]。这是个小技巧。

defCons用来定义连接：
defCons cs prefix = concat $ zipWith f [0..] cs where
f i c = prefix++":C"++(show i)++"->"++prefix++(nodeName c)++";\n"
nodeName (Node ks _) = concat ks

然后，写文件的函数定义为：
genDot fname (Right node) = writeFile fname dots >> putStrLn dots
where
dots = "digraph G{\n\tnode[shape=record]\n"++(addTab $ toDot
node "t")++"}"
addTab s = unlines $ map ("\t"++) (lines s)
注意，我在{}里加了tab用于缩进。

主函数，以及如何获取命令行参入定义如下：
parseArgs :: [String] -> (String, String)
parseArgs [fname, s] = (fname, s)
parseArgs _ = error "wrong usage\nexample:\nbt2dot output.dot \"((B,
C), A, (D, E))\""

main = do
args <- getArgs
let (fname, s) = parseArgs args
genDot fname (parse node "unknown" s)

--
刘新宇
http://sites.google.com/site/algoxy/

liuxinyu

unread,

Jul 1, 2010, 5:30:34 AM7/1/10

to TopLanguage

同志们，俺们这个可是技术论坛！
我也学习鲍尔默一把：

for(i=1; i<=14; ++i)
shout "技术！";

mapM_ (shout "技术") [1..14]

for i in range(14)
shout "技术"

(define (repeat i)
(if (> i 1)
(begin (shout "技术") (repeat (- i 1)))))

(repeat 14)

--
刘
http://sites.google.com/site/algoxy/

Alecs King

unread,

Jul 1, 2010, 6:02:29 AM7/1/10

to pon...@googlegroups.com

On Wed, Jun 30, 2010 at 08:14:32PM -0700, liuxinyu wrote:
> Hi,

hello,

>
> 前一阵太忙了。今天继续。
> 为了写B-tree算法的文章，我需要一个工具能够方便的画B-tree。并且能方便放到我的Latex manuscript里。
> 由于懒得学习，我继续用dot工具(Graphviz)
>
> 为此我写了一个程序把B-tree方便转换成dot。下面共享并解释一下这个工具的代码。

nice. would be better if it was written in literate style or a link to
the code was provided.

>
> 首先介绍一下使用方法：
> 在命令行运行：
> bt2dot foo.dot "(((A, B), C, (D, E, F), G, (H, I, J, K)), M, ((N, O),
> P, (Q, R, S), T, (U, V), W, (X, Y, Z)))"

me thinks "cmd outfile inputstring" not a good idea. how about getting
input from a file:

$ cmd inputfile outfile

or better, just read from stdin and write to stdout. we can redirect
files if needed. and we can easily do something like this:

$ universe | bt2dot | dot -Tsvg | tee 42.svg | viewsvg

>
> leaf = do
> char '('
> ks <- key `sepBy` (string "," >> spaces)
> char ')'
> return (Node ks [])
>
> 其含义是：一个叶子节点首先是一个左括号，然后是一组用逗号（以及若干空格）分割的keys，最后是右括号，例如：(A, Bar, Foo)
>
> Branch节点的parser定义如下：
>
> branch = do
> char '('
> (cs, ks)<- everyOther node key (string "," >> spaces)
> char ')'
> return (Node ks cs)
>

we can filter out spaces for once and all at the start and then do not
bother parsing it in between. this also looses the rule where a space
may appear -- " ( Foo , Bar, Baz ) ".

> 现在我们来看看everyOther函数是如何定义的：
> -- input: a1, b1, a2, b2, ... an, bn, a{n+1}
> -- output: ([a1, a2, ..., an, a{n+1}], [b1, b2, ..., bn])
> everyOther pa pb sep = do{ x <- pa
> ; sep
> ; (ys, xs)<- every2 pb pa sep
> ; return (x:xs, ys)}
> everyOther接受3个参数，解析1,3,5...奇数位置上的parser pa，和解析2,4,6...偶数位置上的parser pb，以
> 及分隔符parser sep。
> 他首先利用pa提取出第一个位置上的值，然后调用every2去把剩余的内容解析出来。
>
> 其中every2定义为：
> -- input a1, b1, a2, b2, ... an, bn
> -- output: ([a1, ..., an], [b1, ..., bn])
> every2 pa pb sep = scan where
> scan = do { x<-pa
> ; y<-(sep>>pb)
> ; do {
> ; (xs, ys)<-(sep>>scan)
> ; return (x:xs, y:ys)
> }
> <|> return ([x], [y])
> }
> 这个函数调用了内部的scan函数，它首先利用pa获取x，然后拿到一个分隔符并丢弃，之后利用pb获取一个y，此后如果有剩余则递归调用scan，如
> 果没有
> 剩余则将x, y封装成仅有一个元素的list的pair。然后扔到monad的结果里。
>

looks a little verbose. i for one cant easily understand the 'scan' part.
how about this:

everyOther pa pb sep = do

a <- pa
r <- many (liftM2 (,) (sep >> pb) (sep >> pa))
let (bs, as) = unzip r
return (a:as, bs)

> 以上就完成了parse，有兴趣的读者可以用测试下，例如：
> testParse = mapM_ (\x->putStrLn $ show $ (parse node "unknown" x))
> ["((A, B), C, (D, E))",
> "((A,B), C, (D,E))",
> "(((A, B), C, (D, E, F), G, (H, I, J, K)), M, ((N, O), P,
> (Q, R, S), T, (U, V), W, (X, Y, Z)))"]

FYI, parsec has a builtin 'parseTest'.

havent very much looked into the writing part. maybe we can tidy it a
bit. hlint is always your friend. =)

Regards,

--
Alecs King

liuxinyu

unread,

Jul 1, 2010, 6:55:10 AM7/1/10

to TopLanguage

Great!

Your comment is so helpful! I'll modify the code according to your
valuable suggestion.

I really love the idea of using liftM2 (,).

Thanks a lot!

--
Liu Xinyu
http://sites.google.com/site/algoxy/

liuxinyu

unread,

Jul 1, 2010, 10:26:43 PM7/1/10

to TopLanguage

Hi,

多谢Alecs King的帮助，程序得到大幅简化。

我又import了Control.Applicative中的(<$>)，这样everyOther就进一步缩减为：

everyOther pa pb sep = do
a <- pa

(bs, as) <- unzip <$> many(liftM2 (,) (sep>>pb) (sep>>pa))
return (a:as, bs)

然后，在调用parser前，filter掉了所有的space
parse node "unknown" (filter (not.isSpace) s)
其中isSpace在Data.Char中定义。

另外，毕竟这里的情况比较简单，如果Key允许使用引号的话，就需要较复杂的处理了。例如("Hello World", "Foo",
"Bar")

再次感谢Alecs King！
--
刘新宇
http://sites.google.com/site/algoxy/

liuxinyu

unread,

Jul 2, 2010, 7:42:06 AM7/2/10

to TopLanguage

开始在Algoxy上写了：
http://sites.google.com/site/algoxy/btree
预计三个月左右脱稿。

liuxinyu

unread,

Jul 21, 2010, 6:59:48 AM7/21/10

to TopLanguage

Hi,

今天给出B-tree插入算法的C++实现：

由于有模板这个工具，每一个不同degree的B-tree都可以定义为不同的类型：
// t: minimum degree of B-tree
template<class K, int t>
struct BTree{
typedef K key_type;
typedef std::vector<K> Keys;
typedef std::vector<BTree*> Children;

BTree(){}

~BTree(){
for(typename Children::iterator it=children.begin();
it!=children.end(); ++it)
delete (*it);
}

//...略

Keys keys;
Children children;
};

由于后面我需要random access一个B-tree的key和child，所以选用了vector座位底层的数据结构。
为了方便，在dtro里使用了递归的方式来释放子节点。

另外还定义了一些方便的成员函数来判断一个节点是否满了，以及是否是叶子节点：
template<class K, int t>
struct BTree{

//...略...

bool full(){ return keys.size() == 2*t-1; }

bool leaf(){
return children.empty();
}

//...略...

};

最后一个比较复杂的成员函数是split，这个函数的前置条件是第i个子节点已经满了需要分割：
template<class K, int t>
struct BTree{

//...略...

// k0, k1, ... ki, k{i+1}, ...
//c0, c1, ..., ci, c{i+1}, ...
// perform split on ci
//
void split_child(int i){
BTree<K, t>* x = children[i];
BTree<K, t>* y = new BTree<K, t>();
keys.insert(keys.begin()+i, x->keys[t-1]);
children.insert(children.begin()+i+1, y);
y->keys = Keys(x->keys.begin()+t, x->keys.end());
x->keys = Keys(x->keys.begin(), x->keys.begin()+t-1);
if(!x->leaf()){
y->children = Children(x->children.begin()+t, x-
>children.end());
x->children = Children(x->children.begin(), x->children.begin()
+t);
}
}

分割算法就是若第i个节点满，则其含有2t-1个key，把中间的key提上来然后左右t-1个key和t个children分别独立出2个子节点。

此后，就可以定义insert主函数了：
template<class K, int t>
BTree<K, t>* insert(BTree<K, t>* tr, K key){
BTree<K, t>* root(tr);
if(root->full()){
BTree<K, t>* s = new BTree<K, t>();
s->children.push_back(root);
s->split_child(0);
root = s;
}
return insert_nonfull(root, key);
}
这个函数上来先判断root是否满，若满的话则新建一个root，然后将原来的root变为此新root的子节点并进行split。分割后调用
insert_nonfull来进行真正的插入。

insert_nonfull的CLRS课本上给出的实现的最大区别在于，CLRS给出是一个递归实现，我这里给出一个消除了递归的纯
imperative实现：
template<class K, int t>
BTree<K, t>* insert_nonfull(BTree<K, t>* tr, K key){
typedef typename BTree<K, t>::Keys Keys;
typedef typename BTree<K, t>::Children Children;

BTree<K, t>* root(tr);
while(!tr->leaf()){
unsigned int i=0;
while(i < tr->keys.size() && tr->keys[i] < key)
++i;
if(tr->children[i]->full()){
tr->split_child(i);
if(key > tr->keys[i])
++i;
}
tr = tr->children[i];
}
orderred_insert(tr->keys, key);
return root;
}
这个实现很简单，如果待插入的节点是个叶子节点，则调用orderred_insert将key插入到节点内并保持有序；
如果待插入节点不是叶子节点，则找到待插入位置的子节点，然后判断该子节点是否满
若满：则split，由于split会引入新子节点和key，所以需要再额外比较下待插入值和新key的的大小关系。
此时更新当前指针指向待插入节点，并循环检查其是否是子节点。
程序最后返回根节点。

其中orderred_insert的实现如下：
template<class Coll>
void orderred_insert(Coll& coll, typename Coll::value_type x){
typename Coll::iterator it = coll.begin();
while(it != coll.end() && *it < x)
++it;
coll.insert(it, x);
}
它和我以前给出的python的实现的区别在于，这个版本更加暴力。

为了方便构造B-tree,我定义了一个list_to_btree的函数模板：

template<class Iterator, class T>
T* list_to_btree(Iterator first, Iterator last, T* t){
return std::accumulate(first, last, t,
std::ptr_fun(insert_key<T>));
}

其本质相当于函数式编程中的fold-left

其中insert_key定义为：
template<class T>
T* insert_key(T* t, typename T::key_type x){
return insert(t, x);
}

为了方便输出，打印函数被递归定义如下：

首先是一个通用的to_str
//generic auxiliary functions
template<class T>
std::string to_str(T x){
std::ostringstream s;
s<<x;
return s.str();
}

然后针对B-tree的节点，定义打印函数：
template<class T>
std::string btree_to_str(T* tr){
typename T::Keys::iterator k;
typename T::Children::iterator c;

std::ostringstream s;
s<<"(";
if(tr->leaf()){
k=tr->keys.begin();
s<<*k++;
for(; k!=tr->keys.end(); ++k)
s<<", "<<*k;
}
else{
for(k=tr->keys.begin(), c=tr->children.begin();
k!=tr->keys.end(); ++k, ++c)
s<<btree_to_str(*c)<<", "<<*k<<", ";
s<<btree_to_str(*c);
}
s<<")";
return s.str();
}

我没有采用前端实现给出的有争议的join或者说intersperse实现
（参见：http://groups.google.com/group/pongba/browse_thread/thread/
50eae1868caca01e/d04103875b91139f?show_docid=d04103875b91139f）

最后的test case如下，我用2-3-4树smoke test了一把：
const char* ss[] = {"G", "M", "P", "X", "A", "C", "D", "E", "J",
"K", \
"N", "O", "R", "S", "T", "U", "V", "Y", "Z"};
BTree<std::string, 2>* tr234=list_to_btree(ss, ss+sizeof(ss)/
sizeof(char*),
new BTree<std::string,
2>);
std::cout<<"2-3-4 tree of ";
std::copy(ss, ss+sizeof(ss)/sizeof(char*),
std::ostream_iterator<std::string>(std::cout, ", "));
std::cout<<"\n"<<btree_to_str(tr234)<<"\n";
delete tr234;

这段程序输出：
liuuuxin@wl025474 ~/temp/tree/B-tree/src
# ./cpptest
B-tree testing
2-3-4 tree of G, M, P, X, A, C, D, E, J, K, N, O, R, S, T, U, V, Y,
Z,

(((A), C, (D)), E, ((G, J, K), M, (N, O)), P, ((R), S, (T), U, (V), X,
(Y, Z)))

先到这里，我后面给出scheme/lisp的的实现。和C++版本的btree delete实现。

--
刘
https://sites.google.com/site/algoxy/btree

On Jul 2, 7:42 pm, liuxinyu <liuxiny...@gmail.com> wrote:
> 开始在Algoxy上写了：http://sites.google.com/site/algoxy/btree
> 预计三个月左右脱稿。

liuxinyu

unread,

Aug 22, 2010, 2:21:36 AM8/22/10

to TopLanguage

Hi,

今天给出C++版本的B-tree删除程序。

在此之前，先fix一个以前给的python程序的bug:

此bug发生在CLRS描述的case 3的left sibling处:
before:

if i>0 and tr.children[i-1].can_remove(): #left sibling
tr.children[i].keys.insert(0, tr.keys[i])
tr.keys[i] = tr.children[i-1].keys.pop()

修改后：
after:

if i>0 and tr.children[i-1].can_remove(): #left sibling

tr.children[i].keys.insert(0, tr.keys[i-1])
tr.keys[i-1] = tr.children[i-1].keys.pop()

下面正式给出这个C++的程序，和CLRS教科书上给出的算法描述不同，CLRS上的算法实际上是递归调用的，这里我完全消除了递归。

有一些非常generic的函数，STL没有提供，为此我手工实现了一下，例如concat:
//generic auxilary functions
// x ++ y in Haskell
template<class Coll>
void concat(Coll& x, Coll& y){
std::copy(y.begin(), y.end(),
std::insert_iterator<Coll>(x, x.end()));
}

为了方面操作，我给BTree增加了一些成员函数：

template<class K, int t>
struct BTree{

//...

// merge_children函数可以看做是split函数的相反操作，他把第i个子节点，
// 第i个key和第i+1个子节点合成一个节点。
// 这样，就使得第i个子节点所含有的key增多了一倍以上，从而可以进行B-tree的平衡性调整。
// 请注意和python等带有GC的版本不同，这里需要仔细回收内存。
void merge_children(int i){

BTree<K, t>* x = children[i];

BTree<K, t>* y = children[i+1];
x->keys.push_back(keys[i]);
concat(x->keys, y->keys);
concat(x->children, y->children);
keys.erase(keys.begin()+i);
children.erase(children.begin()+i+1);
y->children.clear();
delete y;
}

// replace_key函数主要用户将children的某个key， pull上来，取代节点的第i个key。
// 这一操作会返回拉上来的新key的值。
key_type replace_key(int i, key_type key){
keys[i]=key;
return key;
}

//这一函数用于判断一个节点是否含有足够的key以支持在其上的删除操作，从而
//保持B-tree的平衡性。
bool can_remove(){ return keys.size() >=t; }

有了这些辅助设施后，就可以实现del的核心算法了。程序略长，基本是对CLRS上描述的翻译。但是注意递归被消除了。

template<class T>
T* del(T* tr, typename T::key_type key){
T* root(tr);

while(!tr->leaf()){
unsigned int i = 0;

bool located(false);
while(i < tr->keys.size()){
if(key == tr->keys[i]){
located = true;
if(tr->children[i]->can_remove()){ //case 2a
key = tr->replace_key(i, tr->children[i]->keys.back());
tr->children[i]->keys.pop_back();

tr = tr->children[i];
}

else if(tr->children[i+1]->can_remove()){ //case 2b
key = tr->replace_key(i, tr->children[i+1]->keys.front());
tr->children[i+1]->keys.erase(tr->children[i+1]-
>keys.begin());
tr = tr->children[i+1];
}
else{ //case 2c
tr->merge_children(i);
if(tr->keys.empty()){ //shrinks height
T* temp = tr->children[0];
tr->children.clear();
delete tr;
tr = temp;
}
}
break;
}
else if(key > tr->keys[i])
i++;
else
break;
}
if(located)
continue;
if(!tr->children[i]->can_remove()){ //case 3
if(i>0 && tr->children[i-1]->can_remove()){
// case 3a: left sibling
tr->children[i]->keys.insert(tr->children[i]->keys.begin(),
tr->keys[i-1]);
tr->keys[i-1] = tr->children[i-1]->keys.back();
tr->children[i-1]->keys.pop_back();
if(!tr->children[i]->leaf()){
tr->children[i]->children.insert(tr->children[i]-
>children.begin(),
tr->children[i-1]-
>children.back());
tr->children[i-1]->children.pop_back();
}
}
else if(i<tr->children.size() && tr->children[i+1]->can_remove())
{
// case 3a: right sibling
tr->children[i]->keys.push_back(tr->keys[i]);
tr->keys[i] = tr->children[i+1]->keys.front();
tr->children[i+1]->keys.erase(tr->children[i+1]-
>keys.begin());
if(!tr->children[i]->leaf()){
tr->children[i]->children.push_back(tr->children[i+1]-
>children.front());
tr->children[i+1]->children.erase(tr->children[i+1]-
>children.begin());
}
}
else{
if(i>0)
tr->merge_children(i-1);
else
tr->merge_children(i);

}
}
tr = tr->children[i];
}

tr->keys.erase(remove(tr->keys.begin(), tr->keys.end(), key),
tr->keys.end());
if(root->keys.empty()){ //shrinks height
T* temp = root->children[0];
root->children.clear();
delete root;
root = temp;
}
return root;
}

消除递归的关键在于，当前节点只要不是叶子节点，就要在处理后，更新tr和key，然后循环。
有兴趣的读者，可以对比我以前给出的python算法和Haskell算法。结论很有趣。

为了验证算法，我给出了一个quick and dirty的parser，它可以把描述B-tree的字符串转换成B-tree。

// quick and dirty helper function to change a B-tree description
// string to a B-tree.
template<class T>
T* parse(std::string::iterator& first, std::string::iterator last){
T* tr = new T;
++first; //'('
while(first!=last){
if(*first=='('){ //child
tr->children.push_back(parse<T>(first, last));
}
else if(*first == ',' || *first == ' ')
++first; //skip deliminator
else if(*first == ')'){
++first;
return tr;
}
else{ //key
typename T::key_type key;
while(*first!=',' && *first!=')')
key+=*first++;
tr->keys.push_back(key);
}
}
//should never runs here
return 0;
}
注意，它是一个递归程序，还配有一个辅助函数，如下：
template<class T>
T* str_to_btree(std::string s){
std::string::iterator first(s.begin());
return parse<T>(first, s.end());
}

然后就可以测试了：

void test_delete(){
std::cout<<"test delete...\n";
const char* s="(((A, B), C, (D, E, F), G, (J, K, L), M, (N, O)), "
"P, ((Q, R, S), T, (U, V), X, (Y, Z)))";
typedef BTree<std::string, 3> BTr;
BTr* tr = str_to_btree<BTr>(s);
std::cout<<"before delete:\n"<<btree_to_str(tr)<<"\n";
const char* ks[] = {"F", "M", "G", "D", "B", "U"};
for(unsigned int i=0; i<sizeof(ks)/sizeof(char*); ++i)
tr=__test_del__(tr, ks[i]);
delete tr;
}

template<class T>
T* __test_del__(T* tr, typename T::key_type key){
std::cout<<"delete "<<key<<"==>\n";
tr = del(tr, key);
std::cout<<btree_to_str(tr)<<"\n";
return tr;
}

运行测试函数，屏幕输出如下：
test delete...
before delete:
(((A, B), C, (D, E, F), G, (J, K, L), M, (N, O)), P, ((Q, R, S), T,
(U, V), X, (Y, Z)))
delete F==>
(((A, B), C, (D, E), G, (J, K, L), M, (N, O)), P, ((Q, R, S), T, (U,
V), X, (Y, Z)))
delete M==>
(((A, B), C, (D, E), G, (J, K), L, (N, O)), P, ((Q, R, S), T, (U, V),
X, (Y, Z)))
delete G==>
(((A, B), C, (D, E, J, K), L, (N, O)), P, ((Q, R, S), T, (U, V), X,
(Y, Z)))
delete D==>
((A, B), C, (E, J, K), L, (N, O), P, (Q, R, S), T, (U, V), X, (Y, Z))
delete B==>
((A, C), E, (J, K), L, (N, O), P, (Q, R, S), T, (U, V), X, (Y, Z))
delete U==>
((A, C), E, (J, K), L, (N, O), P, (Q, R), S, (T, V), X, (Y, Z))

结果和Python程序的完全一致。

今天心情非常好，因为这段C++程序，我写好后，一次编译测试通过。连typo都没有。颇有当年上学到机房的感觉了。

过些天，我给出scheme/lisp的B-tree算法实现。

--
刘
http://sites.google.com/site/algoxy/btree

liuxinyu

unread,

Aug 31, 2010, 6:31:15 AM8/31/10

to TopLanguage

Hi,

今天给出Scheme/Lisp版本的B-tree插入程序。我数了数，这已经是B-tree这个thread的第19个post了。

和前面C++, Python, Haskell的基本定义不同，我打算在Scheme中直接使用下述形式来定义B-tree

(c1 k1 c2 k2 ... cn kn c[n+1])
其中ci表示第i个子树,ki表示第i个key，之所以可以这样定义，是由于Scheme/Lisp是动态类型的，一个列表中可以同时放入这两种类型的
元素。

例如：
(("A" "B") "C" ("D" "E"))

为了方便访问所有的key和children，我定义了如下的accessor函数：

(define (keys tr)
(if (null? tr)
'()
(if (list? (car tr))
(keys (cdr tr))
(cons (car tr) (keys (cdr tr))))))

(define (children tr)
(if (null? tr)
'()
(if (list? (car tr))
(cons (car tr) (children (cdr tr)))
(children (cdr tr)))))

用来判断一个B-tree是否已经满了的函数如下：

(define (full? tr t) ;; t: minimum degree
(> (length (keys tr))
(- (* 2 t) 1)))

其中t是CLRS中定义的minimum degree，根据B-tree的平衡性要求，节点不得拥有超过2*t-1个key。

如果一个节点拥有过多的key，就需要split，split函数定义如下：

(define (split tr t)
(if (null? (children tr))
(list (list-head tr (- t 1))
(list-ref tr (- t 1))
(list-tail tr t))
(list (list-head tr (- (* t 2) 1))
(list-ref tr (- (* t 2) 1))
(list-tail tr (* t 2)))))

一共有两大类情况，若传入的节点tr是叶子节点，则进入情况一的处理("A" "B" "C" "D" "E" "F"), t=3会被分裂为
(("A" "B") "C" ("D" "E" "F"))
另外的情况是非叶子节点，分裂时还要附带上子几点。所以分裂位置是2*t-1

然后我定义了一个方便把一个 children-key对附加到一个B-tree节点内容前的函数：

(define (cons-pair c k lst)
(cons c (cons k lst)))

接下来是parition-by函数，它接受一个树tr和一个key，然后把树种所有小于key的部分（左半部分），所有大于key的部分（右半部
分），以及可能含有key的子节点这三部分分割出来：

;; (c1 k1 c2 k2 ... c[i-1] k[i-1] ci ki ... cn kn c[n+1])
;; k[i-1] < k < k[i]
;; return:
;; left <- (c1 k1 c2 k2 ... c[i-1] k[i-1])
;; c <- ci
;; right <- (ki c[i+1] ... cn kn c[n+1])
(define (partition-by tr x)
(define (part-by pred tr x)
(if (= (length tr) 1)
(list '() (car tr) '())
(if (pred (cadr tr) x)
(let* ((res (part-by pred (cddr tr) x))
(left (car res))
(c (cadr res))
(right (caddr res)))
(list (cons-pair (car tr) (cadr tr) left) c right))
(list '() (car tr) (cdr tr)))))
(if (string? x)
(part-by string<? tr x)
(part-by < tr x)))

我做了一点特殊处理，对于字符串的情况，比较大小用string<?，而对于其他情况，则直接用<来比较。

最后一个辅助函数是orderred-insert他把一个元素，按照大小顺序插入一个有序序列中。

(define (orderred-insert lst x)
(define (insert-by less-p lst x)
(if (null? lst)
(list x)
(if (less-p x (car lst))
(cons x lst)
(cons (car lst) (insert-by less-p (cdr lst) x)))))
(if (string? x)
(insert-by string<? lst x)
(insert-by < lst x)))

同样，我分别处理了字符串和普通情况。

然后我们进入B-tree插入算法的主函数：

(define (insert tr x t)
(define (ins tr x)
(if (null? (children tr))
(orderred-insert (keys tr) x) ;;leaf
(let* ((res (partition-by tr x))
(left (car res))
(c (cadr res))
(right (caddr res)))
(make-btree left (ins c x) right t))))
(fix-root (ins tr x) t))

它调用一个内部的ins函数来实现插入，并调用fix-root来处理root分裂的问题。

我们来看fix-root的定义：

(define (fix-root tr t)
(cond ((full? tr t) (split tr t))
(else tr)))

如果root含有过多key，则分裂之。

最后一个黑盒是make-btree，它定义如下：

(define (make-btree l c r t)
(cond ((full? c t) (fix-full l c r t))
(else (append l (cons c r)))))

它检查左，子，右这三部分中的子是否包含过多key，若是，则调用fix-full来修理了。
fix-full实现为：

(define (fix-full l c r t)
(append l (split c t) r))

为了方便构造b-tree，我定义了两个帮助函数：

(define (list->btree lst t)
(fold-left (lambda (tr x) (insert tr x t)) '() lst))

(define (str->slist s)
(if (string-null? s)
'()
(cons (string-head s 1) (str->slist (string-tail s 1)))))

然后就可以测试了：
(define (test-insert)
(list->btree (str->slist "GMPXACDEJKNORSTUVYZBFHIQW") 3))

执行输出结果为：
((("A" "B") "C" ("D" "E" "F") "G" ("H" "I" "J" "K")) "M" (("N" "O")
"P" ("Q" "R" "S") "T" ("U" "V") "W" ("X" "Y" "Z")))

这和前面Haskell程序的输出完全一致。

今天先到这里，明天我给出Scheme/Lisp的删除程序。

--
刘
https://sites.google.com/site/algoxy/btree

Yongwei Wu

unread,

Aug 31, 2010, 11:24:01 AM8/31/10

to pon...@googlegroups.com

一直不明白你贴这些干吗？给我们上课吗？我觉得你贴在你自己的博客就好。

2010/8/31 liuxinyu <liuxi...@gmail.com>:

--
Wu Yongwei
URL: http://wyw.dcweb.cn/

liuxinyu

unread,

Aug 31, 2010, 9:42:25 PM8/31/10

to TopLanguage

Hi,

我比较out，所以不用blog。
如果觉得这样的post doesn't make any sense，可以直接m掉。

groups的文化兼收并蓄，虽说TL的规定不让乱灌水，可是现在每天rss更新看下，水文仍是大行其道。
我这也算是某种特殊的灌水吧。

--
刘
https://sites.google.com/site/algoxy/home

On Aug 31, 11:24 pm, Yongwei Wu <wuyong...@gmail.com> wrote:
> 一直不明白你贴这些干吗？给我们上课吗？我觉得你贴在你自己的博客就好。
>

> 2010/8/31 liuxinyu <liuxiny...@gmail.com>:

居振梁

unread,

Aug 31, 2010, 9:56:48 PM8/31/10

to pon...@googlegroups.com

汗，破窗效应也可以这么用？

2010/9/1 liuxinyu <liuxi...@gmail.com>

Hi,

我比较out，所以不用blog。
如果觉得这样的post doesn't make any sense，可以直接m掉。

groups的文化兼收并蓄，虽说TL的规定不让乱灌水，可是现在每天rss更新看下，水文仍是大行其道。
我这也算是某种特殊的灌水吧。

--
御剑乘风来，除魔天地间。有酒乐逍遥，无酒我亦颠。
http://www.xspirithack.org
一饮尽江河，再饮吞日月。千杯醉不倒，唯我酒剑仙。

u20024804

unread,

Aug 31, 2010, 9:23:14 PM8/31/10

to TopLanguage

有人分享不是很好吗，干嘛打击人家的热情

On Aug 31, 11:24 pm, Yongwei Wu <wuyong...@gmail.com> wrote:

> 一直不明白你贴这些干吗？给我们上课吗？我觉得你贴在你自己的博客就好。
>
> 2010/8/31 liuxinyu <liuxiny...@gmail.com>:

smart.mk1

unread,

Sep 1, 2010, 4:52:40 AM9/1/10

to pon...@googlegroups.com

如果出发点是分享知识的, 建议整理过再发出来, 这样大家不用每个人再整理一遍.
如果目的是讨论, 问题更具体一点会更好吧. 呵呵,个人观点.

HaoPeiQiang

unread,

Sep 1, 2010, 8:30:17 AM9/1/10

to pon...@googlegroups.com

建议你反过来，在别处做草稿，在这里放整理过的

2010/9/1 liuxinyu <liuxi...@gmail.com>

Hi,

我基本上先在TL这里打草稿，这样发现了问题可以随时修改，并且有些朋友会给出不错的改进意见。
一般最后我都把结果整理到algoxy上：
https://sites.google.com/site/algoxy/home

之前的suffix tree/trie系列就是这么做的

今天给出Scheme/Lisp的B-tree删除程序。

这次采用直指问题中心的顺序，delete主函数定义如下：

;; delete a key from btree
(define (btree-delete tr x t)
(define (del tr x)
(if (leaf? tr)
(delete x tr)

(let* ((res (partition-by tr x))
(left (car res))
(c (cadr res))
(right (caddr res)))

(if (equal? (first right) x)
(merge-btree (append left (list c)) (cdr right) t)
(make-btree left (del c x) right t)))))
(fix-root (del tr x) t))

这个函数接受3个参数，tr是一个B-tree，x是待删除的key，t是算法导论中定义的minimum degree
这个函数的最核心部分就是(fix-root (del tr x) t)，也就是其调用内部的del来实现删除，然后fix一下根节点，使之符合B-
tree的特性。

fix-root昨天在insert时给出了一部分，今天给出全部：

(define (fix-root tr t)

(cond ((null? tr) '()) ;; empty tree

((full? tr t) (split tr t))

((null? (keys tr)) (car tr)) ;; shrink height
(else tr)))
如果删除的结果是把树删空了，则什么也不做，直接返回空树；过满的情况昨天解说过了；如果删除的结果是根节点一个key都没有了，只有一个孤零零的子节
点，
则把子几点提上来。于是树的高度缩短1.其他情况则返回。

现在回过头看内部的del函数。如果是叶子节点则直接删除。leaf?的定义为：
(define (leaf? tr)
(or (null? tr)
(not (list? (car tr)))))
一个空节点可看做叶子节点，另外如果第一个元素不是list，而是一个key，则其也是叶子节点。

如果不是叶子节点，则用昨天定义的partition-by函数，根据待删除的x值，将树分成3个部分，left, c和right
如果right的第一个元素的值恰好等于x，则说明x存在于本节点中，于是将值为x的key剔除，将剩余的两部分merge起来
否则，则递归地在c中删除x并将结果合并修复。

现在说merge,其定义如下：
(define (merge-btree tr1 tr2 t)
(if (leaf? tr1)
(append tr1 tr2)
(make-btree (inits tr1)
(merge-btree (last tr1) (car tr2) t)
(cdr tr2)
t)))
该函数merge两个tree成为一个，如果两个tree都是叶子节点，则直接合并之，否则递归合并第一棵树的最后一个子节点和第二棵树的第一个子节
点。

昨天说insert时，给出过一个make-btree，今天加入进去针对delete的部分：

(define (make-btree l c r t)
(cond ((full? c t) (fix-full l c r t))

((low? c t) (fix-low l c r t))

(else (append l (cons c r)))))

针对delete的部分就是第3行，如果c含有的key太少，则调用fix-low来修复。

(define (fix-low l c r t)
(cond ((not (null? (keys l)))
(make-btree (except-rest l 2)
(un-split (append (rest l 2) (list c)))
r t))
((not (null? (keys r)))
(make-btree l
(un-split (cons c (list-head r 2)))
(list-tail r 2) t))
(else c)))
这个修复过程先看看左侧有没有富裕的key-child，有的话借过来，如果左侧不足，则向右侧借，如果左右都不足，则把自己提升上来。
其中except-rest, rest, first last等等都是我定义的辅助函数：
(define (rest lst k)
(list-tail lst (- (length lst) k)))

(define (except-rest lst k)
(list-head lst (- (length lst) k)))

(define (first lst)
(if (null? lst) '() (car lst)))

(define (last lst)
(if (null? lst) '() (car (last-pair lst))))

(define (inits lst)
(if (null? lst) '() (except-last-pair lst)))
其含义比较明显，就不解释了。

最后的黑盒是un-split函数，他是昨天我给出的split函数的逆过程：
(define (un-split lst)
(let ((c1 (car lst))
(k (cadr lst))
(c2 (caddr lst)))
(append c1 (list k) c2)))

现在就可以测试了：
(define (test-delete)
(define (del-and-show tr x)
(let ((r (btree-delete tr x 3)))
(begin (display r) (display "\n") r)))
(fold-left del-and-show

(list->btree (str->slist "GMPXACDEJKNORSTUVYZBFHIQW") 3)

(str->slist "EGAMU")))
我们从昨天构造的B-tree中一次删除EGAMU这5个字符，其输出结果如下：

(((A B) C (D F) G (H I J K)) M ((N O) P (Q R S) T (U V) W (X Y Z)))
(((A B) C (D F) H (I J K)) M ((N O) P (Q R S) T (U V) W (X Y Z)))
((B C D F) H (I J K) M (N O) P (Q R S) T (U V) W (X Y Z))
((B C D F) H (I J K N O) P (Q R S) T (U V) W (X Y Z))
((B C D F) H (I J K N O) P (Q R S T V) W (X Y Z))

和我前面给出的Haskell版本的程序比较，结果完全一致

OK, 看来这个thread是太长了，我下面整理下，把algoxy上的btree一页收个尾吧。

--
刘
https://sites.google.com/site/algoxy/btree

--
Tinyfool的开发日记 http://www.tinydust.net/dev/
代码中国网 http://www.codechina.org
myTwitter: http://twitter.com/tinyfool

liuxinyu

unread,

Sep 1, 2010, 8:24:14 AM9/1/10

to TopLanguage

Hi,

我基本上先在TL这里打草稿，这样发现了问题可以随时修改，并且有些朋友会给出不错的改进意见。
一般最后我都把结果整理到algoxy上：
https://sites.google.com/site/algoxy/home

之前的suffix tree/trie系列就是这么做的

今天给出Scheme/Lisp的B-tree删除程序。

这次采用直指问题中心的顺序，delete主函数定义如下：

;; delete a key from btree
(define (btree-delete tr x t)
(define (del tr x)
(if (leaf? tr)
(delete x tr)

(let* ((res (partition-by tr x))
(left (car res))
(c (cadr res))
(right (caddr res)))

(if (equal? (first right) x)
(merge-btree (append left (list c)) (cdr right) t)
(make-btree left (del c x) right t)))))
(fix-root (del tr x) t))

这个函数接受3个参数，tr是一个B-tree，x是待删除的key，t是算法导论中定义的minimum degree
这个函数的最核心部分就是(fix-root (del tr x) t)，也就是其调用内部的del来实现删除，然后fix一下根节点，使之符合B-
tree的特性。

fix-root昨天在insert时给出了一部分，今天给出全部：
(define (fix-root tr t)

(cond ((null? tr) '()) ;; empty tree

((full? tr t) (split tr t))

((null? (keys tr)) (car tr)) ;; shrink height
(else tr)))
如果删除的结果是把树删空了，则什么也不做，直接返回空树；过满的情况昨天解说过了；如果删除的结果是根节点一个key都没有了，只有一个孤零零的子节
点，
则把子几点提上来。于是树的高度缩短1.其他情况则返回。

现在回过头看内部的del函数。如果是叶子节点则直接删除。leaf?的定义为：
(define (leaf? tr)
(or (null? tr)
(not (list? (car tr)))))
一个空节点可看做叶子节点，另外如果第一个元素不是list，而是一个key，则其也是叶子节点。

如果不是叶子节点，则用昨天定义的partition-by函数，根据待删除的x值，将树分成3个部分，left, c和right
如果right的第一个元素的值恰好等于x，则说明x存在于本节点中，于是将值为x的key剔除，将剩余的两部分merge起来
否则，则递归地在c中删除x并将结果合并修复。

现在说merge,其定义如下：
(define (merge-btree tr1 tr2 t)
(if (leaf? tr1)
(append tr1 tr2)
(make-btree (inits tr1)
(merge-btree (last tr1) (car tr2) t)
(cdr tr2)
t)))
该函数merge两个tree成为一个，如果两个tree都是叶子节点，则直接合并之，否则递归合并第一棵树的最后一个子节点和第二棵树的第一个子节
点。

昨天说insert时，给出过一个make-btree，今天加入进去针对delete的部分：

(define (make-btree l c r t)
(cond ((full? c t) (fix-full l c r t))

((low? c t) (fix-low l c r t))

(else (append l (cons c r)))))

liuxinyu

unread,

Sep 1, 2010, 8:33:45 AM9/1/10

to TopLanguage

那你一定没有看过我整理过的，放到这里不合适。

例如：https://sites.google.com/site/algoxy/stree
你看这个在mail-list里发是啥效果？

--
刘

On Sep 1, 8:30 pm, HaoPeiQiang <HaoPeiQi...@gmail.com> wrote:
> 建议你反过来，在别处做草稿，在这里放整理过的
>

> 2010/9/1 liuxinyu <liuxiny...@gmail.com>

HaoPeiQiang

unread,

Sep 1, 2010, 8:38:05 AM9/1/10

to pon...@googlegroups.com

那就贴个链接吧，这里毕竟是供大家讨论的，作为草稿对大家有点不够尊重。此外除了你自己的留言以外参与度太低，哪怕内容真的好，实际上也是过于水了。

2010/9/1 liuxinyu <liuxi...@gmail.com>

Zhang Jiawei

unread,

Sep 1, 2010, 9:12:03 AM9/1/10

to pon...@googlegroups.com

曲高和寡，阳春白雪，liuxinyu 我顶你一下。其实我也不知道是你的帖子水了，还是这里水了。

您继续贴，我收着，博客上最好也留一份，放这里过几年也不知道被压到那里去了。

liuxinyu

unread,

Sep 1, 2010, 10:18:29 AM9/1/10

to TopLanguage

Hi,

>>>
作为草稿对大家有点不够尊重
<<<
这个帽子有点大了，我可不敢戴。罪过罪过。

"xxx飘过"这种灌水式的post没人拍砖，竟然拍到我头上来了。老夫在TL上因为技术争吵被拍砖自是乐呵呵的接受，这个帽子可够冤的。

另，7,8年前我看天涯那种八卦论坛上贴网络小说连载的也是这个路数，也没见过这种说法，今天也算开眼了。

--
刘

On Sep 1, 8:38 pm, HaoPeiQiang <HaoPeiQi...@gmail.com> wrote:
> 那就贴个链接吧，这里毕竟是供大家讨论的，作为草稿对大家有点不够尊重。此外除了你自己的留言以外参与度太低，哪怕内容真的好，实际上也是过于水了。
>

> 2010/9/1 liuxinyu <liuxiny...@gmail.com>

HaoPeiQiang

unread,

Sep 1, 2010, 10:23:58 AM9/1/10

to pon...@googlegroups.com

这不是帽子吧，我只是在跟你商量，什么叫帽子呢，如果我根据这个说法对你做了处罚才叫帽子

2010/9/1 liuxinyu <liuxi...@gmail.com>

Yongwei Wu

unread,

Sep 1, 2010, 10:32:40 AM9/1/10

to pon...@googlegroups.com

论坛里的东西不会自动跑进我的邮箱。

2010/9/1 liuxinyu <liuxi...@gmail.com>:

--

机械唯物主义 : linjunhalida

unread,

Sep 1, 2010, 10:51:09 AM9/1/10

to pon...@googlegroups.com

maillist是交流的地方,这个帖子发出来给大家看,到底是为了什么?

如果大家不知道这个帖子的目的, 那么这个帖子又有什么意义? 感到很奇怪.

2010/9/1 Yongwei Wu <wuyo...@gmail.com>

居振梁

unread,

Sep 1, 2010, 11:29:51 AM9/1/10

to pon...@googlegroups.com

2010/9/1 Yongwei Wu <wuyo...@gmail.com>

> "xxx飘过"这种灌水式的post没人拍砖，竟然拍到我头上来了。

哇，这种帖子貌似我发的最多，
给大家添乱了⋯⋯
对不起啦。

ps：最后一次公开飘过。
大家晚安！！！

windstorm

unread,

Sep 1, 2010, 11:38:02 AM9/1/10

to pon...@googlegroups.com

如果觉得污染了自己的邮箱，可以mute嘛。

我比较喜欢这个系列的post，而且不管怎么说，都比TL上别的很无厘头的post好多了。

另外，贴草稿在很多技术maillist上并不是新鲜事，只不过别人那里可以讨论起来，我们这里讨论不起来而已。

----------------------------------------------------------------------------------
Yours Sincerely
Kun

www.kunli.info
http://twitter.com/cnbuff

2010/9/1 居振梁 <juzhe...@gmail.com>:

机械唯物主义 : linjunhalida

unread,

Sep 1, 2010, 11:44:25 AM9/1/10

to pon...@googlegroups.com

前提是"有意思的问题"

比如某人提出: 在研究如何干掉gfw, 然后指出几个方向,最后: "先开个头，后继跟帖给实现"

这个绝对讨论得起来.

2010/9/1 windstorm <likunar...@gmail.com>

Samuel Ji

unread,

Sep 1, 2010, 11:44:45 AM9/1/10

to pon...@googlegroups.com

在 2010年9月1日下午11:38，windstorm <likunar...@gmail.com>写道：

如果觉得污染了自己的邮箱，可以mute嘛。

同意

我比较喜欢这个系列的post，而且不管怎么说，都比TL上别的很无厘头的post好多了。

这种技术性含量较高的即使一时半会看不了，也可以mark住留着以后看

另外，贴草稿在很多技术maillist上并不是新鲜事，只不过别人那里可以讨论起来，我们这里讨论不起来而已。

”“”

maillist是交流的地方,这个帖子发出来给大家看,到底是为了什么?

如果大家不知道这个帖子的目的, 那么这个帖子又有什么意义? 感到很奇怪.

“”“

你的思想也挺奇怪的：有人发文，其他人不懂或者懒得回，不能怪别人水吧？

机械唯物主义 : linjunhalida

unread,

Sep 1, 2010, 11:52:00 AM9/1/10

to pon...@googlegroups.com

"由于复杂和篇幅原因，CLRS上没有给Delete部分的算法pseudo code。而是作为作业题18.3-2留下来了。
扩展一下，可以不仅仅implement B-Tree，而且实现下B+-Tree和B*-Tree。"

个人不觉得这个有发出来的价值.所以感到奇怪, 不是说这个是水.

Zhang Jiawei

unread,

Sep 1, 2010, 11:02:07 PM9/1/10

to pon...@googlegroups.com

就是因为这样，想发点技术含量的最后都打消了念头，一石激起千层浪的那种，发的起劲。

水就是这么来的。

HaoPeiQiang

unread,

Sep 1, 2010, 11:16:10 PM9/1/10

to pon...@googlegroups.com

你的东西要是真好，就不用担心这个，liuxinyu的东西我兴趣不大，所以内容关注不多，只是其实一直有些人质疑他的东西，我才和他商量一下方式而已，也没有人说不让他发布。

组是一个需要大家互相沟通的东西，东西好，当然好，但是形式如果干扰了大家，会有质疑也没有什么稀奇，修正一下形式就可以了。

2010/9/2 Zhang Jiawei <gho...@gmail.com>

Samuel Ji

unread,

Sep 1, 2010, 11:18:23 PM9/1/10

to pon...@googlegroups.com

在 2010年9月2日上午11:16，HaoPeiQiang <HaoPe...@gmail.com>写道：

你的东西要是真好，就不用担心这个，liuxinyu的东西我兴趣不大，所以内容关注不多，只是其实一直有些人质疑他的东西，我才和他商量一下方式而已，也没有人说不让他发布。

说实话,很多人发的东西(包括你的apple软广告)我更加不感兴趣(甚至反感).

HaoPeiQiang

unread,

Sep 1, 2010, 11:21:12 PM9/1/10

to pon...@googlegroups.com

你当然可以说，你甚至可以要求ban我，这没什么的。

如果要求的人多，大不了我走人，这也没什么的。

下次我发软广告的时候，你调查一下有多少人想赶我走吧，多的话，我自动。

我自己对liuxinyu的东西谈不上喜好，或者说讨厌，只是尽管理员的义务相应一下大家的质疑而已。

2010/9/2 Samuel Ji <princeofd...@gmail.com>

tony.jiao

unread,

Sep 1, 2010, 11:24:47 PM9/1/10

to pon...@googlegroups.com

我觉得挺好的.而且我把这个thread还标星了.

在 2010年9月2日上午11:16，HaoPeiQiang <HaoPe...@gmail.com>写道：

HaoPeiQiang

unread,

Sep 1, 2010, 11:27:32 PM9/1/10

to pon...@googlegroups.com

如果你们觉得好，那就多参与一下，就算平时不参与也罢，有人说他的内容水的时候，你们也不冒出来参与的话，更多的人就会更觉得这个东西水了。

liuxinyu我建议你给你的这个系列，加一个tag，喜欢的人，写个过滤器加星方便，不喜欢的人，写个过滤器mute也方便。

2010/9/2 tony.jiao <baojin...@gmail.com>

Wang Feng

unread,

Sep 1, 2010, 11:28:26 PM9/1/10

to pon...@googlegroups.com

2010/9/2 HaoPeiQiang <HaoPe...@gmail.com>:

> 你当然可以说，你甚至可以要求ban我，这没什么的。
> 如果要求的人多，大不了我走人，这也没什么的。
> 下次我发软广告的时候，你调查一下有多少人想赶我走吧，多的话，我自动。
> 我自己对liuxinyu的东西谈不上喜好，或者说讨厌，只是尽管理员的义务相应一下大家的质疑而已。

喜欢苹果向别人推介, 只要不是太频繁, 我倒不是很介意.
我介意的是一个手里挥舞着 "ban" / "处罚" 各种大棒的管理员时不时出来耀武扬威一番.
今天你威胁(或者涉嫌威胁)的网友就有两个, 即使他们真的错了, 你就不能跟他们发私信说下么? 非要让别人都看到?

机械唯物主义 : linjunhalida

unread,

Sep 1, 2010, 11:29:44 PM9/1/10

to pon...@googlegroups.com

toplanguage里面一起学习如何实现一些算法还是挺好的.

建议这样的内容名称改成: 大家一起来实现XXX吧

这样号召大家参与更有料些.

如果是只是自己练习的话,还是放在自己博客里面好些.

2010/9/2 tony.jiao <baojin...@gmail.com>

Zhang Jiawei

unread,

Sep 2, 2010, 12:19:44 AM9/2/10

to pon...@googlegroups.com

我来说两句，这里的管理员虽然也ban ban ban，比起华蟒好多了。

我昨天晚上就是被人挑逗了一下回一句话，结果第二天就被ban了，顺便菊花也被问候了。今天要求解封，干脆让我装逼滚了。

管理员还是找淡定点的好。

OxFAN

unread,

Sep 2, 2010, 12:37:46 AM9/2/10

to pon...@googlegroups.com

其实我觉得liuxinyu的这些贴都挺好的，我还mark收藏了。只要有看的人，有人觉得有价值就留着呗，能占多少地方？

要是按个人喜好来决定哪些贴该发还是不该发，就有点个人主义到味道了。

最后希望liuxinyu能让自己的技术贴更互动一些，让大家来参与讨论而不是只有一个人发言，毕竟这是讨论组，不是blog。

Dexter Xiong

unread,

Sep 2, 2010, 12:50:56 AM9/2/10

to pon...@googlegroups.com

帖子没有人互动，不代表它不该发，也不能代表它没有价值。大家看看，喜欢的就mark，不喜欢的就跳过。

我倒是觉得这个帖子是linxinyu自己思考的一个整理，有时候有价值的不是结果，而是思考的过程，这样的草稿才是更值得我们讨论的，即使我们无法参与讨论也可以学习到他思考问题的一个过程，这样的帖子才是真的有价值有意义的。

这里是个技术论坛，应该都是欢迎这样的技术帖，没有人互动的原因有很多，帖子有没有价值也完全和互动没有关系，有价值的讨论互动才是有意义的，灌水之类的互动是毫无意义的，又不是来混脸熟的～

随着我们回的这么多的帖，让我觉得这个本来很技术的帖子也已经变成水贴了，我相信liuxinyu看到这样的情况，再出门看看外面那么多水贴，也一定会心寒的

我加入这里时间并不是很长，但是也能明显的感觉到这里越来越水了，有价值的东西越来越少。

大家应该都不会希望看到这样的结果吧。

2010/9/2 OxFAN <oday...@gmail.com>:

Zhang Jiawei

unread,

Sep 3, 2010, 4:05:58 AM9/3/10

to pon...@googlegroups.com

不太懂 B Tree 刚刚开始看算导的这部分。就问一句： B Tree 每个节点包含 Key 的数量必须一样多吗？

liuxinyu

unread,

Sep 3, 2010, 6:02:28 AM9/3/10

to TopLanguage

Hi,

B-tree每个节点包含的key的数量不一定相等。除根节点外，所有节点的拥有的key的数量在一个范围以内。
如果我们定义一个t叫做minimum degree，那么，出了根节点外，所有的节点拥有的key的数量在t-1到2t-1之间。

之所以这样限制，是为了保证B-tree的平衡特性，避免出现二叉树中瘸腿的极端情况，如
（null A (null B (null C (...)))))

平衡的B-tree带来的好处是，对于磁盘访问这种操作，保证代价都比较接近O(ln N)而不会退化到接近O(N)的情况
（上述瘸腿二叉树的复杂度就退化到了O(N)）

另外，我没有仔细google,但是我很有可能是第一个给出B-tree纯函数实现（FP）中文解释的。

把CLRS上的伪代码翻译成C/Python/Java等imperative语言的并不难，用ML\Haskell\Scheme提供纯FP实现
很挑战。有些数据结构几乎google不到任何FP结果。这方面成就最大的恐怕就是Chris Okasaki了。

--
刘新宇
https://sites.google.com/site/algoxy/btree

jigsaw

unread,

Sep 4, 2010, 4:05:10 PM9/4/10

to pon...@googlegroups.com

问一下，in place的sort，比如heap sort，用纯fp来做是不是无解？我的意思是，无法达到相应的BigO

2010/9/3 liuxinyu <liuxi...@gmail.com>

liuxinyu

unread,

Sep 6, 2010, 4:19:29 AM9/6/10

to TopLanguage

Hi,

脱稿。

PDF全文：
https://sites.google.com/site/algoxy/btree/btree-en.pdf

HTML全文：
https://sites.google.com/site/algoxy/btree

HTML的缺点是pseudo code的排版htlatex没有处理好，推荐pdf的。

全部源代码：
https://sites.google.com/site/algoxy/btree/btree.zip

文档是FDL协议，源代码是GPL协议。

另外，完整book也更新了这一部分：
https://sites.google.com/site/algoxy/home/main-en.pdf

book大小有2M，有些大。

--
刘新宇
https://sites.google.com/site/algoxy

liuxinyu

unread,

Sep 6, 2010, 4:22:31 AM9/6/10

to TopLanguage

Hi,

正巧我下一个题目是binary heap。现在先休息一下:)

有些case的确纯FP无解，典型的是Hash。
Heap Sort我觉得改变堆的数据结构也许能有什么新的思路。其实binary heap本质使用array来模拟tree，用FP挑战下应该比较
有趣。
--
刘
https://sites.google.com/site/algoxy

On Sep 5, 4:05 am, jigsaw <jig...@gmail.com> wrote:
> 问一下，in place的sort，比如heap sort，用纯fp来做是不是无解？我的意思是，无法达到相应的BigO
>

> 2010/9/3 liuxinyu <liuxiny...@gmail.com>

Mafish Liu

unread,

Sep 6, 2010, 4:50:38 AM9/6/10

to pon...@googlegroups.com

谢谢楼主，很细致的工作，收藏了。

liuxinyu

unread,

Sep 6, 2010, 5:31:37 AM9/6/10

to TopLanguage

另开了一个thread:
Heap Sort in FP
http://groups.google.com/group/pongba/browse_thread/thread/45a66cbd77919414/4d06ba84af5fd4b9?show_docid=4d06ba84af5fd4b9

On Sep 5, 4:05 am, jigsaw <jig...@gmail.com> wrote:

> 问一下，in place的sort，比如heap sort，用纯fp来做是不是无解？我的意思是，无法达到相应的BigO
>

> 2010/9/3 liuxinyu <liuxiny...@gmail.com>

Fei Yan

unread,

Sep 6, 2010, 7:57:11 AM9/6/10

to pon...@googlegroups.com

google sites需要翻墙才能看。

一点小小的建议，行文语法似乎有不少值得推敲，貌似Chinglish 有点多；如果愿意花时间润色一下，效果会好很多。

Zhang Jiawei

unread,

Sep 6, 2010, 10:19:51 AM9/6/10

to pon...@googlegroups.com

既然是 B Tree 帝，请教一个问题，算法导论里这个章节麻烦看一下：
18.3 Deleting a key from a B-tree
在删除 Key 的几个 case 里， D deleted: case 3b, 那个示意图里， D 所在的节点有 D E J K 这几个元素，但是下面对 case 3b 的分析里又说 b. If c i [x] and both of c i [x]'s immediate siblings have t - 1 keys 明显这里的 t - 1 = 2 所以 case 3b 只适用于 2 个元素的节点。貌似和示意图里4个节点矛盾。如果如示意图里所画，是否可以直接删除 D ？

不知是 errata 还是我没搞懂。望赐教。

liuxinyu

unread,

Sep 6, 2010, 10:31:37 PM9/6/10

to TopLanguage

Hi,

前两天有人说修改hosts，然后https能访问。我也是没有办法，太多东西在google sites上，结果突然就给禁掉了...

我的英文的确是很差，多谢批评。
我想将来等基本算法和数据结构都陆续弄好了后，放到github上请大家随便修改。也许有native speaker斧正，那就很幸运了。
反正是GPL+FDL，几乎没有任何限制。不过现在这上面的东西太少了，实在不好意思现在就call for review.

--
刘
https://sites.google.com/site/algoxy

On Sep 6, 7:57 pm, Fei Yan <skyscribe...@gmail.com> wrote:
> google sites需要翻墙才能看。
> 一点小小的建议，行文语法似乎有不少值得推敲，貌似Chinglish 有点多；如果愿意花时间润色一下，效果会好很多。
>

liuxinyu

unread,

Sep 6, 2010, 10:25:41 PM9/6/10

to TopLanguage

Hi，

略微解释下：
请先看figure 18.8 (d)，现在要从这个图示的树中删除"D"这个key；首先检查根节点，发现"D"不在root中，所以进入case
3；

由于"D"<"P"，故而根节点的左子节点("C","L")为待查节点，这个节点只有两个key，"C"和"L"，t-1=2,故而检查其
sibling；
这个节点没有左侧sibling，故而只能检查其右侧sibling，也就是("T", "X")这个节点。发现其也只有两个key。

因此进入case 3b，于是将("C", "L")，还有"P"，以及("T", "X")进行merge，然后再删除"D"，这样就出现
figure 18.8 (e)
所示的样子，由于这时root不含任何key，而只有一个child，所以要进行shrink。

请特别注意，''introduction to algorithm''中的B-tree插入和删除算法的策略，用一句话概括就是：
防患于未然

插入时，不是发现插入后key的数目超了才去调整，而是预先调整(split)；
删除时也不是发现删除后发现key的数目不够了才去调整，而是预先调整(向左右借，或者merge)；
这样的好处是，自root向叶子，go down一次，而不用回溯。

请对比我给出的FP的算法，采用的策略正好相反：
亡羊补牢

插入时，随便插，发现key超了后，递归调整(split）
删除时，随便删，发现key不够了后，递归调整(向左右借）

--
刘
https://sites.google.com/site/algoxy/btree

On Sep 6, 10:19 pm, Zhang Jiawei <ghos...@gmail.com> wrote:
> 既然是 B Tree 帝，请教一个问题，算法导论里这个章节麻烦看一下：
> 18.3 Deleting a key from a B-tree
> 在删除 Key 的几个 case 里， D deleted: case 3b, 那个示意图里， D 所在的节点有 D E J K 这几个元素，但是下面对
> case 3b 的分析里又说 b. If c i [x] and both of c i [x]'s immediate siblings have
> t - 1 keys 明显这里的 t - 1 = 2 所以 case 3b 只适用于 2 个元素的节点。貌似和示意图里4个节点矛盾。
> 如果如示意图里所画，是否可以直接删除 D ？
>
> 不知是 errata 还是我没搞懂。望赐教。
>

> 在 2010年9月6日下午5:31，liuxinyu <liuxiny...@gmail.com>写道：
>
> > 另开了一个thread:
> > Heap Sort in FP
>

> >http://groups.google.com/group/pongba/browse_thread/thread/45a66cbd77...

OxFAN

unread,

Sep 6, 2010, 10:39:22 PM9/6/10

to pon...@googlegroups.com

推荐你试一下ipv6反向代理访问google服务，windows下用teredo，*nix下用miredo 客户端，强烈建议对google服务依赖的用户使用此方法..

配置好teredo或安装miredo后改hosts，hosts文件在此 https://docs.google.com/Doc?docid=0ARhAbsvps1PlZGZrZG14bnRfNjFkOWNrOWZmcQ&hl=zh_CN

wing fire

unread,

Sep 6, 2010, 11:03:35 PM9/6/10

to pon...@googlegroups.com

一个可行的办法是，在hosts文件里，让你需要访问的google站点ip指向www.google.com的ip。

例如，nslookup得到地址：

docs.google.com 74.125.43.102 ...

www.google.com 173.194.35.104

在hosts里面，添加

173.194.35.104 docs.google.com

google.com会自动帮你转发到正确的服务器上。我这样设置后，https 访问 encrypted.google.com, google doc都没有问题了。

但是这个只是有些站点可以，picasaweb.google.com，www.youtube.com www.blogger.com都不行

---------------------------------------
绝圣弃知，大盗乃止；擿玉毁珠，小盗不起；

OxFAN

unread,

Sep 6, 2010, 11:09:45 PM9/6/10

to pon...@googlegroups.com

我推荐的方法可以访问youtube.com,blogger.com等所有google服务，不用https也可以，而且youtube缓冲速度很快。

貌似gfw目前过滤不了ipv6的数据.

Zhang Jiawei

unread,

Sep 8, 2010, 3:44:50 AM9/8/10

to pon...@googlegroups.com

噢，懂了。

原来是自顶向下开始依次按照算法导论里的cases规则来处理数据。

多谢。

Reply all

Reply to author

Forward