Q: O(T^2/n) @ 4.5 dynamic indexing of IIR

24 views
Skip to first unread message

Hongfei Yan

unread,
Mar 20, 2013, 1:03:38 AM3/20/13
to cs41...@googlegroups.com
Introduction to Information Retrieval, by Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schuetze, 2009.

第四章的4.5节讲 Dynamic indexing
In this scheme, we process each posting ⌊T/n⌋ times because we touch it
during each of ⌊T/n⌋ merges where n is the size of the auxiliary index and T
the total number of postings. Thus, the overall time complexity is O(T^2/n).

Question: 对于O(T^2/n)的理解?

answer:  T是指整个索引包含的posting数目,它是由n逐渐合并得来的,就是T是包含n的。
因此需要 ⌊T/n⌋合并。每次合并需要扫描T中每一个,所以是T*T/n,是 O(T^2/n)

Han Jiang

unread,
Mar 20, 2013, 10:22:31 AM3/20/13
to cs41...@googlegroups.com
老师,我的理解是这样:
书中的T是按图中的形式由叶节点合并来的:
因此,对于每个倒排记录平均处理是:
(n*T/n + n*(T/n-1) + n*(T/n-2) + ... + n * 1) / T = [T/2n]吧?

--
You received this message because you are subscribed to the Google Groups "cs410pku" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cs410pku+u...@googlegroups.com.
To post to this group, send email to cs41...@googlegroups.com.
Visit this group at http://groups.google.com/group/cs410pku?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Han Jiang

Team of Search Engine and Web Mining,
School of Electronic Engineering  and Computer Science
,
Peking University, China
图片1.png

Hongfei Yan

unread,
Mar 21, 2013, 12:53:19 AM3/21/13
to cs41...@googlegroups.com
(n*T/n + n*(T/n-1) + n*(T/n-2) + ... + n * 1) 
设T/n=a,则上式等于 n[a(a+1)/2] = n[T/n(T/n+1)/2] = T^2/2n + T/2,则 overall time complexity is O(T^2/2n)

Han Jiang

unread,
Mar 21, 2013, 7:23:11 AM3/21/13
to cs41...@googlegroups.com
嗯,觉得79页他说平均复杂度时高估了。

Li Yechen

unread,
Mar 23, 2013, 8:52:37 PM3/23/13
to cs41...@googlegroups.com
为什么要顺序merge 而不是等n个文件都生成了再归并把复杂度降成Tlog(n)?
上课的时候貌似想到为什么 现在又忘了..

Li Yechen

unread,
Mar 23, 2013, 9:14:16 PM3/23/13
to cs41...@googlegroups.com
哦是那个二进制合并的方法。。
Reply all
Reply to author
Forward
0 new messages