如果要<<Computer Organization & Design>>3rd~,我有.

aleph

unread,

Jul 31, 2006, 11:22:53 PM7/31/06

to csarch

我有这个.

yuxing tang

unread,

Jul 31, 2006, 11:26:19 PM7/31/06

to alep...@gmail.com, csa...@googlegroups.com

电子版么？
能否给一份？传到我的Gmail邮箱
或者放到newsmth-csarch的FTP上？？

On 8/1/06, aleph <alep...@gmail.com> wrote:
> 我有这个.
>
> >
>

aleph

unread,

Jul 31, 2006, 11:37:17 PM7/31/06

to csarch

请问: newsmth-csarch的FTP 是什么?我传上去.

为什么这里只有2005的帖,你们现在都在newsmth-csarch上讨论吗?
新的newsmth-csarch 的地址是什么?
以前在CLF
上看看这个方面,但上面说的人不多,还是以编译器为主.

Yubo Xie

unread,

Jul 31, 2006, 11:44:10 PM7/31/06

to csa...@googlegroups.com

去这里下吧
http://purec.binghua.com/viewthread.php?tid=1490

--
Best Regards

Xie Yubo
Email: xie...@gmail.com Website: http://xieyubo.com/
Harbin Institute of Technology
Phone: 86-451-86416614 Fax: 86-451-86413309

yuxing tang

unread,

Aug 1, 2006, 12:15:36 AM8/1/06

to csa...@googlegroups.com, xie...@gmail.com

你给的地址是
Operating.Systems.Design.and.Implementation.3rd
不是
Computer Organization & Design.3rd

差很远
给错了？？

yuxing tang

unread,

Aug 1, 2006, 12:24:45 AM8/1/06

to csa...@googlegroups.com, alep...@gmail.com

csarch是Navyant在smth转型的时候使用的
smth的csarch版临时用这个newsgroup交流
在newsmth建立之后
大家都回到newsmth
http://www.newsmth.org/bbsdoc.php?board=CSArch

CSArch的ftp关闭了。如果不大的话，请帮忙发到我的gmail
圆柱3rd也就10多兆，Computer Organization & Design 应该不会超过太多
附带的模拟器和verilog代码也不会很大吧

On 8/1/06, aleph <alep...@gmail.com> wrote:

aleph

unread,

Aug 1, 2006, 1:03:00 AM8/1/06

to csarch

我发给你了.请你看一下.

yuxing tang

unread,

Aug 1, 2006, 1:23:00 AM8/1/06

to csa...@googlegroups.com, alep...@gmail.com

你发过来的不是 computer Organization & Design 3rd
而是
Computer Architecture: A Quantitative Approach 2nd
也就是圆柱的第2版，现在流行第3版，明年发行第4版
不过圆柱第二版对流水线的介绍比较深入，是值得在今天也好好看的书
Organization&Design是体系结构入门书
圆柱要深入些

第3版的O&D的特色是加入了一个小的Processor实现，有Verilog代码
两本书作者都是John L. Hennesy 和 David A. Patterson

anyway, thanks

On 8/1/06, aleph <alep...@gmail.com> wrote:

> 我发给你了.请你看一下.
>
> >
>

aleph

unread,

Aug 1, 2006, 1:47:40 AM8/1/06

to csarch

我在看O&D的2nd 纸板. C&A 还没有看.
不过网上说,没有必要看完O&D 再看C&A ,不知是不是这样?
至于Verilog代码的Processor
网上免费也也有下的.只是要配合书和文档就很少,网上看到有一个讲ARM
Processor 书用Verilog代码
来实现,只是在国内没有这样的书,也下不到E书.
现在DSP 很流行,可以向DSP发展一下.

yuxing tang

unread,

Aug 1, 2006, 1:59:57 AM8/1/06

to csa...@googlegroups.com, alep...@gmail.com

DSP多半使用VLIW结构，增加了针对诸如FFT这样的算法加速部件和指令
基础的东西还是来自processor architecture研究, 如果你看圆柱的3rd，里面就会有
网络处理器，Media Processor（类似DSP）的case介绍
私下认为，DSP的功力不在Microarchitecture上，而是开发环境和开发库的支持，以及开发板的配套上。

对于一个已经有体系结构初步基础（计算机原理）的人，确实不需要为了读懂圆柱而去读O&D
而且圆柱第2版写得相当详细
但如果基础不足，直接看圆柱第3版，会有很多概念混淆

Navyant也对我说过，O&D的第3版其实写得不如第2版清楚
一直想实际看一看，尤其对里面的那个例子感兴趣，
想看看给本科生布置project到底多大规模的Processor合适
Patterson和Hennessy书中代码应该是经得住检验的

On 8/1/06, aleph <alep...@gmail.com> wrote:

aleph

unread,

Aug 1, 2006, 2:42:05 AM8/1/06

to csarch

yuxing tang 对processor architecture
方面真是强,博览了很多这方面的书.
你还是建议我先看完O&D第2版,再看圆柱第3版吗?
我看了几章的O&D 2nd
,已我花很长的时间,不过真的是很有收获.
我现在也一直想做个Processor的小项目用verilog
的代码来在FPGA实现,只是我对体系结构和verilog
都只是一个初学者.
所以也只能在门外看看.

对于编译器而言,有一个新的体系结构,也能学习一下去移植.如GCC.

yuxing tang

unread,

Aug 1, 2006, 2:51:16 AM8/1/06

to csa...@googlegroups.com, alep...@gmail.com

专业是这个，所以看了不少
newsmth上有很多同仁在国外念csarch的MS和PHd，可以一起探讨
只是很多人都很忙，初级的问题不一定能得到注意
而高级的问题，不一定有人回答。国内就这样了...
真有问题可以发到comp.arch上，只要问得恰当，基本都能得到满意答复

如果要用一个小的processor练手，不妨到www.opencores.org上看看
openrisc1k
从芯片到板子到gcc和gnu工具链整套过程提供

GNU CC的Port不是很难，对于一个新的指令集结构
如果不追求优化，根据newsmth上zeal的说法，一个本科生1个月就可以搞定了
只需要修改后端

On 8/1/06, aleph <alep...@gmail.com> wrote:

aleph

unread,

Aug 1, 2006, 3:14:42 AM8/1/06

to csarch

由于前一些时间,在学verilog ,书上看到一个很简单的
risc
代码,便一发不可收拾,去看初体的书.工作中,要用到DSP和ARM等一些处理器,所以很想搞懂它们.
正如你所说,GNU CC的Port
不难,只要看好文档,一般都能搞定
只是做编译器和体系结构这一块的,工作是不好找啊.
我有几个在国内做这方面的朋友,要换一个类似的工作,很难.因为国内的IC公司太少了.而国外的公司一般只在中国设的是维护和支持的职位,开发的不提供.

yuxing tang

unread,

Aug 1, 2006, 3:39:15 AM8/1/06

to csa...@googlegroups.com, alep...@gmail.com

国内有实力，真正做这个的公司基本上没有
但是体系结构和编译，可以算是整个计算机工程（不是科学）的基础
有了这个基础，其它的东西都狠容易的
想换个研究方向都比较容易的可做基础软件研究 OS和Compiler优化

Intel和IBM中国研究中心是提供部分开发岗位的
如果有足够的实力做领先的Gnu项目，象LVS
或者在ISCA、Micro、HPCA、IPDPS、CGO、 ASPLOS上有研究论文发表，是不需要愁工作的

On 8/1/06, aleph <alep...@gmail.com> wrote:

yao gang

unread,

Aug 1, 2006, 4:08:42 AM8/1/06

to csa...@googlegroups.com

Please give me one either ...

On 8/1/06, aleph <alep...@gmail.com> wrote:

我有这个.

aleph

unread,

Aug 1, 2006, 4:12:20 AM8/1/06

to csarch

很精辟！
对于处理器设计，用verilog
设计出ＣＰＵ　在整个过程中只占很小的一部分，到真正流片还有很多工作要做．
ＣＰＵ设计的真正难点在后端，好像涉及到模似的东东，我也不懂．＾＿＾．
所以体系结构很难定位，不知算是ＣＳ还是微电子．
　　现在基于ＦＰＧＡ的软核很流行，ＡＬＴＥＲＡ推了一个ＮＩＯＳ．这个东东名气很大．
但我觉得这个东东概念大于实用，真正在商用的不多．

aleph

unread,

Aug 1, 2006, 4:16:30 AM8/1/06

to csarch

你的ＥＭＡＩＬ是什么？

yao gang

unread,

Aug 26, 2008, 1:37:59 PM8/26/08

to csa...@googlegroups.com

I found the following paraph is hard to understand.

>>>>

The following results are seen from a simulation study of five floating-point benchmarks and two integer benchmarks from the SPEC92 suite. The branch misprediction rate nearly doubles from 5% to 9.1% going from 1 thread to 8 threads in an SMT processor. However, the wrong-path instructions fetched (on a misprediction) drops from 24% on a single-threaded processor to 7% on an 8-thread processor.

<<<<<

since I assume each thread has its own branch predictor anyway or they normally be shared in the whole processor ?

Navy Ant

unread,

Aug 26, 2008, 5:03:18 PM8/26/08

to csa...@googlegroups.com

From the context, a shared branch predictor is assumed. Power 5 uses a
shared one (only RAS is separate), for example. I would assume
normally a SMT processor uses shared predictor.

2008/8/26 yao gang <nobo...@gmail.com>:

yao gang

unread,

Aug 27, 2008, 6:28:43 PM8/27/08

to csa...@googlegroups.com

Thanks Navy Ant. Two further thoughts & questions:

a. What do you mean by RAS,

b. For the second fact the SMT got less wrong path fetched,I am not sure how I understand it correctly. But I can make a stupid example. If I got 8 thread(e.g. assume it exactly the same), it will fetch 8N wrong instructions. However, in a 8-thread processor, if i ignore the pipelien setup cost, i only fetch N wrong instructions, is this scenario correct?

Navy Ant

unread,

Aug 29, 2008, 1:01:56 PM8/29/08

to csa...@googlegroups.com

RAS = Return Address Stack. It is used to predict the targets of
return instructions.

A SMT processor normally fetches instructions in a round-robin way
from all ready threads. Those N threads are typically different. (Your
assumption is not true in most cases). So the wrongly fetched
instructions are only 1/N of that in the single thread case.
Considering the prediction accuracy is compromised a little bit, it makes
sense that the wrongly fetched instruction rate drops from 24% to 7%,
rather than 3%.

2008/8/27 yao gang <nobo...@gmail.com>:

--
Yixin Shi

yao gang

unread,

Sep 15, 2008, 2:10:18 PM9/15/08

to csa...@googlegroups.com

Another conceptual level question regarding to the multi-core processor.

Obviously the multiprocessor core will need much higher bandwidh than the conventional single one.

E.g. Sun T1 got 25.6GB (1.2 GHZ) nd Pentium got 6.4 GB(3.2 GHZ).

I am just wondering whether the power gain get from the processor will be offset by the larger memory power.

I am asking since GPU are squeezing their memory bandwidth to low the power. It seemd adverse trend in the CPU world?

Reply all

Reply to author

Forward