I plan to deploy a WebAnywhere server in China.

4 views
Skip to first unread message

Cameron Wong

unread,
Dec 12, 2009, 12:28:55 AM12/12/09
to webanyw...@googlegroups.com
Hello all,

I plan to deploy a WebAnywhere server in China. But I am not sure the
cabability and performance. Here are some information. Any suggestion
is highly appreciated!

Hardware of the first server:
CPU: Xeon 2400MHz, 4 core
RAM: 2G or 4G
Hard disk: 500G (1 or 2 pieces)

Software:
OS: Redhat EL 5
Web server: apache2 or Nginx
TTS engine: Ekho (1 process for Cantonese, 1 process for Mandarin),
Festival (1 process for English), eSpeak (just for demo purpose for
other languages)
WebAnywhere
System monitor (I will write a small script to monitor the status of the server)

I suppose this server can serve at least 10 clients at the same time.
It would be great if the number can reach 30 or even more. I am not
sure whether this is too optimistic.

When the number of users grows, I will add standalone TTS servers. But
I am not sure what to do when one web server is not enough.

I am not sure whether 2G RAM is enough.

Is hard disk a bottle neck because there are a lot of reading for the
voice cache files? If so, will 2 hard disks run much faster than 1
because they can be read at the same time (if I put cache files at
both disks)

I would like to setup the hardware only once (I mean it should be
expensive and trouble to add RAM or hard disks in future because I
will only have remote access to the server)

Currently, eSpeak is not running in a server-client mode. The espeak
will be invoked every time. The dictionary will be loaded again and
again. So it needs to be improved (to write a wapper to implement the
server-client mode).

Any comments? Thanks a lot!

Cameron Wong

Jeffrey Bigham

unread,
Dec 12, 2009, 1:58:39 PM12/12/09
to webanywhere-dev
Hi Cameron,

Great to hear you'll be hosting a version in China!

Currently, we are hosting WebAnywhere with just a single machine. It
has the capability to add additional machines for TTS, but in practice
we haven't yet found this to be necessary.

I believe the machine we have has similar specifications to what you
mentioned. The bigger the hard drive the better, although 500 GB will
store a whole lot of speech. What I suspect is that after 10 GB or
so, you're mostly storing the long tail of speech sounds, which are
unlikely to be used again anyway. You'll want to write a script that
evicts the most recently used speech sounds from the speech cache.

This setup has allowed us to scale to more than 30 simultaneous users
- which is actually a lot...if you think that most users are
constantly making requests...they may only spend 20 minutes using
WebAnywhere and much of that is spent reading or repeating sounds that
the browser has already downloaded.

I think the biggest bottleneck is bandwidth. We are lucky to be
hosting WebAnywhere at the University of Washington, which is
connected to what is called "Internet 2," which gives it an extremely
high bandwidth connection to many locations around the US (and maybe
the world).

I think you will definitely see performance gains if you can make the
TTS engines work as a server, but we don't have terrible performance
with them now running as separate processes, so this seems like
something you could delay until it becomes necessary.

I hope this helps...in general, it has seemed in practice that
WebAnywhere doesn't require as many resources as people assume, but of
course the more power you put behind it the more likely it is to keep
up with demand.

Let me know if there are any other questions that can answer for you.

Thanks,
Jeff
> --
>
> You received this message because you are subscribed to the Google Groups "webanywhere-dev" group.
> To post to this group, send email to webanyw...@googlegroups.com.
> To unsubscribe from this group, send email to webanywhere-d...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/webanywhere-dev?hl=en.
>
>
>



--
Jeffrey P. Bigham, Ph.D.
Assistant Professor, University of Rochester, Computer Science
Visiting Scientist, MIT CSAIL
http://www.jeffreybigham.com

Cameron Wong

unread,
Dec 13, 2009, 5:22:40 AM12/13/09
to webanyw...@googlegroups.com
Hi Jeff,

Thank you very much for the comments! They are precious for me.

I am glad to hear that the setup can have more simultaneous users than
I expected.

You reminded me the bottleneck of bandwidth. I will pay more attention
on this when choosing the IDC.

Thanks again!

Cameron

2009/12/13 Jeffrey Bigham <jeffrey...@gmail.com>:
Reply all
Reply to author
Forward
0 new messages