Looking for important Confluence requirements

JK

unread,

Aug 12, 2009, 10:54:06 PM8/12/09

to Confluence in the (real) Enterprise

Dear All,

I am 4 months old in Confluence development for our colloaboration &
KM needs but to go ahead have very important queries on Collaboration,
Search, Storage & Media file streaming, email alerts, integration,
forum, blog, Workflow etc.

What is the maximum size that we could limit for attachment and even
pages (2 GB for Pages as pages get stored into database cell record)

Can a single Space in Confluence store content or attachments of size
more than 3-5 GB or do we have any limitation on storage?. Can
Confluence sustain with the huge content size of 1-3 TB (30000 GB)
including text, images, attachments etc

s it advisable to store them into Confluence wiki pages or should we
store them into other server say media server and provide the links in
confluence.

Do you have any case studies from existing customers to see if they
have implemented

Confluence to very large audience above 30,000 users. Please can you
let me know again if we can go ahead Confluence implementation for
such huge audience as i am looking for some evidences on its
performance or issues if any one faced the same. Does bigger audience
implementation affects the Confluence deployment?

Search

Apart for existing available Search feature please do let me know on:

How many million documents search conflunce can support in a single
index.

What is the maximum number of indexes supported.

Any case study to see if we can display the search results in
confluence from different
databases i.e. federated searching.

Ability to start from one single starting point and search through the
entire hierarchy parent to child pages in single or multiple spaces.

Ability to filter search based on parameters such as date created/
updated.

Do we have performance requirements for search.

Please can you let me know where i can find more details on
Confluence's Search Architecture.

Do we have any unique Search feature other than Tag/label based
search.

Would like to have your response in another few hours as similar to
yesterday.

Appreciate you help and support.

Thanks & regards
JK

Igor Minar

unread,

Aug 13, 2009, 1:37:34 AM8/13/09

to enterprise...@googlegroups.com

On Aug 12, 2009, at 7:54 PM, JK wrote:

>
> Dear All,
>
> I am 4 months old in Confluence development for our colloaboration &
> KM needs but to go ahead have very important queries on Collaboration,
> Search, Storage & Media file streaming, email alerts, integration,
> forum, blog, Workflow etc.
>
> What is the maximum size that we could limit for attachment and even
> pages (2 GB for Pages as pages get stored into database cell record)

I just saw a bug report recently that the editor stops working
properly when the content reaches ~500KB.

When it comes to attachments, you should know ahead of time if you are
going to use cluster or just a single node. With cluster attachments
currently _must_ be stored in database, which is a big issue for most
of us (large backups, bad performance, etc). If you don't need
cluster, then attachments will be stored on the filesystem by default.

>
> Can a single Space in Confluence store content or attachments of size
> more than 3-5 GB or do we have any limitation on storage?. Can
> Confluence sustain with the huge content size of 1-3 TB (30000 GB)
> including text, images, attachments etc

theoretically yes, but with 30k users you'll most likely kill your
container, especially if some of the clients will have slow connection
and will be downloading large files. The reason for that is that all
attachment downloads are handled via blocking IO within Confluence,
which means that the download will occupy a worker thread in your
container for the whole duration of the request, which will very
likely result in all of your worker threads being used and your server
will appear to be unresponsive for any new requests. Increasing the
number of worker threads is a possibility, but it is very expensive
workaround (cpu-wise, memory-wise, db-connection-pool-wise).

>
> s it advisable to store them into Confluence wiki pages or should we
> store them into other server say media server and provide the links in
> confluence.

That's what I would strongly encourage you to do. One of the many
items on my todo list is to integrate confluence with a storage
service similar to Amazon's S3.

>
> Do you have any case studies from existing customers to see if they
> have implemented

not that I know of

>
> Confluence to very large audience above 30,000 users. Please can you
> let me know again if we can go ahead Confluence implementation for
> such huge audience as i am looking for some evidences on its
> performance or issues if any one faced the same. Does bigger audience
> implementation affects the Confluence deployment?

According to Atlassian, the primary target for Confluence are
customers with up to 1000 users. We are close to 115k users now (those
are registered = not all of them are active) and we see all kinds of
weird issues.

In my experience big confluence instances need good hw with lots of
memory, fast db and network, lots of gc tuning and occasional
confluence patches.

>
> Search
>
> Apart for existing available Search feature please do let me know on:
>
> How many million documents search conflunce can support in a single
> index.

my guess is that you will kill your db before you kill the search index.

>
> What is the maximum number of indexes supported.
>
> Any case study to see if we can display the search results in
> confluence from different
> databases i.e. federated searching.

I haven't heard of anyone doing that, but maybe Dan from Adaptavist
has some experience with that.

>
> Ability to start from one single starting point and search through the
> entire hierarchy parent to child pages in single or multiple spaces.

as far as I know you can search only in one space or in all spaces
accessible to the current user. that means you can't search in two or
three specific spaces.

>
> Ability to filter search based on parameters such as date created/
> updated.

maybe

>
> Do we have performance requirements for search.

I don't know if you do :). In general you want search to be really
fast because in many cases Confluence is using the search index
instead of the database to speed things up.

>
> Please can you let me know where i can find more details on
> Confluence's Search Architecture.

Confluence uses Apache Lucene (http://lucene.apache.org/java/docs/).
Each node in the cluster has its own index and the nodes are being
synchronized via events broadcasted through Oracle Coherence.

>
> Do we have any unique Search feature other than Tag/label based
> search.

The search is ACL sensitive, so only items that the current user has
access to will be displayed in the results.

cheers,
Igor

Matt Hodges [Atlassian]

unread,

Aug 13, 2009, 5:28:01 AM8/13/09

to Confluence in the (real) Enterprise

> Confluence to very large audience above 30,000 users. Please can you
> let me know again if we can go ahead Confluence implementation for
> such huge audience as i am looking for some evidences on its
> performance or issues if any one faced the same. Does bigger audience
> implementation affects the Confluence deployment?

You may wish to take a look at our documentation on large instances of
Confluence - http://confluence.atlassian.com/x/4AiyCg. We've also
collated some resources on wiki best practices and adoption tips -
http://confluence.atlassian.com/x/7AKgCg.

For this project I would recommend that you explore Confluence
Clustered which offers additional scalability and higher availability.
You can read more about Confluence Clustered here -
http://www.atlassian.com/software/confluence/clustered.jsp.

I hope this helps.

Cheers,

Matt

Jeevan Kamble

unread,

Aug 13, 2009, 5:30:21 AM8/13/09

to enterprise...@googlegroups.com

Thanks Matt,

I was looking for you since long time...All the queries were being answered by Krishna.

Looking for important Confluence requirements - Pls could you reply

JK

Igor Minar

Matt Hodges [Atlassian]

Jeevan Kamble