When to use varchar vs. clob

5,355 views
Skip to first unread message

Marcel

unread,
May 21, 2008, 8:25:59 PM5/21/08
to H2 Database
Hi!

Under data types 'varchar' I read that 'for large text data CLOB
should be used'. After approx. how many chars is a String considered
to be large?

Thanks,
Marcel

Thomas Mueller

unread,
May 22, 2008, 5:37:47 PM5/22/08
to h2-da...@googlegroups.com
Hi,

There is no hard limit, it depends on the use case. I will update the
CLOB documentation as follows:

CLOB is like VARCHAR, but intended for very large values. Unlike when
using VARCHAR, large CLOB objects are not kept fully in-memory. CLOB
should be used for documents and texts with arbitrary size such as XML
or HTML documents, text files, or memo fields of unlimited size.
VARCHAR should be used for text with relatively short average size
(for example shorter than 200 characters). Short CLOBs are stored
inline, but there is an
overhead compared to VARCHAR. Use PreparedStatement.setCharacterStream
to store values.

Regards,
Thomas

Marcel

unread,
May 22, 2008, 7:06:02 PM5/22/08
to H2 Database
OK, thanks Thomas!

Since I've Strings with more than 10k chars, I guess I should switch
to clob.

Cheers,
Marcel

nkp

unread,
May 23, 2008, 10:45:48 AM5/23/08
to H2 Database
My general rule of thumb is that if it's ONE line of text, it's a
VARCHAR, and if it's multiple lines, it's a CLOB.

One word on CLOBs and XML: It's my experience that although XML is
considered text, it should always go into a BLOB. This is because it's
not just text, but encoded text, i.e. it's essentially binary text. A
CLOB can get converted if client and server have different character
encodings, which would invalidate the XML; a BLOB is not converted,
preserving the correctness of the character encoding in the XML
declaration.

Thomas Mueller

unread,
May 24, 2008, 7:52:31 AM5/24/08
to h2-da...@googlegroups.com
Hi,

> My general rule of thumb is that if it's ONE line of text, it's a
> VARCHAR, and if it's multiple lines, it's a CLOB.

Yes, this is a simple and good rule. There are some exceptions, for
example an address could be stored using multiple lines, but it
usually short enough for a VARCHAR.

> One word on CLOBs and XML: It's my experience that although XML is
> considered text, it should always go into a BLOB.

Maybe you are right. It seems that most XML APIs in Java use InputStream.

In many cases the encoding is hardcoded as UTF-8, and XML is processes
as a String in the application, or a Reader is used. In such cases
CLOB is OK.

Regards,
Thomas

Nils Kilden-Pedersen

unread,
May 30, 2008, 9:30:04 AM5/30/08
to H2 Database
On May 24, 6:52 am, "Thomas Mueller" <thomas.tom.muel...@gmail.com>
wrote:
> In many cases the encoding is hardcoded as UTF-8, and XML is processes
> as a String in the application, or a Reader is used. In such cases
> CLOB is OK.

Probably, but I think that's only accidental. I've worked with
relational databases hosted on servers with EBCDIC code pages, and
when accessing CLOBs over JDBC from a non-EBCDIC client (like a PC),
the CLOBs (and CHAR, VARCHAR, etc) where all converted (as they
should), however this obviously corrupted the XML.
So I think the reason people generally use CLOBs, is because generally
people run (at least test) their software in fairly homogeneous
environments, and may not ever see the potential problem.

Just a word up.

Marcel

unread,
Jun 3, 2008, 8:40:13 PM6/3/08
to H2 Database
Hi!

nkp wrote:
> My general rule of thumb is that if it's ONE line of text, it's a
> VARCHAR, and if it's multiple lines, it's a CLOB.
Hmm, I have only one line. Every time. But the line usually has about
30,000 chars in it.
VARCHAR or CLOB?
In MySQL I used column types like 'tinytext' and 'longtext'. Does H2
use/know these types, too?

Thanks,
Marcel

Thomas Mueller

unread,
Jun 5, 2008, 2:37:59 PM6/5/08
to h2-da...@googlegroups.com
Hi,

> Hmm, I have only one line. Every time. But the line usually has about
> 30,000 chars in it.
> VARCHAR or CLOB?

CLOB.

> In MySQL I used column types like 'tinytext' and 'longtext'. Does H2
> use/know these types, too?

H2 maps the data types to BLOB. For the list of supported data types I
suggest you have a look at the H2 documentation.

Regards,
Thomas

Thomas Mueller

unread,
Jun 5, 2008, 2:43:21 PM6/5/08
to h2-da...@googlegroups.com
>> In MySQL I used column types like 'tinytext' and 'longtext'. Does H2
>> use/know these types, too?

Sorry, to CLOB of course ;-)

Reply all
Reply to author
Forward
0 new messages