Google Groups Home
Help | Sign in
错误信息org.xml.sax.SAXParseExcept ion:Parser has reached th...
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  1 message - Collapse all
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
tiny  
View profile
 More options Aug 15 2007, 2:43 pm
From: tiny <tinyf...@gmail.com>
Date: Wed, 15 Aug 2007 11:43:36 -0700 (PDT)
Local: Wed, Aug 15 2007 2:43 pm
Subject: [Tinyfool的开发日记(blog)] 错误信息org.xml.sax.SAXParseException:Parser has reached th...

最近一个处理非常大的XML的程序遭遇了如下的异
常:org.xml.sax.SAXParseException:Parser has reached the entity
expansion limit "64,000" set by the Application.查了查,原来是在单个
xml文件中实体引用超过了默认值64000个。你用dom和sax解析XML都可能会遇到这
个问题,这印证了我的猜测,java的dom是用sax来实现的。解决方法很简单,运行
Java的时候,加上参数-DentityExpansionLimit=xxxxx。xxxxx代表设定的单文件
实体引用数最大值。--------那么这个xxxxx该怎么选择呢?其实也很简单,选择
你认为可能出现的最大值就好了,比你的文件里面的实体数多,自然就没问题了。
--------那么如果你想知道某个文件里面有多少个实体引用该怎么办呢(放心我肯
定不建议你去数)?对,也很简单,首先我们知道实体引用都是“&"开头“;”结
尾,所以我们可以用如下命令来计算:grep -c "&.*;" yourfile.xml其实,&在
xml里表示为&的形式,所以,一个合法的xml内,有多少&就有多少实体引
用,so,上面的命令效率更高的版本是:grep -c "&" yourfile.xml--------为什
么会对最大的实体引用数做出限制呢?这点我有些疑惑,难道要为解析实体引用准
备缓存空间?但是做出来自动增长的缓存也不是不可能的啊。
DentityExpansionLimit参数的问题是,如果要处理无法预期大小的xml文件怎么
办?你设置为100万,xml文件里面有200万个实体引用,你有办法么?

--
由 tiny 于 8/16/2007 02:13:00 上午 在 Tinyfool的开发日记(blog) 上发表


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2008 Google