Issue on how H2 handles storing LOBs

18 views
Skip to first unread message

Khairul Anwar

unread,
Jul 9, 2024, 9:33:08 AM (7 days ago) Jul 9
to H2 Database
Hi all, we are using the default H2 database that came with the open source WSO2 Enterprise Service Bus (ESB) application. Our H2 DB is setup in embedded mode as follows:
````xml
<datasource>
  <name>WSO2_CARBON_DB</name>
  <description>The datasource used for registry and user manager</description>
  <jndiConfig>
    <name>jdbc/WSO2CarbonDB</name>
  </jndiConfig>
  <definition type="RDBMS">
    <configuration>
      <url>jdbc:h2:repository/database/WSO2CARBON_DB;DB_CLOSE_ON_EXIT=FALSE;LOCK_TIMEOUT=60000</url>
      <username>wso2carbon</username>
      <password>wso2carbon</password>
      <driverClassName>org.h2.Driver</driverClassName>
      <maxActive>50</maxActive>
      <maxWait>60000</maxWait>
      <testOnBorrow>true</testOnBorrow>
      <validationQuery>SELECT 1</validationQuery>
      <validationInterval>30000</validationInterval>
      <defaultAutoCommit>false</defaultAutoCommit>
    </configuration>
  </definition>
</datasource>
````



We use this WSO2 ESB application to transform the format of XML messages between our customer's northbound system and our service provisioning system. That XML message is then saved to the H2 DB as a CLOB named `payload`, in a table of the following format:
````sql
CREATE TABLE CUSTOM_ORDER (
  id bigint,
  payload CLOB,
  state varchar2(20),
  category varchar2(100),
  description varchar2(100),
  externalid varchar2(100),
  priority varchar2(100),
  requestedCompletionDate timestamp,
  requestedStartDate timestamp,
  orderdate timestamp,
  startdate timestamp,
  completiondate timestamp
);
````



Our application handles on average 5 transactions per second, so this would mean up to 5 CLOBs being written to the DB every second, using the following command:
````xml
<config enableOData="false" id="default">
      <property name="carbon_datasource_name">WSO2_CARBON_DB</property>
   </config>
   <query id="insert_query" useConfig="default">
      <sql>INSERT INTO CUSTOM_ORDER(ID,PAYLOAD,STATE,CATEGORY,DESCRIPTION,EXTERNALID,PRIORITY,REQUESTEDCOMPLETIONDATE,REQUESTEDSTARTDATE) VALUES(?,?,?,?,?,?,?,?,?)</sql>
      <param name="ID" ordinal="1" sqlType="STRING"/>
      <param name="PAYLOAD" ordinal="2" sqlType="CLOB"/>
      <param name="STATE" ordinal="3" sqlType="STRING"/>
      <param name="CATEGORY" ordinal="4" sqlType="STRING"/>
      <param name="DESCRIPTION" ordinal="5" sqlType="STRING"/>
      <param name="EXTERNALID" ordinal="6" sqlType="STRING"/>
      <param name="PRIORITY" ordinal="7" sqlType="STRING"/>
      <param name="REQUESTEDCOMPLETIONDATE" ordinal="8" sqlType="TIMESTAMP"/>
      <param name="REQUESTEDSTARTDATE" ordinal="9" sqlType="TIMESTAMP"/>
   </query>
````



However, it seems that our embedded H2 setup is having trouble with it. The WSO2 ESB app is able to start and process messages normally, until it reaches a certain point and then it crashes. If we try to restart it, it will hang at startup when it tries to connect to the embedded H2 database.


We did some digging using the H2 console – it seems there is a parameter called h2.lobFilesPerDirectory, that we think is causing the issue. This param had a default value of `256`, and reaching it seems to break our H2 DB:
image


When we checked the system directory where the embedded H2 DB is stored (repository/database/WSO2CARBON_DB.db), we found a file next to it called WSO2CARBON_DB.lobs.db that was having 255 directories inside it when the issue happened:
image


When we changed the value of the h2.lobFilesPerDirectory parameter to 512 (ie. increasing the per directory limit) by including it as a Java option during WSO2 app startup (eg. ./wso2server.sh -Dh2.lobFilesPerDirectory=512 --start), we were able to work around the issue, and the WSO2 app manages to access the H2 DB and starts up normally again. Otherwise, we had to wipe and re-initialise the H2 DB in order to start the WSO2 application.


So my questions are:
1. How are these LOB directories created? As per our observation, the WSO2CARBON_DB.lobs.db has up to 255 parent *.lobs.db directories in it, and each of those directories has multiple child *.lobs.db directories. Each child *.lobs.db directory has one single XML message payload in it. However, the amount of child *.lobs.db directories is different in each parent *.lobs.db directory. (FYI, when we wiped the DB and re-initialised, in the last 2 days we noticed over 50 parent *.lobs.db directories were created)
2. What should be the recommended value of this h2.lobFilesPerDirectory parameter? Can we remove/disable it entirely so that we'll have an unlimited number of LOB files per directory? (Eg. by setting it to -1)
3. Is it possible to remove the old LOBs created in the H2 DB directory?. For example, setting up a cron job to proactively monitor the directory count and delete the old LOBs created on the fly (in case the count is reaching the threshold) to overcome this issue. What would happen if the WSO2 app asks the H2 database for a specific payload but the LOB file containing that data was deleted? Or can we just periodically run DELETE statements to remove old rows where REQUESTEDSTARTDATE is older than 7 days, for example?


Knowing the answer to Q1 will help us determine if Q2/Q3 would be a suitable fix, or if we need to recommend our customer use an external RDBMS instead where there won't be a limit to the number of LOB files per directory.


Thanks, and I appreciate your time. :)
Reply all
Reply to author
Forward
0 new messages