Storage concirns

207 views
Skip to first unread message

ste...@activitystream.com

unread,
Apr 13, 2015, 8:13:13 AM4/13/15
to orient-...@googlegroups.com
Hi,

I'm running Orientdb for a relatively "small" project and the storage space taken by OrientDB puzzles me quite a bit.

I have 12 million "empty (not light)" edges (12.271.168) taking up 850mb of storage space.
I also have 25G worth of .sbc files even though the data files are "only" about 1G.

Am I doing something obviously wrong or ?

Regards,
 -Stefán



ste...@activitystream.com

unread,
Apr 14, 2015, 3:14:58 AM4/14/15
to orient-...@googlegroups.com
Hi,

Can you at least tell me what these .sbc files are? 

Regards,
 -Stefan 

ky...@ovguideinc.com

unread,
Apr 14, 2015, 2:15:11 PM4/14/15
to orient-...@googlegroups.com
This part of the docs detail what the various extensions mean, however I do not see a ".sbc"


This part of the javadoc lists ".sbc" as the default extension for
"com.orientechnologies.orient.core.db.record.ridbag.sbtree.OSBTreeCollectionManagerAbstract"

I'm still not sure what ".sbc" means but it looks like it represents some sort of index structure, hopefully those links are helpful.

ste...@activitystream.com

unread,
Apr 15, 2015, 4:53:56 AM4/15/15
to orient-...@googlegroups.com
Hi,

Thank you for the links.

I'm running a test now to see how these are accumulated and the .sbc and .wal files take up 2/3 of the total storage space (1/3 each).

Viewing the content of the .sbc files reviles nothing as they seem to contain the same "garbage"-patter repeatedly.

I'm quite concerned because this (alleged) storage inefficiency is affecting our plans.

Can someone that knows please chime in? (I'm hoping this is a settings/configuration mistake on my behalf)

Regards,
 -Stefán

ste...@activitystream.com

unread,
Apr 16, 2015, 5:20:14 AM4/16/15
to orient-...@googlegroups.com
Hi,

I think this is quite a serious matter that demands attention from those in-the-know.

Can someone please share a light on what these .sbc files are and what can be done to keep .wal and .sbc files in check.

Regards,
 -Stefán

ste...@activitystream.com

unread,
Apr 16, 2015, 6:04:41 PM4/16/15
to orient-...@googlegroups.com

No one?

Kyle

unread,
Apr 16, 2015, 9:17:25 PM4/16/15
to orient-...@googlegroups.com
I don't think I am someone "in-the-know" but here are a few more points:

1) 
including more of the info specified in 
IMPORTANT: Improving issue management

would probably be helpful. 

Minimally what version of orientdb you are using. 
Telling more about the shape of your data, properties and indices could be helpful too.

2) 
This page of the doc describes the WAL stuff and says how to disable it.

This page of the doc has more performance help

3) 
My previous post established that the .sbc files are probably indices of some sort.

looking at the files in a database, I see examples of these sbc files:
./multi/collections_13.sbc

inspecting the schema for that db shows I have some EMBEDDEDSET properties and an index on some.

I do not want to do this now but you could experiment with creating different dbs, for example:
i) create one without collection-like properties (no EMBEDDEDSET, etc)
ii)  create one with collection-like properties (EMBEDDEDSET, etc)
iii)  create one with collection-like properties and create and index
iii)  add a large amount of data to some of the previous dbs

and see how that affects the file existence/sizes. do some debugging/science!

4)
You can take a look at the code to try to figure more stuff out, this file might be relevant:

5) double/triple posting rapidly is rude!

Colin

unread,
Apr 16, 2015, 10:52:18 PM4/16/15
to orient-...@googlegroups.com
Hi Stefán,

I'm working on getting you answer.

Best regards,

-Colin

Orient Technologies

The Company behind OrientDB

ste...@activitystream.com

unread,
Apr 17, 2015, 3:33:02 AM4/17/15
to orient-...@googlegroups.com

Thank you for the pointers Kyle.

This storage space quantity makes no sense at all when looking at the data that I have stored and .sbc files "seem" to include only a repeated pattern and in no "sensible" proportions.

Regards,
 -Stefán

ste...@activitystream.com

unread,
Apr 17, 2015, 3:33:26 AM4/17/15
to orient-...@googlegroups.com

Thank you Colin, I do appreciate that.

ste...@activitystream.com

unread,
Apr 17, 2015, 6:54:10 AM4/17/15
to orient-...@googlegroups.com
Hi,

This is looking more and more like a "corrupt" database even though it's fully functional.

I have not been able to replicate this behavior by importing the same data again.

I'm not as concerned now as I was but I would like to get a copy of it to you for analysis. (if you like)

I really have no idea what happened but this instance is radically different from the other two that I have created using the same data.

I would still like to know what the .sbc files are to try to understand what went wrong with this instance.

Regards,
 -Stefan

Andrey Lomakin

unread,
Apr 21, 2015, 2:43:21 AM4/21/15
to orient-database
Hi Stefan.
About your question.
 .sbc files are used to store relations (edges) between vertexes.
In sbc files space is occupied by removed edges reused only if it occupies half of total space to make possible to put all edges related to single vertex in close to each other on the disk.
Could you set parameter sbtreebonsai.freeeSpaceReuseTrigger to  0 and check result ?

--

---
You received this message because you are subscribed to the Google Groups "OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to orient-databa...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Best regards,
Andrey Lomakin.

Luca Garulli

unread,
Apr 21, 2015, 4:44:06 AM4/21/15
to orient-...@googlegroups.com
Hi guys,
We have a typo and we really call it "freee" with 3 e! Fixing it in current SNAPSHOTS, so Stefan use "sbtreebonsai.freeeSpaceReuseTrigger", but then remember to switch to:

sbtreebonsai.freeSpaceReuseTrigger

If you use 2.0.9-SNAPSHOT or 2.1-rc2 or later.

Lvc@

Best Regards,

Luca Garulli
CEO at Orient Technologies LTD
the Company behind OrientDB

ste...@activitystream.com

unread,
Apr 29, 2015, 10:53:14 AM4/29/15
to orient-...@googlegroups.com
Hi,

and thank you for the answers.

I will certainly try this. For some strange reason this database still consists of 36G worth of stb files.
FYIþ I have not been able to replicate this behavior

I will report my findings.

Regards,
 -Stefan
Reply all
Reply to author
Forward
0 new messages