Description
What kind of different storages (storing backends) ZODB has and how to use them.
This page explains details how ZODB stores data. The information here is important to know to understand Plone database behavior and how to optimize your application.
ZODB is object oriented database. All data in ZODB is pickled Python objects <http://docs.python.org/library/pickle.html>. Pickle is object serialization module for Python.
Pickle format is series of bytes. Here is example what it does look like:
>>> import pickle
>>> data = { "key" : "value" }
>>> pickled = pickle.dumps(data)
>>> print pickled
(dp0
S'key'
p1
S'value'
p2
s.
It is not very human readable format.
Even if you use SQL based RelStorage ZODB backends, the objects are still pickled to the database; SQL does not support varying table schema per row and Python objects do not have fixed schema format.
Data is usually organized to binary trees or BTrees . More specifically, data is usually stored as Object Oriented Binary Tree OOBtree which provides Python object as key and Python object value mappings. Key is the object id in the parent container as a string and value is any pickleable Python object or primitive you store in your database.
BTree stores data in buckets (OOBBucket <http://docs.zope.org/zope3/Code/BTrees/OOBTree/OOBucket/index.html>`_).
Bucket is the smallest unit of data which is written to the database once. Buckets are loaded lazily: BTree only loads buckets storing values of keys being accessed.
BTree tries to stick as much data into one bucket once as possible. When one value in bucket is changed the whole bucket must be rewritten to the disk.
Plone has two kinds of fundamental way to store data
When storing objects in annotation storage, reading object values need at least one extra database look up to load the first bucket of OOBTree.
If the value is going to be used frequently, and especially if it is read when viewing the content object, storing it in an attribute is more efficient than storing it in an annotation. This is because the __annotations__ BTree is a separate persistent object which has to be loaded into memory, and may push something else out of the ZODB cache.
If the attribute stores a large value, it will increase memory usage, as it will be loaded into memory each time the object is fetched from the ZODB.
BLOBs are large binary objects like files or images.
BLOBs are supported since ZODB 3.8.x. Plone 3.x still uses ZODB 3.7.x by default. ZODB 3.8.x works but it is not officially supported.
When you use BLOB interface to store and retrieve data they are stored physically as files on your file systems. File system, as the name says, was designed to handle files and has far better performance on large binary data as sticking the data into ZODB.
BLOBs are streamable which means that you can start serving the file from the beginning of the file to HTTP wire without needing to buffer the whole data to the memory first (slow).
Plone’s Archetypes subsystem supports storing individual Archetypes fields in SQL database. This is mainly an integration feature. Read more about this in Archetypes manual.