lmdb

This is a universal Python binding for the LMDB ‘Lightning’ Database. Two variants are provided and automatically selected during install: a CFFI variant that supports PyPy and all versions of CPython >=2.6, and a C extension that supports CPython 2.5-2.7 and >=3.3. Both variants provide the same interface.

LMDB is a tiny database with some excellent properties:

  • Ordered-map interface (keys are always sorted)
  • Reader/writer transactions: readers don’t block writers and writers don’t block readers. Each environment supports one concurrent write transaction.
  • Read transactions are extremely cheap.
  • Environments may be opened by multiple processes on the same host, making it ideal for working around Python’s GIL.
  • Multiple named databases may be created with transactions covering all named databases.
  • Memory mapped, allowing for zero copy lookup and iteration. This is optionally exposed to Python using the buffer() interface.
  • Maintenance requires no external process or background threads.
  • No application-level caching is required: LMDB relies entirely on the operating system’s buffer cache.

Installation

For convenience, a supported version of LMDB is bundled with the binding and built statically by default. If your system distribution includes LMDB, set the LMDB_FORCE_SYSTEM environment variable, and optionally LMDB_INCLUDEDIR and LMDB_LIBDIR prior to invoking setup.py.

The CFFI variant depends on CFFI, which in turn depends on libffi, which may need to be installed from a package. On CPython, both variants additionally depend on the CPython development headers. On Debian/Ubuntu:

apt-get install libffi-dev python-dev build-essential

To install the C extension, ensure a C compiler and pip or easy_install are available and type:

pip install lmdb
# or
easy_install lmdb

The CFFI variant may be used on CPython by setting the LMDB_FORCE_CFFI environment variable before installation, or before module import with an existing installation:

>>> import os
>>> os.environ['LMDB_FORCE_CFFI'] = '1'

>>> # CFFI variant is loaded.
>>> import lmdb

Named Databases

Named databases require the max_dbs= parameter to be provided when calling lmdb.open() or lmdb.Environment. This must be done by the first process or thread opening the environment.

Once a correctly configured Environment is created, new named databases may be created via Environment.open_db().

Storage efficiency & limits

LMDB groups records in pages matching the operating system’s page size, which is usually 4096 bytes. Each page must contain at least 2 records, in addition to 8 bytes per record and a 16 byte header. Due to this the engine is most space-efficient when the combined size of any (8+key+value) combination does not exceed 2040 bytes.

When an attempt to store a record would exceed the maximum size, its value part is written separately to one or more dedicated pages. Since the trailer of the last page containing the record value cannot be shared with other records, it is more efficient when large values are an approximate multiple of 4096 bytes, minus 16 bytes for an initial header.

Space usage can be monitored using Environment.stat():

>>> pprint(env.stat())
{'branch_pages': 1040L,
 'depth': 4L,
 'entries': 3761848L,
 'leaf_pages': 73658L,
 'overflow_pages': 0L,
 'psize': 4096L}

This database contains 3,761,848 records and no values were spilled (overflow_pages).

By default record keys are limited to 511 bytes in length, however this can be adjusted by rebuilding the library. The compile-time key length can be queried via Environment.max_key_size().

Memory usage

Diagnostic tools often overreport the memory usage of LMDB databases, since the tools poorly classify that memory. The Linux ps command RSS measurement may report a process as having an entire database resident, causing user alarm. While the entire database may really be resident, it is half the story.

Unlike heap memory, pages in file-backed memory maps, such as those used by LMDB, may be efficiently reclaimed by the OS at any moment so long as the pages in the map are clean. Clean simply means that the resident pages’ contents match the associated pages that live in the disk file that backs the mapping. A clean mapping works exactly like a cache, and in fact it is a cache: the OS page cache.

On Linux, the /proc/<pid>/smaps file contains one section for each memory mapping in a process. To inspect the actual memory usage of an LMDB database, look for a data.mdb entry, then observe its Dirty and Clean values.

When no write transaction is active, all pages in an LMDB database should be marked clean, unless the Environment was opened with sync=False, and no explicit Environment.sync() has been called since the last write transaction, and the OS writeback mechanism has not yet opportunistically written the dirty pages to disk.

Bytestrings

This documentation uses bytestring to mean either the Python<=2.7 str() type, or the Python>=3.0 bytes() type, depending on the Python version in use.

Due to the design of Python 2.x, LMDB will happily accept Unicode instances where str() instances are expected, so long as they contain only ASCII characters, in which case they are implicitly encoded to ASCII. You should not rely on this behaviour! It results in brittle programs that often break the moment they are deployed in production. Always explicitly encode and decode any Unicode values before passing them to LMDB.

This documentation uses bytes() in examples. In Python 3.x this is a distinct type, whereas in Python 2.6 and 2.7 it is simply an alias for str(). Since Python 2.5 does not have this alias, you should substitute str() for bytes() in any code examples below when running on Python 2.5.

Buffers

Since LMDB is memory mapped it is possible to access record data without keys or values ever being copied by the kernel, database library, or application. To exploit this the library can be instructed to return buffer() objects instead of bytestrings by passing buffers=True to Environment.begin() or Transaction.

In Python buffer() objects can be used in many places where bytestrings are expected. In every way they act like a regular sequence: they support slicing, indexing, iteration, and taking their length. Many Python APIs will automatically convert them to bytestrings as necessary:

>>> txn = env.begin(buffers=True)
>>> buf = txn.get('somekey')
>>> buf
<read-only buffer ptr 0x12e266010, size 4096 at 0x10d93b970>

>>> len(buf)
4096
>>> buf[0]
'a'
>>> buf[:2]
'ab'
>>> value = bytes(buf)
>>> len(value)
4096
>>> type(value)
<type 'bytes'>

It is also possible to pass buffers directly to many native APIs, for example file.write(), socket.send(), zlib.decompress() and so on. A buffer may be sliced without copying by passing it to buffer():

>>> # Extract bytes 10 through 210:
>>> sub_buf = buffer(buf, 10, 200)
>>> len(sub_buf)
200

In both PyPy and CPython, returned buffers must be discarded after their producing transaction has completed or been modified in any way. To preserve buffer’s contents, copy it using bytes():

with env.begin(write=True, buffers=True) as txn:
    buf = env.get('foo')           # only valid until the next write.
    buf_copy = bytes(buf)          # valid forever
    env.delete('foo')              # this is a write!
    env.put('foo2', 'bar2')        # this is also a write!

    print('foo: %r' % (buf,))      # ERROR! invalidated by write
    print('foo: %r' % (buf_copy,)  # OK

print('foo: %r' % (buf,))          # ERROR! also invalidated by txn end
print('foo: %r' % (buf_copy,)      # still OK

writemap mode

When Environment or open() is invoked with writemap=True, the library will use a writeable memory mapping to directly update storage. This improves performance at a cost to safety: it is possible (though fairly unlikely) for buggy C code in the Python process to accidentally overwrite the map, resulting in database corruption.

Caution

This option may cause filesystems that don’t support sparse files, such as OSX, to immediately preallocate map_size= bytes of underlying storage.

Transaction management

MDB_NOTLS mode is used exclusively, which allows read transactions to freely migrate across threads and for a single thread to maintain multiple read transactions. This enables mostly care-free use of read transactions, for example when using gevent.

Caution

While any reader exists, writers cannot reuse space in the database file that has become unused in later versions. Due to this, continual use of long-lived read transactions may cause the database to grow without bound. A lost reference to a read transaction will simply be aborted (and its reader slot freed) when the Transaction is eventually garbage collected. This should occur immediately on CPython, but may be deferred indefinitely on PyPy.

However the same is not true for write transactions: losing a reference to a write transaction can lead to deadlock, particularly on PyPy, since if the same process that lost the Transaction reference immediately starts another write transaction, it will deadlock on its own lock. Subsequently the lost transaction may never be garbage collected (since the process is now blocked on itself) and the database will become unusable.

These problems are easily avoided by always wrapping Transaction in a with statement somewhere on the stack:

# Even if this crashes, txn will be correctly finalized.
with env.begin() as txn:
    if txn.get('foo'):
        function_that_stashes_away_txn_ref(txn)
        function_that_leaks_txn_refs(txn)
        crash()

Interface

lmdb.open(path, **kwargs)

Shortcut for Environment constructor.

lmdb.version()

Return a tuple of integers (major, minor, patch) describing the LMDB library version that the binding is linked against. The version of the binding itself is available from lmdb.__version__.

Environment class

class lmdb.Environment(path, map_size=10485760, subdir=True, readonly=False, metasync=True, sync=True, map_async=False, mode=493, create=True, readahead=True, writemap=False, meminit=True, max_readers=126, max_dbs=0, max_spare_txns=1, max_spare_cursors=32, max_spare_iters=32)

Structure for a database environment. An environment may contain multiple databases, all residing in the same shared-memory map and underlying disk file.

To write to the environment a Transaction must be created. One simultaneous write transaction is allowed, however there is no limit on the number of read transactions even when a write transaction exists.

Equivalent to mdb_env_open()

path:
Location of directory (if subdir=True) or file prefix to store the database.
map_size:

Maximum size database may grow to; used to size the memory mapping. If database grows larger than map_size, an exception will be raised and the user must close and reopen Environment. On 64-bit there is no penalty for making this huge (say 1TB). Must be <2GB on 32-bit.

Note

The default map size is set low to encourage a crash, so users can figure out a good value before learning about this option too late.

subdir:
If True, path refers to a subdirectory to store the data and lock files in, otherwise it refers to a filename prefix.
readonly:
If True, disallow any write operations. Note the lock file is still modified. If specified, the write flag to begin() or Transaction is ignored.
metasync:
If False, never explicitly flush metadata pages to disk. OS will flush at its discretion, or user can flush with sync().
sync:
If False, never explicitly flush data pages to disk. OS will flush at its discretion, or user can flush with sync(). This optimization means a system crash can corrupt the database or lose the last transactions if buffers are not yet flushed to disk.
mode:
File creation mode.
create:
If False, do not create the directory path if it is missing.
readahead:
If False, LMDB will disable the OS filesystem readahead mechanism, which may improve random read performance when a database is larger than RAM.
writemap:
If True LMDB will use a writeable memory map to update the database. This option is incompatible with nested transactions.
meminit:
If False LMDB will not zero-initialize buffers prior to writing them to disk. This improves performance but may cause old heap data to be written saved in the unused portion of the buffer. Do not use this option if your application manipulates confidential data (e.g. plaintext passwords) in memory. This option is only meaningful when writemap=False; new pages are always zero-initialized when writemap=True.
map_async:
When writemap=True, use asynchronous flushes to disk. As with sync=False, a system crash can then corrupt the database or lose the last transactions. Calling sync() ensures on-disk database integrity until next commit.
max_readers:
Maximum number of simultaneous read transactions. Can only be set by the first process to open an environment, as it affects the size of the lock file and shared memory area. Attempts to simultaneously start more than this many read transactions will fail.
max_dbs:
Maximum number of databases available. If 0, assume environment will be used as a single database.
max_spare_txns:

Read-only transactions to cache after becoming unused. Caching transactions avoids two allocations, one lock and linear scan of the shared environment per invocation of begin(), Transaction, get(), gets(), or cursor(). Should match the process’s maximum expected concurrent transactions (e.g. thread count).

Note: ignored on CFFI.

max_spare_cursors:

Read-only cursors to cache after becoming unused. Caching cursors avoids two allocations per Cursor or cursor() or Transaction.cursor() invocation.

Note: ignored on CFFI.

max_spare_iters:

Iterators to cache after becoming unused. Caching iterators avoids one allocation per Cursor iter* method invocation.

Note: ignored on CFFI.

begin(db=None, parent=None, write=False, buffers=False)

Shortcut for lmdb.Transaction

close()

Close the environment, invalidating any open iterators, cursors, and transactions. Repeat calls to close() have no effect.

Equivalent to mdb_env_close()

copy(path)

Make a consistent copy of the environment in the given destination directory.

Equivalent to mdb_env_copy()

copyfd(fd)

Copy a consistent version of the environment to file descriptor fd.

Equivalent to mdb_env_copyfd()

flags()

Return a dict describing Environment constructor flags used to instantiate this environment.

info()

Return some nice environment information as a dict:

map_addr Address of database map in RAM.
map_size Size of database map in RAM.
last_pgno ID of last used page.
last_txnid ID of last committed transaction.
max_readers Maximum number of threads.
num_readers Number of threads in use.

Equivalent to mdb_env_info()

max_key_size()

Return the maximum size in bytes of a record’s key part. This matches the MDB_MAXKEYSIZE constant set at compile time.

max_readers()

Return the maximum number of readers specified during open of the environment by the first process. This is the same as max_readers= specified to the constructor if this process was the first to open the environment.

open_db(name=None, txn=None, reverse_key=False, dupsort=False, create=True)

Open a database, returning an opaque handle. Repeat Environment.open_db() calls for the same name will return the same handle. As a special case, the main database is always open.

Equivalent to mdb_dbi_open()

Named databases are implemented by storing a special descriptor in the main database. All databases in an environment share the same file. Because the descriptor is present in the main database, attempts to create a named database will fail if a key matching the database’s name already exists. Furthermore the key is visible to lookups and enumerations. If your main database keyspace conflicts with the names you use for named databases, then move the contents of your main database to another named database.

>>> env = lmdb.open('/tmp/test', max_dbs=2)
>>> with env.begin(write=True) as txn
...     txn.put('somename', 'somedata')

>>> # Error: database cannot share name of existing key!
>>> subdb = env.open_db('somename')

A newly created database will not exist if the transaction that created it aborted, nor if another process deleted it. The handle resides in the shared environment, it is not owned by the current transaction or process. Only one thread should call this function; it is not mutex-protected in a read-only transaction.

Preexisting transactions, other than the current transaction and any parents, must not use the new handle, nor must their children.

name:
Database name. If None, indicates the main database should be returned, otherwise indicates a named database should be created inside the main database. In other words, a key representing the database will be visible in the main database, and the database name cannot conflict with any existing key
txn:
Transaction used to create the database if it does not exist. If unspecified, a temporarily write transaction is used. Do not call open_db() from inside an existing transaction without supplying it here. Note the passed transaction must have write=True.
reverse_key:
If True, keys are compared from right to left (e.g. DNS names).
dupsort:

Duplicate keys may be used in the database. (Or, from another perspective, keys may have multiple data items, stored in sorted order.) By default keys must be unique and may have only a single data item.

dupsort is not yet fully supported.

create:
If True, create the database if it doesn’t exist, otherwise raise an exception.
path()

Directory path or file name prefix where this environment is stored.

Equivalent to mdb_env_get_path()

reader_check()

Search the reader lock table for stale entries, for example due to a crashed process. Returns the number of stale entries that were cleared.

readers()

Return a multi line Unicode string describing the current state of the reader lock table.

stat()

Return some nice environment statistics as a dict:

psize Size of a database page in bytes.
depth Height of the B-tree.
branch_pages Number of internal (non-leaf) pages.
leaf_pages Number of leaf pages.
overflow_pages Number of overflow pages.
entries Number of data items.

Equivalent to mdb_env_stat()

sync(force=False)

Flush the data buffers to disk.

Equivalent to mdb_env_sync()

Data is always written to disk when Transaction.commit() is called, but the operating system may keep it buffered. MDB always flushes the OS buffers upon commit as well, unless the environment was opened with sync=False or metasync=False.

force:
If True, force a synchronous flush. Otherwise if the environment was opened with sync=False the flushes will be omitted, and with map_async=True they will be asynchronous.

Transaction class

class lmdb.Transaction(env, db=None, parent=None, write=False, buffers=False)

A transaction object. All operations require a transaction handle, transactions may be read-only or read-write. Write transactions may not span threads. Transaction objects implement the context manager protocol, so that reliable release of the transaction happens even in the face of unhandled exceptions:

# Transaction aborts correctly:
with env.begin(write=True) as txn:
    crash()

# Transaction commits automatically:
with env.begin(write=True) as txn:
    txn.put('a', 'b')

Equivalent to mdb_txn_begin()

env:
Environment the transaction should be on.
db:
Default named database to operate on. If unspecified, defaults to the environment’s main database. Can be overridden on a per-call basis below.
parent:
None, or a parent transaction (see lmdb.h).
write:
Transactions are read-only by default. To modify the database, you must pass write=True. This flag is ignored if Environment was opened with readonly=True.
buffers:

If True, indicates buffer() objects should be yielded instead of bytestrings. This setting applies to the Transaction instance itself and any Cursors created within the transaction.

This feature significantly improves performance, since MDB has a zero-copy design, but it requires care when manipulating the returned buffer objects. The benefit of this facility is diminished when using small keys and values.

abort()

Abort the pending transaction. Repeat calls to abort() have no effect after a previously successful commit() or abort(), or after the associated Environment has been closed.

Equivalent to mdb_txn_abort()

commit()

Commit the pending transaction.

Equivalent to mdb_txn_commit()

cursor(db=None)

Shortcut for lmdb.Cursor(db, self)

delete(key, value='', db=None)

Delete a key from the database.

Equivalent to mdb_del()

key:
The key to delete.
value:
If the database was opened with dupsort=True and value is not the empty bytestring, then delete elements matching only this (key, value) pair, otherwise all values for key are deleted.

Returns True if at least one key was deleted.

drop(db, delete=True)

Delete all keys in a named database and optionally delete the named database itself. Deleting the named database causes it to become unavailable, and invalidates existing cursors.

Equivalent to mdb_drop()

get(key, default=None, db=None)

Fetch the first value matching key, returning default if key does not exist. A cursor must be used to fetch all values for a key in a dupsort=True database.

Equivalent to mdb_get()

pop(key, db=None)

Use a temporary cursor to invoke Cursor.pop().

db:
Named database to operate on. If unspecified, defaults to the database given to the Transaction constructor.
put(key, value, dupdata=True, overwrite=True, append=False, db=None)

Store a record, returning True if it was written, or False to indicate the key was already present and overwrite=False.

Equivalent to mdb_put()

key:
Bytestring key to store.
value:
Bytestring value to store.
dupdata:
If True and database was opened with dupsort=True, add pair as a duplicate if the given key already exists. Otherwise overwrite any existing matching key.
overwrite:
If False, do not overwrite any existing matching key.
append:
If True, append the pair to the end of the database without comparing its order first. Appending a key that is not greater than the highest existing key will cause corruption.
db:
Named database to operate on. If unspecified, defaults to the database given to the Transaction constructor.
replace(key, value, db=None)

Use a temporary cursor to invoke Cursor.replace().

db:
Named database to operate on. If unspecified, defaults to the database given to the Transaction constructor.
stat(db)

Return statistics like Environment.stat(), except for a single DBI. db must be a database handle returned by open_db().

Cursor class

class lmdb.Cursor(db, txn)

Structure for navigating a database.

Equivalent to mdb_cursor_open()

db:
Database to navigate.
txn:
Transaction to navigate.

As a convenience, Transaction.cursor() can be used to quickly return a cursor:

>>> env = lmdb.open('/tmp/foo')
>>> child_db = env.open_db('child_db')
>>> with env.begin() as txn:
...     cursor = txn.cursor()           # Cursor on main database.
...     cursor2 = txn.cursor(child_db)  # Cursor on child database.

Cursors start in an unpositioned state. If iternext() or iterprev() are used in this state, iteration proceeds from the start or end respectively. Iterators directly position using the cursor, meaning strange behavior results when multiple iterators exist on the same cursor.

Note

From the perspective of the Python binding, cursors return to an ‘unpositioned’ state once any scanning or seeking method (e.g. next(), prev_nodup(), set_range()) returns False or raises an exception. This is primarily to ensure safe, consistent semantics in the face of any error condition.

When the Cursor returns to an unpositioned state, its key() and value() return empty strings to indicate there is no active position, although internally the LMDB cursor may still have a valid position.

This may lead to slightly surprising behaviour when iterating the values for a dupsort=True database’s keys, since methods such as iternext_dup() will cause Cursor to appear unpositioned, despite it returning False only to indicate there are no more values for the current key. In that case, simply calling next() would cause iteration to resume at the next available key.

This behaviour may change in future.

Iterator methods such as iternext() and iterprev() accept keys and values arguments. If both are True, then the value of item() is yielded on each iteration. If only keys is True, key() is yielded, otherwise only value() is yielded.

Prior to iteration, a cursor can be positioned anywhere in the database:

>>> with env.begin() as txn:
...     cursor = txn.cursor()
...     if not cursor.set_range('5'): # Position at first key >= '5'.
...         print('Not found!')
...     else:
...         for key, value in cursor: # Iterate from first key >= '5'.
...             print((key, value))

Iteration is not required to navigate, and sometimes results in ugly or inefficient code. In cases where the iteration order is not obvious, or is related to the data being read, use of set_key(), set_range(), key(), value(), and item() may be preferable:

>>> # Record the path from a child to the root of a tree.
>>> path = ['child14123']
>>> while path[-1] != 'root':
...     assert cursor.set_key(path[-1]), \
...         'Tree is broken! Path: %s' % (path,)
...     path.append(cursor.value())
count()

Return the number of values (“duplicates”) for the current key.

Only meaningful for databases opened with dupsort=True.

Equivalent to mdb_cursor_count()

delete(dupdata=False)

Delete the current element and move to the next, returning True on success or False if the database was empty.

If dupdata is True, delete all values (“duplicates”) for the current key, otherwise delete only the currently positioned value. Only meaningful for databases opened with dupsort=True.

Equivalent to mdb_cursor_del()

first()

Move to the first key in the database, returning True on success or False if the database is empty.

If the database was opened with dupsort=True and the key contains duplicates, the cursor is positioned on the first value (“duplicate”).

Equivalent to mdb_cursor_get() with MDB_FIRST

first_dup()

Move to the first value (“duplicate”) for the current key, returning True on success or False if the database is empty.

Only meaningful for databases opened with dupsort=True.

Equivalent to mdb_cursor_get() with MDB_FIRST_DUP

get(key, default=None)

Equivalent to set_key(), except value() is returned when key is found, otherwise default.

item()

Return the current (key, value) pair.

iternext(keys=True, values=True)

Return a forward iterator that yields the current element before calling next(), repeating until the end of the database is reached. As a convenience, Cursor implements the iterator protocol by automatically returning a forward iterator when invoked:

>>> # Equivalent:
>>> it = iter(cursor)
>>> it = cursor.iternext(keys=True, values=True)

If the cursor is not yet positioned, it is moved to the first key in the database, otherwise iteration proceeds from the current position.

iternext_dup(keys=False, values=True)

Return a forward iterator that yields the current value (“duplicate”) of the current key before calling next_dup(), repeating until the last value of the current key is reached.

Only meaningful for databases opened with dupsort=True.

if not cursor.set_key("foo"):
    print("No values found for 'foo'")
else:
    for idx, data in enumerate(cursor.iternext_dup()):
        print("%d'th value for 'foo': %s" % (idx, data))
iternext_nodup(keys=True, values=False)

Return a forward iterator that yields the current value (“duplicate”) of the current key before calling next_nodup(), repeating until the end of the database is reached.

Only meaningful for databases opened with dupsort=True.

If the cursor is not yet positioned, it is moved to the first key in the database, otherwise iteration proceeds from the current position.

for key in cursor.iternext_nodup():
    print("Key '%s' has %d values" % (key, cursor.count()))
iterprev(keys=True, values=True)

Return a reverse iterator that yields the current element before calling prev(), until the start of the database is reached.

If the cursor is not yet positioned, it is moved to the last key in the database, otherwise iteration proceeds from the current position.

>>> with env.begin() as txn:
...     for i, (key, value) in enumerate(txn.cursor().iterprev()):
...         print('%dth last item is (%r, %r)' % (1+i, key, value))
iterprev_dup(keys=False, values=True)

Return a reverse iterator that yields the current value (“duplicate”) of the current key before calling prev_dup(), repeating until the first value of the current key is reached.

Only meaningful for databases opened with dupsort=True.

iterprev_nodup(keys=True, values=False)

Return a reverse iterator that yields the current value (“duplicate”) of the current key before calling prev_nodup(), repeating until the start of the database is reached.

If the cursor is not yet positioned, it is moved to the last key in the database, otherwise iteration proceeds from the current position.

Only meaningful for databases opened with dupsort=True.

key()

Return the current key.

last()

Move to the last key in the database, returning True on success or False if the database is empty.

If the database was opened with dupsort=True and the key contains duplicates, the cursor is positioned on the last value (“duplicate”).

Equivalent to mdb_cursor_get() with MDB_LAST

last_dup()

Move to the last value (“duplicate”) for the current key, returning True on success or False if the database is empty.

Only meaningful for databases opened with dupsort=True.

Equivalent to mdb_cursor_get() with MDB_LAST_DUP

next()

Move to the next element, returning True on success or False if there is no next element.

For databases opened with dupsort=True, moves to the next value (“duplicate”) for the current key if one exists, otherwise moves to the first value of the next key.

Equivalent to mdb_cursor_get() with MDB_NEXT

next_dup()

Move to the next value (“duplicate”) of the current key, returning True on success or False if there is no next value.

Only meaningful for databases opened with dupsort=True.

Equivalent to mdb_cursor_get() with MDB_NEXT_DUP

next_nodup()

Move to the first value (“duplicate”) of the next key, returning True on success or False if there is no next key.

Only meaningful for databases opened with dupsort=True.

Equivalent to mdb_cursor_get() with MDB_NEXT_NODUP

pop(key)

Fetch a record’s value then delete it. Returns None if no previous value existed. This uses the best available mechanism to minimize the cost of a delete-and-return-previous operation.

For databases opened with dupsort=True, the first data element (“duplicate”) for the key will be popped.

key:
Bytestring key to delete.
prev()

Move to the previous element, returning True on success or False if there is no previous item.

For databases opened with dupsort=True, moves to the previous data item (“duplicate”) for the current key if one exists, otherwise moves to the previous key.

Equivalent to mdb_cursor_get() with MDB_PREV

prev_dup()

Move to the previous value (“duplicate”) of the current key, returning True on success or False if there is no previous value.

Only meaningful for databases opened with dupsort=True.

Equivalent to mdb_cursor_get() with MDB_PREV_DUP

prev_nodup()

Move to the last value (“duplicate”) of the previous key, returning True on success or False if there is no previous key.

Only meaningful for databases opened with dupsort=True.

Equivalent to mdb_cursor_get() with MDB_PREV_NODUP

put(key, val, dupdata=True, overwrite=True, append=False)

Store a record, returning True if it was written, or False to indicate the key was already present and overwrite=False. On success, the cursor is positioned on the key.

Equivalent to mdb_cursor_put()

key:
Bytestring key to store.
val:
Bytestring value to store.
dupdata:
If True and database was opened with dupsort=True, add pair as a duplicate if the given key already exists. Otherwise overwrite any existing matching key.
overwrite:
If False, do not overwrite the value for the key if it exists, just return False. For databases opened with dupsort=True, False will always be returned if a duplicate key/value pair is inserted, regardless of the setting for overwrite.
append:
If True, append the pair to the end of the database without comparing its order first. Appending a key that is not greater than the highest existing key will cause corruption.
replace(key, val)

Store a record, returning its previous value if one existed. Returns None if no previous value existed. This uses the best available mechanism to minimize the cost of a set-and-return-previous operation.

For databases opened with dupsort=True, only the first data element (“duplicate”) is returned if it existed, all data elements are removed and the new (key, data) pair is inserted.

key:
Bytestring key to store.
value:
Bytestring value to store.
set_key(key)

Seek exactly to key, returning True on success or False if the exact key was not found. It is an error to set_key() the empty bytestring.

For databases opened with dupsort=True, moves to the first value (“duplicate”) for the key.

Equivalent to mdb_cursor_get() with MDB_SET_KEY

set_key_dup(key, value)

Seek exactly to (key, value), returning True on success or False if the exact key and value was not found. It is an error to set_key() the empty bytestring.

Only meaningful for databases opened with dupsort=True.

Equivalent to mdb_cursor_get() with MDB_GET_BOTH

set_range(key)

Seek to the first key greater than or equal to key, returning True on success, or False to indicate key was past end of database. Behaves like first() if key is the empty bytestring.

For databases opened with dupsort=True, moves to the first value (“duplicate”) for the key.

Equivalent to mdb_cursor_get() with MDB_SET_RANGE

set_range_dup(key, value)

Seek to the first key/value pair greater than or equal to key, returning True on success, or False to indicate (key, value) was past end of database.

Only meaningful for databases opened with dupsort=True.

Equivalent to mdb_cursor_get() with MDB_GET_BOTH_RANGE

value()

Return the current value.

Exceptions

class lmdb.Error

Raised when an LMDB-related error occurs, and no more specific lmdb.Error subclass exists.

class lmdb.KeyExistsError

Key/data pair already exists.

class lmdb.NotFoundError

No matching key/data pair found.

class lmdb.PageNotFoundError

Request page not found.

class lmdb.CorruptedError

Located page was of the wrong type.

class lmdb.PanicError

Update of meta page failed.

class lmdb.VersionMismatchError

Database environment version mismatch.

class lmdb.InvalidError

File is not an MDB file.

class lmdb.MapFullError

Environment map_size= limit reached.

class lmdb.DbsFullError

Environment max_dbs= limit reached.

class lmdb.ReadersFullError

Environment max_readers= limit reached.

class lmdb.TlsFullError

Thread-local storage keys full - too many environments open.

class lmdb.TxnFullError

Transaciton has too many dirty pages - transaction too big.

class lmdb.CursorFullError

Internal error - cursor stack limit reached.

class lmdb.PageFullError

Internal error - page has no more space.

class lmdb.MapResizedError

Database contents grew beyond environment map_size=.

class lmdb.IncompatibleError

Operation and DB incompatible, or DB flags changed.

class lmdb.BadRslotError

Invalid reuse of reader locktable slot.

class lmdb.BadTxnError

Transaction cannot recover - it must be aborted.

class lmdb.BadValsizeError

Too big key/data, key is empty, or wrong DUPFIXED size.

class lmdb.ReadonlyError

An attempt was made to modify a read-only database.

class lmdb.InvalidParameterError

An invalid parameter was specified.

class lmdb.LockError

The environment was locked by another process.

class lmdb.MemoryError

Out of memory.

class lmdb.DiskError

No more disk space.

Threading control

lmdb.enable_drop_gil()

Arrange for the global interpreter lock to be released during database IO. This flag is ignored and always assumed to be True on CFFI. Note this can only be set once per process.

Continually dropping and reacquiring the GIL may incur unnecessary overhead in single-threaded programs. Since Python intra-process concurrency is already limited, and LMDB supports inter-process access, programs using LMDB will achieve better throughput by forking rather than using threads.

Caution: this function should be invoked before any threads are created.

Command line tools

A rudimentary interface to most of the binding’s functionality is provided. These functions are useful for e.g. backup jobs.

$ python -mlmdb --help
Usage: python -mlmdb [options] <command>

Basic tools for working with LMDB.

    copy: Consistent high speed backup an environment.
        python -mlmdb copy -e source.lmdb target.lmdb

    copyfd: Consistent high speed backup an environment to stdout.
        python -mlmdb copyfd -e source.lmdb > target.lmdb/data.mdb

    drop: Delete one or more named databases.
        python -mlmdb drop db1

    dump: Dump one or more databases to disk in 'cdbmake' format.
        Usage: dump [db1=file1.cdbmake db2=file2.cdbmake]

        If no databases are given, dumps the main database to 'main.cdbmake'.

    edit: Add/delete/replace values from a database.
        python -mlmdb edit --set key=value --set-file key=/path \
                   --add key=value --add-file key=/path/to/file \
                   --delete key

    get: Read one or more values from a database.
        python -mlmdb get [<key1> [<keyN> [..]]]

    readers: Display readers in the lock table
        python -mlmdb readers -e /path/to/db [-c]

        If -c is specified, clear stale readers.

    restore: Read one or more database from disk in 'cdbmake' format.
        python -mlmdb restore db1=file1.cdbmake db2=file2.cdbmake

        The special db name ":main:" may be used to indicate the main DB.

    rewrite: Re-create an environment using MDB_APPEND
        python -mlmdb rewrite -e src.lmdb -E dst.lmdb [<db1> [<dbN> ..]]

        If no databases are given, rewrites only the main database.

    shell: Open interactive console with ENV set to the open environment.

    stat: Print environment statistics.

    warm: Read environment into page cache sequentially.

    watch: Show live environment statistics

Options:
  -h, --help            show this help message and exit
  -e ENV, --env=ENV     Environment file to open
  -d DB, --db=DB        Database to open (default: main)
  -r READ, --read=READ  Open environment read-only
  -S MAP_SIZE, --map_size=MAP_SIZE
                        Map size in megabytes (default: 10)
  -a, --all             Make "dump" dump all databases
  -T TXN_SIZE, --txn_size=TXN_SIZE
                        Writes per transaction (default: 1000)
  -E TARGET_ENV, --target_env=TARGET_ENV
                        Target environment file for "dumpfd"
  -x, --xxd             Print values in xxd format
  -M MAX_DBS, --max-dbs=MAX_DBS
                        Maximum open DBs (default: 128)
  --out-fd=OUT_FD       "copyfd" command target fd

  Options for "edit" command:
    --set=SET           List of key=value pairs to set.
    --set-file=SET_FILE
                        List of key pairs to read from files.
    --add=ADD           List of key=value pairs to add.
    --add-file=ADD_FILE
                        List of key pairs to read from files.
    --delete=DELETE     List of key=value pairs to delete.

  Options for "readers" command:
    -c, --clean         Clean stale readers? (default: no)

  Options for "watch" command:
    --csv               Generate CSV instead of terminal output.
    --interval=INTERVAL Interval size (default: 1sec)
    --window=WINDOW     Average window size (default: 10)

Implementation Notes

Iterators

It was tempting to make Cursor directly act as an iterator, however that would require overloading its next() method to mean something other than the natural definition of next() on an LMDB cursor. It would additionally introduce unintuitive state tied to the cursor that does not exist in LMDB: such as iteration direction and the type of value yielded.

Instead a separate iterator is produced by __iter__(), iternext(), and iterprev(), with easily described semantics regarding how they interact with the cursor.

Memsink Protocol

If the memsink package is available during installation of the CPython extension, then the resulting module’s Transaction object will act as a source for the Memsink Protocol. This is an experimental protocol to allow extension of LMDB’s zero-copy design outward to other C types, without requiring explicit management by the user.

This design is a work in progress; if you have an application that would benefit from it, please leave a comment on the ticket above.

Deviations from LMDB API

mdb_dbi_close():
This is not exposed since its use is perilous at best. Users must ensure all activity on the DBI has ceased in all threads before closing the handle. Failure to do this could result in “impossible” errors, or the DBI slot becoming reused, resulting in operations being serviced by the wrong named database. Leaving handles open wastes a tiny amount of memory, which seems a good price to avoid subtle data corruption.
Cursor.replace(), Cursor.pop():
There are no native equivalents to these calls, they just implement common operations in C to avoid a chunk of error prone, boilerplate Python from having to do the same.

Technology

The binding is implemented twice: once using CFFI, and once as native C extension. This is since a CFFI binding is necessary for PyPy, but its performance on CPython is very poor. For good performance on CPython, only Cython and a native extension are viable options. Initially Cython was used, however this was abandoned due to the effort and relative mismatch involved compared to writing a native extension.

Cython offers no lightweight ability to track object dependencies, and so the best method to ensure crash-safety is managing a dict/list of weakrefs. While it is technically possible to maintain inline lists with Cython, the result is unnatural and much of the original benefit of Cython is lost.

Another problem with Cython is that good performance requires static typing, and frequent visits to the autogenerated C files to figure out a performance problem. In some places Cython makes it difficult to avoid conversions that produce new objects, even though CPython provides macros for conversion-free access. The choice is paying a needless performance cost, or intermixing Cython code with chunks of C/Python API calls.

Invalidation lists

Much effort has gone into avoiding crashes: when some object is invalidated (e.g. due to Transaction.abort()), child objects are updated to ensure they don’t access memory of the no-longer-existent resource, and that they correspondingly free their own resources. On CPython this is accomplished by weaving a linked list into all PyObject structures. This avoids the need to maintain a separate heap-allocated structure, or produce excess weakref objects (which internally simply manage their own lists).

With CFFI this isn’t possible. Instead each object has a _deps dict that maps dependent object IDs to corresponding weakrefs. Prior to invalidation _deps is walked to notify each dependent that the resource is about to disappear.

Finally, each object may either store an explicit _invalid attribute and check it prior to every operation, or rely on another mechanism to avoid the crash resulting from using an invalidated resource. Instead of performing these explicit tests continuously, on CFFI a magic Some_LMDB_Resource_That_Was_Deleted_Or_Closed object is used. During invalidation, all native handles are replaced with an instance of this object. Since CFFI cannot convert the magical object to a C type, any attempt to make a native call will raise TypeError with a nice descriptive type name indicating the problem. Hacky but efficient, and mission accomplished.

Argument parsing

The CPython module parse_args() may look “special”, at best. The alternative PyArg_ParseTupleAndKeywords performs continuous heap allocations and string copies, resulting in a difference of 10,000 lookups/sec slowdown in a particular microbenchmark. The 10k/sec slowdown could potentially disappear given a sufficiently large application, so this decision needs revisited at some stage.

ChangeLog

2014-03-17 v0.79

* CPython Cursor.delete() lacked dupdata argument, fixed.

* Fixed minor bug where CFFI _get_cursor() did not note its idea of
  the current key and value were up to date.

* Cursor.replace() and Cursor.pop() updated for MDB_DUPSORT databases. For
  pop(), the first data item is popped and returned. For replace(), the first
  data item is returned, and all duplicates for the key are replaced.

* Implement remaining Cursor methods necessary for working with MDB_DUPSORT
  databases: next_dup(), next_nodup(), prev_dup(), prev_nodup(), first_dup(),
  last_dup(), set_key_dup(), set_range_dup(), iternext_dup(),
  iternext_nodup(), iterprev_dup(), iterprev_nodup().

* The default for Transaction.put(dupdata=...) and Cursor.put(dupdata=...) has
  changed from False to True. The previous default did not reflect LMDB's
  normal mode of operation.

* LMDB 0.9.11 is bundled along with extra fixes from upstream Git.


2014-01-18 v0.78

* Patch from bra-fsn to fix LMDB_LIBDIR.

* Various inaccurate documentation improvements.

* Initial work towards Windows/Microsoft Visual C++ 9.0 build.

* LMDB 0.9.11 is now bundled.

* To work around install failures minimum CFFI version is now >=0.8.0.

* ticket #38: remove all buffer object hacks. This results in ~50% slowdown
  for cursor enumeration, but results in far simpler object lifetimes. A
  future version may introduce a better mechanism for achieving the same
  performance without loss of sanity.


2013-11-30 v0.77

* Added Environment.max_key_size(), Environment.max_readers().

* CFFI now raises the correct Error subclass associated with an MDB_* return
  code.

* Numerous CFFI vs. CPython behavioural inconsistencies have been fixed.

* An endless variety of Unicode related 2.x/3.x/CPython/CFFI fixes were made.

* LMDB 0.9.10 is now bundled, along with some extra fixes from Git.

* Added Environment(meminit=...) option.


2013-10-28 v0.76

* Added support for Environment(..., readahead=False).

* LMDB 0.9.9 is now bundled.

* Many Python 2.5 and 3.x fixes were made. Future changes are automatically
  tested via Travis CI <https://travis-ci.org/dw/py-lmdb>.

* When multiple cursors exist, and one cursor performs a mutation,
  remaining cursors may have returned corrupt results via key(), value(),
  or item(). Mutations are now explicitly tracked and cause the cursor's
  data to be refreshed in this case.

* setup.py was adjusted to ensure the distutils default of '-DNDEBUG' is never
  defined while building LMDB. This caused many important checks in the engine
  to be disabled.

* The old 'transactionless' API was removed. A future version may support the
  same API, but the implementation will be different.

* Transaction.pop() and Cursor.pop() helpers added, to complement
  Transaction.replace() and Cursor.replace().