You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The TreeReader trait defines the method get_value_opt(max_version: Version, key_hash: KeyHash) -> Result<Option<OwnedValue>>. This method supports the get and get_with_proof methods of the JMT.
To satisfy this interface, an underlying RocksDB implementation has several options:
Map (Version, KeyHash) to Value
The simplest option is to store a mapping from (Version, KeyHash) to Value. This approach allows efficient writes with low compaction overhead, since items are naturally sorted by version. However, it has very poor read efficiency. To satisfy a get_value_opt query, the DB has to iterate backwards over all versions until the key_hash is found. To avoid iteration on get queries, the DB implementation needs to store a second table mapping KeyHashes to Versions. Since this table will be written in random order, it will have high compaction.
In addition, a DB with this setup can't support queries on (unhashed) keys without an adding an extra table to track which keys are present. And even with this additional table, range queries on raw keys are extremely inefficient - since the hash for each key is random.
Map (Key, Version) to Value
A second potential database layout is to store a mapping from (Key, Version) to Value. This schema has the drawback that writes happen in un-sorted order, which adds significant compaction overhead. But, it supports very efficient range queries on keys, and it allows easy retrieval of the history for a given key. So, for read-heavy workloads, this layout is much more efficient.
However, the current TreeReader interface requires apps which pick this layout to add an additional table mapping KeyHashes back to raw Keys.
Potential Change
We should consider changing the TreeReader (and the get, get_with_proof methods) to work on Keys rather than KeyHashes. This would enable implementers to pick either database layout with no extra overhead.
The text was updated successfully, but these errors were encountered:
Background
The
TreeReader
trait defines the methodget_value_opt(max_version: Version, key_hash: KeyHash) -> Result<Option<OwnedValue>>
. This method supports theget
andget_with_proof
methods of the JMT.To satisfy this interface, an underlying RocksDB implementation has several options:
Map
(Version, KeyHash)
toValue
The simplest option is to store a mapping from
(Version, KeyHash)
toValue
. This approach allows efficient writes with low compaction overhead, since items are naturally sorted by version. However, it has very poor read efficiency. To satisfy aget_value_opt
query, the DB has to iterate backwards over all versions until thekey_hash
is found. To avoid iteration onget
queries, the DB implementation needs to store a second table mappingKeyHash
es toVersion
s. Since this table will be written in random order, it will have high compaction.In addition, a DB with this setup can't support queries on (unhashed) keys without an adding an extra table to track which keys are present. And even with this additional table, range queries on raw keys are extremely inefficient - since the hash for each key is random.
Map
(Key, Version)
toValue
A second potential database layout is to store a mapping from
(Key, Version)
toValue
. This schema has the drawback that writes happen in un-sorted order, which adds significant compaction overhead. But, it supports very efficient range queries on keys, and it allows easy retrieval of the history for a given key. So, for read-heavy workloads, this layout is much more efficient.However, the current
TreeReader
interface requires apps which pick this layout to add an additional table mappingKeyHash
es back to rawKey
s.Potential Change
We should consider changing the
TreeReader
(and theget
,get_with_proof
methods) to work onKey
s rather thanKeyHash
es. This would enable implementers to pick either database layout with no extra overhead.The text was updated successfully, but these errors were encountered: