Distributed Hash Tables

The Distributed Hash Table is the central asynchronous data communication service provided by Veilid. As the distributed and mutable data store provided by the network, it is the most basic discoverability layer for communication between nodes.

Some existing uses for the DHT are:

  • Text message communications (in VeilidChat and in the python-demo)
  • Communicate private route blobs between nodes you have already communicated with

It is just a search algorithm

Yes, the heading has it right: the DHT is literally just a search algorithm. To explain how it works, though, we're going to go back to a rough overview of how non-distributed hash tables work.

Simple hash tables

A hash table is a key-value store. What this means is that you know how you want to refer to a piece of information, but you don't know what that information is going to be, so you basically create a lookup key (akin to a variable name within the storage facility), to find the location where the value is stored.

This is kind of like a dictionary, except that it's a lot more efficient to find where something is. (In the worst case, a dictionary requires a number of lookup key comparisons equal to the number of entries within it to get the value, or to return that it doesn't have that value in it.) The idea is that the lookup key gets "hashed" (transformed in a one-way process) to get a number, and then the number is taken modulo the size of some memory area where pointers are stored (called the "hash table"). Then, the pointer to its value gets stored at that slot in the hash table.

So, it can be said to be a search algorithm: Searching for the correct data out of a sea of pointers, based on its name.

There are a couple of caveats to how it was just described, though. First, the size of the hash table needs to be chosen well for the best and most efficient use of resources -- if the hash table is large but you're only putting a few values in it, you've wasted all the memory that isn't being used in the table; if the hash table is small and you're putting a lot of values in it, you're going to end up with "hash collisions" (where multiple keys hash to the same value modulo the table size), and end up with multiple values with pointers that need to be at the same hash table location (which requires a second search, potentially into a linked list or another hash table, to finally identify the correct pointer). And because the lookup key is lost in the hashing process, the hash table can't be resized.

These caveats can be addressed, though, by storing the name of the key with the value at the pointer the hash table stores, and either creating new and smaller hash tables for each collision at those pointers, linked lists of key-value pairs rooted at those pointers, or by resizing the hash table entirely (by allocating a completely new memory area for it of a different size, then reinserting all of the entries from the old hash table, before deallocating the old entries and old hash table). Yes, these resizing operations, secondary hash table lookups, and linked list lookups are less efficient than a one-to-one correspondence, but lookups using them still make the lookup much more efficient than potentially looking through the entire list of lookup keys.

Distributing the hash table and the XOR metric

With a distributed hash table, there are many individual locations (nodes) where the value can be stored. Fortunately, in Veilid, every individual node has its own public cryptographic key (called its "identity"), and all of these identities are the same length.

Now, unlike a simple hash table, we actually want hash collisions (multiple entries going to the same machine), because every node can store multiple DHT entries. And also, we want to maximize the availability of a DHT record, even in the face of one or several nodes that contain that entry being disconnected from the network for a moment or forever.

So, to find the best places to store the value, we compare the lookup key to the node identities, because they are the same length. Then, we find nodes whose identities are "near" the lookup key, and send the DHT request (create, update, retrieve) to them.

What defines "near"? Well, it's the number of matching bits between the hashed lookup key and the node identity, starting the comparison from the left. When we XOR the lookup key and the node identity, the one with the longest number of zero bits to the left is the nearest. (Or, to put it another way for people who are used to big-endian C-style numeric representation, if the outputs of the XORs were to then be interpreted as numbers, the smallest resulting number would be the nearest.)

To help maintain availability if the nearest node to the value goes down or otherwise disconnects from the network, the value is sent also to the nearest 4 neighbors to that lookup key (2 above, 2 below). The DHT implementation uses the routing table (since it uses the same "nearness" metric) to route any request for the value to the nearest node of the lookup key hash.

The Veilid DHT Schema

In Veilid, DHT entries are composed of:

  1. a lookup key, which is the hash of the public key of the writer and the digest of the schema in #2
  2. a value, which is limited to 1MiB in size, and which is composed of: - an owner-defined schema, used by the node where it's being stored to validate writes and updates - a list of subkey values, with ordinal numbers as their keys (0, 1, 2, etc)

All DHT updates are signed by an app-generated keypair used by the app that is creating them.

Rules for schemas

The most important rule for value schemas in Veilid is this: Once they are created, they are immutable and cannot ever be changed. The reason for this is because a digest of the schema is made part of the lookup key of the entry it applies to. If the schema were to change, it would change the lookup key for the value.

The second most important rule: A Veilid DHT value, including the schema, cannot be larger than 1 mebibyte (1MiB).

Implemented value schema types

There are two types of schema implemented as of v0.2.6, called "DFLT" (default) and "SMPL" (simple). Both allow the creator ("owner") of the DHT entry to specify up to one writer for each subkey, though each writer may write to more than one subkey.

  • DFLT or "default": Only updates which are signed by the creator ("owner") identity to any of the allowable subkeys can successfully modify the value.
    • A DFLT schema has an owner key, and a number of allowable, allocated subkeys (called the "o_cnt", or "count of subkeys available to be written by the owner").
  • SMPL or "simple": Updates which are signed by the creator ("owner") or specifically-identified delegates ("writers") to their specifically-allocated subkeys can successfully modify the value.
    • The owner key may have 0 subkeys allocated to it, in which case only the writers will be able to write to their specific subkey ranges.
    • Each writer has a count of subkeys it exclusively can write to.

If you find that you need a schema with a different behavior, please see the Internals Book about DHT schema behavior requirements, try to consider how the behavior you need can be met while still fulfilling those requirements, and file a ticket at the Veilid issue tracker.