Github Twitter Telegram Login

bitdb

Bitcoin as Decentralized NoSQL Database


Demo Get Started

Introduction

Bitdb is a NoSQL database powered by Bitcoin.

It's an autonomous public MongoDB instance which self-updates by crawling and indexing Bitcoin OP_RETURN transactions, a special type of transaction that can include up to 220 bytes of arbitrary data.

Using this 220 bytes as a database insertion command, we can construct an entire database system from Bitcoin.

Because it's backed by Bitcoin as canonical storage, it comes with all the benefits of Bitcoin's decentralization.

Because it's powered by MongoDB as index, it comes with all the benefits of NoSQL databases, including unlimited insertion of unstructured data, as well as higly flexible and portable query interface.

Rules

  1. You can't write to Bitdb directly.
  2. Bitdb writes itself. Bitdb is an autonomous daemon that constantly synchronizes itself with Bitcoin.
  3. The only way to write to Bitdb is by making a Bitcoin OP_RETURN transaction.

How is it different?


1. Before Bitdb

Most attempts at building a global decentralized database have looked like this:

  1. take an existing legacy database system that supports replication
  2. implement the replication feature
  3. sprinkle some magical distributed consensus and governance rules to tackle the decentralized replication problem.

The big challenge with this approach is that the network needs to not only get the tech but also the incentive structure right.

This is not a trivial problem. There's a reason why we don't have a mainstream decentralized database with such architecture.

2. Bitdb is Bitcoin.

Bitdb takes a different approach. Instead of trying to implement decentralization on top of a legacy database, Bitdb utilizes Bitcoin as the single source of truth. In this approach, the "database" is merely an index built from the canonical Bitcoin blockchain

Because Bitcoin has already solved the decentralized consensus problem through Proof of Work, Bitdb works right out of the box.

Bitdb does not invent a new experimental consensus algorithm, nor does it try to bootstrap a new network from scratch starting from 0 liquidity, all of which are huge risk factors for these initiatives.

Bitdb is Bitcoin. And Bitcoin works. As long as Bitcoin exists, Bitdb will scale along with Bitcoin.

Features

  • Simple Usage: There is no complicated setup process. You can get started using Bitdb in seconds. Querying Bitdb is as easy as a single MongoDB query (or a JavaScript function call), or a single REST API call.
  • Censorship resistance: Because the canonical data is 100% stored on Bitcoin blockchain and Bitdb is merely a user-facing index, no one can censor what gets stored on Bitdb.
  • Trust, but verify: The core reason why Bitdb works is because it employs the "trust, but verify" approach. Each document contains a hash of the Bitcoin transaction which caused the insertion, which means anyone can verify the validity of the document by checking with any Bitcoin node.
  • Powerful Queries: You can use Bitdb as a way to store anything in an unstructured manner and query them later any way you want. Since it utilizes a document database, you can apply all kinds of advanced queries such as regular expression matching, aggregation, projection, etc.
  • Create your own protocol: You no longer need to build out your own infrastructure to create your OP_RETURN protocol. Bitdb indexes everything.
  • Designed to last forever: BitDB chooses document database (MongoDB) instead of relational database which requires a rigid schema. The schema-less approach means you can insert data to the database without having to worry about how it will be used 1000 years from now.

How it works


1. How to write to Bitdb

Writing to Bitdb is done by making Bitcoin OP_RETURN transactions.

  1. Each OP_RETURN transaction represents a document.
  2. Each push data in a transaction represents an attribute in the document.
  3. The attributes follow a naming convention:
    "b" prefix
    and
    "s" Prefix
    followed by the corresponding push data
    index
  4. "s" prefixed
    attributes are utf8 encoded representation of the push data
  5. "b" prefixed
    attributes are base64 encoded representation of the push data

For example, when you make two OP_RETURN transactions from your wallet that look like this:

Bitdb populates itself with following entries:

s1 s2 s3 b1 b2 b3
m

(utf8 encoding for 0x6d02)
Hello bQI=

(base64 encoding for 0x6d02)
aGVsbG8=

(base64 encoding for 'Hello')
m

(utf8 encoding for 0x6d03)
fe32a4bc5a52ce9b861725462ad7d5d223d3554532eb172c7d29feca5722d44c This is a reply bQM=

(base64 encoding for 0x6d03)
ZmUzMmE0YmM1YTUyY2U5Yjg2MTcyNTQ2MmFkN2Q1ZDIyM2QzNTU0NTMyZWIxNzJjN2QyOWZlY2E1NzIyZDQ0Yw==

(base64 encoding for "fe32a4bc5a52ce9b861725462ad7d5d223d3554532eb172c7d29feca5722d44c")
InRoaXMgaXMgYSByZXBseSI=

(base64 encoding for "This is a reply")

2. How data is stored in Bitdb

The table above was a simplified version. Here's an example of a full document object:



  • index: the index of the OP_RETURN output within its parent transaction
  • tx: transaction hash
  • senders: An array of sender objects. Each sender object can have an
    "a"
    attribute (address)
  • receivers: An array of receiver objects. This is normally an empty array for simple OP_RETURN transactions. But it will be populated when the parent transaction also contains another output that transfers money. Each receiver object can have an
    "a"
    attribute (address) and a
    "v"
    attribute (the value sent in satoshis).
  • block_index: block index of the parent transaction
  • block_hash: block hash of the parent transaction
  • block_time: block time of the parent transaction
  • b1,b2,...: base64 representation of each push data
  • s1,s2,...: utf8 representation of each push data. useful for queries like
    $regex
    (see example)


3. How to query Bitdb

  • Bitdb internally uses MongoDB to index the OP_RETURN transactions in a structured manner. This means you can query it just like any regular MongoDB.
  • Bitdb supports most MongoDB methods such as find, aggregate, sort, limit, project, etc.
  • To query the database, you simply make a requeust with a JSON payload that describes the request and the response.

Above JSON request makes a mongodb query with the "request" part, and then transforms the response using the "response" part, so that the response is rendered correctly on the client side.

Bitdb Query Language

You can query Bitdb using a JSON based query language, derived from MongoDB queries.

1. Syntax

The query consists of request (required) and response (optional) attributes, which define how the request and response should be carried out.

  1. request
    • encoding: Key value pairs to describe the attribute encodings used in the query (within the
      find
      or
      aggregate
      attributes below). All attributes are treated as
      utf8
      if not specified.
    • find: Mongodb query filter object. Learn more about MongoDB query filter
    • aggregate: Mongodb aggregationg pipeline stages array. Learn more about Mongodb aggregate stages
    • project: Mongodb project operator for selectively returning attributes. Learn more about Mongodb projection
    • sort: Mongodb sort operator. Learn more about Mongodb sort operator
    • limit: Mongodb limit operator. Limit the number of results to return
  2. response
    • encoding: How to interpret the response. All the
      "b" prefixed attributes
      are stored as
      base64 encoding
      . You can decode them before returning by specifying its original encoding (for example
      "utf8"
      , or
      "hex"
      ). Otherwise all
      "b" prefixed
      attributes will be in base64 format.

2. Example Query

Find all transactions where the first push data is "hello"

{
  "request": {
    "find": { "b1": "hello" }
  }
}


Find all transactions where the first push data is "0x6d02"

{
  "request": {
    "encoding": { "b1": "hex" },
    "find": { "b1": "6d02" }
  }
}


Find all transactions where the first push data is "0x6d02" and the second push data matches "bet"

{
  "request": {
    "encoding": { "b1": "hex" },
    "find": {
      "b1": "6d02",
      "s2": {
        "$regex": "bet", "$options": "i"
      }
    }
  }
}


Find all transactions with the sender qq4kp3w3yhhvy4gm4jgeza4vus8vpxgrwc90n8rhxe

{
  "request": {
    "find": {
      "senders.a": "qq4kp3w3yhhvy4gm4jgeza4vus8vpxgrwc90n8rhxe"
    }
  }
}


More complex query

{
  "request": {
    "encoding": { "b1": "hex" },
    "find": {
      "b1": "6d02"
    },
    "project": {
      "b1": 1, "b2": 1, "_id": 0, "block_time": 1
    },
    "sort": {
      "block_time": 1
    },
    "limit": 100
  },
  "response": {
    "encoding": { "b1": "hex", "b2": "utf8" }
  }
}


3. Demos

Here are some example queries you can try:

Explore demos

FAQ

  • Supported Blockchains?
    • BitDB can technically work on any Bitcoin-like blockchains that support
      OP_RETURN
      . It can even be implemented as a sidechain. The current implementation of Bitdb lives on Bitcoin Cash blockchain.
  • Why Bitcoin Cash? The current implementation of Bitdb is powered by Bitcoin Cash (BCH) because it is more optimized for data storage.
    1. Larger document size: BCH has 220 byte OP_RETURN limit compared to BTC's 80 bytes, making it possible to fit about 3 times as much data into each document unit.
    2. Higher throughput: BCH's block size limit is 32 MB vs. BTC's 1 ~ 4MB, which makes it more optimized for storing large volume of data.
    3. Predictable insertion cost: Lastly, the transaction fee is much more deterministic on BCH compared to BTC, which makes the cost of each document insertion more predictable. Predictable fee is important for money transfers, but when it comes to using Bitcoin as a database this becomes significantly more important.