How to use blockchain to build a database solution, ZDNet
How to use blockchain to build a database solution
Why would you want to use blockchain to build a database solution? And how would you actually do that? BigchainDB has answers.
Very first Wall Street, then the database world. While most people are still attempting to wrap their goes around blockchain and its difference from Bitcoin, others are using it in a broad range of domains. Is it hype, a case of having a hammer and watching problems as tears up, or could blockchain actually have a purpose in the database world?
BigchainDB’s creators argue there is a reason, and a way, for blockchain and databases to live joyfully ever after.
Blockchain technology, put simply, is a type of digital ledger which records transactions, agreements, contracts and sales. The technology is decentralized, which means that information is stored in computers around the world, and is permanently updated in real-time to reflect switches in stock, sales and accounts by bringing records together into blocks before algorithms ‘chain’ these data stores together chronologically.
Silicon Valley is hot on blockchain — the technology behind the Bitcoin cryptocurrency — and its many potrential uses.Blockchain’s economic influence could be as significant as the Internet
Blockchain was introduced by Bitcoin, which despite its oft discussed issues has illustrated a novel set of benefits: decentralized control, where “no one” wields or controls the network; immutability, where written data is “forever” tamper-resistant; and the capability to create and transfer assets on the network, without reliance on a central entity.
The initial excitement surrounding Bitcoin stemmed from its use as a token of value, for example as an alternative to government-issued currencies. Now the separation inbetween Bitcoin and the underlying blockchain technology is getting better understood, the scope of the technology itself and its applications are being extended.
With this increase in scope, single monolithic blockchain technologies are being re-framed into building blocks at four levels of the stack:
Two. Decentralized (blockchain) computing platforms
Trio. Decentralized processing (wise contracts) and decentralized storage (file systems, databases) and communication
Four. Cryptographic primitives, consensus protocols, and other algorithms.
Blockchain operations work with data, and that data is also stored as part of the blockchain. For example, when transferring assets from one knot to another, the amounts transferred as well as the sender, receiver, and time of transfer are stored. So the option to leverage the benefits blockchain brings by using it as a database is tempting.
The problem is, the blockchain as a database is awful, measured by traditional database standards: throughput is just a few transactions per 2nd (tps), latency before a single confirmed write is ten minutes, and capacity is a few dozen GB. Furthermore, adding knots causes more problems: with a doubling of knots, network traffic quadruples with no improvement in throughput, latency, or capacity. Plus, the blockchain essentially has no querying abilities.
How could that possibly ever work? Trent McConaghy and his co-founders in BigchainDB have tackled this issue by turning it on its head: instead of using blockchain as a database, they are taking a database and adding blockchain features to it. Originally they commenced working with RethinkDB, the reason being that RethinkDB leveraged a clean and efficient knot update protocol.
BigchainDB works by building blockchain features on top of a DB, rather than using blockchain as a DB. Photo: BigchainDB
Under the bondage mask, BigchainDB utilizes two distributed databases, S (transaction set or “backlog”) and C (blockchain), connected by the BigchainDB Consensus Algorithm (BCA). The BCA runs on each signing knot, with signing knots forming a federation. Non-signing clients may connect to BigchainDB, and depending on permissions they may be able to read, issue assets, transfer assets, and more.
Each of the distributed DBs, S and C, is an off-the-shelf big data DB. BigchainDB does not interfere with their internal workings, so it gets to leverage their scalability properties, as well as features like revision control and benefits like battle-tested code. Each DB is running its own internal consensus algorithm for consistency.
At this point BigchainDB has moved towards using MongoDB, and is in fact in a partnership with them. But why MongoDB? It could have been any other open source distributed database. “We did consider a number of DBs, but we desired a document DB to begin with as we’re working with JSON at this point, and MongoDB is an demonstrable choice.”
But, again, isn’t BigchainDB afraid that combining the well known blockchain with the recently targeted MongoDB could raise numerous crimson flags in terms of security? McConaghy has openly acknowledged that the underlying DB may be a security vulnerability at this point, but is neither critical of MongoDB nor apologetic.
“MongoDB has been clear about providing ease of access by removing hard security, so it’s not their fault if people left their installations on the internet unsecured. As for us, at this point we are no better or worse than a centralized solution, and we will undoubtedly add improved security features before moving to production,” he says.
BigchainDB promises blockchain advantages, plus scalability. See also addendum. Pic: BigchainDB
BigchainDB works by suggesting an API on top of the underlying database, with the aim of acting as a substrate-agnostic layer that adds the key blockchain features of decentralization, immutability, and asset transferability. But that leads to some interesting issues.
Even tho’ it may be the wrong device for the job, the years of development behind the relational database ensure its popularity — for the moment, says MongoDB’s Max Schireson.
For example, what if for some reason users would like to use a different database as a substrate? BigchainDB offers a Service Provider Interface that can be used to butt-plug in other databases. It is what has been used to integrate and operate on top of MongoDB, and according to McConaghy could also be used to do the same with any other database, be it relational or key-store or anything else.
Of course, that is lighter said than done, and brings up another issue: querying. Albeit BigchainDB’s querying support is not fully operational at this point, the aim is to suggest one unified querying interface over whatever underlying database knots BigchainDB may be using. That is a hard problem to solve, as not all databases have the same query languages or capabilities.
However, the current trend towards feature convergence in the database world, and in particular the renewed interest and turn to SQL as the standard for querying may suggest a way out of this. Even so-called NoSQL databases like MongoDB suggest SQL capabilities these days, so this is the most promising way forward for BigchainDB as well: a SQL interface.
At this point, BigchainDB queries are mostly done by directly using MongoDB’s API, but this is a sort of hack that tightly couples BigchainDB to MongoDB, so it is seen as an interim solution that will eventually give way to querying via BigchainDB’s own API.
As should be evident by now, BigchainDB is not a typical database by any measure. It is also not a typical startup run by a typical founder. McConaghy has a rich background in AI before it was cool and a hacker ethos: “doing AI in the 90s was one of the least popular things one could possibly do, so I certainly didn’t do it for the hype.”
McConaghy could have been part of the Facebooks of the world had he chosen to, as he has actually turned down such offers. This is not what drives him, and by extension BigchainDB. The drive behind BigchainDB is not getting to a successful exit or IPO, but rather reshaping the internet and the world at large.
McConaghy believes that centralization leads to concentration of power, citing examples such as social media ownership and control of data or the conundrum that both creators and consumers of art, and content in general, face on the internet.
This is what McConaghy’s previous venture, Ascribe, was about: helping digital artists transfer ownership of their work to customers. Albeit whether this is indeed applicable to everyday art like music or movies is unclear, Ascribe aims to provide a solution for digital artists with unique creations and collectors that want to own them, and uses decentralization to achieve this. At some point Ascribe’s evolution gave birth to BigchainDB.
Some might say this is an overly complicated solution, but McConaghy is not one to bashful away from complexity. When asked on his take on Numerai and the criticism that has been voiced towards it for example, he is adamant: “I don’t think it’s overly complicated, on the contrary, I think it’s brilliant, maybe the best combination of blockchain and AI out there. I think they are doing a truly good job of aligning incentives for founders, employees and users. Think of Facebook, what if it operated on the basis of providing its users a stake in the value it generates? This is what Numerai is doing, and in the process it is bringing a shift in the power structure and creating incentives for cooperation. So it is turning a zero-sum game to a positive-sum game.”
So where on that long and winding road is BigchainDB at the moment? Berlin-based BigchainDB has raised a total of five million euros, with a latest series A of three million. It is working in close collaboration with a number of early adopter clients, including the likes of RWE and Internet Archive.
The Internet Archive, along other organizations such as Open Media or the Human Data Commons Foundation, are also the caretakers of IPDB, or Inter-Planetary DB: a public example of BigchainDB, used to collectively store and manage content in a safe and decentralized way. IPDB has an identically grand vision: its aim is to be a database for the internet.
For Internet Archive for example, it would mean moving away from traditional storage technology and towards the decentralized and cooperative storage model that BigchainDB stands for. As Internet Archive is looking into options such as moving its data to Canada to avoid data sovereignty issues, the potential of adding immutability on top of decentralized storage is appealing.
For RWE on the other mitt, the stakes are a bit different. Traditionally, large electrified utilities would connect the energy producers with the energy consumers. Deregulation switches things, as anyone can now connect to anyone. RWE is getting in front of that by exploring several blockchain projects, such as energy exchanges, electrified car charging, and billing.
BigchainDB has recently released version 0.9, and its roadmap for two thousand seventeen is to reach a stable version 1.0 in the summer and to have fully operational, production-ready open-source and enterprise versions available by the end of the year.
Whether that aim is feasible, or whether its grand vision is likely to be achieved remains to be seen. It certainly does not lack in ambition or abilities however.
Addendum, March 8th 2017: After the article was published, we received the following clarification from Bigchain’s CEO regarding scalability:
“When we very first released BigchainDB, we gave too strong of an impression that it was *already* doing 1M writes/s whereas that was actually just in the underlying database (RethinkDB at the time), tho’ we had designed the algorithm such that BigchainDB could eventually hit that (after more hardening and optimizations).
After feedback, we revised things to set a more adequate expectation: *towards* 1M writes/s. And we also discovered that users didn’t care as much about that benefit compared to other benefits, like high capacity and usability; so we spent more of our resources towards user asks than towards 1M writes/s so far. (That is however still in the roadmap; it’s just not a priority).
I wrote a blog post last May describing this journey; including an apology for setting the wrong expectations; and a commitment to be better about it, which I’m proud to say we’ve kept. It was the very first time in my career that I’d had misaligned expectations compared to what I was shipping; never again! :)”
Digital Transformation, a CXO’s Guide: