When we describe our overall service architecture to smart people who have been involved in other big services, the two most common questions are:
- Why is your structured data stored in SQL databases instead of something like [big-data, web-scale, No-SQL platform X]?
- Why are you running your own hardware instead of hosting Evernote in [cloud service provider Y]?
These are both valid and interesting questions. I’ll start with #1 and save #2 for a future post.
For the right application, a modern key-value storage engine may offer significant performance or scalability advantages in comparison to a single SQL instance. There are a few reasons that we’ve decided to store all of your account metadata within a single (replicated) MySQL instance instead.
First, the ACID properties of a transactional database like MySQL’s InnoDB are important for our application and synchronization model.
Here’s a little snippet of the database tables for storing “notebooks” and “notes” within a shard’s SQL database:
CREATE TABLE notebooks ( id int UNSIGNED NOT NULL PRIMARY KEY, guid binary(16) NOT NULL, user_id int UNSIGNED NOT NULL, name varchar(100) COLLATE utf8_bin NOT NULL, ... ) ENGINE=InnoDB DEFAULT CHARSET=utf8; CREATE TABLE notes ( id int UNSIGNED NOT NULL PRIMARY KEY, guid binary(16) NOT NULL, user_id int UNSIGNED NOT NULL, notebook_id int UNSIGNED NOT NULL, title varchar(255) NOT NULL, ... FOREIGN KEY (notebook_id) REFERENCES notebooks(id) ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
If you create a notebook named “Cooking” on your Windows client and then immediately clip a recipe for “Quick Tomato Sauce” into that notebook, your client will do something like this on the next sync:
- Call NoteStore.createNotebook() to ask the server to make the Notebook. This will return the GUID of the created notebook.
- Call NoteStore.createNote() to ask the server to make a Note in that Notebook, by specifying the Notebook’s GUID.
Each of these coarse-grained API calls is implemented through single SQL transaction, which ensures that a client can completely trust any reply given by the server. The ACID-compliant database ensures, for example:
Atomicity: If an API call succeeds, then 100% of the changes are completed, and if an API call fails, then none of them are committed. This means that if we fail trying to store the fourth image in your Note, there isn’t a half-formed Note in your account and incorrect monthly upload allowance calculations to charge you for the broken upload.
Consistency: At the end of any API call, the account is in a fully usable and internally consistent state. Every Note has a Notebook and none of them are “dangling.” The database won’t let us delete a Notebook that still has Notes within it, thanks to the FOREIGN KEY constraint.
Durability: When the server says that a Notebook was created, the client can assume that it actually exists for future operations (like the createNote call). The change is durable so that the client knows that it has a consistent reflection of the state of the service at all times.
The Durability property is the most important for our synchronization protocol … if the client can’t assume that changes made on the server will be Durable, then the protocol would become much more complex and inefficient. Each synchronizing client would need to constantly double-check whether the state of each server object matched the local state. Maintaining absolute consistency for an account with 20k Notes, 40k Resources, and 10k Tags would be very expensive if changes couldn’t assume Durability.
The ACID benefits of a transactional database make it very hard to scale out a data set beyond the confines of a single server. Database clustering and multi-master replication are scary dark arts, and key-value data stores provide a much simpler approach to scale a single storage pool out across commodity boxes.
Fortunately, this is a problem that Evernote doesn’t currently need to solve. Even though we have nearly a billion Notes and almost 2 billion Resource files within our servers, these aren’t actually a single big data set. They’re cleanly partitioned into 20 million separate data sets, one per user.
This extreme locality means that we don’t have one “big data” storage problem, but rather we have a lot of “medium data” storage problems that partition neatly into a sharded architecture.
But maybe later…
We’re very interested in all of the cool new data storage systems for future projects that don’t require strong ACID transactionality and do require horizontal scalability. For example, our reporting and analytics system has gradually outgrown its current MySQL platform and needs to be replaced with something bigger/faster/cooler.
But we’re relatively satisfied with sharded MySQL storage for Evernote user account metadata, even though that’s not going to win any style points from the cool kids.