So for a while now I have been wrestling over the idea of horizontal scaling of a (for the purpose of this article) mysql db. Not how to actually do it but to make it better. I want a system that can grow and scale as a product does allowing small operations to build on a powerful stack from the beginning with a super low overhead. I should take a moment here to note that I am sure I am not the first to come to this idea but I have not located another source that talks about it. If you do know of some good info please pass it along to firstname.lastname@example.org and I will link to it from here.
I guess it would be good to note that mysql has a tendency to begin to perform badlywith super large table sizes. To counter this a standard approach is to shard the db. This basically means breaking the database into smaller databases and using a hash algorithm, usually linked to the users id, to make sure we always query the same db instance for a particular user. This is great, if I have a db that is too big, split it in two and now I have 2 that are much faster. Easy enough, right? Ok, so lets continue this example by having our 2 shards now reach capacity, then what do we do? You would probably say “Add two more nodes” so this is a little more difficult, now we have to take two nodes off line, shard them, update our hash function, then bring the whole thing back online…Is there a better way?
So lets start with the hash algorithm. Knowing the space the hash will map to could potentially increase, we need to consider a space not bound by integer values. Instead lets consider the infinite space between 0.0 and 1.0. Targeting this space will allow for an addition of an arbitrary set of indexable db nodes without further modifications to the hash function. A key consideration is the idea that a hash lookup COULD fail meaning the db index returned by the hash function does not contain the target record and will have to recover properly.
Lets put this in a concrete example…
Lets say I started with a single db instance and that we have a hash function that takes in a user id and spits out a floating point number between 0 and 1. A user record is queried and the hash function returns 0.6, well we dont have a db node at 0.6 only one at 0. We search a list of the db nodes we have and their decimal ids, we then take the node with the closest index, in this case 0 so we ask db 0 for the user record. So for the base case its pretty simple, there is only one node so every hash and distance operation points back to the single db node at 0. Now lets say that db is headed towards becoming overloaded and you decide to expand your db shards but your product requires a really high up-time and taking it down to bisect it is out of the question, so what do we do. Well lets look at the hash algorithm again lets say we need to get the record for the same user as before. The hash function points us to 0.6 whos closest node is 1.0 (We know from before that the user record is located on node 0). So we consider this a MISS. This is where the auto-balancing part comes into play, after a miss the system will search the other db nodes in an expanding radius from the hashed value. Once the record is found (in shard 0) we would then move the record to the node we first checked. We now only have to search N/2 db nodes where N is the total number of nodes in the system to either find the record or be guaranteed it does not exist. This allows us to dynamically scale up the number of db nodes as demand increases and have the load automatically re-distribute itself to the new shards.
- Hash space between 0.0 and 1.0
- Keep a list of the db nodes and their decimal id
- Look up record in closest db id to hash result
- Search radially outward to a max of N/2 for record if needed
- Move record to the first db searched (if needed)
In conclusion database management at a large scale can be a giant pain so building in transparent flexibility at the beginning can be the difference between a hyper-viral hit and a missed opportunity.