When we decided a few weeks ago to give the service away for free, one of our goals was to stress our architecture to see what it could withstand. Code has a funny way of not always bottlenecking where you think it will, and as you scale a system, some things always surprise you. Now that we are adding roughly 1000 new users every day, we are starting to see bottlenecks and system problems where we did not expect. The “scalability” of cloud computing is not always as simple and easy as it sounds, and we are at the point where adding more servers doesn’t really help.
For some of you (most of you actually) things seem to be working fine. For about 10% of you, you see almost nothing backing up at all. The difference has to do with where you are in the system, the amount of data in your accounts, and the amount of data of users in the backup queue around you.
The two biggest problem areas are Gmail and Twitter backups. Gmail backups cause problems because initial backups are often so big, often multi Gb, and clog things up for other users. Twitter causes problems not from tweet volume, but from backing up friends and followers. These methods of the Twitter API aren’t as fast and so if you have millions of followers, or are in the queue near someone who does, the entire system gets bogged down.
On top of all of this, our process that sweeps through and collects your history and emails you about it was not written in a way that we can run multiple instances of it in our system. That was a lack of foresight on our part and we are really surprised it became a bottleneck so quickly.
The good news is, this is all easy to fix. Now that we understand the issues, we have made some quick changes that will improve performance. We have switched friend and follower backup to weekly instead of daily, and may move Twitter backup to every other day for a week or so, to make sure it gets done.
Our new architecture is almost in place, but moving multiple terabytes of data will take a week or more. You should see improvements in backup performance starting in a few days, and in 3-4 weeks things should be humming along just fine. By then you will be back to daily backups of everything in your account.
As a final note, we just want to thank you for breaking things. We are cloud geeks who love tackling challenging problems and because there are so many of you with so much data, we get to work on some really interesting things and implement some innovative solutions that make our programmer friends jealous. Our new architecture changes should get us through early summer, and we are already starting to work on the next architecture that will take us to 1 million users.
Thank your for you patience and support. If you have particular questions or concerns about your account, feel free to email us or tweet to @backupify.
Pingback: Haz backup gratis de tus cuentas online en Amazon S3