Monday, September 19, 2011

Notes from "Scaling MongoDB for Real-Time Analytics"

* Scaling MongoDB for Real-Time Analytics, a Shortcut around the mistakes I've made - Theo Hultberg Chief Architect, Burt

40gb, 50 million documents per day

use of mongo
- virtual memory (throwaway)
- short time storage (throwaway)
- full time storage

why
- sharding makes write scale
- secondary indexes

using jRuby

mongo 1.8.1
updates near 50% write lock incrementing single number
pushing near 80% write lock

mongo 2
updates near 15% write lock
push near 50% write lock

iterations
- 1
-- one document per session
-- update as new data comes along
-- 1000% write lock!
-- lesson: everything is about working around the global write lock
- 2
-- multiple documents using same id prefix
-- not as much lock, but still not great performance, couldn't remove data at same pace as inserting
-- lesson: everything is about working around the global write lock
-- being more thought about designing primary key (same prefix for a group)
- 3
-- wrote a new collection every hour
-- lots of complicated code, bugs
-- enables fragmented database files on disk
-- lesson: make sure you can remove old data
- 4
-- sharding
-- higher write performance
-- lots of problems (in 1.8) and ops time spent debugging
-- lesson: everything is about working around the global write lock, sharding is a way around this (4 shards means 4 global write lock)
-- lesson: sharding not a silver bullet (it's buggy, avoid it if you can)
-- lesson: it will fail, design for failure (infrastructure)
- 5
-- move things to separate clusters for different usage (high writes then drop; increment a document)
-- lesson: one database with one usage pattern per cluster
-- lesson: monitor everything (found replica that was 12 hours behind, useless!)
- 6
-- upgraded to monster servers (high memory quad extra large with AWS) (downgraded to extra large machine - 6 machines with mongod 12gb ram, 3 machines for mongo-config, 3 machines for arbiters across three availability zones writing 2000 documents per second, reading 4000 documents per secondc)
-- lesson: call the experts when you're out of ideas
- 7
-- partitioning again and pre-chunking (need to know your data and range of your keys)
-- partition by database, new db each day
-- decrease size of documents (less in RAM)
-- no more problems removing data
-- lesson: smaller objects means smaller documents
-- lesson: think about your primary key
-- lesson: everything is about working around the global write lock

Q&A
- would you recommend?
-- best for high read load, high write load is not a solved problem yet
- EC2
-- you have replicas, why also have EBS?
-- use ephemeral disk, it comes included and is predictable performance (with caveats like spreading across availability zones to avoid data centre loss)
-- use RAID 10 if using EBS
- monitoring
-- mongostat
-- plugins that are available
-- mms from 10gen
-- server density (here at Mongo con)
- map reduce
-- moved away to be more real-time (but was using hadoop)

No comments: