Sharing a few interesting articles I read in the past few weeks on the interweb about Twitter, LinkedIn, Ebay and Google.
Improving running components at Twitter describes the evolution of Twitter's technology and about their new message queue server, named Kestrel, written in approximately 1.5K lines of Scala.
LinkedIn Communication Architecture details the heavy usage of Java, Tomcat, Jetty, Lucene, Spring and ActiveJMX at LinkedIn. Oracle and MySQL are used for data storage. They have made heavy customizations to Lucene for their near real-time indexing needs. They have open-sourced their Lucene modifications in the form of Zoie on Google Code. The upcoming Lucene In Action 2 has a case-study on how Zoie builds upon Lucene.
The eBay way is a presentation on eBay's realtime personalization system. This mammoth system handles 4 billion reads/writes per day. The interesting thing about this system is that it uses the MySQL memory engine as a caching tier in front of a persistent tier. Some critical data is replicated (presumably on the cache tier as they talk about doubling memory needs). They encountered problems with the single-threaded MySQL replication, so it is managed through dual writes instead (the second write can be asynchronous). The system is capable of automatic redistribution of data if a node goes down.
Jeff Dean's WSDM keynote slides on the evolution of Google's search infrastructure are perhaps the most interesting of all. It has gone through a number of iterations over the years. I was surprised to know that their complete index is served out of memory. Although it makes sense with the fact that as they increased the number of nodes, they crossed a point where they had enough combined memory to hold the index completely.
Thoughts about technology, business and all that's life.
This blog has moved to http://shal.in.
Wednesday, April 01, 2009
Subscribe to:
Post Comments (Atom)
About Me
- Shalin Shekhar Mangar
- Committer on Apache Solr. Principal Software Engineer at AOL.
Labels
- Apache Solr (8)
- Apache Lucene (3)
- Apache Mahout (3)
- AOL (1)
- Architecture (1)
- DataImportHandler (1)
- Faceted Search (1)
- Google App Engine (1)
- Inside Solr (1)
- Machine Learning (1)
- Optimization (1)
- Scalability (1)
2 comments:
very interesting post, extremely informative. Thanks for the heads up.
Great post . This article is really very interesting and enjoyable . I think its must be helpful and informative for us. Thanks for sharing your nice post.
champions league live
Post a Comment