Titre : Building large-scale distributed applications on top of self-managing transactional stores Auteur : Peter Van Roy Université catholique de Louvain Abstact : More and more large-scale Internet applications are using horizontal scalability (adding more nodes to the system) instead of vertical scalability (adding resources to a single node). Vertical scalability is limited by how big one node can get, whereas horizontal scalability has no such limits. Moreover, vertical scalability needs a large up-front investment to handle a projected peak, whereas horizontal scalability can be pay-per-use on a cloud. This talk explains the key ideas of two recent systems that achieve horizontal scalability, Scalaris and Beernet, and how we use them to build scalable self-managing applications. Both systems provide robust atomic transactions at high performance with strong consistency. They extend Distributed Hash Tables (DHTs) with symmetric data replication and enhanced Paxos commit and consensus protocols to support transactions over the replicated data. Both Scalaris and Beernet were implemented as part of the European SELFMAN project. Scalaris achieves 4000 read-modify-write transactions per second on two dual-core Intel Xeon processors at 2.66 GHz, with a data replication factor of 4. Scalaris works well both in tightly and loosely coupled settings: we have implementations running on cluster computers and on PlanetLab. Our Distributed Wikipedia application written in Scalaris won first prize in the IEEE International Scalable Computing Challenge 2008. Beernet achieves a major simplification of DHT self-organization by using a novel relaxed ring structure that does not need periodic stabilization, greatly reduces lookup inconsistency in the face of high node turnover ("churn"), and does not rely on transitive connectivity. The functionality provided by systems such as Scalaris and Beernet makes it trivially easy to build large-scale Internet applications. Students used Beernet to build a Distributed Wiki in our undergraduate course at UCL and complained that it was too easy: the system hid the difficulties of distributed programming too well. We conclude that these systems are the first steps towards a new generation of programming platforms for the Internet that will make obsolete current programming platforms which are still focused on single machines.