I present two recommended reads. Oddly enough, both are written by Microsoft employees.

  • On Designing and Deploying Internet-Scale Services” (PDF link) by James Hamilton is essentially an end-to-end checklist for making an application work, scale, and be manageable. The paper, presented at LISA ‘07, is 12 pages of bullet points of recommendations from the author and various other employees of large application teams within Microsoft, focused around ten themes:

    • overall service design
    • designing for automation and provisioning
    • dependency management
    • release cycle and testing
    • hardware selection and standardization
    • operations and capacity planning
    • auditing monitoring and alerting
    • graceful degradation and admission control
    • customer and press communications plan
    • customer self-provisioning and self-help.

    It is definitely a concise way to make sure you’ve thought of how your services works, acts, and operates, and where you can improve it.