You can buy this book by amazon.com.
Book Release It! is about how to architect, design, and build software. Reading these notes will not replace reading the book. I collect these notes to come back to them in future to reference them in current projects.
Chapter 1 Living in Production
Part I — Create Stability
Chapter 2 Case Study: The Exception That Grounded an Airline
Cause of problem is stmt.close() can throw an exception, therefore connection was not closed.
Chapter 3 Stabilize Your System
Systems should be tested also during long run without rebooting.
Think how to stop crack propagation, for example, by setting timeouts.
The more tightly coupled the architecture, the greater the chance the coding error can propagate.
Fault => Errors => Failures
Chapter 4 Stability Anti-patterns
Tight coupling ruins stability.
There are two extremes by integration points: monolith vs spiderweb.
New architect focuses on components, an experienced one on interconnections.
There is Oracle feature dead connection detection.
Treat response as data and only if you know if you need it start parsing.
Stability patterns to make integration points safer: Circuit Breaker and Decoupling Middleware.
Chain reaction in horizontal scaled architectures: if one machine has failed, load is distributed to other machines.
Keep as little in user session as possible. For example, you can use weak references.
Move memory outside of current process, for example using Redis.
Speed of memory is an important factor: registers, cache, local memory, disk, remote memory, and so on.
Think about sockets: open and closed.
Pay attention to user behavioral patterns: valuable users, accidental users, unwanted users.
Pay attention to blocked threads and synchronized functions.
Avoid deadlocks using timeouts.
Shared nothing architecture is an ideal case of horizontal scaling. It is most scalable architecture.
Prepare for expected load using autoscaling feature.
Point-to-point communication can be dangerous. There is a way to avoid it using concept of farms and load balancers between them, or using broadcasting, publish/subscribe, or messaging patterns.
Shared common services can become a bottleneck.
To avoid Dogpile you should schedule start of demand in random fashion, not at once.
If you have observer in system, observer should differentiate, between true state of the system and current state of the system available for observer. This can help avoid system crash, during usage of automatic services.
Always specify in request maximum limit of rows to retrieve.
List of anti-patterns:
- Integration Points
- Chain Reactions
- Cascading Failures
- Blocked Threads
- Self-Denial Attacks
- Scaling Effects
- Unbalanced Capacities
- Force Multiplier
- Slow Responses
- Unbounded Result Sets
Chapter 5 Stability Patterns
List of stability patterns:
- Circuit Breaker
- Steady State
- Fail Fast
- Let It Crash
- Test Harnesses
- Decoupling Middleware
- Shed Load
- Create Back Pressure
Well placed timeouts helps fault isolation.
Circuit breakers check system state and if operation possible (state closed) execute it, in other case do nothing (state open). Closed circuit breaker counts failed operations and after threshold changes state to open.
In a ship, bulkheads are partitions that, when sealed, divide the ship into separate, watertight compartments. You can partition your system in the same way.
Steady states: data purging, log files, in-memory caching.
Fail Fast: better is no responses, than slow responses.
Let It Crash (Akka): limited granularity, fast replacement, supervision, reintegration.
Handshaking is about rejecting of incoming work, because of full load.
Decoupling Middleware: irreversible decision.
Shed Load: refuse new requests, show that system is overloaded.
Create Back Pressure: block producers to add new item in queue.
Governor: limits speed.
Part II — Design for Production
Chapter 6 Case Study: Phenomenal Cosmic Powers, Itty-Bitty Living Space
Below is an explanation of design for production principle.
Chapter 7 Foundations
Concerns and levels of responsibility:
- Operations – Security, availability, capacity, status, communication
- Control Plane – System monitoring, deployment, anomaly detection, features
- Interconnect – Routing, load balancing, failover, traffic management
- Instances – Services, processes, components, instance monitoring
- Foundation – Hardware, VMs, IP addresses, physical network
One machine can have many physical interfaces, as a result different names.
Any physical host resources is typically over subscribed with VMs resources.
System clock can be not stable during migration of VM from one host to another.
Some words how to design container applications, like, they have no identity, startup/shutdown should be quick, externalize networking and so on.
The 12-Factor App
- Codebase – Track one codebase in revision control. Deploy the same build to every environment.
- Dependencies – Explicitly declare and isolate dependencies.
- Config – Store config in the environment.
- Backing services – Treat backing services as attached resources.
- Build, release, run – Strictly separate build and run stages.
- Processes – Execute the app as one or more stateless processes.
- Port binding – Export services via port binding.
- Concurrency – Scale out via the process model.
- Disposability – Maximize robustness with fast startup and graceful shutdown.
- Dev/prod parity – Keep development, staging, and production as similar as possible.
- Logs – Treat logs as event streams.
- Admin processes – Run admin/management tasks as one-off processes.
Chapter 8 Processes on Machines
Code, config, and connection.
Be carefully about log messages. Build process should set log level to WARN automatically to avoid debug messages on prod.
Chapter 9 Interconnect
DNS for service discovery
Load Balancing (Software Load Balancing (Reverse Proxy), Hardware Load balancing (F5), Health Checks, Sttickiness (repeated requests, stateful), Partitioning Request Types (content-based routing))
too busy, try later; residence time;
Migratory Virtual IP Addresses
Chapter 10 Control Plane
- Explain what happened
- Commit to improvement
Goal for the platform team is to enable their customers.
Provisioning and Deployment Services – pull vs push
List of services, which might be needed:
- Log collection and search
- Metrics collection and visualization
- Configuration service
- Instance placement
- Instance and system visualization
- IP, overlay network, firewall, and route management
- Alerting and notification
Chapter 11 Security
Part III Deliver Your System
Chapter 12 Case Study: Waiting for Godot
Chapter 13 Design for Deployment
Key concerns: automation,orchestration, and zero-downtime deployment.
Ideal deployment tool matches current state and desired state.
Zero downtime, smaller and frequent deployments
Chapter 14 Handling Versions
Compatible vs incompatible API changes.
Part IV Solve Systemic Problems
Chapter 15 Case Study: Trampled by Your Own Customers
Chapter 16 Adaptation
Trade off efficiency for flexibility.
Evolutionary Architecture: Microservices, Microkernel and plugins, Event-based.
Six modular operators: Splitting, Substituting, Augmenting and Excluding, Inversion, Porting.
Create options for the future.
Messages, Events, and Commands
- Event notification
- Event-carried state transfer
- Event sourcing
- Command-query responsibility segregation (CQRS)
Treat the messages like data instead of objects to support schema evolution.
There is no such thing like natural data model. It’s important to make deliberate choices about when to use relational, document, graph, key-value, or temporal databases.
Chapter 17 Chaos Engineering
- Limit of capacity
- Limit of safety
- Limit of economy
William Kent. Data and Reality. 1st Books, Bloomington, IL, 1998
Neal Ford, Rebecca Parsons, and Pat Kua. Building Evolutionary Architectures.O’Reilly & Associates, Inc., Sebastopol, CA, 2017.