Building Microservices: The Service Chassis

One way I like to deal with many of the issues that come with designing a Microservice system is to use a service chassis. Some might argue that microservices should not share code, but I find that in practice this is unsustainable and leads to to much inconsistency between services, vastly increasing the maintenance burden. So when I expect to develop all or at least several services in the same language, I start by creating a chassis.

What is a service chassis?

A service chassis is a reusable component (or set of components) that encapsulate how a service should interoperate with it’s platform and dependencies, and provide shared logic used for implementing a service. In general is should provide the following generic features

Host HTTP endpoints
Handel inter-service messaging
Output metrics and logs to an observability platform
Store and retrieve data
Scaling and concurrency
Emitting data changes to a Datasink for reporting tools
Provide abstractions over framework and 3rd party packages/libraries
Centralised configuration of 3rd party packages/libraries
Define common exceptions
Implementation of circuit-breaker pattern to wrap calls to 3rd party services

Advantages of a service chassis

Standardisation: A service chassis provides a standardised way to build and deploy microservices. It ensures consistency across different services in terms of how they handle cross-cutting concerns like logging, security, and communication.
Faster Development: By handling common tasks and infrastructure concerns, a service chassis allows developers to focus on the unique business logic of each microservice. This results in faster development and reduced time to market.
Enhanced Security: A service chassis can provide a unified approach to security, ensuring that all microservices adhere to the same security protocols and standards.
Simplified Maintenance: With a chassis handling common infrastructure elements, maintenance becomes more straightforward. Updates to shared components can be rolled out across all services seamlessly.
Easier Integration and Testing: The consistency offered by a service chassis simplifies the integration and testing of different microservices, as they share a common framework for communication and data exchange.
Technology Agnosticism: A well-designed service chassis can be technology-agnostic, allowing developers to use different technologies for different services as appropriate.

By taking responsibility for the common cross cutting concerns that every microservice must deal with using a shared package, we can ensure that all microservices in a system conform to a predictable set of patterns and behaviours. It also ensures that the microservice repository only contains code relevant to the service’s business domain or bound context. All of which contribute to more robust and maintainable software systems.

Chassis responsibilities

Logging

A microservice chassis should incorporate a comprehensive logging framework for use by all implemented services. This framework should structure logs as key/value pairs, optimising them for straightforward ingestion into various log aggregation tools. A best practice is to direct logs to the console. This approach ensures the logs’ availability to developers during service development and their accessibility in containerised environments managed by platforms like Docker or Kubernetes.

Console-based logging eliminates reliance on specific logging systems, thereby simplifying the substitution of different logging solutions. To facilitate the transition of logs from the console to aggregation systems, tools like Filebeat or Logstash can be employed. These tools enable efficient shipping of logs to various log aggregation systems, including Elasticsearch, Splunk, and Datadog.

In addition to basic logging, the chassis should support structured and contextual logging, enabling more granular and meaningful log analysis. It should allow for varying log levels (e.g., DEBUG, INFO, WARN, ERROR) to control the verbosity and detail in different environments or situations. This flexibility is crucial for effective debugging and monitoring.

Furthermore, integrating correlation IDs into the logging process is essential. These IDs help in tracing requests across various microservices, significantly aiding in debugging complex workflows that span multiple service boundaries. This traceability is particularly vital in distributed systems where understanding the interaction between different services is key to identifying and resolving issues.

Finally, ensuring that the logging framework is lightweight and has minimal impact on the performance of microservices is crucial. Overhead from logging operations should not adversely affect the response times or resource usage of the services. This balance is necessary to maintain the efficiency and responsiveness of the microservices architecture.

Metrics Monitoring and Management

Effective metrics monitoring is critical in microservices architecture, and a robust chassis should facilitate this seamlessly. Primarily, the chassis ought to handle the collection and exposure of key operational metrics. These include, but are not limited to:

Traffic Metrics: Tracking all incoming and outgoing network requests.
Dependency Metrics: Monitoring interactions with external dependencies like databases, message queues, and caches, to identify performance bottlenecks or failures.

For the metrics format, leveraging industry standards like the Prometheus Exposition Format is advisable due to its widespread adoption, especially in cloud-native environments. Alternatively, OpenTelemetry offers a vendor-neutral approach, aligning with modern interoperability standards.

A crucial aspect of this setup is the ease of access and integration:

Endpoint Exposure: Ensuring metrics are accessible via a standardised endpoint (e.g., /metrics), simplifying parsing and ingestion processes for monitoring tools.
Custom Metrics: The chassis should not only cater to generic operational metrics but also allow developers to inject custom, business-specific metrics. This flexibility enables more nuanced monitoring tailored to the service’s unique operational aspects.

By adhering to these practices, the chassis ensures a unified, efficient approach to metrics monitoring, greatly aiding in maintaining service health and performance transparency.

Caching

Incorporating caching functionality within the microservice chassis offers significant benefits. A well-implemented caching layer in the chassis not only enhances performance but also provides a consistent caching strategy across different services. This unified approach ensures that any caching technology, whether it’s Redis, Memcached, or Amazon ElastiCache, can be seamlessly integrated and, if needed, replaced without impacting the service logic.

To maximise the effectiveness of the caching layer, the chassis should support configurable caching policies. These policies can dictate the lifespan of cache entries, eviction strategies, and cache invalidation mechanisms, tailored to the varying needs of different services. For instance, a time-to-live (TTL) policy might be suitable for data that changes infrequently, whereas a more dynamic eviction strategy could be applied to rapidly changing data.

Additionally, the chassis should provide hooks or callbacks for services to influence the caching behaviour. This allows services to implement custom logic for cache population, updating, or invalidation, based on their specific domain logic. By enabling this level of control, services can optimise cache usage, ensuring that the cached data is always relevant and up-to-date.

Moreover, it’s essential to incorporate monitoring and analytics capabilities into the caching layer. This would allow services to gain insights into cache hit rates, average retrieval times, and other key performance indicators. These metrics can be invaluable for fine-tuning caching strategies and identifying potential bottlenecks or inefficiencies.

Lastly, consider the implications of distributed caching in a microservices architecture, especially when dealing with a cluster of service instances. The chassis should facilitate the synchronisation of cache state across different instances, ensuring data consistency and reliability. This might involve implementing distributed caching solutions or ensuring that the chosen caching technology supports clustering natively.

Data access

A service chassis should provide a comprehensive data access layer that services can utilise for data persistence. This layer can be based on a versatile repository pattern, allowing services to define their own models for data interaction. Emphasising simplicity, this layer should offer an easy-to-use interface for performing CRUD (Create, Read, Update, Delete) operations.

To cater to various storage needs, the chassis can offer multiple implementations of this interface. This approach ensures flexibility and choice for developers, allowing them to select the storage type most suitable for their service’s domain. Consider including up to five implementations to cover a broad range of data storage paradigms:

Relational Data Persistence: Ideal for services requiring structured data storage with relationships.
Document Data Persistence: Suitable for services that deal with semi-structured data, like JSON or XML documents.
Graph Data Persistence: Best for services needing to model and query complex relationships between data points.
Wide-Column Data Persistence: Efficient for services handling large volumes of data with variable attributes.
Spatial Data Persistence: Essential for services that manage geospatial data.

Developers can choose the desired data store by including a specific library in their project or setting an environment variable. The aim is to ensure that the service code remains consistent, irrespective of the database technology, allowing for seamless transitions between different database systems.

Concurrency

In microservice architectures, running multiple instances of a service is common, which can introduce concurrency challenges. For instance, keeping open HTTP connections with clients or using patterns like MVVM and websockets often result in a single service instance communicating with a client. If another instance updates data, the instance with the open connection may not receive this update. A solution to this is employing shared resources such as caches, databases, or private message queues to ensure synchronisation across service instances.

Moreover, multiple instances sharing a database can lead to concurrency issues like race conditions, deadlocks, and lost updates. To address these, the service chassis should implement robust concurrency control mechanisms. This includes utilising database transactions and employing strategies like pessimistic or optimistic locking, which help manage concurrent access to shared resources effectively. By incorporating these features, the chassis ensures that services handle concurrent operations reliably, maintaining data integrity and consistency across the system.

HTTP Server

In the realm of microservices, the provision and configuration of a HTTP server is a crucial component, ideally integrated into the service chassis. This integration accomplishes multiple objectives:

Standardisation: By handling common configurations centrally within the chassis, such as error handling and response with standard HTTP error codes, we avoid redundancy. Each microservice doesn’t need to individually tackle these common tasks.
Predefined Endpoints: The chassis can define standard endpoints, such as /health, /metrics, and /readiness, which are essential for service monitoring and management. This ensures consistency across all services and facilitates easier monitoring and maintenance.
Traffic Management: Wrapping all HTTP traffic with appropriate metrics and logging within the chassis provides a unified approach to monitoring and analysing service requests and responses. This not only aids in troubleshooting but also in optimising performance.
Request Validation: By incorporating request validation at the chassis level, services can offload the initial processing of requests. This ensures that only well-formed requests are processed by individual services, thereby enhancing efficiency and security.
Middleware Integration: The chassis can facilitate the integration of various middleware components, like CORS handling, request logging, and others. This provides a streamlined approach to enhancing the capabilities of each service without cluttering their individual codebases.

By incorporating these elements into the HTTP server functionality of the service chassis, microservices can operate more efficiently, securely, and consistently, allowing developers to focus on the unique business logic of each service.

There are other HTTP concerns that could be handled by the chassis, but in practice I prefer to deal with these using API gateways such as Trafik, Nginx and Kong.

Security: Implementing security protocols, like SSL/TLS encryption, at the chassis level ensures that all communications are secure by default. This approach centralises security management, reducing the risk of individual services overlooking critical security measures.
Load Balancing and Routing: The chassis can offer functionalities for load balancing and intelligent routing of requests.

Messaging

In a microservice architecture, handling messaging patterns consistently is crucial. The chassis plays a pivotal role in this process by managing the intricacies of publishing, subscribing, and validating messages. This arrangement allows service developers to concentrate solely on providing messages and defining message handlers, without getting entangled in the complexities of the messaging system.

To achieve this, the chassis should offer a robust abstraction layer over the messaging infrastructure. This layer ensures that microservices can interact with messaging systems without being tightly coupled to their specifics. Consequently, it becomes feasible to switch out the underlying messaging system without necessitating code changes in the microservices.

The chassis should also streamline message handling by standardising the acknowledgement and validation of messages. This standardisation ensures uniformity and reliability across all services, enhancing overall system resilience.

Furthermore, the chassis should include features that simplify the development process, such as pre-built templates for common messaging scenarios such as the Decorated Saga Pattern and utilities for debugging and monitoring message flows. These tools can significantly reduce development time and improve the effectiveness of the microservices in handling messaging.

Reporting

In microservice architectures, running reporting tools directly on services is considered bad practice. Instead, reports should be generated from a dedicated reporting database. Adhering to good microservice practice, each service should exclusively own its database, preventing external systems from accessing it directly. The challenge then lies in efficiently transferring data from individual microservices to a centralised reporting platform.

A practical solution is the implementation of a “Datasink” (like heatsink not “sync”) mechanism. Given that the chassis includes a data access layer, it can be equipped to duplicate data modifications to an alternate location. This can be achieved by serialising creations, deletions, or updates of data into JSON format and storing these in a communal area, such as a shared container volume or an AWS S3 bucket.

Subsequently, tools like Amazon Athena and Glue, or custom scripts, can be utilised to parse these JSON files. They reconstruct the data changes from various services into a database format suitable for reporting tools. This approach requires caution, as services should be free to update their internal data structures without impacting their external interfaces. Such changes, would not be considered breaking changes in terms of service functionality, yet they can affect the data schema output.

To address this, the data parsing mechanism should be robust against schema changes. A simple method involves reading JSON files as-is and transferring the data to a wide-column store. In this store, existing fields are matched with JSON properties, with new columns created as needed. This allows for flexibility in accommodating data structure changes. Setting up alerts for new column creation can help data-focused teams to adapt to these changes efficiently.

Overall, this Datasink approach ensures that microservices maintain their autonomy and agility while providing a reliable and adaptable means of gathering data for comprehensive reporting.

Common Exceptions

Services utilising a microservice chassis primarily focus on domain-specific logic, leading to a scenario where many potential exceptions are rooted in the chassis itself. To effectively handle these exceptions, the chassis can define a suite of custom, common exceptions that are likely to occur due to the shared infrastructure or service integration issues. Examples include:

“RecordNotFound”: This exception could be thrown when a requested data record is not found in the database.
“UnableToProcessMessage”: This might occur in scenarios involving message parsing or handling failures in the messaging system.
“ServiceUnavailable”: Used when an external service or a critical internal component is not reachable or not responding.
“InvalidRequestException”: Thrown when incoming requests have invalid or malformed data.
“DependencyFailureException”: Used to encapsulate errors arising from failures in dependent services or systems, such as databases or caches.

Additionally, defining chassis-level exceptions for anticipated business logic errors can streamline error handling across different services. These exceptions, while generic, should be flexible enough to allow services to provide specific error details relevant to their domain. For example:

“BusinessValidationException”: This could be used when a service’s business logic validation fails.
“QuotaExceededException”: This exception can be thrown when a user or system exceeds a predefined usage or resource quota. This is particularly relevant in services where resource consumption, data usage, or API call limits are monitored and restricted. By implementing this at the chassis level, uniform handling of quota violations across different services can be ensured.

These defined exceptions should be accompanied by appropriate logging and metrics to ensure that they are trackable and can be analysed for patterns and root causes. Moreover, it’s crucial for the chassis to provide mechanisms that enable services to define and throw custom exceptions specific to their domain needs, maintaining flexibility while ensuring consistency in error handling across the microservice ecosystem.

Circuit Breaker

Microservices often rely on external third-party APIs for essential operations. While the integration of these APIs is typically highly specific to the domain of the service and thus not ideal for inclusion within the service chassis, managing the risks associated with these external calls is critical for maintaining service availability. To address this, the chassis can be equipped with a sophisticated “Circuit Breaker” pattern.

This pattern functions by monitoring the health and response patterns of third-party API calls. If a certain threshold of failures or timeouts is reached, the circuit breaker trips, temporarily halting further attempts to call the problematic API. This prevents a cascade of failures and excessive load on both the service and the third-party system, allowing for a controlled recovery and maintenance of overall service availability.

In practice, this could involve implementing a mechanism within the chassis that accepts endpoint details and request parameters from the microservice. The chassis would then manage these API interactions, utilising strategies like exponential back off and retry logic to mitigate the risks associated with unreliable external systems. By integrating this functionality directly into the chassis, it offloads a significant burden from service developers, enabling them to concentrate on core business logic without compromising on resilience and reliability.

Furthermore, the circuit breaker’s operational parameters (like thresholds for tripping, recovery time, and logging details of failed interactions) can be made configurable, offering adaptability to different use cases and external dependencies. This ensures a balance between robustness and flexibility, tailored to the specific needs of each microservice in the ecosystem.

Automated Documentation

While the microservice chassis handles various aspects like logging, metrics, caching, etc., it’s crucial to ensure that both the chassis and the services implementing it are well-documented. This not only aids in maintainability but also enhances the understanding for developers working on or with these services. To achieve this, automated documentation should be a fundamental feature of the service chassis.

For the Chassis Itself:

Self-Documenting Code: The chassis should be developed with self-documenting code practices. This means using clear, descriptive naming conventions, and including comments and docstrings where necessary. This makes the chassis codebase more understandable and accessible.
Automated API Documentation: Even though direct service-to-service API calls are discouraged to prevent tight coupling, the chassis itself may expose certain endpoints (like /health, /metrics, etc.). These should be automatically documented. Tools like Swagger or Redoc can be integrated into the chassis to auto-generate documentation for these endpoints. This documentation can include details about the endpoint’s purpose, input parameters, and the structure of the response.

For Implementing Services:

Endpoint Documentation Generation: For services implementing the chassis, it’s essential to have their endpoints automatically documented. This can be achieved by reusing tools intergrated into the chassis (tools like Swagger or Redoc). This approach ensures consistency in how endpoints are documented across different services.
Service-Level Documentation: Apart from endpoint documentation, the chassis can facilitate the generation of broader service-level documentation. This includes aspects like data flow diagrams, architecture overviews, and dependency graphs. Tools like Doxygen or Sphinx can be integrated to create comprehensive documentation sets that update automatically as the service evolves.
Change Management Documentation: The chassis could include tools for automatically tracking and documenting changes in services. This can involve integrating with version control systems to generate change logs and release notes, aiding in understanding the evolution of each service.

By embedding these automated documentation features into the chassis, it ensures that both the chassis and the services built upon it are consistently and comprehensively documented. This approach not only saves time but also enhances the clarity and maintainability of the microservice ecosystem, making it more resilient to the complexities that often arise with distributed systems.

Security and Compliance

The final concern that can be dealt with in the chassis is security. It is still critical for the chassis to embed fundamental security and compliance measures. These aspects ensure that each service adheres to essential security standards and legal regulations, thereby reducing vulnerabilities and ensuring data integrity.

Authentication and Authorisation: Implement robust authentication and authorisation mechanisms. This could involve integrating with identity providers and ensuring that each service can verify and understand tokens and control access to resources.
Data Encryption: The chassis should enforce encryption for data at rest and in transit. This can be achieved by implementing encryption libraries and ensuring that all data, whether stored in databases or transmitted over networks, is encrypted using industry-standard protocols like TLS for in-transit data and AES for data at rest.
Input Validation: To prevent common security vulnerabilities such as SQL injection or cross-site scripting (XSS), the chassis should include robust input validation mechanisms. These ensure that all incoming data is properly sanitised and validated before being processed or stored.
Audit Logging: Beyond standard logging, the chassis should implement audit logging capabilities. This involves recording who accessed what data and when, which is crucial for tracing activities in the event of a security breach and for compliance with regulations like GDPR and HIPAA.
Vulnerability Scanning and Patch Management: Integrate tools for regular vulnerability scanning into the chassis. This will help in identifying and addressing security flaws promptly. Additionally, implement a patch management process to ensure all components of the microservice ecosystem are up-to-date with the latest security patches.
Rate Limiting and Throttling: To protect against DoS (Denial of Service) attacks, the chassis should incorporate rate limiting and throttling mechanisms. This limits the number of requests a user can make in a given timeframe, thus preventing overloading of services.
Compliance Checks: Embed compliance check tools or modules that ensure services adhere to relevant industry standards and legal requirements. This is especially important for services dealing with sensitive data, ensuring they comply with standards like PCI-DSS for payment processing or HIPAA for healthcare information.
Error Handling: Implement secure error handling that avoids exposing sensitive information in error messages. This includes customising error responses to ensure they do not reveal internal system details that could be exploited by malicious actors.

By incorporating these security and compliance measures into the chassis, you create a more robust and secure foundation for all services that are built on it. This approach not only enhances overall system security but also streamlines compliance processes, making it easier for each microservice to adhere to necessary standards and best practice.

Conclusion

In summery, the microservice chassis provides a foundational framework for developing and deploying microservices, addressing many critical concerns. By encapsulating these aspects, the chassis simplifies the development process, allowing service developers to concentrate on business logic without being bogged down by underlying infrastructure and cross-cutting concerns.

However, it’s essential to recognize that not all responsibilities need to be embedded within the microservice chassis. Certain aspects can be more effectively managed by other architectural components:

API Gateway: An API gateway is a vital component in a microservices architecture, ideally situated to handle responsibilities such as Authentication, TLS Termination, and Rate Limiting. By centralizing these functions, the API gateway provides a unified layer of control, enhancing security and reducing the complexity within each microservice.
Service Mesh: For managing the secure communication between services, a service mesh is an invaluable tool. It can handle the encryption of data in transit, ensuring that communications within the microservice ecosystem are secure. This offloads a significant security burden from the services or the chassis, allowing services to communicate over a trusted network without the need to develop our own security measures.

In conclusion, while the microservice chassis provides a comprehensive and robust framework for developing microservices, it is part of a larger ecosystem of tools and components. Integrating an API gateway and service mesh into the architecture can further streamline operations, enhance security, and ensure that each component of the system is utilised according to its strengths. This holistic approach to architecture not only optimises performance but also aligns with best practices in microservices development, resulting in a more resilient, scalable, and maintainable system.

Mark Jones

View More Posts

Mark is an experienced Technical Architect, Consultant and Entrepreneur. From a C# and DevOps background, for the last 8 years he has worked with startups, FTSE 100 and government organisations to build innovative cloud orientated microservice platforms.

Previous Article Building a choreographed microservice architecture with the Decorated Saga pattern

Next Article Navigating the Fine Line of Microservices: When a pattern is an anti-pattern