Each micro-services are docker-based (and any dependencies and communication port are orchestrated via docker-compose) and is 100% Rust.
For this excercise, at the time of the design/writing, I believe there will be 4 micro-services.
- Google OAuth2 - an OAuth2 relay service to validate/verify user is who they claim to be. This service will be responsible for validating the user's identity, and then will be able to generate a JWT token for the user. This will mainly be useful for session recovery on disconnection and reconnecting, we can validate first that the reconnecting user is really who they claim to be, so when it offers last used session token, we can assume it's not a MItM spoof. Just have to make sure one realizes that the session token it may be using is based on last session that occurred 3 months (ok, maybe 3 days) ago!
- Generator - this service simply generates the starting sudoku playfield based difficulties. As a very basic generator, all it will do is auto-generate a solved data, and then (randomly?) remove cells. From what I understand, sudoku "puzzle" only has ONE UNIQUE solution IF the starting clue has less than 7 or less cells. Mathmatically, if there are 2 digits (of of 9 digits) not on the starting clue, they can be interchangeable on positions. In any case, there will be more details on its own REAMDE page.
- Solver/Evaluator - this service will be able to solve a given sudoku playfield, and will be able to evaluate if the given sudoku playfield is solved or not; There are few ways, such as brute-force as well as perhaps ML agents that solves it.
- Game - this service will be the main game service, and will be the main entry point for the client. This service will be able to communicate with the generator and solver/evaluator service to generate a new game, and will be able to evaluate if the game is solved or not. This service will also be able to save the game state, and load the game state. Whether the design is via Google Cloud Load-Balancer or routed via Nginx/IIS/Apache, this service will be the main entry point for the client. This is the only service that are synchrnous with the client.
- Trainer - this is a micro-service that probably will never be available on production server, and is basically the service that handles all the game training for solver to use.
- Leaderboard - this is a feature that will be added possibly in the future, but it's here just to emphasize the fact that because it will be designed (hopefully, IMHO) correctly, adding new micro-service will be as easy as adding a new docker-compose service and a new Rust crate.
- Lobby/Chat - yet another micro-service that should be trivial to add in the future.
In a nutshell, Game service is the only service that may have an IP address and/or hostname exposed to WAN (because I'll be using gRPC for client rather than another proxy such as Nginx). Note that I do not have write my own proxy-router/load-balancer nor do I need to setup load-balancer service or routing service (including Nginx) because from the client's point of view, it connects to ONE single endpoint, and that endpoint will respond back from that I.P. address back to the requested IP:port pair. From the Game micro-service, it also assumes the connected host:port is where it'll respond back to. Because of this, if I do not have a proxy/routing service visible by client (or WAN), I just make the Game service IP address visible, and we should be good to go.
In a Docker based environment, the Game service hosted on a Docker container is on a NAT, hence it's I.P. address is local and inaccessible by other hosts even on the same subnet. The host of the docker container is the one that is accessbile by other hosts on the SAME subnet (or on a subnet based on netmask). So we'll use docker-compose to route (port-forward) the request from client to game service. So what gets exposed is the host that hosts the Game service docker container.
Though it feels (definitely is) overdesigned and overkill, as a practice, with the tasks/purposes/actions broken down into each microservices, at this level of discussions (each micro-services has its own README for more details), what we want to be aware of are:
- All micro-services are made of stateless request. Inputs that relates/associates to current state are either passed as a parameter on the request (just like REST and gRPC) or some session-based ID is passed in as input for persisted lookup. It's probably wrong to assume that actions posted on Kafka will immediately be availabe for others to poll.
- Each requests to any/all micro-services are idempotent.
- Each requests (on any/all micro-services) are synchronous (hence gRPC) blocking call, hence it should process as quick as possible to unblock. Because it's synchronous, it is also assumed that if the endpoint crashes or socket disappears (either way, abnormal disconnection), the requester should be configurable to reconnect to another micro-service up to N trials. As for if endpoint goes into infinite-loop tarpit, there should also have a configurable TTL threshold (this is to make sure DoS are handled even at the non-exposed level). Any critical messages that needs state to rollback is based on if the caller disconnected by the time service has finished, that service MUST not update any persisted state until we get a clear confirmation that it was successfull (HTTP-speak, anything other than 200 should not update persisted state).
- Any abnormal disconnections, failures, and/or errors, whether it associates with persisted states (critical) or not (warning) should be posted to Kafka
- Optional, but probably not needed - If a micro-service comes up to spin, first thing it should do is announce itself to Kafka, and micro-services that are interested in new available micro-service should update local list of service. And every N seconds (configurable), it should re-announce itself to Kafka that it is still alive. As for the subscribre side, it should refresh the list so that stale micro-services are pruned. I do NOT need to do this if using Kubernetes since you can use the Kubenetes DNS to discover services. As for Docker-Compose, unless I create multiple hosts with same services, I usually assume there is only one service running, so I can just locate that service via namespace via Docker host NAT'ing them. In any case, I need a way to deal with fault-tollerance, but maybe this is over-over-engineering...
There may be few others that I've mentioned on my other blog (README) on this repos, but as mentioned, for this excercise, I do not think I'd need to cover any other design requirements. Though this service will be very simple to make a monolithic server by combining all 3 modules into one, and there is nothing wrong on such design as long as it doesn't need to be scaled, as mentioned, this is a practice and my past experiences with monolithic servers attempting to become micro-service has been ... not so good. So I'm trying to stay away from that pattern now a days... In fact, if I was going to make the service monolithic, I'd rather just completely toss the need for server, and embed the whole logic into the client.