'Will BackgroundService play nicely on a Kubernetes cluster

I have a kubernetes cluster into which I'm intending to implement a service in a pod - the service will accept a grpc request, start a long running process but return to the caller indicating the process has started. Investigation suggests that IHostedService (BackgroundService) is the way to go for this.

My question is, will use of BackgroundService behave nicely with various neat features of asp.net and k8s:

  • Will horizontal scaling understand that a service is getting overloaded and spin up a new instance even though the service will appear to have no pending grpc requests because all the work is background (I appreciate there's probably hooks that can be implemented, I'm wondering what's default behaviour)
  • Will the notion of awaiting allowing the current process to be swapped out and another run work okay with background services (I've only experienced it where one message received hits an await so allows another message to be processed, but backround services are not a messaging context)
  • I think asp.net will normally manage throttling too many requests, backing off if the server is too busy, but will any of that still work if the 'busy' is background processes
  • What's the best method to mitigate against overloading the service (if horizontal scaling is not an option) - I can have the grpc call reutrn 'too busy' but would need to detect it (not quite sure if that's cpu bound, memory or just number of background services)
  • Should I be considering something other than BackgroundService for this task

I'm hoping the answer is that "it all just works" but feel it's better to have that confirmed than to just hope...



Solution 1:[1]

Investigation suggests that IHostedService (BackgroundService) is the way to go for this.

I strongly recommend using a durable queue with a separate background service. It's not that difficult to split into two images, one running ASP.NET GRPC requests, and the other processing the durable queue (this can be a console app - see the Service Worker template in VS). Note that solutions using non-durable queues are not reliable (i.e., work may be lost whenever a pod restarts or is scaled down). This includes in-memory queues, which are commonly suggested as a "solution".

If you do make your own background service in a console app, I recommend applying a few tweaks (noted on my blog):

  • Wrap ExecuteAsync in Task.Run.
  • Always have a top-level try/catch in ExecuteAsync.
  • Call IHostApplicationLifetime.StopApplication when the background service stops for any reason.

Will horizontal scaling understand that a service is getting overloaded and spin up a new instance even though the service will appear to have no pending grpc requests because all the work is background (I appreciate there's probably hooks that can be implemented, I'm wondering what's default behaviour)

One reason I prefer using two different images is that they can scale on different triggers: GRPC requests for the API and queued messages for the worker. Depending on your queue, using "queued messages" as the trigger may require a custom metric provider. I do prefer using "queued messages" because it's a natural scaling mechanism for the worker image; out-of-the-box solutions like CPU usage don't always work well - in particular for asynchronous processors, which you mention you are using.

Will the notion of awaiting allowing the current process to be swapped out and another run work okay with background services (I've only experienced it where one message received hits an await so allows another message to be processed, but backround services are not a messaging context)

Background services can be asynchronous without any problems. In fact, it's not uncommon to grab messages in batches and process them all concurrently.

I think asp.net will normally manage throttling too many requests, backing off if the server is too busy, but will any of that still work if the 'busy' is background processes

No. ASP.NET only throttles requests. Background services do register with ASP.NET, but that is only to provide a best-effort at graceful shutdown. ASP.NET has no idea how busy the background services are, in terms of pending queue items, CPU usage, or outgoing requests.

What's the best method to mitigate against overloading the service (if horizontal scaling is not an option) - I can have the grpc call reutrn 'too busy' but would need to detect it (not quite sure if that's cpu bound, memory or just number of background services)

Not a problem if you use the durable queue + independent worker image solution. GRPC calls can pretty much always stick another message in the queue (very simple and fast), and K8 can autoscale based on your (possibly custom) metric of "outstanding queue messages".

Solution 2:[2]

Generally, "it all works".
For the automatic horizontal scale, you need a autoscaler, read this: Horizontal Pod Autoscale
But you can just scale it yourself (kubectl scale deployment yourDeployment --replicas=10).

Lets assume, you have a deployment of your backend, which will start with one pod. Your autoscaler will watch your pod (eg. used cpu) and will start a new pod for you, when you have a high load.

A second pod will be started. Each new request will send to different pods (round-robin).
There is no need, that your backend throttle calls. It should just handle many calls as possible.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Stephen Cleary
Solution 2 akop