Long-running ASP.NET tasks

I know there’s a bunch of APIs out there that do this, but I also know that the hosting environment (being ASP.NET) puts restrictions on what you can reliably do in a separate thread.

I could be completely wrong, so please correct me if I am, this is however what I think I know.

  • A request typically timeouts after 120 seconds (this is configurable) but eventually the ASP.NET runtime will kill a request that’s taking too long to complete.
  • The hosting environment, typically IIS, employs process recycling and can at any point decide to recycle your app. When this happens all threads are aborted and the app restarts. I’m however not sure how aggressive it is, it would be kind of stupid to assume that it would abort a normal ongoing HTTP request but I would expect it to abort a thread because it doesn’t know anything about the unit of work of a thread.

If you had to create a programming model that easily and reliably and theoretically put a long running task, that would have to run for days, how would you accomplish this from within an ASP.NET application?

The following are my thoughts on the issue:

I’ve been thinking a long the line of hosting a WCF service in a win32 service. And talk to the service through WCF. This is however not very practical, because the only reason I would choose to do so, is to send tasks (units of work) from several different web apps. I’d then eventually ask the service for status updates and act accordingly. My biggest concern with this is that it would NOT be a particular great experience if I had to deploy every task to the service for it to be able to execute some instructions. There’s also this issue of input, how would I feed this service with data if I had a large data set and needed to chew through it?

What I typically do right now is this

SELECT TOP 10 * 
FROM WorkItem WITH (ROWLOCK, UPDLOCK, READPAST)
WHERE WorkCompleted IS NULL

It allows me to use a SQL Server database as a work queue and periodically poll the database with this query for work. If the work item completed with success, I mark it as done and proceed until there’s nothing more to do. What I don’t like is that I could theoretically be interrupted at any point and if I’m in-between success and marking it as done, I could end up processing the same work item twice. I might be a bit paranoid and this might be all fine but as I understand it there’s no guarantee that that won’t happen…

I know there’s been similar questions on SO before but non really answers with a definitive answer. This is a really common thing, yet the ASP.NET hosting environment is ill equipped to handle long-running work.

Please share your thoughts.

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Have a look at NServiceBus

NServiceBus is an open source
communications framework for .NET with
build in support for publish/subscribe
and long-running processes.

It is a technology build upon MSMQ, which means that your messages don’t get lost since they are persisted to disk. Nevertheless the Framework has an impressive performance and an intuitive API.

Method 2

John,

I agree that ASP.NET is not suitable for Async tasks as you have described them, nor should it be. It is designed as a web hosting platform, not a back of house processor.

We have had similar situations in the past and we have used a solution similar to what you have described. In summary, keep your WCF service under ASP.NET, use a “Queue” table with a Windows Service as the “QueueProcessor”. The client should poll to see if work is done (or use messaging to notify the client).

We used a table that contained the process and it’s information (eg InvoicingRun). On that table was a status (Pending, Running, Completed, Failed). The client would submit a new InvoicingRun with a status of Pending. A Windows service (the processor) would poll the database to get any runs that in the pending stage (you could also use SQL Notification so you don’t need to poll. If a pending run was found, it would move it to running, do the processing and then move it to completed/failed.

In the case where the process failed fatally (eg DB down, process killed), the run would be left in a running state, and human intervention was required. If the process failed in an non-fatal state (exception, error), the process would be moved to failed, and you can choose to retry or have human intervantion.

If there were multiple processors, the first one to move it to a running state got that job. You can use this method to prevent the job being run twice. Alternate is to do the select then update to running under a transaction. Make sure either of these outside a transaction larger transaction. Sample (rough) SQL:

UPDATE InvoicingRun
SET Status = 2 -- Running
WHERE ID = 1
    AND Status = 1 -- Pending

IF @@RowCount = 0
    SELECT Cast(0 as bit)
ELSE
    SELECT Cast(1 as bit)

Rob

Method 3

Have thought about the use the Workflow Foundation instead of your custom implementation? It also allows you to persist states. Tasks could be defined as workflows in this case.

Just some thoughts…

Michael

Method 4

Use a simple background tasks / jobs framework like Hangfire and apply these best practice principals to the design of the rest of your solution:

  • Keep all actions as small as possible; to achieve this, you should-
  • Divide long running jobs into batches and queue them (in a Hangfire queue or on a bus of another sort)
  • Make sure your small jobs (batched parts of long jobs) are idempotent (have all the context they need to run in any order). This way you don’t have to use a quete which maintains a sequence; because then you can
  • Parallelise the execution of jobs in your queue depending on how many nodes you have in your web server farm. You can even control how much load this subjects your farm to (as a trade off to servicing web requests). This ensures that you complete the whole job (all batches) as fast and as efficiently as possible, while not compromising your cluster from servicing web clients.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x