Software Quality

May 31, 2011

How to avoid socket exhaustion on web servers by disabling HTTP Keep Alive

Filed under: Microsoft CRM — David Allen @ 8:48 pm

Problem:

Given

a web server, running IIS,

Hosting the web services for Microsoft CRM 4.0,

and a console application that runs a conversion program, which tries to update thousands of records per second through the CRM web service,

When

we run the conversion application at its fastest possible speed,

We observe

After only 26 seconds, we see failures in the web services calls of type System.Net.Sockets.SocketException.

If we put a deliberate delay of 100ms in the loop, we are able to process without errors, but at a reduced speed.

Cause:

Exhaustion of available ports (socket addresses).

Solution:

Disable HTTP Keep Alive setting on the web server that serves the web services.

for IIS 6

http://www.microsoft.com/technet/prodtechnol/WindowsServer2003/Library/IIS/ea116535-8eb9-4c80-8b14-b34418dbfe42.mspx?mfr=true

for IIS 7

http://technet.microsoft.com/en-us/library/cc772183(WS.10).aspx

Results

With HTTP Keep Alive disabled on the web services server, we have been able to achieve processing rates against the Microsoft CRM 4.0 web services at rates from 1,000 – 5,000 records / minute. Before this change, we could only achieve rates of 200 – 500 records / minute. We had to use parallel programming to achieve the higher rates, but in the context of this post, the point is that we could operate at higher transaction rates without socket errors. Before we disabled HTTP Keep Alive, we did not even consider parallelism, as we were overloading the server and exhausting sockets with simple sequential processing.

Caveat

Certain web applications require the use of HTTP Keep Alives to operate. In general, web applications usually work most efficiently with this setting enabled. Microsoft CRM 4.0 requires HTTP Keep Alives to be enabled to function. So our solution required us to setup a separate Microsoft CRM 4.0 platform server, where we could disable the HTTP Keep Alive. We targeted this separate server for our conversion processes. This left the primary CRM server unaltered, so it could continue to serve web requests from users using their web browsers.

Variations tried and not tried:

I tried altering the authentication mechanism while HTTP Keep Alive was enabled. No combination of authentication prevented socket errors.

Two articles from the references below suggested I leave the server HTTP Keep Alive enabled but disable it for my client calls by  altering the web request
http://stackoverflow.com/questions/2503776/how-can-i-disable-keep-alive-on-asp-net-web-service-client-requests
http://blog.developers.ie/cgreen/archive/2007/04/24/2745.aspx

I did not test this because we had already established a second server, dedicated to handling the data conversion web requests, and disabled HTTP Keep Alive on that server.  This second server provided additional opportunities for boosting performance, because we used it’s processing power exclusively to handle the data conversion calls, and left the main server to handle workflow processing. So we stuck with that model.

Additional References

I found plenty of articles on the socket error. But finding one that led me to a solution was difficult. That is why I’m posting this. Here are some references.

http://msdn.microsoft.com/en-us/library/aa560610(v=bts.20).aspx

http://blog.port80software.com/2004/12/07/hurry-up-and-time_wait/

http://blogs.msdn.com/b/dgorti/archive/2005/09/18/470766.aspx

Trying to understand all this information is another challenge. I don’t pretend to understand the underlying mechanisms at work here. If someone understands this better, feel free to comment. I just know what works and what does not.

Tools to diagnose:

on the web server, you can run

netstat -an >connections.txt

and examine the number of ports waiting. In this scenario, the number of ports in use is enormous.
In a sample I ran, there were over 3,900 ports in TIME_WAIT state. Here is a tiny sample of what is in the connections.txt file, the output of netstat -an. But honestly, I don’t know what this is telling me. I mean, when HTTP Keep Alives are enabled, and we are getting socket errors, the list actually appears to be smaller than when it is running at high speed. So I don’t know what to suggest here.

Active Connections
  Proto  Local Address          Foreign Address        State
  TCP    0.0.0.0:80             0.0.0.0:0              LISTENING
  TCP    0.0.0.0:135            0.0.0.0:0              LISTENING
  TCP    0.0.0.0:445            0.0.0.0:0              LISTENING
  TCP    0.0.0.0:3389           0.0.0.0:0              LISTENING
  TCP    0.0.0.0:47001          0.0.0.0:0              LISTENING
  TCP    0.0.0.0:49152          0.0.0.0:0              LISTENING
  TCP    0.0.0.0:49153          0.0.0.0:0              LISTENING
  TCP    0.0.0.0:49154          0.0.0.0:0              LISTENING
  TCP    0.0.0.0:49157          0.0.0.0:0              LISTENING
  TCP    0.0.0.0:55902          0.0.0.0:0              LISTENING
  TCP    192.168.95.99:80       192.168.95.136:1026    TIME_WAIT
  TCP    192.168.95.99:80       192.168.95.136:1027    TIME_WAIT
  TCP    192.168.95.99:80       192.168.95.136:1028    TIME_WAIT
  TCP    192.168.95.99:80       192.168.95.136:1029    TIME_WAIT

Sample socket error:

Event Type: Error
Event Source: <whatever>
Event Category: None
Event ID: 0
Date:  5/30/2011
Time:  9:23:12 AM
User:  N/A
Computer: <web server computer>
Description:
System.Net.WebException: Unable to connect to the remote server ---> System.Net.Sockets.SocketException: Only one usage of each socket address (protocol/network address/port) is normally permitted 192.168.95.99:80
   at System.Net.Sockets.Socket.DoConnect(EndPoint endPointSnapshot, SocketAddress socketAddress)
   at System.Net.ServicePoint.ConnectSocketInternal(Boolean connectFailure, Socket s4, Socket s6, Socket& socket, IPAddress& address, ConnectSocketState state, IAsyncResult asyncResult, Int32 timeout, Exception& exception)
   --- End of inner exception stack trace ---
   at System.Net.HttpWebRequest.GetRequestStream(TransportContext& context)
   at System.Net.HttpWebRequest.GetRequestStream()
   at System.Web.Services.Protocols.SoapHttpClientProtocol.Invoke(String methodName, Object[] parameters)
   at Microsoft.Crm.SdkTypeProxy.CrmService.Execute(Request Request)
   at <our custom code called the stack above>

2 Comments »

  1. […] by someone. But it was not easy for us to find it. I wrote about this in a previous post here https://codecontracts.info/2011/05/31/how-to-avoid-socket-exhaustion-on-web-servers-by-disabling-http… . The thought of doing parallel processing crossed our minds, as we could see that the servers were […]

    Pingback by How to speed up a data conversion program that loads Microsoft CRM 4.0 « Software Quality — May 31, 2011 @ 10:03 pm

  2. […] HTTP requests to the CRM web services, we experienced socket exhaustion on the server.  I wrote an earlier post on how we resolved that.  Our failure to resolve that sooner delayed our adoption of parallel […]

    Pingback by How we sped up our Microsoft CRM data conversion using the Microsoft Task Parallel Library « Software Quality — October 15, 2011 @ 9:15 am


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: