Handy references/tutorials
Mail Thread Index
cleaning up TIME_WAIT states
http://www.sunsite.auc.dk/RFC/rfc/
RFC HyperText Archive,
search RFCs for TIME_WAIT
RFC 1337 - TIME-WAIT Assassination Hazards in TCP
RFC 1122 - Requirements for Internet Hosts -- Communication Layers
RFC 793 - Transmission Control Protocol
RFC 1323 - TCP Extensions for High Performance
RFC 916 - RELIABLE ASYNCHRONOUS TRANSFER PROTOCOL (RATP)
RFC 1339 - Remote Mail Checking Protocol
stuck in TIME_WAIT, winsock mailing list
Try setting SO_REUSEADR before closing the socket.
setting SO_REUSEADDR, SO_REUSEPORT and SO_LINGER
When this is done in the server NFMSrv ... i.e. in the OMC code? The sockets don't linger in time_wait after client shutdown. NFMSim sockets are not the problem => setting them there won't help. NFMSim ascOutput -------------------------------------- socket socket set REUSEADDR/PORT LINGER bind(6543) connect------->accept recv <-------send .. .. shutdown shutdown, TIME_WAIT exits because of REUSE* and LINGER NFMSim shutdown before ascOutput => no problems, no TIME_WAIT ascOutput shutdown first => TIME_WAIT without LINGER int nAllowReuse = 1; if (setsockopt(nfmSocketFd, SOL_SOCKET, SO_REUSEADDR, (char*)&nAllowReuse, sizeof(nAllowReuse)) <0 ) { cerr << "NFMSim:setHost:ERROR: could not set socket to reuse addr." << endl; } if (setsockopt(nfmSocketFd, SOL_SOCKET, SO_REUSEPORT, (char*)&nAllowReuse, sizeof(nAllowReuse)) <0 ) { cerr << "NFMSim:setHost:ERROR: could not set socket to reuse port." << endl; } linger lingerSet; lingerSet.l_onoff = 0; lingerSet.l_linger = 0; // set server to allow reuse of socket port if (setsockopt(nfmSocketFd, SOL_SOCKET, SO_LINGER, (char*)&lingerSet, sizeof(lingerSet)) <0 ) { cerr << "NFMSim:setHost:ERROR: could not set socket to linger." << endl; }
ypcat hosts |grep hostname arp -a |grep hostname ip.add.re.ss hostname netstat NFMSrv --- nothing NFMSim --- tcp 0 0 hostname.6543 hostname.1641 ESTABLISHED tcp 0 0 hostname.1641 hostname.6543 ESTABLISHED ctrlCNFMSrv $ netstat |grep 6543 tcp 0 0 hostname.6543 hostname.1641 FIN_WAIT_2 tcp 0 0 hostname.1641 hostname.6543 CLOSE_WAIT quit NFMSim $ netstat |grep 6543 tcp 0 0 hostname.6543 hostname.1641 TIME_WAIT run NFMSim, connection fails ... but exits TIME_WAIT ... or timeouts? quit NFMSim 1st, then NFMSrv and all is fine.
Fun with browsers: monitoring the http server side while downloading pages. see which versions of browsers have bad socket behaviour. basically: netscape 4.7 on solaris was good, netscape 4.6 on winNT bad, opera 3.60 on winNT good.
Waider's fault: > > Hey hey, > > doing a PR exercise on friday whereby I go visit a customer and wave a > dead chicken over their computer and declare the problem solved. Thing > is, I don't know what the problem is. Here's the gist: > > They're on an ISDN pipe, which is not actually being used to download > megs of pr0n or anything, and connecting to an admin page on > jobfinder. They get the top few lines of the page, then the connection > stalls. On hitting 'stop' and then refreshing the page, wammo, full > page. Now, I don't know how repeatable this is, but I think it's > repeatable enough that the customer is quite unhappy (hence the PR > exercise). > I have definately experienced something like the above .... more noticable if network is slow or site server itself slow .. usually using Netscrape ... and strangely enough often after typing in the url. As per customer I hit Esc and reload (but in a more knowlegable way listening out for any change in karma of the connection or the browser) This definately helps the user and makes them feel in control, good stress relief. That done "wammo" (or after maybe a short erk (browser thinking) then "wammo") the page reloads (mostly from the cache from the preceding request) > Does anyone have suggestions? It would be nice to know what the browser is thinking exactly. What happens if you leave it for, say, a half hour? Can you magically grep out open socket and match with the requests? It's fun, ... REALLY! Stop, reload, netstat |grep and count repeat until not fun anymore My browser esp tells me that the it's waiting for one of the sockets to go away ... but it's already gone .... or maybe never was. It's also fun to look at all the sockets on the server. These are sometimes just about on the limit (always a few sockets in TIME_WAIT from people hitting that Stop btn) look, see: hope.46830 dead.52265 8760 0 8760 0 TIME_WAIT hope.46831 dead.52425 8760 0 8760 0 TIME_WAIT hope.46832 dead.52426 8760 0 8760 0 TIME_WAIT hope.46836 dead.52265 8760 0 8760 0 TIME_WAIT hope.46837 dead.52431 8760 0 8760 0 TIME_WAIT hope.46838 dead.52265 8760 0 8760 0 TIME_WAIT hope.46869 dead.52521 8760 0 8760 0 TIME_WAIT hope.80 achill.36774 8760 0 8760 0 TIME_WAIT hope.80 achill.36775 8760 0 8760 0 TIME_WAIT hope.46875 dead.52475 8760 0 8760 0 ESTABLISHED hope.46876 dead.52521 8760 0 8760 0 TIME_WAIT hope.46877 hope.40928 32768 0 32768 0 TIME_WAIT hope.962 derry.nfsd 8760 0 8760 0 ESTABLISHED hope.80 zach.1898 8276 0 8760 0 TIME_WAIT hope.80 zach.1899 8760 0 8760 0 TIME_WAIT hope.80 achill.36776 8760 0 8760 0 TIME_WAIT hope.46839 dead.52475 8760 0 8760 0 TIME_WAIT hope.46840 dead.52431 8760 0 8760 0 TIME_WAIT hope.80 zach.1900 7963 0 8760 0 TIME_WAIT hope.80 achill.36778 8760 0 8760 0 TIME_WAIT hope.80 zach.1901 8266 0 8760 0 TIME_WAIT hope.80 zach.1902 8760 0 8760 0 TIME_WAIT hope.80 zach.1903 8359 0 8760 0 TIME_WAIT hope.80 zach.1904 8760 0 8760 0 TIME_WAIT hope.80 zach.1905 8618 0 8760 0 TIME_WAIT hope.80 zach.1906 8618 0 8760 0 TIME_WAIT hope.80 zach.1907 8760 0 8760 0 TIME_WAIT hope.80 zach.1908 8760 0 8760 0 TIME_WAIT hope.46846 dead.52461 8760 0 8760 0 TIME_WAIT hope.46881 dead.52521 8760 0 8760 0 TIME_WAIT hope.46882 hope.40232 32768 0 32768 0 TIME_WAIT hope.46857 dead.52475 8760 0 8760 0 TIME_WAIT hope.80 tulla.40707 8760 0 8760 0 TIME_WAIT hope.nfsd dead.576 8760 0 8760 0 ESTABLISHED hope.46889 dead.52475 8760 0 8760 0 ESTABLISHED hope.46894 dead.52265 8760 0 8760 0 ESTABLISHED hope.80 tulla.40709 8760 0 8760 0 TIME_WAIT hope.80 tulla.40708 8760 0 8760 0 TIME_WAIT hope.46895 dead.52431 8760 0 8760 0 ESTABLISHED hope.46896 dead.52475 8760 0 8760 0 ESTABLISHED hope.46859 dead.52265 8760 0 8760 0 TIME_WAIT hope.46860 dead.52265 8760 0 8760 0 TIME_WAIT hope.46864 dead.52425 8760 0 8760 0 TIME_WAIT hope.46865 dead.52426 8760 0 8760 0 TIME_WAIT hope.80 achill.36784 8760 0 8760 0 TIME_WAIT hope.80 achill.36785 8760 0 8760 0 TIME_WAIT hope.46900 dead.52425 8760 0 8760 0 ESTABLISHED hope.46901 dead.52426 8760 0 8760 0 ESTABLISHED hope.46866 dead.52431 8760 0 8760 0 TIME_WAIT hope.80 tulla.40712 8760 0 8760 0 TIME_WAIT hope.80 achill.36786 8760 0 8760 0 TIME_WAIT hope.46902 dead.52431 8760 0 8760 0 ESTABLISHED hope.46903 dead.52454 8760 0 8760 0 TIME_WAIT hope.46904 dead.52454 8760 0 8760 0 TIME_WAIT hope.80 achill.36788 8760 0 8760 0 TIME_WAIT hope.80 achill.36789 8760 0 8760 0 TIME_WAIT hope.80 achill.36790 8760 0 8760 0 TIME_WAIT waugh! socket wastage curiously enough ... apache 1.3.9 used here and using netscape 4.6 on winNT to browse ... 1-4 sockets are used when the machine connects and gets a page. these goto ESTABLISHED if a page is downloaded, they wait in FIN_WAIT_2 after nothing happens for a short while. If there's only html on the page or you view a gif/png/... only one socket goes to ESTABLISHED. Obviously used to download up to four thingies in parallel. ... if you hit "back" one goes to ESTABLISHED too ... but ... well, it doesn't do nothin. If you go away to another server these poor sockets head into TIME_WAIT So if you oscillate between one server and others you're building up sockets in TIME_WAIT (which should usually last 2 mins yeah). I played with sockets a little only last year and we managed to avoid TIME_WAIT easily enough. If the client closes down the connection properly. So ..... *sigh* (fires up MSIE) oops ... scrap that ... can't find MSIE ... must've trashed it >;) (fires up Opera 3.60) .... downloads something local netstat -f inet | grep for me => VERY interesting. A few loads and I can see hmmmmmm, no sockets visible :( So I guess it's being shut down properly after use. good Opera, nice Opera *pat* *pat* aie! Opera just DIED .... hmmm, there's a socket stuck in FIN_WAIT_2 ... :-7 Hmmmm, INTERESTING .... just discovered netscrape (4.7) on the suns is also well behaved sockets wise. On the other hand .... why bother (square aura man - not cool). It's the browser, ... it's the connection. Waving dead chickens is probably better than anything else you could do ... but not exactly reassuring to people who are still thinking software is reliable, ... if you can, get them to focus while hitting stop and reload. :) educational anyway .... you _did_ ask for suggestions ... not sure exactly what I've suggested here though. James.