From hawkeye@glia  Fri Apr 14 23:37:48 1995
Received: from glia.biostr.washington.edu by betz.biostr.washington.edu via SMTP (931110.SGI/930416.SGI)
	for jsp id AA20451; Fri, 14 Apr 95 23:37:48 -0700
From: hawkeye@glia (Ken Keys)
Posted-Date: Fri, 14 Apr 95 23:43:21 PDT
Message-Id: <9504150643.AA13374@glia.biostr.washington.edu>
Received: by glia.biostr.washington.edu
  (911016.SGI/Eno-0.1) id AA13374; Fri, 14 Apr 95 23:43:21 -0700
Subject: non-blocking connect()
To: jsp@glia (Jeff Prothero)
Date: Fri, 14 Apr 95 23:43:21 PDT
X-Mailer: ELM [version 2.3 PL11]

You asked for it.

Here's Schmidt's article...

> >From noao!math.arizona.edu!news.Arizona.EDU!news.Cerritos.edu!nic-nac.CSU.net!usc!howland.reston.ans.net!news.moneng.mei.com!uwm.edu!psuvax1!news.ecn.bgu.edu!newspump.wustl.edu!bigfoot.wustl.edu!tango.cs.wustl.edu!not-for-mail Sun Jan 29 05:30:26 MST 199> 5
> Article: 36330 of comp.unix.solaris
> Path: noao!math.arizona.edu!news.Arizona.EDU!news.Cerritos.edu!nic-nac.CSU.net!usc!howland.reston.ans.net!news.moneng.mei.com!uwm.edu!psuvax1!news.ecn.bgu.edu!newspump.wustl.edu!bigfoot.wustl.edu!tango.cs.wustl.edu!not-for-mail
> From: schmidt@tango.cs.wustl.edu (Douglas C. Schmidt)
> Newsgroups: comp.protocols.tcp-ip,comp.unix.internals,comp.unix.solaris
> Subject: Non-blocking connects with BSD sockets
> Date: 29 Jan 1995 00:01:11 -0600
> Organization: Computer Science Department, Washington University.
> Lines: 103
> Distribution: inet
> Message-ID: <3gfav7$chp@tango.cs.wustl.edu>
> NNTP-Posting-Host: tango.cs.wustl.edu
> Xref: noao comp.protocols.tcp-ip:38935 comp.unix.internals:8839 comp.unix.solaris:36330
> 
> Hi,
> 
> 	I've been doing a lot of work with non-blocking connects over
> sockets on Solaris 2.3 lately.  Since it difficult to find an in-depth
> discussion of non-blocking connects in the literature, I thought
> people might find it useful to see what I've learned.
> 
> 	Non-blocking connect()s are typically used in conjunction with
> select() or poll().  Here's the basic logic from the connection
> initiator's point of view:
> 
> 	1. The socket descriptor is created and set into non-blocking
> 	   mode: 
> 
> 		int fd = socket (PF_INET, SOCK_STREAM, 0);
> 		// Set fd into non-blocking mode via fcntl() or ioctl().
> 
> 	2. Initiate the connection:
> 
> 		if (connect (fd, ....) == -1)
> 		  {
> 		    if (errno == EINPROGRESS)
> 		      // Handle non-blocking connects (see below).
> 		    else
> 		      // An error has occurred, see errno for details. 
> 		  }
> 		else
> 		  // Connected immediately.
> 
> 	3. Sometimes, the connection completes synchronously (e.g.,
> 	   if the peer process is running on the same machine).
> 	   The more interesting cases occur when connect()
> 	   returns -1 and errno == EINPROGRESS.
> 	   
> 	   (BTW, does anyone know why BSD created another errno
> 	    value, rather than using EWOULDBLOCK, which seems much
> 	    more consistent?).
> 
> 	4. On Solaris 2.x, there appear to be several conditions that
> 	   must be tested for to ensure that a non-blocking connect()
> 	   has either succeeded or failed.  In general, when the
> 	   connection is successfully established, the descriptor
> 	   becomes "writable."  Likewise, when it fails (e.g., the
> 	   peer host has refused the connection) the descriptor
> 	   becomes "readable."  
> 	   
> 	   Oddly enough, I've seen cases where the non-blocking
> 	   connect becomes enabled for reading (rather than writing)
> 	   when it succeeds!  This seems to occur if data is
> 	   transmitted rapidly by the peer (perhaps even piggy-backed
> 	   along with the peer's SYN/ACK?). 
> 
> 	   Below a pseudo-sketch of the algorithm that I'm
> 	   currently using to handle non-blocking connects.  In the
> 	   system I'm working on, everything is nicely encapsulated
> 	   via C++ wrappers and an object-oriented event
> 	   demultiplexing engine.  However, the basic logic is as
> 	   follows:
> 
>             // Check to see if connection has been established:
> 	    { 
> 	      fd_set rd_fds;
> 	      fd_set wr_fds;
> 	      struct timeval tv = ...
> 
> 	      FD_ZERO (&rd_fds);
> 	      FD_ZERO (&wr_fds);
> 	      FD_SET (fd, &wr_fds);
> 	      FD_SET (fd, &rd_fds);
> 
> 	      if (select (fd + 1, &rd_fds, &wr_fds, 0, &tv) == 0)
> 		// problems, bail out...
> 	      else if (FD_ISSET (fd, &wr_fds) || FD_ISSET (fd, &rd_fds))
> 		{
> 		  sockaddr_in addr;
> 		  int len = sizeof addr;
> 
> 		  // Check to see if we can determine our peer's address.
> 		  if (getpeername (fd, (sockaddr *) &addr, &len) == -1)
> 		    {
> 		      // Error, fd is not connected, errno contains reason.
> 		      // ...
> 		      close (fd);
> 		      return -1;
> 		    }
> 		  else // We've successfully connected on fd.
> 		    {
> 		      // ...
> 		    }
> 		}
> 	    }
> 
> 
> I'd be interested to know whether anyone has observed serious
> portability problems with non-blocking connections via sockets?
> 
> 	Doug
> 
> -- 
> Dr. Douglas C. Schmidt 			(schmidt@cs.wustl.edu)
> Department of Computer Science, Washington University
> St. Louis, MO 63130. Work #: (314) 935-7538; FAX #: (314) 935-7302
> http://www.cs.wustl.edu/~schmidt/

And here's my reply...

> >From noao!CS.Arizona.EDU!news.Arizona.EDU!news.Cerritos.edu!nic-nac.CSU.net!usc!howland.reston.ans.net!pipex!uunet!news.u.washington.edu!tcp.com!hawkeye Tue Jan 31 14:56:27 MST 1995
> Article: 36362 of comp.unix.solaris
> Path: noao!CS.Arizona.EDU!news.Arizona.EDU!news.Cerritos.edu!nic-nac.CSU.net!usc!howland.reston.ans.net!pipex!uunet!news.u.washington.edu!tcp.com!hawkeye
> From: hawkeye@tcp.com (Hawkeye)
> Newsgroups: comp.protocols.tcp-ip,comp.unix.internals,comp.unix.solaris
> Subject: Re: Non-blocking connects with BSD sockets
> Date: 30 Jan 95 00:10:20 GMT
> Organization: The Commnet Project
> Lines: 83
> Distribution: inet
> Message-ID: 
> References: <3gfav7$chp@tango.cs.wustl.edu>
> NNTP-Posting-Host: eith.biostr.washington.edu
> X-Newsreader: NN version 6.5.0 #2 (NOV)
> Xref: noao comp.protocols.tcp-ip:38957 comp.unix.internals:8842 comp.unix.solaris:36362
> 
> schmidt@tango.cs.wustl.edu (Douglas C. Schmidt) writes:
> 
> [discussion bout non-blocking connect() on Solaris 2.3 deleted]
> 
> >            // Check to see if connection has been established:
> >	    { 
> >	      fd_set rd_fds;
> >	      fd_set wr_fds;
> >	      struct timeval tv = ...
> 
> >	      FD_ZERO (&rd_fds);
> >	      FD_ZERO (&wr_fds);
> >	      FD_SET (fd, &wr_fds);
> >	      FD_SET (fd, &rd_fds);
> 
> >	      if (select (fd + 1, &rd_fds, &wr_fds, 0, &tv) == 0)
> >		// problems, bail out...
> >	      else if (FD_ISSET (fd, &wr_fds) || FD_ISSET (fd, &rd_fds))
> >		{
> >		  sockaddr_in addr;
> >		  int len = sizeof addr;
> 
> >		  // Check to see if we can determine our peer's address.
> >		  if (getpeername (fd, (sockaddr *) &addr, &len) == -1)
> >		    {
> >		      // Error, fd is not connected, errno contains reason.
> >		      // ...
> >		      close (fd);
> >		      return -1;
> >		    }
> 
> I've tried several methods of determining whether the connect() succeeded
> or failed.
> 
> 1) Try read(fd, buf, 0).  If it fails, the connect() failed, and you
>    can examine the errno from the read() to find out why.  But, this does
>    not work on all systems.
> 
> 2) Try getpeername(), as in the example above.  If it works, the connect()
>    worked.  If it fails with errno==ENOTCONN, you know the connect()
>    failed.  This method seems to work on all systems that support
>    nonblocking connect(), but there is no way to find out why it
>    failed.  And it seems a bit kludgy to me.
> 
> 3) Try a second connect().  If it succeeds (unlikely), use it.  More likely,
>    it will fail.  If errno==EISCONN, you know the first connect() worked.
>    Otherwise, use getsockopt(fd, SOL_SOCKET, SO_ERROR, (void*)&err, &len)
>    to find out why the first connect() failed.
> 
> Method 3 seems to be the best, since it is portable _and_ gives the
> reason for failure.  The only portability problem I've found is that
> HP/UX does not have the SO_ERROR option to getsockopt(), but this is
> easily tested with #ifdef SO_ERROR.  I've never had a complaint about
> this method from any user on any system on which EINPROGRESS is defined
> (and my software has run on a lot of platforms, including BSD, IRIX,
> SunOS, Solaris, Linux, OSF/1, AIX, XENIX, SCO UNIX, HP/UX, DYNIX, just
> off the top of my head).  Additionally, it works with the SOCKS proxy
> server (version 4.3 beta and later); the other methods do not.
> 
> >		  else // We've successfully connected on fd.
> >		    {
> >		      // ...
> >		    }
> >		}
> >	    }
> 
> 
> >I'd be interested to know whether anyone has observed serious
> >portability problems with non-blocking connections via sockets?
> 
> >	Doug
> 
> >-- 
> >Dr. Douglas C. Schmidt 			(schmidt@cs.wustl.edu)
> >Department of Computer Science, Washington University
> >St. Louis, MO 63130. Work #: (314) 935-7538; FAX #: (314) 935-7302
> >http://www.cs.wustl.edu/~schmidt/
> 
> -- 
> hawkeye@tcp.com     |  TinyFugue info:  http://www.tcp.com/hawkeye/tf.html
> Ken Keys            |  The latest version of tf is 3.4 alpha 16, available at:
> 1820 Cottonwood Av. |  ftp://tf.tcp.com/pub/tinyfugue/tf.34a16.tar.gz
> Carlsbad CA 92009   |  ftp://ftp.tcp.com/pub/mud/Clients/tf/tf.34a16.tar.gz

Bernstein apparently got both those articles from Stevens, and then wrote
to me.  Here's my reply to him:

> From hawkeye  Sun Apr  9 22:48:33 1995
> From: hawkeye (Ken Keys)
> Posted-Date: Sun, 9 Apr 95 22:48:32 PDT
> Received-Date: Sun, 9 Apr 95 22:48:33 -0700
> Message-Id: <9504100548.AA01545@glia.biostr.washington.edu>
> Received: by glia.biostr.washington.edu
>   (911016.SGI/Eno-0.1) id AA01545; Sun, 9 Apr 95 22:48:33 -0700
> Subject: Re: non-blocking connect()
> To: djb@silverton.berkeley.edu (D. J. Bernstein)
> Date: Sun, 9 Apr 95 22:48:32 PDT
> Cc: hawkeye (Ken Keys)
> In-Reply-To: <199504030239.TAA12825@silverton.berkeley.edu>; from "D. J. Bernstein" at Apr 2, 95 7:39 pm
> X-Mailer: ELM [version 2.3 PL11]
> Status: RO
> 
> 
> > The question is what to do after select() returns writability.
> 
> > My favorite solution now is read(,&ch,0). If connect() failed, you get
> > the right errno through error slippage. If connect() succeeded, you get
> > a 0 return value on every system I've tested, though I could also
> > imagine a system returning -1/EWOULDBLOCK.
> > 
> > On the other hand, zero-length read()s are even more poorly documented
> > than errno slipage. Keys says this solution isn't portable; on what
> > systems does it fail?
> 
> The SunOS 5.3 read(2) man page explicitly states "If nbyte is zero,
> read() returns zero and has no other results."  I'm pretty sure I've
> seen this happen on other systems as well (I don't even think Solaris 2
> was around when I first discovered the problem with this method).
> 
> > Another possibility is to select() for readability. But this is wrong,
> > for the reason Schmidt points out, and not just under Solaris: the
> > connect() may have succeeded and data may have arrived before you had
> > time to do a select().
> 
> I think it is wrong for the opposite reason:  the connect() may have
> succeeded but the other end doesn't write anything, so your socket
> will never select() as readable.
> 
> > Another possibility is getpeername(). getpeername() will return
> > ENOTCONN if the socket is not connected. The difficulty here is finding
> > out how connect() failed. SO_ERROR is not portable (it showed up in BSD
> > 4.3, as I recall, and many vendors still haven't figured out socket
> > options), but at least it works. Error slippage will work as long as
> > getpeername() doesn't actively wipe out the socket error: we can just do
> > read(,&ch,1).
> 
> In my experience, getpeername() and SO_ERROR seemed to work best in general.
> SO_ERROR certainly seems to be intended for just this kind of use; the only
> problem is its lack of availability on all systems.  You could easily use
> it conditionally on #ifdef SO_ERROR, and otherwise use a more hackish method
> like read(,&ch,1).
> 
> > Another possibility is a second connect(). This seems to be a strictly
> > worse solution than getpeername(). It doubles network traffic if the
> > connection failed, and it doesn't show the error.
> 
> You can still use SO_ERROR to get the error, as in the getpeername()
> solution.
> 
> BTW, could you send me a copy of my article if you still have it?
> While you're at it, I'd like to see the Stevens and Schmidt articles
> too.
> 
> -- 
> hawkeye@tcp.com     |  TinyFugue info:  http://www.tcp.com/hawkeye/tf.html
> Ken Keys            |  TF 3.5 alpha 3 for UNIX and OS/2 is available at:
> 1820 Cottonwood Av. |  ftp://tf.tcp.com/pub/tinyfugue/tf.35a3.tar.gz
> Carlsbad CA 92009   |  ftp://ftp.tcp.com/pub/mud/Clients/tf/tf.35a3.tar.gz

And finally, Bernstein's reply:

> From djb@silverton.berkeley.edu  Mon Apr 10 09:01:51 1995
> Posted-Date: Mon, 10 Apr 1995 09:13:17 -0700
> Received-Date: Mon, 10 Apr 95 09:01:51 -0700
> Date: Mon, 10 Apr 1995 09:13:17 -0700
> From: "D. J. Bernstein" 
> Message-Id: <199504101613.JAA06946@silverton.berkeley.edu>
> To: djb@silverton.berkeley.edu, hawkeye@glia.biostr.washington.edu
> Subject: Re: non-blocking connect()
> 
> > The SunOS 5.3 read(2) man page explicitly states "If nbyte is zero,
> > read() returns zero and has no other results."
> 
> And it's telling the truth?
> 
> > I think it is wrong for the opposite reason:
> 
> What I meant was, select() for readability, and take readability to mean
> error. The problem is that readability doesn't necessarily mean error.
> 
> ---Dan


-- 
hawkeye@tcp.com     |  TinyFugue info:  http://www.tcp.com/hawkeye/tf.html
Ken Keys            |  TF 3.5 alpha 3 for UNIX and OS/2 is available at:
1820 Cottonwood Av. |  ftp://tf.tcp.com/pub/tinyfugue/tf.35a3.tar.gz
Carlsbad CA 92009   |  ftp://ftp.tcp.com/pub/mud/Clients/tf/tf.35a3.tar.gz