A Better Erlang TCP listening pattern: addressing the fast packet loss problem

9:04 pm Erlang, Programming, Tools and Libraries, Tutorials

[digg-reddit-me]I’ve had mixed reactions to this when I’ve discussed it with people on IRC.  This may be well known to oldbear Erlang programmers.  I suppose it’s also possible that I’m wrong, though I’ve talked to several people I respect, and more than one of them have suggested that they were already aware of this problem.  If I’m wrong, please let me know; I’m open to the possibility that there’s a better answer that I just don’t know about.  I’ve never seen it discussed on the web, at least. Update: Serge Aleynikov points out that this TrapExit tutorial documents this approach.

I think this is probably real.

I believe there is a significant defect in the idiomatic listener pattern as discussed by most Erlang websites and as provided by most Erlang example code, and which is found in many Erlang applications.  This defect is relatively easily repaired once noticed without significant impact on surrounding code, and I have added functionality to my ScUtil Library to handle this automatically under the name standard_listener.

The fundamental problem is essentially a form of race condition.  The idiomatic listener is, roughly:

do_listen(Port, Opts, Handler) ->

    case gen_tcp:listen(Port, Opts) of

        { ok, ListeningSocket } ->
            listen_loop(ListeningSocket, Handler);

        { error, E } ->
            { error, E }


listen_loop(ListeningSocket, Handler) ->

    case gen_tcp:accept(ListeningSocket) of

        { ok, ConnectedSocket } ->
            spawn(node(), Handler, [ConnectedSocket]),

        { error, E } ->
            { error, E }


Now consider the case of a server under very heavy load.  Further, consider that the listening socket is opened either {active,true} or {active,once}, which is true in the vast bulk of Erlang network applications, meaning that packets are delivered automatically to the socket owner.  The general pattern is that the listening socket accepts a connection, spawns a handling process, passes the connected socket to that handling process, and the handling process takes ownership of the socket.

The problem is that it takes time for that all to happen, and Erlang doesn’t specify or allow you to control its timeslicing behavior (as well it should not).  As active sockets are managed by a standalone process, this means that if the connecting client is fast and the network is fast, the first packet (even the first several under extreme circumstances) could be delivered before the socket has been taken over by the handling PID, meaning that its contents would be dispatched to the wrong process, with no indication of where they were meant to go.  This invalidates connections and fills a non-discarding mailbox, which is a potentially serious memory leak (especially given that erlang’s response to out of memory conditions is to abort an entire VM.)

Obviously, this is intolerable.  There are better answers, though, than to switch to {active,false}.  One suggestion I heard was to pre-spawn handlers in order to reduce the gap time, but that doesn’t solve the problem, it just makes it less likely.

The approach that I took is to lie.  standard_listener takes the following steps to resolve the problem:

  1. Add the default {active,true} to the inet options list, if it isn’t already present.
  2. Strip out the {active,Foo} from the inet options list, and store it as ActiveStatus.
  3. Add {active,false} to the inet options list, and use that options list to open the listener.
  4. When spawning handler processes, pass a shunt as the starting function, taking the real handling function and the real ActiveStatus as arguments
  5. The shunt sets the real ActiveStatus from inside the handler process, at which point the socket begins delivering messages

This neatly closes the problem.  A free, MIT license implementation can be found in ScUtil beginning in version 96.  A simplified, corrected example follows for immediate reference; the thing in ScUtil is more feature complete.

do_listen(Port, Opts) ->

    ActiveStatus = case proplists:get_value(active, SocketOptions) of
        undefined -> true;
        Other     -> Other

    FixedOpts = proplists:delete(active, SocketOptions)
             ++ [{active, false}],

    case gen_tcp:listen(Port, FixedOpts) of

        { ok, ListeningSocket } ->
            listen_loop(ActiveStatus, ListeningSocket);

        { error, E } ->
            { error, E }


listen_loop(ActiveStatus, ListeningSocket, Handler) ->

    case gen_tcp:accept(ListeningSocket) of

        { ok, ConnectedSocket } ->
            spawn(?MODULE, shunt, [ActiveStatus,ConnectedSocket,Handler]),
            listen_loop(ActiveStatus, ListeningSocket, Handler);

        { error, E } ->
            { error, E }


shunt(ActiveStatus, ConnectedSocket, Handler) ->

    controlling_process(ConnectedSocket, self()),
    inet:setopts(ConnectedSocket, [{active, ActiveStatus}]),

11 Responses

  1. Scott Parish Says:

    Another approach is to run accept() in the worker process. So your server will prestart a worker which will block on accept(). Once accept() returns, this worker sends a message back out to spawn a new worker and then goes on to handle the newly created socket.

    This is described in A fast web server demonstrating some undocumented Erlang features and used by webservers such as crary

  2. Steve Vinoski Says:

    Unless I’m missing something, if active is originally specified as anything but false, then the handler will never receive any messages because you never make it the controlling process.

  3. John Haugeland Says:

    Mr Parish: Yes, I suppose that could work.

  4. John Haugeland Says:

    Mr Vinoski: I was under the impression that setting inet options implicitly made another process the controlling process. It’s worth noting that the code does, in fact, work.

    However, for the sake of writing explicit code, I will update both the example and the library code.

  5. Steve Vinoski Says:

    @John: it doesn’t work for me. If I set {active, false} then my handler works fine, but for once or true settings messages go to the accepting process, which I verified by doing a timed receive at the top of the listen_loop and putting a timeout on accept to force loop recursion. When active is once or false, the listening loop gets the message intended for the handler, while the handler’s receive times out.

    The inet:setopts documentation says nothing about transferring socket ownership, and a quick look through the R12B-5 sources doesn’t seem to show any ownership transfer in inet:setopts or anything it calls.

  6. John Haugeland Says:

    How odd. It works fine in the shell at this end.

    Either way, fixed in Version 99. You’re added to the thanks table in Version 100.

  7. Steve Vinoski Says:

    @John: the fix you made has some problems. First, it doesn’t compile because controlling_process comes from the gen_tcp module, so you need to prefix the call with that module name. Second, the process that controls the socket is the only one that can call controlling_process to change ownership, so it can’t be done where you put it.

    Two other existing issues are: 1) closing the listening socket sends the accepting process into a tight CPU-eating loop because the closed condition is treated just as any other error, looping back to do another accept, and 2) it would be nice to be able to pass port 0 to the listener to have it open an ephemeral port and then have that port number returned to the caller (this helps make it easier to test, for one thing).

    Find a patch for all these issues below.

    Also, here’s a test program. Apply the patch, then comment out the call to gen_tcp:controlling_process/2 to simulate svn version 98 of your code. Then compile everything and run the test, like this:


    It will work for the first case (the {active, false} case) only. Restore the call to gen_tcp:controlling_process/2 and recompile, then retry the test. All cases will then work fine. This is why I think your original code never worked. Here’s the test code, feel free to use it as you like.

    -export([test/0, test/1]).
    -author(“Steve Vinoski “).

    test() ->
    lists:map(fun test/1, [false, once, true]).

    test(Active) ->
    Self = self(),
    {ok, Pid, Port} = scutil:standard_listener(
    fun(S, Opts) ->
    case proplists:get_value(active, Opts) of
    false ->
    Msg = gen_tcp:recv(S, 14, 2000),
    io:format(“active(false) handler received ~p~n”, [Msg]);
    A ->
    Msg ->
    io:format(“active(~p) handler msg: ~p~n”, [A, Msg])
    after 2000 ->
    io:format(“active(~p) handler timed out~n”, [A])
    Self ! done,
    end, 0, [{packet, raw}, binary, {active, Active}]),
    {ok, P} = gen_tcp:connect(“localhost”, Port, [{packet, raw}, binary]),
    gen_tcp:send(P, <>),
    done -> Pid ! terminate
    after 10000 -> timeout

    And here’s the patch:

    Index: scutil.erl
    — scutil.erl (revision 100)
    +++ scutil.erl (working copy)
    @@ -2662,7 +2662,7 @@

    -%% @spec standard_listener(Handler, Port, SocketOptions) -> { ok, WorkerPid } | { error, E }
    +%% @spec standard_listener(Handler, Port, SocketOptions) -> { ok, WorkerPid, ListeningPort } | { error, E }

    %% @doc {@section Network} Listens on a socket and manages the fast packet loss problem.
    @@ -2688,7 +2688,11 @@
    case gen_tcp:listen(Port, FixedOptions) of

    { ok, ListeningSocket } ->
    - { ok, spawn(?MODULE, standard_listener_controller, [Handler, Port, FixedOptions, ListeningSocket, ActiveStatus, 0]) };
    + ListeningPort = case Port of
    + 0 -> {ok, LP} = inet:port(ListeningSocket), LP;
    + _ -> Port
    + end,
    + { ok, spawn(?MODULE, standard_listener_controller, [Handler, Port, FixedOptions, ListeningSocket, ActiveStatus, 0]), ListeningPort };

    { error, E } ->
    { error, E }
    @@ -2741,9 +2745,13 @@

    { ok, ConnectedSocket } ->
    Controller ! serviced,
    - spawn(?MODULE, standard_listener_shunt, [Handler, Port, FixedOptions, ConnectedSocket, ActiveStatus]),
    + Pid = spawn(?MODULE, standard_listener_shunt, [Handler, Port, FixedOptions, ConnectedSocket, ActiveStatus]),
    + gen_tcp:controlling_process(ConnectedSocket, Pid),
    standard_listener_accept_loop(Handler, Port, FixedOptions, ListeningSocket, ActiveStatus, Controller);

    + { error, closed } ->
    + closed;
    { error, _E } ->
    standard_listener_accept_loop(Handler, Port, FixedOptions, ListeningSocket, ActiveStatus, Controller)

    @@ -2757,7 +2765,6 @@

    standard_listener_shunt(Handler, Port, FixedOptions, ConnectedSocket, ActiveStatus) ->

    - controlling_process(ConnectedSocket, self()),
    CollectedOptions = proplists:delete(active, FixedOptions) ++ [{active, ActiveStatus}, {from_port, Port}],

    case ActiveStatus of

  8. Maxim Says:

    To address the described issue (which is real) in my Erlang implementation I removed the possibility of changing the controlling process completely. There is just no controlling_process() call. The process which accepts the socket has to handle it all the way down to closure.

    I think it is intolerable for such a robust system as Erlang to drop a few packets now and then.

  9. John Haugeland Says:

    Maxim: I agree and I disagree.

    I disagree on grounds that it is possible to write software in Erlang without this vulnerability. Primarily I consider this a documentation defect; someone which knew about this problem would not have an issue.

    However, I agree, this is an unacceptable current threat level. That’s why I wrote a library fix. :D ~

  10. John Haugeland Says:

    Steve Vinoski’s fixes are now implemented. Not sure why it took me so long.

  11. links for 2010-05-23 « Donghai Ma Says:

    [...] A Better Erlang TCP listening pattern: addressing the fast packet loss problem | John Haugeland is F… (tags: erlang networking tcp programming pattern protocol) [...]

Leave a Comment

Your comment

You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.