A Better Erlang TCP listening pattern: addressing the fast packet loss problem

Erlang, Programming, Tools and Libraries, Tutorials 11 Comments

[digg-reddit-me]I’ve had mixed reactions to this when I’ve discussed it with people on IRC.  This may be well known to oldbear Erlang programmers.  I suppose it’s also possible that I’m wrong, though I’ve talked to several people I respect, and more than one of them have suggested that they were already aware of this problem.  If I’m wrong, please let me know; I’m open to the possibility that there’s a better answer that I just don’t know about.  I’ve never seen it discussed on the web, at least. Update: Serge Aleynikov points out that this TrapExit tutorial documents this approach.

I think this is probably real.

I believe there is a significant defect in the idiomatic listener pattern as discussed by most Erlang websites and as provided by most Erlang example code, and which is found in many Erlang applications.  This defect is relatively easily repaired once noticed without significant impact on surrounding code, and I have added functionality to my ScUtil Library to handle this automatically under the name standard_listener.

The fundamental problem is essentially a form of race condition.  The idiomatic listener is, roughly:

do_listen(Port, Opts, Handler) ->

    case gen_tcp:listen(Port, Opts) of

        { ok, ListeningSocket } ->
            listen_loop(ListeningSocket, Handler);

        { error, E } ->
            { error, E }

    end.

listen_loop(ListeningSocket, Handler) ->

    case gen_tcp:accept(ListeningSocket) of

        { ok, ConnectedSocket } ->
            spawn(node(), Handler, [ConnectedSocket]),
            listen_loop(ListeningSocket);

        { error, E } ->
            { error, E }

    end.

Now consider the case of a server under very heavy load.  Further, consider that the listening socket is opened either {active,true} or {active,once}, which is true in the vast bulk of Erlang network applications, meaning that packets are delivered automatically to the socket owner.  The general pattern is that the listening socket accepts a connection, spawns a handling process, passes the connected socket to that handling process, and the handling process takes ownership of the socket.

The problem is that it takes time for that all to happen, and Erlang doesn’t specify or allow you to control its timeslicing behavior (as well it should not).  As active sockets are managed by a standalone process, this means that if the connecting client is fast and the network is fast, the first packet (even the first several under extreme circumstances) could be delivered before the socket has been taken over by the handling PID, meaning that its contents would be dispatched to the wrong process, with no indication of where they were meant to go.  This invalidates connections and fills a non-discarding mailbox, which is a potentially serious memory leak (especially given that erlang’s response to out of memory conditions is to abort an entire VM.)

Obviously, this is intolerable.  There are better answers, though, than to switch to {active,false}.  One suggestion I heard was to pre-spawn handlers in order to reduce the gap time, but that doesn’t solve the problem, it just makes it less likely.

The approach that I took is to lie.  standard_listener takes the following steps to resolve the problem:

  1. Add the default {active,true} to the inet options list, if it isn’t already present.
  2. Strip out the {active,Foo} from the inet options list, and store it as ActiveStatus.
  3. Add {active,false} to the inet options list, and use that options list to open the listener.
  4. When spawning handler processes, pass a shunt as the starting function, taking the real handling function and the real ActiveStatus as arguments
  5. The shunt sets the real ActiveStatus from inside the handler process, at which point the socket begins delivering messages

This neatly closes the problem.  A free, MIT license implementation can be found in ScUtil beginning in version 96.  A simplified, corrected example follows for immediate reference; the thing in ScUtil is more feature complete.

do_listen(Port, Opts) ->

    ActiveStatus = case proplists:get_value(active, SocketOptions) of
        undefined -> true;
        Other     -> Other
    end,

    FixedOpts = proplists:delete(active, SocketOptions)
             ++ [{active, false}],

    case gen_tcp:listen(Port, FixedOpts) of

        { ok, ListeningSocket } ->
            listen_loop(ActiveStatus, ListeningSocket);

        { error, E } ->
            { error, E }

    end.

listen_loop(ActiveStatus, ListeningSocket, Handler) ->

    case gen_tcp:accept(ListeningSocket) of

        { ok, ConnectedSocket } ->
            spawn(?MODULE, shunt, [ActiveStatus,ConnectedSocket,Handler]),
            listen_loop(ActiveStatus, ListeningSocket, Handler);

        { error, E } ->
            { error, E }

    end.

shunt(ActiveStatus, ConnectedSocket, Handler) ->

    controlling_process(ConnectedSocket, self()),
    inet:setopts(ConnectedSocket, [{active, ActiveStatus}]),
    Handler(ConnectedSocket).

Making significant strides towards documenting ScUtil

Erlang, Programming, Tools and Libraries 6 Comments

[digg-reddit-me]My erlang library ScUtil has been public at the ScUtil page for some time now.  Recently, I’ve been working on documenting it.  It’s about half documented in its current state.

Here’s the prototype ScUtil documentation.  I’m open to commentary.

Prototype: EEP18 Considered Harmful: The problems with Erlang to JSON term translation

ECMA / Javascript, ECMAScript, Erlang, General Interest, Programming, Rants, Tools and Libraries, Web and Web Standards 5 Comments

THIS IS ONLY HALF WRITTEN.  I have been sitting on this post, waiting for the mood to finish it, for months; because EEP18 is now being treated as a likely implement, I am immediately publishing the half-written version, because it exposes many (though not all) of the serious, irreconcilable problems with EEP18.

On the mailing list, people are actively trying to bring Erlang up to snuff with regards to web standards.  One of the more unfortunate choices being discussed is JSON as a data notation.  JSON, unfortunately, does not actually map to Erlang in a useful way.  Joe Armstrong has gone as far as to suggest BIFs, which are decidedly unrealistic as well as unnecessary.  My goal is to create a JSON handling library.  However, the mailing list is beginning to put momentum behind an alternative proposal which is currently presented in BIF form.

This post explains why my approach is different.  Many of the issues herein are discussed by the tabled EEP (EEP 18, “JSON BIFs” by Rickard O’Keefe), but some are not, and some of these issues are accepted when I believe they should not be.  It is my position that EEP 18 is unacceptably dangerous.  I will explain why.

Read the rest…

Yay for Varun! Open Source Inbound

ECMA / Javascript, ECMAScript, General Interest, Programming, Tools and Libraries No Comments

My boss’ boss, Varun, is letting me open source some of the work I’m doing at Kayako.  I’m not supposed to talk about the interesting three until they’re ready for release, but I can tell you that a JavaScript ISO8601 implementation is among them, and that they’re all going to be MIT licensed, no GPL contamination.

More news as I get my butt in gear and finish the libraries in question.  But, yay Varun!

Moments, Skewness and Kurtosis (Statistics in Erlang part 8)

Erlang, Math, Programming, Statistics, Tools and Libraries, Tutorials No Comments

[digg-reddit-me]So, it was pointed out to me that I had the central moments, but not the moments – ie, the ones not normalized against the input’s average.  Also, it was pointed out that most people don’t know that kurtosis and skewness are related to the central moments.  Furthermore, it turns out (and I didn’t know this) that there are in fact meaningful uses of floating-point exponents in moments.

So, I implemented moments, I replaced my central moments implementation, and I gave name wrappers for skewness and kurtosis to make them easier to identify.

This closes issues 169, 170, 171, 172 and 173.  As usual, this code is part of the ScUtil library, which is free and MIT licensed, because the GPL is evil.

moment(List, N) when is_list(List), is_number(N) ->
    scutil:arithmetic_mean( [ pow(Item, N) || Item <- List ] ).

moments(List) ->
    moments(List, [2,3,4]).

moments(List, Moments) when is_list(Moments) ->
    [ moment(List, M) || M <- Moments ].

central_moment(List, N) when is_list(List), is_number(N) ->
    ListAMean = scutil:arithmetic_mean(List),
    scutil:arithmetic_mean( [ pow(Item-ListAMean, N) || Item <- List ] ).

central_moments(List) ->
    central_moments(List, [2,3,4]).

central_moments(List, Moments) when is_list(Moments) ->
    [ central_moment(List, M) || M <- Moments ].

skewness(List) -> central_moment(List, 3).
kurtosis(List) -> central_moment(List, 4).

Spearman’s Rank Correlation Coefficient (Statistics in Erlang part 7)

Erlang, Math, Programming, Statistics, Tools and Libraries No Comments

[digg-reddit-me]Spearman’s Rank Correlation Coefficient – usually just called The Spearman Correlation, sometimes Spearman’s Rank or Spearman’s Rho – is a method of determining the similarity between two numeric sets (we also go over the Pearson and the Kendall).  If you don’t know what a correlation is, start with Statistics in Erlang part 5, which covers the basic idea of correlations.  A good math-based explanation is available at David M Lane’s Hyperstat.

In English, for two lists of length N, you take the square of the difference between each matched row in the two lists, sum them, multiply by six, and divide by N cubed minus N.  This provides the rank correlation squared, which is over the interval [-1, 1].  However, with spearman you usually use the rank squared, so we leave it that way instead of providing the root; the tuple’s atom label makes it clear that’s happening.

Sorry again about the screwy formatting; I’m just getting things to fit on the blog.  The stuff in the library is better formatted.

spearman_correlation(List1, List2) when
    is_list(List1), is_list(List2),
    length(List1) /= length(List2) -> {error, lists_must_be_same_length};

spearman_correlation(List1, List2) when is_list(List1), is_list(List2) ->

    {TR1,_} = lists:unzip(ordered_ranks_of(List1)),
    {TR2,_} = lists:unzip(ordered_ranks_of(List2)),

    Numerator   = 6 * lists:sum([ (D1-D2)*(D1-D2)
                               || {D1,D2} <- lists:zip(TR1,TR2) ]),
    Denominator = math:pow(length(List1),3)-length(List1),

    {rsquared,1-(Numerator/Denominator)}.

Test data is available at Geography Fieldwork.

1> X = [50,175,270,375,425,580,710,790,890,980].
[50,175,270,375,425,580,710,790,890,980]
2> Y = [1.80,1.20,2.00,1.00,1.00,1.20,0.80,0.60,1.00,0.85].
[1.80000, 1.20000, 2.00000, 1.00000, 1.00000,
 1.20000, 0.800000, 0.600000, 1.00000, 0.850000]
3> scutil:spearman_correlation(X,Y).
{rsquared,-0.730303}

This code is part of the ScUtil library.  This code is free and MIT licensed, because the GPL is evil.  This code uses the ordered ranks code from Part 4.  This closes issue 139.

The Pearson Correlation Coefficient (Statistics in Erlang part 6)

Erlang, Math, Programming, Statistics, Tools and Libraries 2 Comments

[digg-reddit-me]I went over much of the concept of correlations in Part 5; if you don’t know what statistical correlations are, you should read part 5 first.

The Pearson Correlation Coefficient is one method of generating the correlation of sets.  You can get a good math-based explanation at David Lane’s Hyperstat.

In english, basically, you take two numeric lists of the same length, X and Y, then calculate five sums and a length from them:

  1. The sum of the items in List X (SumX)
  2. The sum of the items in List Y (SumY)
  3. The sum of the squares of the items in List X (SumXX)
  4. The sum of the squares of the items in List Y (SumYY)
  5. The sum of the products of the matched items in Lists X and Y (SumXY)
  6. The length of the lists, which should be the same (N)

Using those, you can construct a polynomial which is honestly best expressed in code:

pearson_correlation(List1, List2) when is_list(List1), is_list(List2) ->

    SumXY = lists:sum([A*B || {A,B} <- lists:zip(List1,List2) ]),   ]

    SumX  = lists:sum(List1),
    SumY  = lists:sum(List2),

    SumXX = lists:sum([L*L || L<-List1]),
    SumYY = lists:sum([L*L || L<-List2]),

    N     = length(List1),

    Numer = (N*SumXY) - (SumX * SumY),
    Denom = math:sqrt(((N*SumXX)-(SumX*SumX)) * ((N*SumYY)-(SumY*SumY))),

    {r, (Numer/Denom)}.

This code is part of the ScUtil Library.  The ScUtil library is free and MIT licensed, because the GPL is evil.

1> X = [1,3,5,6,8,9,6,4,3,2].      
[1,3,5,6,8,9,6,4,3,2]
2> Y = [2,5,6,6,7,7,5,3,1,1].  
[2,5,6,6,7,7,5,3,1,1]
3> scutil:pearson_correlation(X,Y).
{r,0.854706}

Verification of test data is available at Changing Minds.  This closes issue 140.

Statistical Correlations (Statistics in Erlang part 5)

Erlang, Math, Programming, Statistics, Tools and Libraries 2 Comments

[digg-reddit-me]Since most of my readers are programmers, I’m going to explain this in programmer-speak.  Also, it’s damned hard to find a non-math explanation of this stuff.

The general idea with correlations is simple: we want to measure how much changes in one set affect the other set – that is, their correlation.  Correlations aren’t about the actual values involved in the two columns, so much as how they seem to affect one another.

A simple example is the edge size and volume of a cube.  As the edge size goes up, so will the volume.  To that end, if you make a two-column table where one column is edge size and the other volume (or, for that matter, face size works too), and then the rows are just a bunch of example data, you would want to see a “perfect correlation” – without fail, the change in column 1 should show a perfect match for changes in column 2.  For a perfect match like that, you get a correlation of 1.0.  Similarly, if you measure the average density of a fixed number of particles in that space, as the edge size goes up, the average density goes down; you would see a “perfect inverse” correlation, or a value of -1.0.  If you measure two values which aren’t correlated – where values in one column don’t seem to affect values in the other – you should get a value at or near zero.

The purpose of the correlation coefficient is to tell how how strongly two columns are correlated, as well as whether their correlation is positive (similar) or negative (inverse).  You can use measurements to determine whether sets of measurements are related.

Consider, for example, a table of height and weight among a distribution of people.  One expects a strong correlation, but not perfect; some people are over- and under-weight for their height.  The closer that measurement comes to 1, the less the outside factors matter.  The closer that measurement comes to zero, the less dominant the measured term is in the measured result.  In practical terms, if you see (for example) a stronger correlation between users of Medicine X and outbreaks of Symptom Y than in the general population, it is likely that Medicine X has Symptom Y as a long-term ramification.

The way this is achieved is through ranking, which was covered in Statistics in Erlang part 4.  The general idea is straightforward: just make a list of your values’ ranks from most significant (usually largest) to least, starting counting at 1.  Do that for both columns, then sort by the first column (keep the columns correlated of course).  At that point, what you actually do to measure the correlation varies from method to method, but the general landscape of things should now be apparent: we’re just measuring how much the difference in rank varies when sorted by one column.

There are several ways to get such correlations.  We’re going to go over the big three – the Pearson Correlation Coefficient, the Kendall Tau Rank Correlation, and the Spearman Rank Correlation Coefficient.  Each one is covered in one of the upcoming tutorials: Pearson Correlation in Erlang (part 6), Spearman Correlation in Erlang (part 7) and Kendall Correlation in Erlang (part 8).

Ranks Of, Ordered Ranks Of and Tied Ranks Of (Statistics in Erlang part 4)

Erlang, Math, Programming, Statistics, Tools and Libraries 8 Comments

[digg-reddit-me]The upcoming Statistics in Erlang post, on correlations, requires some groundwork tools.  At this time, I don’t actually know the proper statistical names for these functions, and in fact such names probably don’t exist, so I just made some up; these functions are treated by the texts describing correlations as an inline part of the process, but the process is much easier to understand if they’re seperate, so I seperated them.

Witness ranks_of(), ordered_ranks_of() and tied_ranks_of().  On their own, their importance isn’t tremendously apparent, but when we get to correlations, it’ll suddenly become obvious.

  • ranks_of()
    • Ranks of takes a list and reverse sorts it, then returns a series of 2-tuples whose first member is the “rank” or 1-offset ordinal of the item’s list position, and whose second member is the original list value.  Erlang sort is used to order the terms, so they don’t actually need to be strictly numeric, though the function is pretty meaningless otherwise.  Soon, a version will be provided that takes a predicate for custom sorting.
    • Tied (equal) values are listed in order as separate ranks.  If you want those ties to be expressed as average values, use tied_ranks_of() below.  Because there are no ties, ranks are expressed as integers.
    • Values are presented with highest rank (rank 1, highest literal value) sorted to the front.
    • 1> scutil:ranks_of([3,8,22,8,535]).
      [{1,535},{2,22},{3,8},{4,8},{5,3}]
  • tied_ranks_of()
    • Very similar to ranks_of(), except that ranks are presented as tied when values are equivalent.  Notice how with ranks_of(), the two 8s are ranked 3 and 4; here, they are both ranked 3.5.
    • Because ties can create half-values, ranks are presented as floats.
    • 86> scutil:tied_ranks_of([3,8,22,8,535]).
      [{1.00000,535},{2.00000,22},{3.50000,8},{3.50000,8},{5.00000,3}]
  • ordered_ranks_of()
    • Correlations frequently need to be column cross-sorted.  The easiest way to deal with this is to provide a ranking function which provides the rankings in the column’s original order.  ordered_ranks_of() is tied_ranks_of() which does not alter list ordering.
    • 3> scutil:ordered_ranks_of([3,8,22,8,535]).
      [{5.00000,3},{3.50000,8},{2.00000,22},{3.50000,8},{1.00000,535}]

These functions are important building blocks to the correlations, coming up in parts 5 and 6.

As usual, this code is part of the ScUtil library. ScUtil is free and MIT license, because the GPL is evil.

This code doesn’t need any of the prior parts 1, 2, or 3 of Statistics in Erlang, but I’m linking them to make them easy to find.

This closes issue 129.

Sorry about the screwy formatting; it’s hard fitting code into a blog.  The formatting in the module is much nicer.

ranks_of(List) when is_list(List) ->
    lists:zip(lists:seq(1,length(List)),lists:reverse(lists:sort(List))).

tied_ranks_of(List) ->
    tied_rank_worker(ranks_of(List), [], no_prev_value).

tied_add_prev(Work, {FoundAt, NewValue}) ->
    lists:duplicate(
        length(FoundAt),
        {lists:sum(FoundAt) / length(FoundAt), NewValue}
    ) ++ Work.

tied_rank_worker([], Work, PrevValue) ->
    lists:reverse(tied_add_prev(Work, PrevValue));

tied_rank_worker([Item|Remainder], Work, PrevValue) ->
    case PrevValue of
        no_prev_value ->
            {BaseRank,BaseVal} = Item,
            tied_rank_worker(Remainder, Work, {[BaseRank],BaseVal});
        {FoundAt,OldVal} ->
            case Item of
                {Id,OldVal} ->
                    tied_rank_worker(
                        Remainder,
                        Work,
                        {[Id]++FoundAt,OldVal});
                {Id,NewVal} ->
                    tied_rank_worker(Remainder,
                        tied_add_prev(Work, PrevValue),
                        {[Id],NewVal})
            end
    end.

ordered_ranks_of(List) when is_list(List) ->
    ordered_ranks_of(List, tied_ranks_of(List), []).

ordered_ranks_of([], [], Work) ->
    lists:reverse(Work);

ordered_ranks_of([Front|Rem], Ranks, Work) ->
    {value,Item} = lists:keysearch(Front,2,Ranks),
    {IRank,Front} = Item,
    ordered_ranks_of(Rem, Ranks--[Item], [{IRank,Front}]++Work).

Standard Deviation, Root Mean Square and the Central Moments (Statistics in Erlang part 3)

Erlang, Math, Programming, Statistics, Tools and Libraries 4 Comments

[digg-reddit-me]These are standard statistical tools for measuring differences, drift and error within sets, as well as Kth moments about the mean.  Central moments are also the building blocks of skewness and kurtosis, which are covered in a later post.  Root mean square is particularly useful as a measure of set error.  Standard deviation is great for figuring out how much spread/differentiation there is within a set.

This code requires the arithmetic mean stuff from Statistics in Erlang part 1.  There’s interesting, unrelated stuff in Part 2.

As usual, this code is part of the ScUtil library.  ScUtil is free and MIT license, because the GPL is evil.

This implements issue 128.  This implements issue 138.  This finishes implementing issue 119.

std_deviation(Values) when is_list(Values) ->
    Mean = arithmetic_mean(Values),
    math:sqrt(arithmetic_mean([ (Val-Mean)*(Val-Mean) || Val <- Values ])).

root_mean_square(List) when is_list(List) ->
    math:sqrt(arithmetic_mean([ Val*Val || Val <- List ])).

central_moments(Items) ->
    Cnt  = length(Items),
    Mean = lists:sum(Items) / Cnt,
    TFun = fun(X) ->
        Base = X-Mean, B2=Base*Base, B3=B2*Base, B4=B3*Base, {B2,B3,B4}
    end,
    collapse_central_moments(Cnt, [ TFun(I) || I <- Items ], {0,0,0}).

collapse_central_moments(N, [], {WM2, WM3, WM4}) ->
    { WM2/N, WM3/N, WM4/N };

collapse_central_moments(N, [{I2,I3,I4}|Rem], {WM2, WM3, WM4}) ->
    collapse_central_moments(N, Rem, {I2+WM2, I3+WM3, I4+WM4}).

« Previous Entries