Mean, Median, Mode and Histograph (Statistics in Erlang Part 2)
August 10, 2008 5:01 pm Erlang, Math, Programming, Statistics, Tools and Libraries[digg-reddit-me]Mean is a complex topic, and is covered in Statistics in Erlang part 1.
Median and mode are less complex. It’s worth noting that mode reports its results as a list, because it’s possible for there to be several modes for a list. Also, mode is not strictly numeric – it works for mixed-type lists too (you can take the mode of a list of atoms, for example.)
Mode is really just a reduction of the results of histograph, which is similarly open to arbitrary type list contents. Mode also uses a toy function even_or_odd which is provided here.
As usual, this code is part of the ScUtil library. ScUtil is free and MIT license, because the GPL is evil.
This closes issue 100. This closes issue 105. This closes issue 134. This closes issue 135.
histograph(List) when is_list(List) ->
[Head|Tail] = lists:sort(List),
histo_count(Tail, Head, 1, []).
histo_count([], Current, Count, Work) ->
lists:reverse([{Current,Count}]++Work);
histo_count([Current|Tail], Current, Count, Work) ->
histo_count(Tail, Current, Count+1, Work);
histo_count([New|Tail], Current, Count, Work) ->
histo_count(Tail, New, 1, [{Current,Count}]++Work).
even_or_odd(Num) when is_integer(Num) ->
if
Num band 1 == 0 -> even;
true -> odd
end.
median(List) when is_list(List) ->
SList = lists:sort(List),
Length = length(SList),
case even_or_odd(Length) of
even -> [A,B] = lists:sublist(SList, round(Length/2), 2), (A+B)/2;
odd -> lists:nth( round((Length+1)/2), SList )
end.
mode([]) -> [];
mode(List) when is_list(List) ->
mode_front(lists:reverse(lists:keysort(2, scutil:histograph(List)))).
mode_front([{Item,Freq}|Tail]) ->
mode_front(Tail, Freq, [Item]).
mode_front([ {Item, Freq} | Tail], Freq, Results) ->
mode_front(Tail, Freq, [Item]++Results);
mode_front([{_Item,_Freq} |_Tail],_Better, Results) ->
Results;
mode_front([], _Freq, Results) -> Results.

August 10th, 2008 at 8:20 pm
[...] This code requires the arithmetic mean stuff from Statistics in Erlang part 1. There’s interesting, unrelated stuff in Part 2. [...]