Monday, March 03, 2008

How To Tally A Sorted List


Of all the topics I wish I had more time and more resources to talk about here on this blog, my biggest regret is that I’ve spent so little time with programming, specifically list processing.

There’s something magical about the way the simple structure of a list can be used for almost any data structure imaginable. There’s something magical about the way tools developed for manipulating lists of one kind can be re-used in totally unexpected ways with other lists.

List processing typically implies recursive algorithms but since I do most of my work with low-level embedded systems or high-level scripting languages, stack space typically doesn’t support recursive implementations. However, lists are just as powerful when used with iteration as they are with recursion.

Today’s post isn’t about list processing as a concept or about why I haven’t written more about list processing. Today’s post is a kind of companion piece to my post about bubble sorting.

Someday I hope to do a whole series of posts implementing my list processing library in JavaScript. JavaScript matrices are powerful, but my list processing routines let data get shared reasonably seamlessly with scripting environments like Microsoft Word and Excel macros.

Also someday I hope to do a post explaining why I haven’t written more about programming. I’m waiting to think of some way to write it that is even passably interesting.

Because I work in many different environments, in the course of a year I code and re-code the same algorithms two or three times. (Plus, I actually enjoy re-coding routines because I can experiment with different techniques.)

This version of my ‘tally’ function is one of the most concise I’ve ever coded.

*

The basic idea behind this routine is that when you have a list of dozens or hundreds of elements just sorting the list doesn’t impose a useful organization to the data.

Instead of something like this:

    {0,0,0,1,1,1,1,1,1,2,2,2,3,3,3,3,3,3,3 ...}

You want something like this:

    {003:0, 003:2, 006:1, 007:3 ...}

It’s a very cool feeling to take a list of what appears to be hundreds of, for instance, digit patterns and tally them and realize you only have four or five in very specific ratios.

(‘Tally’ outputs an unsorted list of counted elements. Since the count is fixed length and appended as a header, the tally list simply can be sorted to create an ordered list as above. But sometimes you want an up sort, sometimes a down, so ‘Tally’ leaves that for afters.)

This code is written in Texas Instruments Basic, which is pretty generic. The only custom function I used is td(n,c) which inputs a numeric value, n, and outputs a string that is of length c.

    td(5,3) = “005”

‘@’ is the TI comment character for inline comments.



. . . . . . . . . .



Tally(aList)

Func

Local c,r,q,x,t,p

@ c = count of input list
@ r = return list of tallied values
@ q = count of tallied values
@ x = holding variable for item being tallied
@ t = tally value for the item being tallied
@ p = pointer into the input list

@ this gets the number of input elements

Dim(aList) -> c

@ first we account for an empty input list
@ we just return the empty list

If c=0 : Return aList

@ if the list isn’t empty we prepare
@ a return list big enough for the
@ worst case tally

NewList(c) -> r
0 -> q

@ we get ready to go by setting
@ the pointer to the start
@ of the input list

1 -> p

Loop

    @ here in the outer loop we set up the element
    @ to be tallied (we dereference it because
    @ the TI GetType routine needs a variable)

    aList[p] -> x
    1 -> t
    p+1 -> p

    @ the inner loop looks through the list
    @ finding the end of the list or
    @ finding a dissimilar item or
    @ tallying up the same items

    Loop

        @ these must be separate lines because
        @ if p does point beyond the list end
        @ indexing into aList with p is bad

        If p>c : Exit
        If aList[p] ≠ x : Exit

        @ we know we’re not out of the list
        @ and we know the next element
        @ is not dissimilar so it must be similar
        @ tally it, return to the inner top

        t+1 -> t
        p+1 -> p

        EndLoop

    @ here we’re either outside the list just
    @ saving the final element or we’re
    @ saving the current tally

    If GetType(x) ≠ “STR” : String(x) -> x
    q+1 -> q
    td(t,3) & “:” & x -> r[q]

    @ if we’re outside the list
    @ we exit the outer loop
    @ return the tallies and
    @ end the function

    If p>c : Exit

    @ if we’re not outside the list
    @ we return to the outer loop’s top

    EndLoop

@ we only need the part of the
@ output list with values in it

Return Left(r,q)

EndFunc












No comments: