tachikoma valentines

Another bad sort

Oops, I fell inactive on LJ again.  At least I’m still pretty active on Stack Overflow regardless of what you see here.

Recently a whimsical question there asked for insertion sort in various languages.  Pretty easy stuff, but I chose to answer with J again because my other favorite languages (Haskell, Perl, C) were already implemented.

(([<.{.@]),}.@]$:~[>.{.@])`[@.(0=#@])/  (Yes, it’s actually longer than the quicksort I mentioned a little while ago.)

NB. given two arguments, return the left argument
insert0 =: 4 : 'x'
insert0 =: [
NB. given two arguments, return the lesser of left and head of right
insert1min =: 4 : 'x <. {. y'
insert1min =: [ <. {.@]
NB. given two arguments, return the greater of left and head of right
insert1max =: [ >. {.@]
NB. recurse with insert1max and tail of right, and prepend insert1min
insert1 =: 4 : 'x insert1min y , (x insert1max y) insert }. y'
insert1 =: insert1min , insert1max insert }.@]
insert1 =: insert1min , }.@] insert~ insert1max
NB. if right has zero length, insert0, else insert1
NB. the recursive reference to insert in insert1 can be replaced by $:
insert =: insert1 ` insert0 @. (0 = #@])
NB. apply insert between all elements of argument
sort =: insert/
tachikoma valentines

Paludis crossbuilding

Gentoo has this terrific tool named crossdev.
dtlin@burnup:~$ su -c 'emerge crossdev'
… … …
dtlin@burnup:~$ crossdev
Usage: crossdev [options] --target TARGET

    --b, --binutils ver   Specify version of binutils to use
    --g, --gcc ver        Specify version of gcc to use
    --k, --kernel ver     Specify version of kernel headers to use
    --l, --libc ver       Specify version of libc to use
    -S, --stable          Use latest stable versions as default
    -C, --clean target    Uninstall specified target
    -P, --portage opts    Options to pass to emerge (see emerge(1))
    --with[out]-headers   Build C library headers before C compiler?
Stage Options:
    -s0, --stage0         Build just binutils
    -s1, --stage1         Also build a C compiler (no libc/C++)
    -s2, --stage2         Also build kernel headers
    -s3, --stage3         Also build the C library (no C++)
    -s4, --stage4         Also build a C++ compiler [default]
Extra Fun (must be run after above stages):
    --ex-only             Skip the stage steps above
    --ex-gcc              Build extra gcc targets (gcj/ada/etc...)
    --ex-gdb              Build a cross gdb
    --ex-insight          Build a cross insight
Target (-t) takes a tuple ARCH-VENDOR-OS-LIBC; see 'crossdev -t help'
(Yes, Gentoo can be a bit too… happy with colored terminals sometimes.)

If I want a cross-compiler targeting Windows, all I have to do is ensure a valid $PORTDIR_OVERLAY in /etc/make.conf, which all developers will have already, and run crossdev -t i586-mingw32msvc. Et voilà, a i586-mingw32msvc-gcc appears!

Except it only works with Gentoo’s default package manager, Portage.  On my system, it’s been replaced by Paludis, “the other package mangler”.  And there doesn’t seem to be documentation on how to manually take advantage of the ebuilds’ cross-building features manually.  Well, I think I can remedy that.

First, set up the Paludis equivalent of $PORTDIR_OVERLAY and fill it.
dtlin@burnup:~$ su -
burnup ~ # cat >/etc/paludis/repositories/local.conf
location = /var/paludis/repositories/local
master_repository = gentoo
format = ebuild
burnup ~ # mkdir -p /var/paludis/repositories/local/profiles
burnup ~ # cd /var/paludis/repositories/local
burnup local # mkdir cross-i586-mingw32msvc
burnup local # ls -d *-* >profiles/categories
burnup local # ln -fns ../../gentoo-portage/{sys-devel/{binutils,gcc,gdb},dev-util/{mingw-runtime,w32api}} cross-i586-mingwmsvc/
Next, build the stages in order.
burnup local # echo 'cross-i586-mingw32msvc/* crosscompile_opts: headers-only' >>/etc/paludis/use.conf
burnup local # echo 'cross-i586-mingw32msvc/* -* nocxx' >>/etc/paludis/use.conf
burnup local # paludis -i cross-i586-mingw32msvc/{w32api,mingw-runtime,gcc}
burnup local # sed -i '$d' /etc/paludis/use.conf
burnup local # sed -i '$d' /etc/paludis/use.conf
burnup local # paludis -i cross-i586-mingw32msvc/{w32api,mingw-runtime,gcc}
Most targets will use linux-headers instead of w32api and glibc/dietlibc/uclibc/klibc instead of mingw-runtime, but that’s the general idea.

Ah, I forgot…
burnup local # cat >>/etc/paludis/bashrc
case "${CATEGORY}/${PN}" in
    # cross-build
        CBUILD="${CHOST}" CTARGET="${CATEGORY#cross-}"
        CBUILD="${CHOST}" CTARGET="${CATEGORY#cross-}"
        CC="${CTARGET}-gcc" CXX="${CTARGET}-g++" LD="${CTARGET}-ld"
This may be necessary after stage 2 (but will probably fail before).
tachikoma valentines

Dynamic programming to the rescue

Project Euler, problem 76: “How many different ways can one hundred be written as a sum of at least two positive integers?”

Well, hmm.  There are $\sum_{i=1}^{99}\begin{pmatrix}99\\i\end{pmatrix}$=299-1=633825300114114700748351602687 ways of partitioning 100.  Eliminate duplicates (1+2=2+1) and we’ll have the answer… in a trillion years or so.  That’s not really a good solution.

A better solution would use a better method.  Does f(n,m)=$\left{\begin{matrix}0&,&n<0\\1&,&n=0\\\sum_{i=1}^{m}f(n-i,i)&,&\text{otherwise}\end{matrix}\right.$, g(n)=f(n,n)-1 look usable?  g(n) will take us through a n-depth call stack, forking out into O(2n) different function calls – just like before, the complexity is still too high to fit within Project Euler’s suggested "one-minute" maximum compute time.

p=repeat 1:[[sum[p!!(n-i)!!i|i<-[1..min m n]]|m<-[0..]]|n<-[1..]]
#include <stdio.h>
#define TOP 100
static int ways[TOP];
int main() {
    int i, j;
    for (i = 1; i < TOP; i++) {
        ways[i - 1]++;
        for (j = 0; i + j < TOP; j++)
            ways[j + i] += ways[j];
    printf("%d\n", ways[TOP - 1]);
J, Haskell, and C respectively, all running in under one second.

Both Haskell and C have been memoized manually, while J takes advantage of M., the auto-memoizer.  Now, ain’t that a handy language built-in?
tachikoma valentines

Project Euler

Project Euler is a countable collection of problems, requiring a mix of mathematical reasoning and machine computation to solve.  Every once in a while, I get the urge to try to grow the subset of problems I’ve completed.

At only 85 solved, I’m still solidly within Level 2, which is not enough to make it into their permanent rankings yet.

For the most part, I write my calculations in Haskell, falling back to C when I’m unable to express an algorithm in anything but imperative form.

However, there is this very interesting language out there going by the un-Google-able name of J:

999999 A. '0123456789'

5 solutions to 5 problems in 5 lines, much more succinctly stated than would be possible in any other language.  (Largest prime factor of 600851475143; sum of primes under two milion; millionth permutation of 10 digits by lexical sort; how many distinct numbers n of the form xy where 1<x,y≤100; sum of the first hundred rows of Pascal’s triangle.)

The title of last week’s post, A.~(1:I.~([:*./}:<:}.)"_1)@(i.@!@#A.]), is an implementation of bogosort, which is very much not recommended for actual use.

However, quicksort can be written as (($:@(<#[),(=#[),$:@(>#[)){.)^:(1<#); this reads as “when count greater than 1, recurse on items less than and greater than head, joined by items equal in the middle”.

Not that you’d should use that either; /:~ is the built-in sort.
tachikoma valentines

Sorting factoid #1

Anyone who has gone through any introductory-level computer science course has probably been exposed to a few common sorting algorithms.

#include <stdlib.h>
#include <string.h>

static void swap(int *a, int *b) {
    if (a != b) {
        int c = *a;
        *a = *b;
        *b = c;

void bubblesort(int *a, int l) {
    int i, j;

    for (i = l - 2; i >= 0; i--)
        for (j = i; j < l - 1 && a[j] > a[j + 1]; j++)
            swap(a + j, a + j + 1);

void selectionsort(int *a, int l) {
    int i, j, k;
    for (i = 0; i < l; i++) {
        for (j = (k = i) + 1; j < l; j++)
            if (a[j] < a[k])
                k = j;
        swap(a + i, a + k);

static void hsort_helper(int *a, int i, int l) {
    int j;

    for (j = 2 * i + 1; j < l; i = j, j = 2 * j + 1)
        if (a[i] < a[j])
            if (j + 1 < l && a[j] < a[j + 1])
                swap(a + i, a + ++j);
                swap(a + i, a + j);
        else if (j + 1 < l && a[i] < a[j + 1])
            swap(a + i, a + ++j);

void heapsort(int *a, int l) {
    int i;

    for (i = (l - 2) / 2; i >= 0; i--)
        hsort_helper(a, i, l);

    while (l-- > 0) {
        swap(a, a + l);
        hsort_helper(a, 0, l);

static void msort_helper(int *a, int *b, int l) {
    int i, j, k, m;

    switch (l) {
        case 1:
            a[0] = b[0];
        case 0:

    m = l / 2;
    msort_helper(b, a, m);
    msort_helper(b + m, a + m, l - m);
    for (i = 0, j = 0, k = m; i < l; i++)
        a[i] = b[j < m && !(k < l && b[j] > b[k]) ? j++ : k++];

void mergesort(int *a, int l) {
    int *b;

    if (l < 0)

    b = malloc(l * sizeof(int));
    memcpy(b, a, l * sizeof(int));
    msort_helper(a, b, l);

static int pivot(int *a, int l) {
    int i, j;

    for (i = j = 1; i < l; i++)
        if (a[i] <= a[0])
            swap(a + i, a + j++);

    swap(a, a + j - 1);

    return j;

void quicksort(int *a, int l) {
    int m;

    if (l <= 1)

    m = pivot(a, l);
    quicksort(a, m - 1);
    quicksort(a + m, l - m);

Sadly, unless you were paying a modicum of attention during algorithms, the equivalence between quicksort and binary search trees may have escaped you.

struct node {
    int value;
    struct node *left, *right;

void btreesort(int *a, int l) {
    int i;
    struct node *root = NULL, **ptr;

    for (i = 0; i < l; i++) {
        for (ptr = &root; *ptr;)
            ptr = a[i] < (*ptr)->value ? &(*ptr)->left : &(*ptr)->right;
        *ptr = malloc(sizeof(struct node));
        **ptr = (struct node){.value = a[i]};

    for (i = 0; i < l; i++) {
        struct node *node;
        for (ptr = &root; (*ptr)->left; ptr = &(*ptr)->left);
        a[i] = (*ptr)->value;
        node = (*ptr)->right;
        (*ptr) = node;

Well, it’s true: the two are simply slightly different views of the same idea.

QuicksortBinary tree
Choose a pivot, x.Insert a node, x.
If y is less than x, move it to the left of x.If y is less than x, put it in the left subtree of x.
If y is greater than x, move it to the right of x.If y is greater than x, put it in the right subtree of x.
Sort left of x, sort right of x.Left and right of x are binary search trees.
The resulting array is in sorted order.Follow an in-order traversal of the tree.

And now you, too, know.
tachikoma valentines

Everyone loves music, even those who can’t play

Having recently “acquired” the K-ON! music collections, I’m quite disappointed to be missing (scans of) the official score book with sheet music… it sounds quite amazing.

Ah well.  As I’m not a paying customer, I don’t have much room to complain.  Besides, I can still enjoy videos which some talented netizens have chosen to share:

けいおん!K-ON!OP Cagayake! GIRLS 叩いてみた

けいおん!K-ON!ED Don't say "lazy" 叩いてみた

けいおん!K-ON!ふわふわ時間 叩いてみた


【ギター】Don't say "lazy"【教えてみる】

tachikoma valentines


Verizon Northeast Router Failure

Our household switched over to FiOS® some while ago, for both TV and Internet.  The router hung a few times early on, but that was rare, and despite the lack of certain functionality (why does it answer BOOTP without being able to configure clients?) the Actiontec MI424-WR has been pretty nice.  (Hint: enable administration over port 992, apt-get install telnet-ssl, and you can dive right into the system internals, including a BusyBox shell environment.  The vendor has GPL-mandated sources reasonably accessible; unfortunately the management bits are tucked away inside a monolithic OpenRG binary.)

Today held the first real disappointment in their service for me.  Some major peering node went down, and bam half the Internet is inaccessible for over an hour, and horribly slow thereafter.  I’m still unable to create a stable connection to my corporate VPN even now.

Not that it explains my ¾years-long absence, but even a bad excuse is a good excuse to start back up again, isn’t it?
tachikoma valentines

What kind of a first-world country relies on overhead power lines?

One year ago, Storm Leaves Northeast Buried in Snow:

A pre-winter blend of snow, sleet and freezing rain cut visibility and iced over highways from the Great Lakes to New England, dumping up to a foot-and-a-half of snow, stranding air and road travelers and causing an airliner to skid off a runway.

The storm knocked out power to thousands of homes and businesses Sunday, including 137,000 in Pennsylvania, at least 10,000 in northern New England and 20,000 in eastern Canada, authorities reported.

I’d take snow over ice any day, though.  This past week, Ice storm cripples north-east US:

As many as 1m people have been left without power in the north-eastern US after one of the worst ice storms in a decade crippled the electricity grid.

States of emergency have been declared in Massachusetts, New Hampshire and in parts of Maine and New York state.

About 1.4 million homes and businesses across the four affected states were left without electricity on Friday morning after a widespread overnight ice storm coated power lines, pylons and trees.

<title/> is an overheard line from a conversation between people forced to temporarily relocate due to the icy storm.

I’ve actually returned home just now; power has finally been restored to my neighborhood (although reports are warning that as the ice melts, trees snapping back to their original positions could cause further outages).  Our house’s heat and running water relies on electricity, so my family grabbed our stuff and escaped to a hotel in Waltham.  Boston and its nearby surroundings were lucky enough to escape most of the damage.  Not so much for other population centers like Worchester, Fitchburg, and Lowell, nor the little town where we live.

At least we’re not in New Hampshire.  Their power grid might not be back in service until Thursday or Friday – one whole week later.

tachikoma valentines

Wow, I suck.

I can't say that I'll post more often, because I probably still won’t.  Sorry.

Google Reader - chibiryuu's shared items does get updated frequently, with stuff I find interesting, so check there for updates if you want to poll for my aliveness.  Also, if you publish a public RSS feed, thank you for making it easy to stalk keep in touch; if you don’t, shame on you for being anti-social.

Anyhow, I was pretty happy to find a graph of RPM over MPH for various gears of a 2006 Civic Si while randomly browsing: its data corroborates my experiences driving my car.

No, I don’t drive up to max RPMs.  (Okay, I did that sometimes, but not very often.  Besides, I’ve never brought 4th gear to max rev: that’s over 100mph!)  It means that I now have numeric quantification for an interesting pattern I’ve noticed:
gear         2000 rpm                3000 rpm
----  -----------------------  ---------------------
1st   14 km/h = 10 km/h + 43%  13 mph = 10 mph + 34%
2nd   22 km/h = 20 km/h + 12%  20 mph = 20 mph +  5%
3rd   31 km/h = 30 km/h +  5%  29 mph = 30 mph -  2%
4th   41 km/h = 40 km/h +  4%  38 mph = 40 mph -  3%
5th   51 km/h = 50 km/h +  4%  48 mph = 50 mph -  3%
6th   73 km/h = 70 km/h +  5%  68 mph = 70 mph -  2%
Rev-matching anything outside of 1st gear is so very simple.  It seemed far too coincidental when I discovered it experimentally, but the numbers back it up.

Notice how the gear ratios get tigher and tigher packed from 1st to 5th, and then there’s that big jump from 5th to 6th?  When I shift by ear (RPM is correlated to pitch of engine noise), that always throws me off.  Basically, shifting by a single gear usually involves a nice-sounding change in pitch — a major third or perfect fifth interval — except for that 5/6 transition, which is a dissonant tritone.
tachikoma valentines

Assembly review, part 1 of 3

None of this will be new to most of my readers… because I’m pretty sure I have 3 at most, and 2 of them should know this stuff already.  But!  This is for the sake of the third.

Say you’re given a disassembly.  For example, here’s some that might be outputted by objdump -d fun.o:
fun.o:     file format elf32-i386

Disassembly of section .text:

00000000 <fun>:
   0:        55                           push   %ebp
   1:        89 e5                        mov    %esp,%ebp
   3:        53                           push   %ebx
   4:        83 ec 08                     sub    $0x8,%esp
   7:        83 7d 08 01                  cmpl   $0x1,0x8(%ebp)
   b:        77 08                        ja     15 <fun+0x15>
   d:        8b 45 08                     mov    0x8(%ebp),%eax
  10:        89 45 f8                     mov    %eax,-0x8(%ebp)
  13:        eb 23                        jmp    38 <fun+0x38>
  15:        8b 45 08                     mov    0x8(%ebp),%eax
  18:        83 e8 02                     sub    $0x2,%eax
  1b:        89 04 24                     mov    %eax,(%esp)
  1e:        e8 fc ff ff ff               call   1f <fun+0x1f>
  23:        89 c3                        mov    %eax,%ebx
  25:        8b 45 08                     mov    0x8(%ebp),%eax
  28:        83 e8 01                     sub    $0x1,%eax
  2b:        89 04 24                     mov    %eax,(%esp)
  2e:        e8 fc ff ff ff               call   2f <fun+0x2f>
  33:        01 c3                        add    %eax,%ebx
  35:        89 5d f8                     mov    %ebx,-0x8(%ebp)
  38:        8b 45 f8                     mov    -0x8(%ebp),%eax
  3b:        83 c4 08                     add    $0x8,%esp
  3e:        5b                           pop    %ebx
  3f:        5d                           pop    %ebp
  40:        c3                           ret   
Let’s work step-by-step to figure out what this is.

Assumption: cdecl calling convention.

Assumption: fun is a function.  Stuff other than functions can and do end up in the .text section, but this seems to disassemble just fine, so it’s probably executable code.  So let’s start with a skeleton:
… fun(…);

Bytes 0-6 are the function prelude: saving the old %ebp, setting %ebp to the top of our frame, and growing our stack by 0x8.

Quick, what does the stack look like?
....^-- arguments are above the return address
....<-- return address
    v-- top of our frame -- %ebp now points here
....<-- the eight bytes of our frame
....<-- that we just reserved...
    ^-- bottom of our frame -- %esp now points here

Bytes 7-c compare the literal value 1 to the 4-byte value at an address 8 bytes above %ebp — in other words, the first argument — then if (unsigned)arg1 > 1, we jump to byte 15 (ja is the unsigned version of jg, jump-if-greater.)  Now we know that fun takes at least one argument, the first of which is an unsigned 4-byte value, so:
… fun(unsigned arg1, …);

Suppose we didn’t take the jump, meaning that (unsigned)arg1 ≤ 1.  Bytes d-14 copy arg1 to %eax, then %eax to the bottom of our frame — that’s a local variable.
… fun(unsigned arg1, …) {
    unsigned local1;
    if (arg1 <= 1)
        local1 = arg1;

Alright, so suppose that bytes b-c do jump us to 15.  Bytes 15-24 copy arg1 to %eax, subtract 2, push it on the stack, recursively call fun, then save the return value in %ebx.  Bytes 25-33 copy arg1 to %eax, subtract 1, push it on the stack, recursively call fun, then add the return value to %ebx.  Bytes 35-37 then take %ebx, the sum, and saves it to what we decided to call local1.  From this, we can gather that fun takes a single argument, since that’s all we push onto the stack before calling it, and returns a 4-byte value, since we treat %eax as one:
unsigned fun(unsigned arg1) {
    unsigned local1;
    if (arg1 <= 1)
        local1 = arg1;
        local1 = fun(arg1 - 2) + fun(arg1 - 1);

Bytes 38-40 copy local1 to %eax — the return register — then go through the standard teardown of our frame and return.  We’ve completed analyzing this function:
unsigned fun(unsigned arg1)
    unsigned local1;
    if (arg1 <= 1)
        local1 = arg1;
        local1 = fun(arg1 - 2) + fun(arg1 - 1);
    return local1;

That looks very familiar… almost like a naïve Fibonacci number generator!  In fact, if we write this:
unsigned fib(unsigned n) {
    if (n <= 1)
        return n;
    return fib(n - 2) + fib(n - 1);
and pass it through gcc -S, it outputs this:
        .file  "fib.c"
.globl fib
        .type  fib, @function
        pushl  %ebp
        movl   %esp, %ebp
        pushl  %ebx
        subl   $8, %esp
        cmpl   $1, 8(%ebp)
        ja     .L2
        movl   8(%ebp), %eax
        movl   %eax, -8(%ebp)
        jmp    .L4
        movl   8(%ebp), %eax
        subl   $2, %eax
        movl   %eax, (%esp)
        call   fib
        movl   %eax, %ebx
        movl   8(%ebp), %eax
        subl   $1, %eax
        movl   %eax, (%esp)
        call   fib
        addl   %eax, %ebx
        movl   %ebx, -8(%ebp)
        movl   -8(%ebp), %eax
        addl   $8, %esp
        popl   %ebx
        popl   %ebp
        .size  fib, .-fib
        .ident "GCC: (GNU) 4.2.4 (Gentoo 4.2.4 p1.0)"
        .section       .note.GNU-stack,"",@progbits
So we pretty much nailed it on the head.  (As to be expected from a simple example like this.)