Cookie Notice

As far as I know, and as far as I remember, nothing in this page does anything with Cookies.


In Praise of Night-Hacking, or at least in hopes to understand

Photo credit: pantulis 
I code every day. It's what I do, and I'm thankful. But there is a problem with coding from 9am to 5pm.

Other people.

Yeah, you're coding for their benefit, and it is well and good and helpful (and secretly, an ego boost at times when you otherwise feel humbled by the tasks in front of you) to serve as tech support for other people, even when and sometimes especially when the task seems otherwise mundane to you.

But, when you have all that state in your head, the last thing you want to do is fix someone else's thing. It has had me nearly to tears before. When there's nobody else around, you don't have to worry about the house of cards you have in your head. That's when you can relax and get into it, man. You know, like a coding machine?

But that's not it. Or at least not all of it.

At least what a geek with a hat has to say. In essence, when you're tired, parts of your brain shut down, and those can be the "squirrel!" aspects that get in the way of the deep concentration you need to build, understand and modify the house of data cards in your head. And, with your focus and the lights of the computer screen, you can keep going until you drop off, and when you do, it's the deep, relaxing sleep that exhaustion brings, not the light, brittle sleep sometimes broken by insomnia you get when you're sleeping because it's time. (I once heard a professor opine that humans were really built for 25-hour days. I do believe that some.)

A big problem is, of course, that if your life doesn't live on hacker's hours, parts of your life fall away when you start living on hackers hours. Something to sleep on, so to speak.


Can You Identify The Bug?


use 5.010 ;
use strict ;
use warnings ;
use Carp ;
use Data::Dumper ;
use Getopt::Long ;
use Pod::Usage ;

use lib '/home/ltl/lib' ;
use SecGenTools::Run 'get_run_data' ;
use SecGenTools::Accession 'get_accession_data' ;
use SecGenTools::Request 'get_request_data' ;

my $config      = config() ;
my $run_stub    = get_run_data( $config->{ run_id } ) ;
my $run         = $run_stub->{ $config->{ run_id } } ;
my $run_samples = get_run_sample( $config->{ run_id } ) ;
my %conversion =
    map { $_, $run->{ max_regions } + 1 - $_ } 1 .. $run->{ max_regions } ;

for my $region ( 1 .. $run->{ max_regions } ) {
   my $conversion = $conversion{ $region } ;
   for my $sample_id ( sort keys %$run_samples ) {
       my $sample = $run_samples->{ $sample_id } ;
       next if $sample->{ region } != $region ;
       $sample->{ region } = $conversion ;
       update_run_sample( $sample ) ;
       say join "\n\t" , $sample_id ,
           $sample->{ accession_id } ,
           $region ,
           $conversion ,

exit 1 ;

This is the core of the code. config() puts all the configuration options into a hash and returns it. get_run_data(), get_request_data(), andget_accession_data(), are wrappers around database calls.

An accession is an individual sample to be analyzed. A request is one or more accessions. A run is one or more requests, assigning the accessions to one of the regions. And, because this is a mapping between a data structure and a physical object, it is possible for the user to start at the wrong end. This program is supposed to take all the requests and reverse the regions. That is, for an 8-region run, all the accessions in region 1 should be in region 8 and vice versa.

This code puts everything in regions 1-4. It took me a morning to figure out why, although part of that was restoring the database to the way it was before.

It took me a while to get it, so stop here if you're still looking.

In a word, persistence.

$sample is a reference to the data in $run_samples. In my mind, my $sample = $run_samples->{ $sample_id } made a local copy, but it didn't. That meant that, when the code hits region 5, it works on the accessions just moved over from region 4, too, and so on.

And, in working around that, I decided that looping on region was useless except to massage the retentive control freak in me, and once I convinced myself of that, things got better. So, like the Monster at the End of the Book, the bug is me.


Lots of Small Rants

I got three things on Black Friday, and the first one shouldn't even count. It's the e-book version of Programming Android from O'Reilly. Time was, when I had a technical question, O'Reilly books were the first place I'd look, but now, my first checks are Google and StackOverflow, so this is a bit of a nostalgia move, but    there's enough of a blank spot in my mind to justify it. Now, just to spend some time and generate something.

The other two are cheap pieces of USB hardware from Newegg. The first is a USB Wifi dongle (shown right), or, as it reads, "802.11b b/g/n Wireless Adatper". I have to say, I like it better so far than the Netgear that has been going back and forth between a few Windows boxes I have. Plus, I like the form. It's nice and small.

The other is also an inexpensive piece of Rosewill kit, a USB dongle connecting to a remote to Windows Media Center. Right now, it's nonsensically connected to the laptop I'm writing this on, but I could easily see myself liking this. I have a wireless Logitech keyboard with gone-AWOL mouse and don't do much with whatever PC I have in the bedroom for lack of pointer, and while I have been using Boxee for bedroom media PC duties, I could see myself accepting Windows Media Center due to this one. Problem is, the PC I needed the USB dongle for is the the same one I needed the remote for, and because it's been repurposed from HTPC to desktop, as long as I need the WiFi dongle, I do not need the remote and vice versa.

I've been doing more with git, as previously mentioned today, and I think have hit the distinction for bare repositories. This has been kicking to most of the day, and it's a bit of duh I just had to work through to get this into my head. I've been making non-bare repositories and wanting them to behave like bare repositories, then tearing my hair out. I'll have to work out aliases or scripts for this stuff.


Book Review - Version Control by Example, or "Thank you, @eric_sink"

For too long, I've worked in environments without version control. There's been backups, either real or virtual, and for the web part there's the Google cache (which saved by butt once as the webmaster for my LUG) but version control has always been something that I know I should be doing, but we've never done. (The one exception was doing temp work at the car parts company. There, we used code reviews and Synergy. Not the good Synergy, which gives you the ability use one keyboard and mouse across several computers in software, but the bad one, the ancient, slow and ponderous configuration management system, which is similar enough to version control that I cannot express the distinction.

So, here's something where I know I should be doing it, but I don't really know how to do it. I've heard enough about Git and have signed up with Github, which put it before Subversion and CVS and the rest of the choices. But, unfortunately, I hadn't heard enough about Git to really know what the heck I should be doing with it.

Then I saw a post by my internet friend Funnel Fiasco about Version Control by Example by Eric Sink, saying how good it was. And, as it turns out, Eric is much more about teaching people about how to not do stupid things than he is about sales, so you can download digital copies, browse through online, and he'll even send you a copy free.

I've had it in my bag for several months without cracking it open, which was stupid of me. Right now, I'm in a decent place at work, with little sitting with a tight deadline, so I have time to go through the book and start  actually learning from it, and wow! I'm getting it! It makes sense. The core of the book is the same examples expressed in Subversion, Mercurial, Veracity and Git, so it serves as a Rosetta Stone, too, so if you know one, you have a step toward using the others if that's what your workplace or project uses.

Thank you, Eric Sink!


Speaking of git, my script for automating the mounting and unmounting of remote file systems via sshfs, is now on github.

Questions about Git

I'm reading Eric Sink's Version Control By Example, starting to hit the examples, and I'm finding that to be a bit of a problem.

Eric's example is one dev in Birmingham, UK and one in Birmingham, AL, writing C code and committing to a server in Cleveland. Right now, the dev team for my office are two coders separated by all of five feet, writing Perl code for one of two servers, one being the web server. I'm easing in to git, as you might guess by the reading material, and like most working environments I've been in, our version control system has been "copy a backup before you muck with the file", which I know is dumb and useless (especially in the lab, where we rely on RAID for data protection and thus don't really have backups). I'm kinda taking the lead on this, having been burned enough to want to protect myself, but I don't really know much of it.

I've been using git so far to keep track of changes locally, just doing git init and git commit within the local file system. Is this enough? Or do we really, truly need a server?

And the code sits in /home/user/web/cgi-bin/foo/bar/blee/quuz.cgi or /home/user/bin/, I'm wondering: should I have those be the directories to run git in? Or should I do the work in /home/varlogrant/dev/ and copy over to /home/usr/bin when I'm happy with it and it has been committed?

I have a bigger question on how to take a large selection of interconnected Perl modules and make them 1) test driven in the real, chromatic-approved way, 2) working with git in a useful way, and 3) usable from ~/lib on several dissimilar systems. I have a cheap hack on #3, but if I break apart and reassemble the modules, I can probably do it cleaner and smarter.


'cat' considered not useless

Consider this:
cat /path/to/my/file.log |
grep filter |
sed 'something, I don't really use sed a lot' |
grep filter_again |

Many would write this off as "useless use of cat". I don't think so. I mean, functionally, sure. But it isn't all functional.
cat /path/to/my/file.log | # read data from file
grep filter |              # run a filter on file contents
sed 'something, I don't really use sed a lot' | #change file contents
grep filter_again |        # run another filter on contents
lp                         # send to printer

Compare to this:
grep filter cat |          # read data from file AND run a filter on file contents
sed 'something, I don't really use sed a lot' | #change file contents
grep filter_again |        # run another filter on contents
lp                         # send to printer

You have now overloaded grep and if you want to remove it, you now have more editing than reading a line.

Friend of the blog Patrick says that this use of cat is just as useless as comments. I tend to agree.



Improving my SSHFS script

I can name more than a dozen machines whose file systems I would want to mount from work. I don't mount those file systems all the time, but I do often enough that I have a Perl script I use to manage my SSHFS mounting and unmounting of them.

Here's a problem. This weekend, the servers for work were taken down for scheduled maintenance. That's fine, but it does mean that I had to remount a bunch of them this AM, and when the script went through the already-mounted filesystems, it would ask for a password and say "this is already mounted". What I probably want is some way of knowing quickly whether there's something mounted and then doing that check before I do the mount.

And this is what initially draws my attention: mount points are directories. And directories can contain files. It is a known trick for hackers to unmount shares, hide files in the mount point, and remount, where sysadmins won't notice it. It strikes me that I can touch a file like .unmounted into each mount point, and then look for that file on each remounting, and skip it if I can't -f that file.

But really, this has to be a solved problem somehow, so there might be a better way. Pointers?

ETA: The code, a few revisions back, is part of a previous post. A twitter response is leading me to start the process of finally using my github account and setting it up there. Will add again when I get to there.


I've Been Doin' Some Hard-Travelin', I Thought You Knowed

Back to the Traveling Salesman.

What I had before was 11298 miles, using the shortest available path to an unconnected state capital, and it had problems, problems where the path already chosen forced a great amount of backtracking. Knots, my friend Mark calls them. The knots are the thing that looked really wrong to me.

So, I added another step.

I modified choose_shortest_path() so that it returned an array with the path. I then do some substitutions. Take two capitals, switch their order, and if that gets us shorter, go with that. Not randomly. Iteratively. First those next to each other, then those separated by one, then by two, up to 5. Then again. Five times.

This gets me to 10886 miles. So far. I'm doing it again, five times going from one to forty, just to see if we can get better than that, because the Washington-to-Arizona knot looks wrong to me, but that's a gut feeling, not a proven issue. That is a near-1100-mile leap, but using it seems to save me 412 miles, so it must work. 

A CS professor once described NP-Complete problems as a license to hack, because there isn't an established best solution, you can play with it. This is a bit what I'm doing here. Certainly, this won't help you pack you knapsack, but if it helps you visit all the capitals that much faster, I'm happy.


# naive shortest-path determination - A little better

use 5.010 ;
use strict ;
use warnings ;
use Data::Dumper ;
use DBI ;

use lib '/home/jacoby/lib' ;
use MyDB 'db_connect' ;

my $states    = get_states() ;
my $combos    = get_combos() ;
my $distances = get_distances() ;
my %shortest ;

#for my $start ( 1..48 ) {
#    my $state = $states->{ $start }->{ state } ;
#    my @path = choose_shortest_path( $start ) ;
#    my $dist = find_distance( @path ) ;
#    say join "\t", (sprintf '%02.2f' , $dist), $start, $state ;
#    }
#exit ;

my @path = choose_shortest_path( 23 ) ;
my $distance = find_distance( @path ) ;
say $distance ;
say as_google_url( separate_by_pipes( @path ) ) ;
say '' ;

my $path = \@path ;
for my $pass ( 1 .. 5 ) {
    for my $offset ( 1 .. 40 ) {
        my $start = 0 ;
        $path = massage_path( $start, $offset, $path ) ;
        my $distance = find_distance( @$path ) ;
        say join "\t", $pass , $offset, scalar @$path , $distance ;
say as_google_url( separate_by_pipes( @$path ) ) ;
say separate_by_pipes( @$path ) ;

exit ;

######## ######## ######## ######## ######## ######## ######## ########
sub choose_shortest_path {
    my @path = @_ ;
    return @path if scalar @path == 48 ;
    my $s_id    = shift @path ;
    my $state   = $states->{ $s_id }->{ state } ;
    my @choices = sort { #sort by distance
        $distances->{ $a }->{ distance } <=> $distances->{ $b }->{ distance }
        grep { # haven't been chosen yet
                is_not_in_array( $combos->{ $_ }->{ state_id_1 }, \@path )
            and is_not_in_array( $combos->{ $_ }->{ state_id_2 }, \@path )
        grep { # must have the state current state
               $combos->{ $_ }->{ state_id_1 } == $s_id
            or $combos->{ $_ }->{ state_id_2 } == $s_id
            } keys %$combos ;
    my $c     = shift @choices ; #shortest
    my $c_obj = $combos->{ $c } ;
    my ( $o ) = grep { $_ != $s_id } $c_obj->{ state_id_1 },
        $c_obj->{ state_id_2 } ;
    my $o_state = $states->{ $o }->{ state } ;
    my $d = $distances->{ $c }->{ distance } || 'x' ;
    return choose_shortest_path( $o, $s_id, @path ) ;

######## ######## ######## ######## ######## ######## ######## ########
sub massage_path {
    my ( $a, $offset, $path ) = @_ ;
    my $b = $a + $offset ;
    my $alt ;
    @$alt = @$path ;
    if ( $b >= 48 ) { return $path ; }
    $alt->[ $a ] = $path->[ $b ] ;
    $alt->[ $b ] = $path->[ $a ] ;
    my $d1 = find_distance( @$path ) ;
    my $d2 = find_distance( @$alt ) ;
    $path = $alt if $d2 < $d1 ;
    return massage_path( $a + 1, $offset, $path ) ;

######## ######## ######## ######## ######## ######## ######## ########
sub find_distance {
    my @path     = @_ ;
    my $distance = 0 ;
    for my $i ( 1 .. 47 ) {
        my ( $s1, $s2 ) = sort { $a <=> $b } $path[ $i ], $path[ $i - 1 ] ;
        my ( $combo ) = grep {
                   $combos->{ $_ }->{ state_id_1 } == $s1
                && $combos->{ $_ }->{ state_id_2 } == $s2
            sort keys %$combos ;
        $distance += $distances->{ $combo }->{ distance } ;
    return sprintf '%0.02f', $distance ;

######## ######## ######## ######## ######## ######## ######## ########
sub is_not_in_array {
    my ( $num, $path ) = @_ ;
    for my $p ( @$path ) {
        return 0 if $num == $p ;
    return 1 ;

######## ######## ######## ######## ######## ######## ######## ########
sub get_states {
    my $dbh    = db_connect() ;
    my $sql    = 'SELECT * from state_capitals ORDER BY id' ;
    my $states = $dbh->selectall_hashref( $sql, 'id' ) or croak $dbh->errstr ;
    return $states ;

######## ######## ######## ######## ######## ######## ######## ########
sub get_combos {
    my $dbh    = db_connect() ;
    my $sql    = 'SELECT * from combinations ORDER BY id' ;
    my $combos = $dbh->selectall_hashref( $sql, 'id' ) or croak $dbh->errstr ;
    return $combos ;

######## ######## ######## ######## ######## ######## ######## ########
sub get_distances {
    my $dbh    = db_connect() ;
    my $sql    = 'SELECT * from distances ORDER BY id' ;
    my $combos = $dbh->selectall_hashref( $sql, 'id' ) or croak $dbh->errstr ;
    return $combos ;

######## ######## ######## ######## ######## ######## ######## ########
sub separate_by_pipes {
    return join '|', @_ ;

######## ######## ######## ######## ######## ######## ######## ########
sub as_mark_list {
    my ( $path ) = @_ ;
    return join '', map { $states->{ $_ }->{ st } }
        split m{\|}mx, $path ;

######## ######## ######## ######## ######## ######## ######## ########
sub as_google_url {
    my ( $path ) = @_ ;
    my $url1 =
    my $url2 = '&size=500x400&sensor=false' ;
    my $body = join '|', map {
        join ',', $states->{ $_ }->{ latitude },
            $states->{ $_ }->{ longitude }
        split m{\|}mx, $path ;
    return join '', $url1, $body, $url2 ;

######## ######## ######## ######## ######## ######## ######## ########
sub key_from_value {
    my ( $v ) = @_ ;
    my %rev = reverse %shortest ;
    return $rev{ $v } ;


Beyond Firebug to NYTProf

Clearly, the problem is in the core application, not the CSS and JS surrounding it. And Firebug only covers the outside of the application.

So I got NYTProf going. A little search gave me the knowledge that calling a page at is the same as calling it with perl myprog.cgi 'foo=bar', which is so good to know, especially since the addition of the NYTProf step is perl -d:NYTProf myprog.cgi 'foo=bar' .

So, I was able to shave off a second by caching. I could have HOP'd it and just used Memoize, but I like having all the details of a program visible so I don't get bit by something I can't see.

    # all lines of code with  %url_cache are new
    # URLs have been changed to protect the innocent.
    my %url_cache ;
    sub get_service_page {
        my ( $pi ) = @_ ;
        if ( $url_cache{ $pi } ) {
            return $url_cache{ $pi } ;
        my $readfile = pi_Readfile() ;
        my $url = '' ;
        my $alt =
        my $attr   = 'SGNAME_PUTATIVE' ;
        my $sgname = $readfile->{ $pi }->{ $attr } ;

        if ( ! defined $sgname || '' eq $sgname ) {
            $url_cache{ $pi } = $alt ;
            return $alt ;
        $url =~ s/XXXXX/$sgname/ ;
        $url_cache{ $pi } = $url ;
        return $url ;

So, simply by holding onto that little piece of information instead of checking against the same pi_Readfile() each time, I was able to go from 2-3 seconds to 1.2 seconds. And, now that I'm seeing it, I could hold onto the data structure I get from  pi_Readfile() the same way I hold onto the cache, and could probably tighten up even more.

I don't know why it didn't occur to me to do that in the first place....


Just Ran Firebug

Pretty clear where the lag is, isn't it?


Even More Traveling, Even Less Sales

Here's some table descriptions from MySQL, from which you should be able to reverse engineer the table creation. State Capitals
| Field     | Type        | Null | Key | Default | Extra          |
| id        | int(10)     | NO   | PRI | NULL    | auto_increment |
| state     | varchar(25) | YES  |     | NULL    |                |
| st        | varchar(2)  | YES  |     | NULL    |                |
| city      | varchar(25) | YES  |     | NULL    |                |
| latitude  | float(12,6) | YES  |     | NULL    |                |
| longitude | float(12,6) | YES  |     | NULL    |                |
Combinations - Connecting each one to each other
| Field      | Type    | Null | Key | Default | Extra          |
| id         | int(10) | NO   | PRI | NULL    | auto_increment |
| state_id_1 | int(10) | YES  |     | NULL    |                |
| state_id_2 | int(10) | YES  |     | NULL    |                |
| Field    | Type        | Null | Key | Default | Extra          |
| id       | int(10)     | NO   | PRI | NULL    | auto_increment |
| distance | float(12,6) | YES  |     | NULL    |                |
I'll say again, I think I made a mistake by not including distance in the combination table. I didn't write perl code to put the state capital information into the database. I copied it from a source and recrafted it into SQL by hand.
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 01 , "Delaware" , "DE" , "Dover" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 02 , "Pennsylvania" , "PA" , "Harrisburg" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 03 , "New Jersey, NJ" , "Trenton" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 04 , "Georgia" , "GA" , "Atlanta" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 05 , "Connecticut" , "CT" , "Hartford" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 06 , "Massachusetts" , "MA" , "Boston" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 07 , "Maryland" , "MD" , "Annapolis" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 08 , "South Carolina" , "SC" , "Columbia" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 09 , "New Hampshire" , "NH" , "Concord" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 10 , "Virginia" , "VA" , "Richmond" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 11 , "New York" , "NY" , "Albany" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 12 , "North Carolina" , "NC" , "Raleigh" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 13 , "Rhode Island" , "RI" , "Providence" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 14 , "Vermont" , "VT" , "Montpelier" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 15 , "Kentucky" , "KY" , "Frankfort" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 16 , "Tennessee" , "TN" , "Nashville" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 17 , "Ohio" , "OH" , "Columbus" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 18 , "Louisiana" , "LA" , "Baton Rouge" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 19 , "Indiana" , "IN" , "Indianapolis" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 20 , "Mississippi" , "MS" , "Jackson" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 21 , "Illinois" , "IL" , "Springfield" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 22 , "Alabama" , "AL" , "Montgomery" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 23 , "Maine" , "ME" , "Augusta" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 24 , "Missouri" , "MO" , "Jefferson City" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 25 , "Arkansas" , "AR" , "Little Rock" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 26 , "Michigan" , "MI" , "Lansing" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 27 , "Florida" , "FL" , "Tallahassee" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 28 , "Texas" , "TX" , "Austin" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 29 , "Iowa" , "IA" , "Des Moines" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 30 , "Wisconsin" , "WI" , "Madison" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 31 , "California" , "CA" , "Sacramento" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 32 , "Minnesota" , "MN" , "Saint Paul" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 33 , "Oregon" , "OR" , "Salem" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 34 , "Kansas" , "KS" , "Topeka" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 35 , "West Virginia" , "WV" , "Charleston" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 36 , "Nevada" , "NV" , "Carson City" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 37 , "Nebraska" , "NE" , "Lincoln" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 38 , "Colorado" , "CO" , "Denver" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 39 , "North Dakota" , "ND" , "Bismarck" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 40 , "South Dakota" , "SD" , "Pierre" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 41 , "Montana" , "MT" , "Helena" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 42 , "Washington" , "WA" , "Olympia" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 43 , "Idaho" , "ID" , "Boise" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 44 , "Wyoming" , "WY" , "Cheyenne" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 45 , "Utah" , "UT" , "Salt Lake City" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 46 , "Oklahoma" , "OK" , "Oklahoma City" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 47 , "New Mexico" , "NM" , "Santa Fe" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 48 , "Arizona" , "AZ" , "Phoenix" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 49 , "Alaska" , "AK" , "Juneau" ) ;
INSERT INTO state_capitals ( id , state , st , city ) VALUES ( 50 , "Hawaii" , "HI" , "Honolulu" ) ;
The latitudes and longitudes were also hand-crafted.
UPDATE state_capitals SET latitude="32.361538", longitude="-86.279118" where state = "Alabama" ;
UPDATE state_capitals SET latitude="58.301935", longitude="-134.419740" where state = "Alaska" ;
UPDATE state_capitals SET latitude="33.448457", longitude="-112.073844" where state = "Arizona" ;
UPDATE state_capitals SET latitude="34.736009", longitude="-92.331122" where state = "Arkansas" ;
UPDATE state_capitals SET latitude="38.555605", longitude="-121.468926" where state = "California" ;
UPDATE state_capitals SET latitude="39.7391667", longitude="-104.984167" where state = "Colorado" ;
UPDATE state_capitals SET latitude="41.767", longitude="-72.677" where state = "Connecticut" ;
UPDATE state_capitals SET latitude="39.161921", longitude="-75.526755" where state = "Delaware" ;
UPDATE state_capitals SET latitude="30.4518", longitude="-84.27277" where state = "Florida" ;
UPDATE state_capitals SET latitude="33.76", longitude="-84.39" where state = "Georgia" ;
UPDATE state_capitals SET latitude="21.30895", longitude="-157.826182" where state = "Hawaii" ;
UPDATE state_capitals SET latitude="43.613739", longitude="-116.237651" where state = "Idaho" ;
UPDATE state_capitals SET latitude="39.783250", longitude="-89.650373" where state = "Illinois" ;
UPDATE state_capitals SET latitude="39.790942", longitude="-86.147685" where state = "Indiana" ;
UPDATE state_capitals SET latitude="41.590939", longitude="-93.620866" where state = "Iowa" ;
UPDATE state_capitals SET latitude="39.04", longitude="-95.69" where state = "Kansas" ;
UPDATE state_capitals SET latitude="38.197274", longitude="-84.86311" where state = "Kentucky" ;
UPDATE state_capitals SET latitude="30.45809", longitude="-91.140229" where state = "Louisiana" ;
UPDATE state_capitals SET latitude="44.323535", longitude="-69.765261" where state = "Maine" ;
UPDATE state_capitals SET latitude="38.972945", longitude="-76.501157" where state = "Maryland" ;
UPDATE state_capitals SET latitude="42.2352", longitude="-71.0275" where state = "Massachusetts" ;
UPDATE state_capitals SET latitude="42.7335", longitude="-84.5467" where state = "Michigan" ;
UPDATE state_capitals SET latitude="44.95", longitude="-93.094" where state = "Minnesota" ;
UPDATE state_capitals SET latitude="32.320", longitude="-90.207" where state = "Mississippi" ;
UPDATE state_capitals SET latitude="38.572954", longitude="-92.189283" where state = "Missouri" ;
UPDATE state_capitals SET latitude="46.595805", longitude="-112.027031" where state = "Montana" ;
UPDATE state_capitals SET latitude="40.809868", longitude="-96.675345" where state = "Nebraska" ;
UPDATE state_capitals SET latitude="39.160949", longitude="-119.753877" where state = "Nevada" ;
UPDATE state_capitals SET latitude="43.220093", longitude="-71.549127" where state = "New Hampshire" ;
UPDATE state_capitals SET latitude="40.221741", longitude="-74.756138" where state = "New Jersey" ;
UPDATE state_capitals SET latitude="35.667231", longitude="-105.964575" where state = "New Mexico" ;
UPDATE state_capitals SET latitude="42.659829", longitude="-73.781339" where state = "New York" ;
UPDATE state_capitals SET latitude="35.771", longitude="-78.638" where state = "North Carolina" ;
UPDATE state_capitals SET latitude="48.813343", longitude="-100.779004" where state = "North Dakota" ;
UPDATE state_capitals SET latitude="39.962245", longitude="-83.000647" where state = "Ohio" ;
UPDATE state_capitals SET latitude="35.482309", longitude="-97.534994" where state = "Oklahoma" ;
UPDATE state_capitals SET latitude="44.931109", longitude="-123.029159" where state = "Oregon" ;
UPDATE state_capitals SET latitude="40.269789", longitude="-76.875613" where state = "Pennsylvania" ;
UPDATE state_capitals SET latitude="41.82355", longitude="-71.422132" where state = "Rhode Island" ;
UPDATE state_capitals SET latitude="34.000", longitude="-81.035" where state = "South Carolina" ;
UPDATE state_capitals SET latitude="44.367966", longitude="-100.336378" where state = "South Dakota" ;
UPDATE state_capitals SET latitude="36.165", longitude="-86.784" where state = "Tennessee" ;
UPDATE state_capitals SET latitude="30.266667", longitude="-97.75" where state = "Texas" ;
UPDATE state_capitals SET latitude="40.7547", longitude="-111.892622" where state = "Utah" ;
UPDATE state_capitals SET latitude="44.26639", longitude="-72.57194" where state = "Vermont" ;
UPDATE state_capitals SET latitude="37.54", longitude="-77.46" where state = "Virginia" ;
UPDATE state_capitals SET latitude="47.042418", longitude="-122.893077" where state = "Washington" ;
UPDATE state_capitals SET latitude="38.349497", longitude="-81.633294" where state = "West Virginia" ;
UPDATE state_capitals SET latitude="43.074722", longitude="-89.384444" where state = "Wisconsin" ;
UPDATE state_capitals SET latitude="41.145548", longitude="-104.802042" where state = "Wyoming" ;
The distances themselves were generated mathematically, with the help of Google and Wikipedia to find the how-to.

use 5.010 ;
use strict ;
use warnings ;
use Data::Dumper ;
use DBI ;

use lib '/home/jacoby/lib' ;
use MyDB 'db_connect' ;

use subs qw{ get_combos get_states set_distance } ;

my $pi = atan2( 1, 1 ) * 4 ;
my $states = get_states() ;
my $combos = get_combos() ;

for my $combo ( (sort { $a<=>$b } keys %$combos ) ) {
    my $c_obj = $combos->{$combo} ;
    my ( $state_1 , $state_2 ) =  sort { $a <=> $b } $c_obj->{ state_id_1 } , $c_obj->{ state_id_2 } ;
    my $obj_s1 = $states->{ $state_1 } ;
    my $obj_s2 = $states->{ $state_2 } ;
    my $dist = haversine(
            $obj_s1->{ latitude } , $obj_s1->{ longitude } ,
            $obj_s2->{ latitude } , $obj_s2->{ longitude } ) ;
    say $combo ;
    say join ' - ' ,
    ( join ', ' , $obj_s1->{ city } , $obj_s1->{ state } ) ,
    ( join ', ' , $obj_s2->{ city } , $obj_s2->{ state } ) ;
    say join "\t" , '' , $dist . ' miles';
    set_distance( $combo , $dist ) ;

sub get_states {
    my $dbh = db_connect() ;
    my $sql = 'SELECT * from state_capitals ORDER BY id' ;
    my $states = $dbh->selectall_hashref( $sql , 'id' ) or croak $dbh->errstr;
    return $states ;
sub get_combos {
    my $dbh = db_connect() ;
    my $sql = 'SELECT * from combinations ORDER BY id' ;
    my $combos = $dbh->selectall_hashref( $sql , 'id' ) or croak $dbh->errstr;
    return $combos ;
sub set_distance {
    my ( $combo , $dist ) = @_ ;
    my $dbh = db_connect() ;
    my $sql = "INSERT INTO distances ( id , distance ) VALUES ( $combo , $dist ) " ;
    say $sql ;
    $dbh->do( $sql ) or croak $dbh->errstr;

sub haversine {
    my ( $lat1, $lon1, $lat2, $lon2 ) = @_ ;

    my $theta = $lon1 - $lon2 ;
    my $dist =
        sin( deg2rad( $lat1 ) ) *
        sin( deg2rad( $lat2 ) ) +
        cos( deg2rad( $lat1 ) ) *
        cos( deg2rad( $lat2 ) ) *
        cos( deg2rad( $theta ) ) ;

    $dist = acos( $dist ) ;
    $dist = rad2deg( $dist ) ;
    $dist = $dist * 60 * 1.1515 ;
    return sprintf '%5.2f' , $dist ;

sub acos {
    my ( $rad ) = @_ ;
    my $ret = atan2( sqrt( 1 - $rad**2 ), $rad ) ;
    return $ret ;

sub deg2rad {
    my ( $deg ) = @_ ;
    return ( $deg * $pi / 180 ) ;

sub rad2deg {
    my ( $rad ) = @_ ;
    return ( $rad * 180 / $pi ) ;

More Details on Traveling Salesman

I have just received mail from a programmer in France who is interested in learning Perl and  asked about some of the constructs in my Traveling Salesman code, specifically asking about and do_connect. That module and that function are about one thing: getting a DBI object without having to have my login and password and all that within the body of the program, so I can do things like paste it into my blog without worrying. Adapted, it looks like this:

package MyDB ;
use strict;
use warnings;
use DBI;

use Exporter qw(import);
our %EXPORT_TAGS = ('all' => [ qw(
                            ) ],
our @EXPORT_OK   = ( @{$EXPORT_TAGS{'all'}} );
our $VERSION = 0.0.1;
our %_DB = (
    default => {
        user       => 'YouDontGetThis',
        password   => 'YouDontGetThis',
        host       => 'YouDontGetThis',
        port       => '3306',
        database   => 'YouDontGetThis',
    test => {
        user       => 'YouDontGetThis',
        password   => 'YouDontGetThis',
        host       => 'YouDontGetThis',
        port       => '3306',
        database   => 'YouDontGetThis',

my $_db_params  = '';       # String of current database parameters.
my $_dbh;                   # Save the handle.

sub db_connect {
    my ($param_ptr, $attr_ptr) = @_;

    # If database is already opened then check for a fast return.

    if (defined $_dbh &&
        (!defined $param_ptr || $param_ptr eq ''))    { return $_dbh }

    # Check for a different set of parameters to use via a the name (string)
    #   of the parameter (e.g., 'test').

    my $which_db = 'default';

    if (defined $param_ptr && ref($param_ptr) eq '' && $param_ptr ne '') {
        if (defined $_DB{$param_ptr})   { $which_db = $param_ptr }
        else { return; }

    # Get the base parameters ... copy and flatten from global array

    my %params = ();
    my %attr   = ();

    foreach (keys %{$_DB{$which_db}} ) {
        $params{$_} = $_DB{$which_db}{$_};

    # Add in extra parameters if given and if the database is not the default.

    if (defined $param_ptr
        && ref($param_ptr) eq 'HASH'
        && (!defined $param_ptr->{database} ||
             $param_ptr->{database} ne 'default') ) {

        foreach (keys %{$_DB{default}})  {
            if (defined $param_ptr->{$_}) { $params{$_} = $param_ptr->{$_} }

    if (defined $attr_ptr && ref($attr_ptr) eq 'HASH') {
        foreach (keys %$attr_ptr)  { $attr{$_} = $attr_ptr->{$_} }

    # Now make up an order string of the parameters so that we can compare
    #   them to the old ones.

    my $new_db_params = '';
    foreach (sort keys %params)  { $new_db_params .= $params{$_} };

    # Can also do a quick return if params are same as old ones

    if (defined $_dbh && $new_db_params eq $_db_params)  {
        return $_dbh;

    # At this point either the database has never been opened or
    #   new parameters are to be used. Close database and reopen.

    $_db_params = $new_db_params;

    if (defined $_dbh) { $_dbh->disconnect }    # no error check

    my $source = "dbi:mysql:$params{database}:$params{host}:$params{port}";

    $_dbh = DBI->connect($source, $params{user},
                               $params{password}, \%attr);

    return $_dbh;

    } # End of db_connect


 For the particulars, I used something, I think WolframAlpha, to get the latitude and longitude of each capital, then looked into some geometry to calculate the distances. Perhaps I should include some database dumps here. for that info.

How I create my web applications

I handle our web stuff from soup to nuts, so here's a little bit of my methodology on how I do that work.

  • I think about what we're supposed to do and what we're supposed to store, and try to express it in SQL, culminating in the creation of a table in our MySQL database.
  • I write generalized functions for CRUD (creation, reading, updating and deleting) as needed, and create or add to existing Perl modules. I also make testing scripts for these functions to run on the command line.
  • I write the read functions into a Perl-driven CGI program. I'm old-school enough that each attempt to learn a framework such as Catalyst leaves me frustrated. Full creation of an element is usually handled in this program.
  • I write a jQuery-based Javascript module to run within the CGI that allows me to collect and add to all the information I need to make modifications. 
  • I write an AJAX backend program in Perl that passes JSON back and forth between the client and server.
Right now, I'm in that last step. I know there are a few things that I need to start doing. I need to have development, test and production streams going for all this stuff. I need to have much more git going on. And, often there are small tweaks on the CSS throughout this process.


Traveling Salesman without Farmer's Daughter

Have posted some related details to the problem, specifically some database access boilerplate.

The challenge is not traditional Traveling Salesman, which brings you back to the start. In this case, this is all the state capitals in the continental US, and the challenge is to get through all of them in the shortest distance, but was not getting back to the start.

This is my naive solution, which is starting from what I judged to be the furthest east (Maine) and going westward, choosing the shortest capital-to-capital distance. This is easy, and it avoids the biggest pitfall, the one that makes this a named problem.

Starting at one capital, you have 47 choices. For each of them, there are then 46 choices each, and then 45, and so on. The notation for that is n! (as opposed to !n which means not n) and it is big. 1.24x1061. This means that generating all possible paths would take forever even on big iron. The CS term is nondeterministic polynomial, or NP. Traveling Salesman is NP-Complete, IIRC. What this means is that, while finding a relatively fast way through this is pretty easy, finding the provably shortest isn't.

But I'm sure I can do better than this naive solution. The backtracking to get Vermont in makes me think that starting with Vermont and New Hampshire might be the better solution, and the jump from Tennessee to Michigan tells me that simple shortest-path is not the best solution there. I can generate this fast with simple recursion, getting a simple ordered list, doing the transforms that could tweak this into a faster path is something I don't really know how to do, code-wise.

Here's my Perl code, including some dyked-out bits covering some other cases. The first thing I can think of is to check every edge and each time two edges cross, switch the order of the second node for each edge. I think that could work, if I could decide how to code it. 


# naive shortest-path determination - sucks

use 5.010 ;
use strict ;
use warnings ;
use Data::Dumper ;
use DBI ;

use lib '/home/jacoby/lib' ;
use MyDB 'db_connect' ;

my $states    = get_states() ;
my $combos    = get_combos() ;
my $distances = get_distances() ;
my %shortest ;

#for my $start ( 1..48 ) {
#    my $state = $states->{ $start }->{ state } ;
#    my $dist  = choose_shortest_path( $start ) ;
#    say join "\t", (sprintf '%02.2f' , $dist), $start, $state ;
#    }
#exit ;

# 23 = maine

##say 'long' ;
##say choose_longest_path( 23 ) ;
##say '' ;

#say 'short' ;
#choose_shortest_path( 23 ) ;
#say Dumper \%shortest ;
#say '' ;
#my ( $s ) = sort { $shortest{ $a } <=> $shortest{ $b } } keys %shortest ;
#say as_mark_list( $s ) ;
#say '' ;
#say as_google_url( $s ) ;
#say '' ;

sub as_mark_list {
    my ( $path ) = @_ ;
    return join '', map { $states->{ $_ }->{ st } }
        split m{\|}mx, $path ;

sub as_google_url {
    my ( $path ) = @_ ;
    my $url1 =
    my $url2 = '&size=500x400&sensor=false' ;
    my $body = join '|', map {
        join ',', $states->{ $_ }->{ latitude },
            $states->{ $_ }->{ longitude }
        split m{\|}mx, $path ;
    return join '', $url1, $body, $url2 ;

sub key_from_value {
    my %rev = reverse %shortest ;
    my ( $v ) = @_ ;
    return $rev{ $v } ;

sub choose_shortest_path {
    my @path = @_ ;
    do {
        #say join '|', reverse map {
            #join ',',
        #        $states->{ $_ }->{ latitude },
        #        $states->{ $_ }->{ longitude }
        #        } @path ;
        #say as_mark_list( join '|', @path ) ;
        #say as_google_url( join '|', @path ) ;
        return 0 ;
        if scalar @path == 48 ;

    #say join ' ' , scalar @path , '-' , @path ;
    my $s_id    = shift @path ;
    my $state   = $states->{ $s_id }->{ state } ;
    my @choices = sort {
        $distances->{ $a }->{ distance } <=> $distances->{ $b }->{ distance }
        grep {
                is_not_in_array( $combos->{ $_ }->{ state_id_1 }, \@path )
            and is_not_in_array( $combos->{ $_ }->{ state_id_2 }, \@path )
        grep {
               $combos->{ $_ }->{ state_id_1 } == $s_id
            or $combos->{ $_ }->{ state_id_2 } == $s_id
            } keys %$combos ;
    my $c     = shift @choices ;
    my $c_obj = $combos->{ $c } ;
    my ( $o ) = grep { $_ != $s_id } $c_obj->{ state_id_1 },
        $c_obj->{ state_id_2 } ;
    my $o_state = $states->{ $o }->{ state } ;
    my $d = $distances->{ $c }->{ distance } || 'x' ;

    #say $state ;

    #say join "\t", '', $d, $c, $s_id, $state, $o, $o_state ;
    #say join "\t", '', join ' ' , @path ;
    #say '' ;
    my $dist = choose_shortest_path( $o, $s_id, @path ) ;
    return $dist + $d ;

#sub choose_longest_path {
#    my @path = @_ ;
#    do {
#        say join '|', reverse map {
#            join ',',
#                $states->{ $_ }->{ latitude },
#                $states->{ $_ }->{ longitude }
#                } @path ;
#        say as_mark_list( join '|', @path ) ;
#        say as_google_url( join '|', @path ) ;
#        return 0 ;
#        }
#        if scalar @path == 48 ;
#    #say join ' ' , scalar @path , '-' , @path ;
#    my $s_id    = shift @path ;
#    my $state   = $states->{ $s_id }->{ state } ;
#    my @choices = sort {
#        $distances->{ $a }->{ distance } <=> $distances->{ $b }->{ distance }
#        }
#        grep {
#                is_not_in_array( $combos->{ $_ }->{ state_id_1 }, \@path )
#            and is_not_in_array( $combos->{ $_ }->{ state_id_2 }, \@path )
#            }
#        grep {
#               $combos->{ $_ }->{ state_id_1 } == $s_id
#            or $combos->{ $_ }->{ state_id_2 } == $s_id
#            } keys %$combos ;
#    my $c     = pop @choices ;
#    my $c_obj = $combos->{ $c } ;
#    my ( $o ) = grep { $_ != $s_id } $c_obj->{ state_id_1 },
#        $c_obj->{ state_id_2 } ;
#    my $o_state = $states->{ $o }->{ state } ;
#    my $d = $distances->{ $c }->{ distance } || 'x' ;
#    #say $state ;
#    #say join "\t", '', $d, $c, $s_id, $state, $o, $o_state ;
#    #say join "\t", '', join ' ' , @path ;
#    #say '' ;
#    my $dist = choose_longest_path( $o, $s_id, @path ) ;
#    return $dist + $d ;
#    }

sub is_not_in_array {
    my ( $num, $path ) = @_ ;
    for my $p ( @$path ) {
        return 0 if $num == $p ;
    return 1 ;

sub get_states {
    my $dbh    = db_connect() ;
    my $sql    = 'SELECT * from state_capitals ORDER BY id' ;
    my $states = $dbh->selectall_hashref( $sql, 'id' ) or croak $dbh->errstr ;
    return $states ;

sub get_combos {
    my $dbh    = db_connect() ;
    my $sql    = 'SELECT * from combinations ORDER BY id' ;
    my $combos = $dbh->selectall_hashref( $sql, 'id' ) or croak $dbh->errstr ;
    return $combos ;

sub get_distances {
    my $dbh    = db_connect() ;
    my $sql    = 'SELECT * from distances ORDER BY id' ;
    my $combos = $dbh->selectall_hashref( $sql, 'id' ) or croak $dbh->errstr ;
    return $combos ;


Split Attention

If anyone has a solution, please tell me.

I have some tasks. The one that's almost off my front burner is a conversion job. There's a gob of files that exist in the worst possible XML format, which is there's one entry in the XML, which is a huge compressed BLOB (binary large object) of unknown format. The files have a name that say what kind of job created it and what day, but if you don't know which project went through which day, it gives you no help. We can export as (a more real) XML, which gives us the sample information we need. Thing is, there are over 1500 samples to be converted, which really cannot be a by-hand operation.

So, I'm working with Sikuli again, this time on Windows not Ubuntu. I've done the best I can, but this is a program written in Jython (a Java [not my favorite] implementation of Python[not my favorite]) running in Sikuli (taking over the keyboard and mouse) going over Samba to grab the configuration and data files, running in 32-bit Windows 7 on a VirtualBox instance. There's lots and lots of not-necessarily-stable involved here, which means I have to keep checking on it to ensure it keeps running. Which means, while it is going, I can't really throw my mind at any task that really takes my attention, which, basically, means any task I need to do at work, which makes me feel terribly unproductive, even as I know that it's much more productive than if I was doing all that crud by hand.

So, until I can code up a way that's much more repeatable and stable, or at least until I get well past the 10% completion this project is at about now, I can't really focus elsewhere.


We Solved It! Return of the PAL Problem

Remember what I was saying about the campus networking issue I've been discussing? Long story short: My phone connects to our PEAP and THAWTE protected campus network just fine, but once I am on, the network cannot do anything. But only with Android. Windows is happy with PAL2.0. iOS is happy with PAL2.0, and if I recall the last time I booted the poor, battery-starved old Linux laptop I have in my bookcase at work, desktop Linux is happy with PAL2.0.

But Android is not. Which is annoying.

I have of course went to the helpdesk, and that's where it gets a bit interesting. There's two groups I've hit: the helpdesk, who have found this problem across different Android versions, across different carriers, different handset manufacturers, without finding any common indicator, and the networking group, who, without more to go on, have entered the go away kid, you bother me zone, considering it a failure in user configuration.

I feel I should point out that, for the most part, my co-worker Rick keeps his Android phone all but off when he's in the office. I, on the other hand, tend to hook my phone into my Greater Audio System (Windows and another 1/8" cable [either laptop or phone, depending] run into a y-cable plugged into the audio-in of my Linux box, which I config to go direct to the audio out, because I don't have a mixer, and then into a speaker and to my headphones) so I can get my phone's notification beeps and podcasts along with the other audio I have at work. So I use the network and feel it when it isn't available. So, while I was pushed by usage, for Rick it was a question of curiosity. Which seems to be enough.

When you think about networking, if you think about networking, you probably think about your IP address akin to a phone number, which kinda works and kinda doesn't. Assume we have an address of (which I don't: that's the IP address of Google's open DNS server, which I kinda like). I can directly connect to any machine on my subnet, which could easily be and or further out, and if I can't find what I want on my local network, the traffic goes out the gateway to the hierarchically higher network. (There can also be down, in addition to up, but that's not important right now.)

One way to look around the local network is by using MAC addresses and routing tables, but that's too low level for this discussion. We use subnet masks. It's a series of 1s and 0s, in that order, which is used to tell if IP address A is in the same subnet as IP address B. A common netmask would look like 11111111111111111111111100000000. Clearly, that's hard for people to deal with, so we would write that as First, each of those four breaks out 8 spaces, which are binary representations of positive integers between 0 and 255. Here's a table of what the allowed numbers are:
00000000 0
10000000 128
11000000 192
11100000 224
11110000 240
11111000 248
11111100 252
11111110 254
11111111 255
So, if the subnet mask is,, that means the mask in binary is 11111111 11111111 00000000 00000000. The digit that share a space with 1 will be the same if the IP address is on the same subnet, and different if it's 0. Google's would be 00001000 00001000 00001000 00001000, the neighbor would be 00001000 00001000 00001000 00001001. Diff's at the end, in the zero space, and thus same subnet. would be 00001000 00001001 00001000 00001000, and that diff would be in the ones, and thus different.

The subnet mask sent out by DHCP was iOS and Windows are just peachy with that. Rick noticed that the IP addresses were a little higher than he would expect. He suggested that we enter in static IP addresses based upon what we got via DHCP (using a tool like ifconfig to tell you it all) but with 224 as the third octet, and it worked like a charm. Go us!


I don't like PAL2.0

Purdue has a wireless network, called Purdue Air Link, or PAL2.0. In essence, you log on to the wireless the same way you'd log onto most other university computing equipment. 

It works fine on my co-workers' iProducts.

It works fine on my laptop.

In most places, it works fine for me. And not just "at my home". Other places on campus, other places with PAL2.0.

But not in my office. In my office, I can connect -- phone says I have connection to network, phone says I have an IP address -- but cannot do anything with them. And the tools I would normally try to use to get a better idea of what's going on, such as ping, nslookup, traceroute and the like, are not on my phone. (Maybe I need to look into those.)

And, I should say, it affects my co-worker's Android phone, too.

I don't understand an issue that only occurs with Android phones, but it seems I have one. I think I'm getting an IP address, but I can't route.


Mad Mad Mad at Komodo Edit

I program, and for most small jobs, and for most complex find/replace jobs, I prefer vi. I have tried EMACS, but I never really got it. But I've hit a point that, when I'm doing most of my coding, I prefer an editor that behaves a bit more like a word processor. I want a graphical editor. I have tried a few. The one I use these days is Komodo Edit from Activestate, which I use some on Windows but primarily on Unix.

I develop and maintain a few web apps for the lab, and that's everything from SQL to Perl libraries to Perl applications to HTML to Javascript. I'm often wanting to make sure that the page I'm writing is working right with the AJAX Javascript code I'm writing for it, which is connecting to the CGI Perl code I'm writing, and that it is working correctly with the modules. Since they're at the far ends of the user experience, the Perl module and the HTML are tabs in one editor window, with the client-side Javascript in an other window and the server side Perl in another. Both my journalism school page layout background and my history with 80x24 terminals has lead me to believe that you don't want to go too wide, so I try to keep my code within 75 to 80 characters per line. My main monitor is 1680x1050, and I can have my code at a readable font size and still have three editor windows in one monitor, leaving my other for terminals (I have four to six open post of the time) and my Windows monitor open to a web browser.

This means that I CANNOT have three open and not partially-hidden editor windows on one screen. This used to be no problem for Komodo Edit, and is a deal-breaker for me.

So, what other mature modern graphical text editors for Linux are out there?


Who is Dennis Ritchie and Why should I care?

You can check Wikipedia for the fuller explanation, but here's the summary: He was a researcher at Bell Labs. While he was there, he worked on the Multics system, and as an offshoot, created the C language and the Unix operating system. The C language was not tightly tied to the specific system it came from, which meant that things written in it could reasonably be ported from one set of hardware to another.

He, Ken Thompson and Brian Kernighan are among the set of people whose creations made modern computation possible. I throw into that list Bob Metcalfe (ethernet) and Douglas Engelbart (windowing systems and the mouse). Others came along after and made things better, smaller, faster, more integrated and more beautiful, which is all very important, but Ritchie is among those who made things possible. He will be missed.

Dennis MacAlistair Ritchie 1941-2011


Open Computing, Taken To The Next Level

Hat-tip MakerBlog
This is the Doma Pro PCI, a black acrylic "case" for a computer. I say "case" because, while there are enclosures for the optical and hard drive, it's not really a case because it doesn't close. All your electronic bits are open to the world.

Which is neat, if you're in an environment where keeping your computer open to the elements is acceptable I wouldn't try this with young children.

But, as always, I do like.

During college, friend-of-the-blog Patrick had a working computer stuck to his bulletin board. The power supply sat on a nearby shelf, I think, but otherwise, it all hung off the wall. 

I'd be curious about RF interference in the surounding area, but this looks like a neat thing.


Trying to install boom

boom is a key-value store, running on the command line and written in ruby. I don't know whether I need a key-value store for my command line, but I remember when I didn't know if I needed a laptop, bash, Linux, VirtualBox, an RSS aggregator, and so many other things. So, I thought I'd try it. And, I of course decided to try it first on the machine I command line the most, my work machine, which runs Ubuntu 10.04, the long-term support version.

jacoby@oz:~$ uname -a
Linux oz 2.6.32-34-generic #77-Ubuntu SMP Tue Sep 13 19:40:53 UTC 2011 i686 GNU/Linux
jacoby@oz:~$ ruby -v
ruby 1.9.1p378 (2010-01-10 revision 26273) [i486-linux]
jacoby@oz:~$ gem -v
jacoby@oz:~$ gem install boom
WARNING:  Installing to ~/.gem since /var/lib/gems/1.9.1 and
	  /var/lib/gems/1.9.1/bin aren't both writable.
WARNING:  You don't have /home/jacoby/.gem/ruby/1.9.1/bin in your PATH,
	  gem executables will not run.
ERROR:  Error installing boom:
	multi_json requires RubyGems version >= 1.3.6
jacoby@oz:~$ gem update --system 1.3.7
ERROR:  While executing gem ... (RuntimeError)
    gem update --system is disabled on Debian. RubyGems can be updated using the official Debian repositories by aptitude or apt-get.

I followed this up by downloading and building ruby and trying to do a gem install from that. Into my second day, I decided that spending more of my precious time trying to get it working on Ubuntu was a waste. So I briefly considered reimplementing it in perl, then found ruby for Windows and installed it on my Win7 box. Then I opened a term and did the gem install. And now it works.

I find this profoundly disappointing.

I don't think this is the fault of Zach Holman, the creator of boom. I have now tried boom on Windows and kinda like it. I suspect (and might pop open VirtualBox and a bleeding-edge distro to check) that the problem is that Ubuntu or just U10.04 is stuck with an ancient RubyGems the way that RHEL is stuck with ancient Perl. I just know that I don't believe it should've been a problem.


I have headphones at work, which I use for the express purpose of filtering out the sounds of my coworkers, and the sound of them talking to each other is distracting enough. Voice control makes all the sense in a car, but on a tablet?

Tablesorter Bug

We use the jQuery module tablesorter to allow for the dynamic sorting of our tables, and I have just now run into a bug.

We have 3 second-generation sequencers: the 454, the SOLiD, and the HiScan. We have a second-generation sequence table, which mingles all three. And sorting on sequence agent is broken, because it trys to look at the column numerically when clearly, it needs to treat them all as character strings. Which will monopolize my next few hours.


Learned some Javascript (and maybe more) (I think)

var filename = get_filename_or_something() ; # !important
if ( !! filename ) {
    # do code
It took me a while, but I think I get this. If you try this and filename is undefined, that could be a problem. One bang forces it into the logical realm, and the other bang reverses it, making positives negatives. In perl, we'd just go if ( $filename ) {} , but we're cool like that.

But, as you might imagine, it is absolutely impossible to look this one up on Google.

How a technical presentation Q&A session turned my head around.

Sometimes you learn something beyond what you expect.

Last night, I attended a presentation at my local LUG of Puppet by Garrett Honeycutt of Puppetlabs. Garrett had evidently been a PLUG guy when I was, but I didn't really recognize him. I did recognize his friend, so it could be my mind going.

Anyway, Puppet, in a nutshell, is a configuration manager useful for keeping hundreds of machines doing what you want, using the same kinds of tools people use to manage software, including moving from Development to Test to Production. Some people I know who ride the big iron are finding their current means to manage their hundreds of machines to be a little too hodge-podge and daunting were there, and it seems they like the taste of that Kool-Aid. I was interested, and thinking it could be useful, but a bit more than anything I need.

That's when it hit me. In our current workplace, we're not really doing that. There's one canonical instance of a thing, be it a Perl program, a Perl library, a Javascript module, a web page, a script, whatever. We have test and production databases, and for some of the modules, we're using tests like in Perl Testing, which I must confess I never really "got". So, while we might be generating decent code, we're not engineering it and maintaining it the way we should.

So, I'm spending much of today trying to wrap my head around that, hoping to establish Dev-Test-Production setup for my code. Thank you, Garrett and Mike and the rest, for kicking me the right direction without even meaning to.


It Exists!

Turns out that "Bluetooth for Location" has been solved already, with BlueProximity. I'm still looking for a way that a recurring crontab thing can be set up, but a precedent has been set and licensed GPL, so I'm optimistic.

Also, apt-getting bluetooth and bluez-utils made the broken little dongle shine blue. Just that easy.


How Can My Computer Know I'm At My Desk?

The problem is exactly as stated. I have my computer do certain alerting things, but unless I'm physically near, I don't it to do some of  'em. And it would be nice to be able to use it to enable certain security aspects, less "unlock when I show up" but rather "lock when I leave". But first, of course, we must have that mechanism.

So what are our choices?

  • Motion Sensor - I like it. I used to have one in my den to control the light. It's neat. For work areas, I strongly suggest them for your lights, and have on this blog in the past. But you don't always have motion, especially when you've messed up line-of-sight or have a moment of stillness. So, it's a factor, but not the factor.
  • Light Sensor - That won't tell me everything, but will tell us something, and that something is that the light switch is on at my desk. Not perfect, because often my coworkers turn the light on when I'm not yet there, but it's certainly better than nothing. 
  • Location Sensing - My phone is the very definition of an Every Day Carry, and being a GPS-enabled Android which talks to Latitude, I can query Google to find out where I am. Except, where I am at work is in a subbasement, with no radio and no cellphone, so the location sensing defaults to a switch on the other side of campus, which hardly works. Maybe I should try it for "at home". 
  • WiFi - If my phone is connected to my home network, my switch knows it and with a little coding, I can use it. Which is great, but when I'm at home, I often have WiFi off, and sometimes I use 4G (in the one corner of town where Sprint has 4G). Sometimes I don't though, which makes it a poor indicator at home. But it gets worse at work. My desktop is behind a firewall and my phone would be on the campus WiFi, so never the twain shall meet. I could write something that announces every 10 min or so that I'm on the campus network, should I ever get far enough into coding Android, but it isn't there yet.
  • Bluetooth - This is the one I'm currently dreaming about. My dongle is Class 2, which means I have about 30 feet of range (and possibly less at the office). That's a nice range.
Without the addition of hardware I don't yet own, I'm thinking that the best solution is Bluetooth. I think that I could have fun making a dingus with Arduino that does motion sensing and light sensing (and maybe a thermometer) so I can get all sorts of data I can play with, but I can't play with that at work tomorrow.

Any thoughts I'm missing besides USB dongle and smart card you can think of?


The Delicious Fruit Has Become Bitter

I just got mail from Yahoo.

It's the opt-out email. YouTube creators Chad Hurley and Steve Chen have bought Delicious from Yahoo! and are going to try to get it going again.

I wish them luck.

I doubt will go anywhere.

I got into Delicious when I was at the clinic, I think. There were two main selling points for the service:
  1. Your bookmarks were saved in one central location, so that you could share them between your work machine, your home machine, your laptop, etc.
  2. You could see what other people bookmarked and found valuable on various subjects.
Point #1 was crucial to me. You start getting computers, you start getting headaches about knowing where your links are. Point #2? I tried it a few times. It never was too valuable to me. When I set up a new computer, one of the first things I did was sign in with Delicious and download the Firefox plugin.

When Chrome came along with their bookmark system, I jumped. Right now, a few games that don't work right in Chrome are all that keep me going back to Firefox, plus the occasional debugging, and I never ever log into Delicious. A few months ago, I downloaded the bookmarks from Delicious and put them into Evernote. I don't think I've surfed from it yet. I doubt half the content is still there.


Pop Culture Rant, not so much a Tech Rant

George Lucas doesn't give a sh*t about you. It's survival of the fittest, friend, and George has the f*cking editing bay.

 I think we saw what we needed to know about George Lucas in Return of the Jedi. We know that Boba Fett's backpack has jets, but he couldn't jet out of the Sarlacc pit. This song is as cool as Fett seemed in Empire Strikes Back, as cool as he's presented here by MC Chris — he'll chase you "from Endor to Hoth, from Ripley to Spock" — but he dropped into the Sarlacc pit like a sucker and couldn't get out even though he had the tools to. Forget, if you can, that Star Wars begins with Jar Jar and ends with Ewoks and just focus on how the intense human drama at the end of Empire gave way to the fairly ham-handed beginning of Return.

Watch the movies. Enjoy them. But remember that Lucas has lost the script. Don't invest yourself in Lucas, because he doesn't care about you.


Password Strangeness

I'm in this group. This group has a web tool. This web tool is made with ease of use and trust as cornerstones, so there's one password. That password, for purposes of discussion, is Abc123. Be it noted that I remember the letters and numbers, but not necessarily the cases.
While waiting for the bus to work, another member of the group wanted access to the web tool, and emailed me for the password. I did some checking on my Android phone and told him that the password was ABC123, because that worked with my phone.

When I got to a computer, we did some testing and found that neither abc123 and ABC123 before I checked the password configuration and found the right one.

So, why, when I put the password into this tool, did it take ABC123 as Abc123.


Configuration Joys

I just got a new remote. I don't suppose I really needed one. I have three in my bedroom (four if you count the wireless keyboard) and I'm more than fine with handling them. It wasn't "Hey, I need a new zapper in the bedroom" that lead me to jump on this.

It was my RSS feed.

I have Radio Shack stuck in there, mostly to see when Radio Shack starts selling the Arduino. What I saw was an announcement of Logitech's Harmony line of remotes. I had heard about them. A friend has a fairly advanced one that has a screen and allows you to combine button-presses for several devices into one action, such as "play movie" starting the DVD player and moving to the composite input.

I got the Harmony 200,  a much less complex and cool remote. Also, at $20, much much cheaper. It can control three devices, which works well for the bedroom (TV, DVD, Cable) but the living room has TiVo too and the four-device Harmony 300 is $10 more.

The selling point, the thing that made me go out and get it, was the programming. Well, the price, too, but that just meant it was possible. You plug it into USB and there's a quick-and-easy screen which lets you tell it this is my TV, this is my DVD player, etc. You can then customize the buttons. Then you sync the settings and you're done. No more messing with the "Press the Mode button for 4 seconds, look for the flashing LCD, then punch this four-digit number in" or the worse "Press Power then Down again and again until the TV goes off".

I'll mess with finding different combinations, but this is a solid, well-made remote whose ROM I flash to bend it to my will.

We live in the future.


More Fun With SQL

I've mentioned this bit of code recently:

        INSERT INTO accession_analysis (
            accession_id ,  analysis_id ,
            reference_id ,  status ,
            status_text  ,  extra_parameters
            accession_id ,
            ? , ? , ? , ? , ?, ?
        FROM accessions
        WHERE request_id = ?
        ORDER BY accession_id
Well, it's slightly different. Before we were just holding the current state ( waiting, working, success, failure ) and now we're holding text information, too, and you don't just want the current state, you want to be able to look back. So, in addition to accession_analysis , we're adding accession_analysis_status, which will have, so far, a unique id, an id for the AA it connects to, and then the the status information.

And the run now. I have to add a run number to the accession_analysis schema. I can do that, but that's not germane right now.

What is germane is how to store the status into accession_analysis_status at about the same time as it goes into accession_analysis, getting the id from the accession_analysis table. It would be far easier if I was using an iterative approach, but then I'm blasting the DB with many connections instead of just one.

An approach would be to find all the accession_analysis elements without a matching accession_analysis_status, and then inserting them into accession_analysis_status. Something like

INSERT INTO accession_analysis_status ( 
    aa_id , status , status_text 
SELECT id , status ,status_text 
FROM accession_analysis 
WHERE there's no aa_id corresponding to the id in accession_analysis
But clearly, I don't know how to express this as SQL yet.


Command Line Comments

Just read Life on the Command Line, wherein the author explains how he does everything everything everything on the command line these days

I'm largely sympathetic. Honestly. And I'm close. Look at my work Linux box and you'll see a good half-dozen terminal windows open. I'm a programmer whose two primary programming styles are command-line/batch/crontab and web programming, so I am much more comfortable with those styles and their explicit order than the GUI developers.

But I have sort of a guilty secret.

I spend most of my programming time using KomodoEdit.

I have a fairly custom .vimrc that allows me to do cool things in vi, but beyond crontab editing and other things that look for $EDITOR, I only use vi when I want to do lots of specific and repetitive find-replace stuff or need to do small changes, or when I'm using SSH to connect to a system and not using SSHFS to mount it. My heavy lifting for editors is done via KomodoEdit.

And that isn't it. Not by a long shot.

I have a significant amount of music. I used to use Rhythmbox as my media player because it allowed me to use and thus alias a command-line interface. On that system, I would have rhythmbox-client --play ,rhythmbox-client --pause , rhythmbox-client --previous and rhythmbox-client --next aliased to play , pause , prev and next .

I stopped for a few reasons. I now have a Windows machine that I use for testing web dev in different browsers, and, when I play my media, I play via Windows Media Player, in part because playing and queueing media that I have but didn't have in my media library made Rhythmbox unhappy. But, between Amazom, Google, Rdio, Pandora and Spotify (which I love), I hardly listen to that huge library anymore, because either I have much of that library up already or I'm streaming stuff I want but don't have. And those tools don't give me a command-line option.

It gets to what tool is powerful enough to allow me to do what I need to. With music, I need it on and going, and paused on occasion.

Don't get me wrong. I'm with him at least in part about mail sorting. Before web tools and Thunderbird, my preferred mail client was PINE, but there's functional reasons that I can't remember anymore that I moved the one remaining command-line mail account from the nmap-based sorting from mh/nmh to procmail. IIRC, I liked the syntax for mh better but I found that SpamAssassin worked better with procmail, and I desperately needed SpamAssassin. The sorting choices I get from both Gmail and Zimbra are both clunkier than a nice mh or procmail script. But I spend little of my time in email, so that shouldn't make much of a difference.

Anyway, interesting read.


Fun with SQL

If you want to put one thing into a database, that's easy.
 INSERT INTO table_1 (field_1 , field_2 , field_3 ) VALUES ( 4 , 1 , 2 ) ; 
I've been putting lots of things into table_1 that are in table_2, where there are n instances of table_1 in each table_2. This lead me to something like this.
my $sql ;
$sql = << 'SQL'
SELECT id FROM table_2 WHERE field_3 = ?
my $ptr = db_arrayref( $sql , $param->{ field_3 } ) ;

$sql = << 'SQL'
INSERT INTO table_1 ( 
    field_1 , field_2 , field_3 
    ? , ? , ?

map {
   my @vals ;
   push @vals , $param->{ a } ;
   push @vals , $param->{ b } ;
   push @vals , $_ ;
    }  @$ptr ;
That's OK for what it is, but what that ends up meaning is lots of small SQL commands sent to the server. SQL handles the commands well, but the opening and closing of network connections is just sort of sucky, so, that's not optimal. So, if you can just send one command, that's what you want. And in this case, the command is this:
INSERT INTO table_1 ( field_1 , field_2 , field_3 ) 
    1 ,
    2 ,
    field_3 FROM table_2 WHERE field_3 = ?
Isn't SQL wonderful?


Failure to Plan is Planning to Fail

I have a database table full of requests. When that table is filled, we also put together a wiki page for that request, because the the data we generate for each request can be fairly free-form. This means that we effectively divorce the state of the DB for a given request and the state of the wiki.

There are certain fields that are on the wiki page and not the database. Specifically, contact information for the person making the request. It made sense to me, at the time. I think it was based upon the data separation issue, or that we actually hold that in a profile, too, but I can't remember right now.

Now, my boss wants to be able to regenerate the wiki page for quick-and-easy refresh. Regenerate the DB stuff into a new wiki page and cut-and-paste the non-DB stuff from the old page to the new page. Easy-peasy. Except, I don't store all the data in the database.

So, now, I'm looking at adding columns to the table to handle said contact information and spidering the contact info off the wiki pages. If I had just put it into the database in the first place...


More Javascript Didacticisms: Namespaces

For the draft page in the Blogger dashboard, I'm seeing sixteen javascript libraries and five extensions running in Chrome. That's a lot. Think about that next time you type var scratch = "something" .

If you want to use scratch and so does the person who wrote the popup library, your code might not work. I learned this when I was writing test code to properly create objects for jQuery's post(). I was sharing a variable that contained the URL for the JSON I was grabbing, and when I had Before commented out, After worked perfectly, but otherwise, it all crashed.

The solution? Objects.
var my_ns = {
    foo : x' ,
    bar : 14 ,
    blee : function () {

$(function() {
    } )


Thank you, Stack Overflow!

I have to thank Stack Overflow again.

I code everything from SQL to Javascript, and that means that I don't really get to become expert in everything. I do alright, but there's occasionally things I just don't get, and I don't really have a community of developers around me to ask.

My most recent problem relates to hashes in Javascript connecting to JSON. You can make arrays behave like hashes in Javascript:

var hash = new Array()

hash['foo'] = 1

alert( )

My problem was with post, the jQuery method to do AJAX with large data sets. This didn't work.

$.post( url , hash , function() { ... } , 'json' )

Because hash is an abused array, not an object. Before I figured it out, today, thanks to SO, I make a long string that looked like { 'foo' : '1' , 'bar' : '2' } and ran eval on it. Bad bad bad bad bad. I now know that it should work like this:

var hash = {}

hash['foo'] = 1

$.post( url , hash , function() { ... } , 'json' )

Thanks again, StackOverflow!


What to do with massive amounts of computrons, plus quitting in front of people

I am in the bioinformatics space. Irony is, there's no way I could've paid to attend OSCON. 


Dialup Slowdown

This is the dialup handshake, slowed down 700x.

I used to have a 2400 baud (call it ~2Kb/m) and I'm now getting 5MB/s, which is more than a 700x speedup.


My .perltidyrc

Especially when working with others, it is important to be able to format your code. Perl Best Practices convinced me to start using PerlTidy, and this is my current .perltidyrc.

# my .perltidyrc config file

  --noopening-sub-brace-on-new-line #  -nsbl        # opening sub braces on right
  --indent-closing-brace  # thought this was -ibc
  --indent-block-comments #  -icb         # indent comment blocs



I Keep Telling People!

This guy on Reddit points out that, in 1996, a simple statement of many of our regular lives would be taken as science fiction.
Mary pulled out her pocket computer and scanned the datastream. It established contact with satellites screaming overhead, triangulated her position, and indicated there was an available car just a few blocks away; she swiped her finger across the glass screen to reserve it. A few minutes later, she spotted the little green hatchback and tapped her bag against the door to unlock it. "Bummer," she said as she glanced at her realtime traffic monitor. "Accident on the Bay Bridge. I'll have to take the San Mateo. Computer, directions to Oakland airport. Fastest route." Meanwhile, she pulled up Kevin's flight on the viewscreen. The plane icon was blipping over the Sierra Nevadas and arrival would be in half an hour. She wrote him a quick message: "Running late. Be there soon. See if you can get a pic of the mountains for our virtual photospace."
This is not a new thought. I worked in a used bookstore during my first sojourn in academia, around the summer of 1991, five years before the point in question. At that time, I was big into cyberpunk science fiction. Bruce Sterling and William Gibson. I was putting things away, cleaning up the place, when I saw a book centered upon a computer hacker. It wasn't in the science fiction section. It wasn't even in the true crime section like Cliff Stoll's The Cuckoo's Egg. It was in mystery. That was the point where science fiction stopped being about futurism for me.


Lovin' Me Some Gingerbread

So, last night, I forgot to plug in my phone, which is an HTC Evo, an Android phone. I've had it since November, and if I forgot the overnight charging, I'd have a dead phone.

I had something like 77% charge this morning.

I love the Power Management under Gingerbread.