/var/log/rant

2013/11/27

Why Yahoo is Sad for me

In a perfect world, I would be able to wrap up my tools in a little bundle. I can't. We've made a decision to have one lib directory and put everything in it. So, we have our blessed jQuery, jQuery-UI, Mustache.js, etc. And, we don't use every library in every web app. So, this means for one JS-heavy page, I'm calling those three, jQuery TableSorter, HighCharts and a theme, plus my library which calls that.

Well, really, I can and should pull out the HighCharts libs, as we no longer plot on that page. Which makes things even worse. I call seven (now 5) libraries to do this work which ySlow considers excessive. But because I don't always use jQuery-UI and Mustache, I can't just make and minify a std_lib.min.js with everything put together.

This is where I get a B.

Even worse, I don't run everything through GZIP. (F.) I also don't use a CDN. I am using SSL now, and it seems that SSL and CDNs are mutually exclusive, as everything that doesn't come from your https server is a possible security issue. Even then, almost everybody who is going to use this data is going to be in the same lab. Still, F.

And we're not going to get into how I'm not going to play with the Apache config to get Expires headers (F) and Entity Tags (F).

And we have our Lab Notebook, the place where we store all our work, in OddMuse. And OddMuse remembers who you are with a Cookie. Not having written OddMuse, I can't change that. Not having a good alternate choice, I can't change to another Wiki. So, there's one cookie. (D)

I know that Yahoo's problems are not my problems. Honestly, while I got a grade of 77 (C) on the page as a whole, I know it's one of my better websites, because it handles only a small chunk of data at a time. And really, I'd rather work with my unminified JS so if problems arrive, the JS console will give me a sane reading of where the problem lies.

So, I'll take that Charlie and proceed to the next problem.

2013/11/07

Counting Hashtag Usage with R

Found this on the Revolution Analytics blog: What does Barack Obama tweet about most? In essence, pull down all the tweets from the official Twitter feed for the President, grab the hashtags and create a bar graph.

I don't often use R interactively. I generally make a script, set it in crontab, and have it run automatically. So, I adapted it to run via Rscript.

2013/09/03

Bug in Perl? Or am I just doing it wrong?

Three modules: MyTools::Foo, MyTools::Bar and MyTools::Blee. Foo primarily exports foo(), which returns "foo". Bar, similarly, exports bar() which returns "bar". Same with Blee.

Foo also exports foo_bar(), which concatenates the output of foo() and bar(), and foo_blee(), which does the same with foo() and blee(). Bar has bar_foo() and bar_blee(), and Blee has blee_foo() and blee_bar().

Here, the library holding MyTools is identified using the PERL5LIB environment variable. I do this to model a problem shared by a suite of modules that have spaghetti-like tendencies to interconnect with each other. Previously, I had used use lib '/path/to/lib' to do this purpose, but the need to start using git and the like has pushed me toward using a means to identifying library paths without hardcoding them.

I haven't developed the test case such that, using use lib, it works, but my production uses lib all over the place, and I only noticed the problem when I copied it to a dev directory and pulled those lines.

The problem I'm seeing is that Perl starts to look for foo() in Bar or Blee, when foo() is in Foo. Clearly, It'd be better if Foo didn't include Bar and Blee while Bar included Foo and Blee, etc. It would be nice if I had separated the code better in the first place, but seeing that the code base where I started finding this problem is 19 modules with over 8,000 lines and is used across several systems by someone else, so every change has the potential to break my lab's workflow, I can't really disentangle it right now.

So, it strikes me that Perl is wrong, too. (I fully admit that my code is an unruly hairball. I'm starting to pay that technical debt right now.) I'm somewhat loathe to tag this as an error in Perl and start filing bug reports until someone with more direct experience looks at this and says either "That's odd" or "Dave, you're a dumbass." I would certainly accept either answer.

So, am I wrong?

2013/08/21

Caller ID, Time and Temp with Twilio

The other day, my wife got a new phone, replacing her shattered iPhone 4 with a new LG Optimus. The problem is, by rotation, the upgrade was for me, so the new phone was to have my phone number. They moved it to her phone number, but that left me disconnected from the cell network. At least, I knew at the time I was disconnected, but I was less than sure that her phone was using her number.

My thought was, "It would be nice to be able to call someplace and have it tell you what number your phone is, so we could be sure." So, I wrote something.

This uses Twilio, which is essentially an API into IP telephony. I've used Twilio before, and have two projects on GitHub, Call_Me and SMS_Me, that are generally unidirectional messaging tools, but I haven't done much interactivity with the tool.

It gives the following output, which Twilio then runs through text-to-speech.


The time is currently 4 37 pm

The temperature in West Lafayette is 83 degrees Fahrenheit.

Your phone number is 1 2 3. 4 5 6. 7 8 9 0.

A few notes on the formatting first. I separate the numbers of the phone number (and that is not a real number) so that it says the number with an expected cadence. If I just put 1234567890, it'd say one billion, two hundred twenty three million, five hundred sixty-seven thousand, eight hundred and ninety. If I just broke it up to 123 456 7890, it'd say one hundred twenty three, four hundred fifty six, seven thousand eight hundred ninety. To break up the numbers so it just says the digits, you need to put the spaces. To break it up so the area code and exchange and line numbers are distinct, you need the punctuation.

Times are weird, too. If the time is 4:37, just having the time as 4 37 works, but if the time is 4:05 and broken into 4 05 instead, it'd read it as four five. This is why I use digit letter-"o" digit for the time.

I say lots of bad things about XML and prefer to avoid it when possible, but sometimes, like here, it's just that easy to handle. Kudos to XML::LibXML and NOAA for making that part reasonable and good.

Now that I have this working, I'm curious about doing other things. I can see myself adding to this, checking to see if From is my number, and if so, opening up choices like my whole Quantified Self stuff, the home automation I want to do more of, etc. I'm not sure I like the TwiML module, and might move that over to Template Toolkit, but that remains to be seen.

2013/08/12

Coding Yourself Into Shape

Today, I've hit a new low.

224 lbs. This is as low as I've been in a great long while.

I've been looking and thinking about this graph for a while and have come to some conclusions.

I ran some last year and it didn't help much.

The first day I was under 240 (the orange line, separating overweight and obese for my height on the BMI index) was October 7, 2012. (I get that it's problematic because Fat Albert and in-his-prime Mike Tyson were about the same height and weight, but there's a huge difference, pun intended, between their body composition. When people start mistaking me for a heavyweight champion, I'll start worrying about being placed wrong on the BMI chart.) I started running at some point in the summer and mostly ended after a 5K I participated in on August 25. The trend line for weight doesn't change significantly, sticking with the ~1lb/week during that time.

Be aware that this is self-tracked, and there are some days where I forgot to weigh and some weeks when I was unable to check against my scale, and this version of the graph does not show the gaps.

So, what did help?

The majority of my weight loss during that time is probably attributable to drinking more water and less caffeine.

There were certainly times when I lived off the stuff.

In April 2009, I started to think about all the bad things they say about caffeine, as well as all the good things, and decided that I'd go cold turkey, test it out on myself and see for myself.

The first day, I took my boys to a pizza place with a soda fountain, and I had two cups of Diet Coke before I thought about it. The second day, I did much the same. This is why we call it a habit, because we start continuing behaviors because that's what we do.

The third day, I remembered and went without anything.

The fourth day, my co-worker looked over at me, with my area's lights off, hunched over, nauseous and with a head that felt like it was exploding, and told me "If you have the flu, you should just go home." I gave up. But I started cutting back. Way back.

Right now, I tend to drink two cups of coffee a day, only on work days, and never on work days. Sometimes maybe a cup on Sundays, but rarely.

The way I think of the problem with cola, even diet is:

As I got from Tim Ferriss' Four Hour Body, Never Drink Calories
We put artificial sweetener into our drinks in order to trick our tongues into thinking the drink is sugary and therefore tasty. Current thought is that our pancreas is similarly tricked, thus creating large amounts of Insulin and storing our calories as fat.
The body works better with sleep. When you caffeinate, you mess up your sleep. When you don't get good sleep, you get fat.
If you don't drink it all the time, there's withdrawal, which give the flu-like symptoms I describe above. I heard a story about how a person's mom tried to get his dad to switch to decaf on weekends, and the illness and headaches almost led to them getting a divorce. Part of what I began to realize was that I was having the same effects, and sometimes still do.

I have never really trusted the bad things said about Aspartame, but there's enough here to make me choose water over cola every time.

You should never be hungry.

Time was, I would only eat dinner most days. Beyond what it might've done to slow my metabolism, it made me famished by the end of the day, and, as I was either a CS student or a developer for those days, and willpower and cognitive processing draw from the same pool, by the end of the day, as I was exhausted, I made stupid decisions, ate as much as I could, then fell asleep.

Right now, I eat cheaply and probably not right — generally a packet of oatmeal for breakfast and a $1 TV dinner for lunch — I always eat, so I'm never so famished that I couldn't say "no" to the promise of a snack, or to thirds. I don't always do so, as I'm a weak, weak man, but I'm better than I was.

Eventually, you get all the benefits you're going to get for one change.

The story from October to the middle of summer is pretty much +/- 5lbs from 235 lbs, which is pretty much stasis. (Reading Back, that doesn't make sense. Trying to rewrite) The sudden drop at about 375 on the graph above (I really need to put dates into the graph!) occurred over Christmas/New Years', when I was away, visiting with family. Days out and about with my family in Nevada are different than snowy days in Indiana, so that shows that other changes in diet and exercise made more difference at 235lbs than they did before.

My newest changes are more exercise — I try to make it to the gym three times a week but usually more hist twice — and less carbs — as suggested by Randal Schwartz, I tend to eat the toppings off pizzas but leave the bread alone, and not just the crust — and those are leading to the current drop. We'll see how far these take me.

Your body is hackable.

The graph above and the data it shows exist because I read Tim Ferriss' book, The 4-Hour Body. I'm not doing everything in it — it's probably accurate to say I'm not doing anything in it — but most everything in this post started with me trying to affect the inputs to the black box that is my body and trying to get different outputs.

The graphs and the core of my first approach has been mostly inspired by John Walker's Hacker's Diet. My graph shows a several-day rolling average, but not a weighted average like Walker suggests, because that was the limit to the amount of SQL I knew.

My goal right now is to get to and stay at about +/-5lbs from 200, and I think I can do that. My other goal is to not be able to pinch an inch on my belly. I've been able to do that ever since Special K started showing that ad.

There's more things to change than weight.

In many ways, I enjoyed my time trying to build up to a 5K last year. I didn't enjoy all of it. There was intense pain, mostly in my right foot, and running on feet that hurt you is not fun. So, I went to see a podiatrist, who said that my left ankle is pronated and my right was even more so, so much that my right achilles tendon is in the wrong place. She made me orthotics. I personally don't think they did much good, and I stopped wearing them earlier this summer. Over the last few weeks, I've been using a home-made standing desk, basically monitors on top of the hutch and a box holding my keyboard up. I might get a purpose-built desk to replace this in the fullness of time, but this was mostly to prove the concept.

I have to say that, for the first two weeks, my feet were always sore. The problem wasn't the sole of the foot, but rather the arches going into the ankle. The interesting thing has happened in the last week: I've felt something happen with my heel which might be my tendon going back into position, and I'm pretty sure my ankle looks better and has better range of motion than it has in quite some time. So, for that change alone, I don't think I'll go back to a standard desk any time soon.

Making it a more public process helps with motivation.

When I make new milestones with my weight loss, I tweet my weight, linking to a JavaScript graph of my recent progress. I wear a FitBit most every day, and have code that connects to the FitBit API and also tweets my steps and floors each day. You can even follow me directly on FitBit. I even have my running tracker, Endomondo, post to Twitter and Facebook, even if I wish it wouldn't post the map.

I don't feel like running every day. I don't feel motivated to just eat the toppings off pizza every day. That people I know can know what I'm doing and give me encouragement is a major part of what I'm doing. (Learning how to make the graphs and sparklines and such is another big part.)

If you have something you want to change about yourself, the thing I would suggest is to break it into things you can keep track of, then go for it!

2013/08/07

Enemy of the State?

I've been glad for all the new life breathed into the Perl community starting with 5.10, but, honestly, there has been only one thing I consistently used in the raft of new stuff added: say. I disliked having to write print "$var\n" each time I wanted to print a line with a newline, and say allowed me an easy way to do that without changing the behavior of print.

But I've been trying out state.

state declares a lexically scoped variable, just like my. However, those variables will never be reinitialized, contrary to lexical variables that are reinitialized each time their enclosing block is entered.

That sounds like a winner, doesn't it? But it left me some questions.

I deal with SQL a lot, and so I have queries in my code. Sometimes, big queries. I'm wondering if assigning the SQL query each time is more costly, CPU-wise, than declaring it as a state variable and just holding it in memory.

#!/usr/bin/perl

use feature qw{ say state } ;
use strict ;
use warnings ;
use Benchmark qw{:all} ;
use Carp ;

use lib '/home/jacoby/lib' ;
use DB ;

my $sql = <<SQL;
        SHOW TABLES
SQL

sub outside {
    my $arrayref = db_arrayref( $sql ) ;
    }

sub use_my {
    my $sql = <<SQL;
        SHOW TABLES
SQL
    my $arrayref = db_arrayref( $sql ) ;
    }

sub use_state {
    state $sql = <<SQL;
        SHOW TABLES
SQL
    my $arrayref = db_arrayref( $sql ) ;
    }

timethese(
    10_000 , {
    'outside'   => sub { outside( ) ; } ,
    'use my'    => sub { use_my( ) ; } ,
    'use state' => sub { use_state( ) ; } ,
    } ) ;

DB.pm is my wrapper for the very complex DBI module, which gives me just three functions: db_do, db_hashref and db_arrayref. DB has %prepared, such that $prepared{ $sql } = $dbh->prepare( $sql ), so that, when using a query over and over again, a pre-prepared statement handle object is passed out.

Benchmark: timing 10000 iterations of outside, use my, use state...
   outside: 476 wallclock secs (14.66 usr +  8.21 sys = 22.87 CPU) @ 437.25/s (n=10000)
    use my: 475 wallclock secs (14.87 usr +  8.01 sys = 22.88 CPU) @ 437.06/s (n=10000)
 use state: 469 wallclock secs (14.68 usr +  8.14 sys = 22.82 CPU) @ 438.21/s (n=10000)

Looks like, in this case, it doesn't make a whole lot of difference. Still, good to know, right?

2013/07/30

Telling One Liners

I don't do a lot of Perl one-liners. In general, I think if you're writing Perl, you're going to want to repeat it so you should put it in a file.

I say in general, because I ran into a specific situation. For a plot-generating process, we 1) pointed to the version of the plot-maker that didn't use libcairo, so it couldn't be used without X11, and 2) accidentally removed lots from the completed file, where we list the plots we've done so we don't redo them, which lead to 3) remaking lots of plots that exist with a plot-maker that couldn't make plots, meaning files that existed got replaced with zero-sized files.

Once we figured out what the problem was, I wrote this one-liner, which ran for over an hour and 40 minutes.

for i in {1..100} ; 
do (
    ls -lS */nanodrop/*png | perl -lane ' print $F[8] if 0 == $F[4] ' | wc -l 
    ); 
sleep 60 ; 
done

Basically, it looped over and over again to keep constant watch (the for loop), each time taking an ls -l (the -S, order by size, not strictly being necessary but left over from my initial checking), breaking it apart with Perl, printing the file name ($F[8]) if the file size ($F[4]) was 0. I pipe the output into wc -l, which counts the number of lines. I then sleep for a minute (as the number of zero-length files was over 7000 at first run, the ls-l took quite some time, so the reports were much longer than a minute apart), and do it again.

Granted, I could've done it all in Perl, but why re-implement tools that exist in the shell when you don't have to? I could've done the Perl stuff in Awk, if only I knew Awk better, but I could get to the point where I knew the damage and the rate of repair much faster with tools I knew. Really, the only thing I needed to learn was that perl -lane was what I needed to get the lines broken easily.

2013/07/29

Making a CarPuter to step into the New Car

It's been a while since I've written on car computing — over two years, it seems — but that doesn't mean I've stopped thinking about it. The top reasons I would've wanted one in years past would've been for Entertainment and Communication, and I think everyone would agree that today, it's far better to just carry a smartphone than to embed one into your car. And, seeing how smartphones are changing so much so fast, doing more than adding the ability to interface with Bluetooth and/or USB to your vehicle seems silly, at least in short-term.

Navigation doesn't fare too much better, so although it's a toss-up whether dedicated units like Garmin and TomTom are better than smartphone navigation like Google Maps, both are accepted as better (more current maps, better interfaces) than in-dash choices.

I think the use that is least considered is recording. Dash cams are common in Russia because they're used for legal protection, but have given YouTube a great catalog of amazing video. I think there's reason beyond "Hold my beer and watch this" for Americans to have dash cams, and I do want one.

But, ultimately, I think the best reason to get into "Carputers" is Diagnostics, getting into the data that is available from your car's OBDII port. The obvious way is to use an OBDII-to-Bluetooth adapter like the ELM327 or Garmin EcoRoute, but it strikes me that there are enough security vectors in to the New Car that adding more is not a wise route. So, I'm thinking that the Raspberry Pi and an OBDII USB cable might be the better way to handle it, except I'm not sure how to export the data, and while it would be useful to keep track while driving, ultimately, off-the-road analysis is where the usefulness of the process comes in.

I'm thinking that a Raspberry Pi, a cable, a small monitor with composite video and maybe a few other things could be easily turned into a car-monitoring system, and I could pretty easily set something up to only sync with my home network when it's close. I'm not sure whether that's more cost-effective than just getting an ELM327 and an ELM327 app from the Play Store, but I think I'd end up learning more that way.

Anyway, I'm still undecided on the phone/Pi issue, but I think this is something I need to do.

2013/07/23

I've got a standing desk, and boy are my legs tired!

Borscht-belt comedy aside, I converted my desk to a standing desk configuration a week ago last Friday, making this the seventh workday where I've spent most of my day standing.

Yes, compared to other standing desks, this is pretty janky. It works for me, though. The tops of my monitors are only a inch or so above eye level, which is pretty close to Hoyle according to ergonomic standards I've seen, and the box put the keyboard at a good typing level. I'm happy with it.

My feet, on the other hand...

My left foot is a bit pronated, which is to say that the ankle turns in a bit and the big toe turns out a bit and the arch is a bit flat.

My right foot? Take every a bit from the previous sentence and put in extremely, and it also lacks the range of motion of my right. I first figured out the issues with my ankles when I noticed I could make fart noises with my foot on the bathroom tile, without trying, but the time I took a hike with my son's Cub Scout pack was when I realized I had a problem, not just an issue.

Since then, I have become much more active. Last year, I did Couch-to-5K, and actually did a 5K. I did it in 45 minutes, finding that I do a mile in 15 minutes or so whether I run or walk, but hey, I got the t-shirt. This year, I've been focused on other activities, but plan to do it again. When I started running last year, I got the same foot pain, which eventually subsided the more I worked. Internet lookup makes me think it's the Extensor Digitorum Brevis muscle that was giving me problems last year, and this last week, this is the pain I've been having again, except this time, in both feet.

The previous pain tells me that I'm doing the right thing, that I'm strengthening my feet and ankles by standing. My recent experience with Cross-Fit tells me that I had better start exercising and strengthening my feet and ankles or this pain won't go away soon.

In terms of productivity, I don't know that I've noticed a difference. I'm in a position at the moment where there's a lot of sit-and-think (or rather, stand-and-think) and not so much great code generation. I don't currently quantify my work, but would be interested in finding a way to do so. Of course, it would've been good to have a body of sitting-desk work in the dataset to use as a comparison....

2013/05/15

Stupid Things with References

It took me a while to figure out why I was writing junk to the DB. More or less, this is my code.

my $request = get_request( $request_id ) ;
my $to_wiki = $request ;

I thought to be making a copy of the hash in the hashref.

I meant to be making a copy of the hash in the hashref.

I wasn't making a copy of the hash in the hashref. I was making a copy of the hashref address. Which means every change I made to $to_wiki, I made to $request.

Instead, I needed to do something more like this.

my $request = get_request( $request_id ) ;
my $to_wiki ;
%$to_wiki = map { $_ => $request->{ $_ } } keys %$request ;

I present this as a cautionary tale. Please, learn from the mistakes of others.

Cookie Notice