In my lab, we have an AJAX-laden web tool which loads a certain JSON API on page load. It was judged that what we had was too slow, so I created a program that wrote that JSON to a static file on regular intervals. Problem with that, of course, is that changes to the data would not show up in the static file until the next scheduled update.
So, we created a third version, which checks the database for checksum, and if it changes, it regenerates the file and sends the data. Otherwise, it opens the file and sends the data.
I tested with Chrome Dev Tools, which told a bit of the story, but at the scale where it's closer to anecdotes than data. I wanted to go into the hundreds of hits, not just one. I pulled out Benchmark, which told a story, but wasn't quite what I wanted. It started the clock, ran it n times, then stopped the clock, while I wanted to get clock data on each GET.
I also realized I needed to test to be sure that the data I was getting was the same, so I used Test::Most to compare the object I pulled out of the JSON. That was useful, but most useful was the program I wrote using Time::HiRes to more accurately grab the times, then use Statistics::Basic and List::Util to take the collected arrays of sub-second response times and show me how much faster it is to cache.
And it is fairly significant. The best and worst performance were comparable, but the average case has the cached version being about twice as fast, and using the static file being about 7 times faster. With, of course, the same problems.
If I wasn't about to take time out of the office, I'd start looking into other methods to get things faster. Good to know, though, that I have the means to test and benchmark it once I get back next week.
So, we created a third version, which checks the database for checksum, and if it changes, it regenerates the file and sends the data. Otherwise, it opens the file and sends the data.
I tested with Chrome Dev Tools, which told a bit of the story, but at the scale where it's closer to anecdotes than data. I wanted to go into the hundreds of hits, not just one. I pulled out Benchmark, which told a story, but wasn't quite what I wanted. It started the clock, ran it n times, then stopped the clock, while I wanted to get clock data on each GET.
I also realized I needed to test to be sure that the data I was getting was the same, so I used Test::Most to compare the object I pulled out of the JSON. That was useful, but most useful was the program I wrote using Time::HiRes to more accurately grab the times, then use Statistics::Basic and List::Util to take the collected arrays of sub-second response times and show me how much faster it is to cache.
And it is fairly significant. The best and worst performance were comparable, but the average case has the cached version being about twice as fast, and using the static file being about 7 times faster. With, of course, the same problems.
If I wasn't about to take time out of the office, I'd start looking into other methods to get things faster. Good to know, though, that I have the means to test and benchmark it once I get back next week.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env perl | |
# my modern perl boilerplate | |
use feature qw( say ) ; | |
use strict ; | |
use warnings ; | |
# modules used | |
use LWP::UserAgent ; | |
use Benchmark qw{ :all } ; | |
my $agent = LWP::UserAgent->new() ; | |
my $count = 20 ; | |
my $base = 'https://example.edu/AJAX/endpoints' ; | |
my @apis ; | |
push @apis, '/the_caching_one.cgi' ; | |
push @apis, '/the_dynamic_one.cgi' ; | |
push @apis, '/the_static_file.json' ; | |
timethese( $count , { | |
'api' => sub { $agent->get( $base . $apis[0] ) } , | |
'cache' => sub { $agent->get( $base . $apis[1] ) } , | |
'static' => sub { $agent->get( $base . $apis[2] ) } , | |
} ) ; | |
exit ; | |
__DATA__ | |
Benchmark: timing 20 iterations of api, cache, static... | |
api: 11 wallclock secs ( 1.14 usr + 0.06 sys = 1.20 CPU) @ 16.67/s (n=20) | |
cache: 7 wallclock secs ( 1.05 usr + 0.03 sys = 1.08 CPU) @ 18.52/s (n=20) | |
static: 2 wallclock secs ( 1.20 usr + 0.02 sys = 1.22 CPU) @ 16.39/s (n=20) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env perl | |
# my modern perl boilerplate | |
use feature qw( say ) ; | |
use strict ; | |
use warnings ; | |
# modules used | |
use List::Util qw{ min max sum } ; | |
use LWP::UserAgent ; | |
use Statistics::Basic qw(:all nofill) ; | |
use Time::HiRes qw( gettimeofday tv_interval ) ; | |
my $agent = LWP::UserAgent->new() ; | |
my $count = 20 ; | |
my $base = 'https://example.edu/AJAX/endpoints' ; | |
my @apis ; | |
push @apis, '/the_caching_one.cgi' ; | |
push @apis, '/the_dynamic_one.cgi' ; | |
push @apis, '/the_static_file.json' ; | |
my $times ; | |
# for each API endpoint being tested, run $count | |
# times and collect the elapsed time it takes to get said URL | |
# ensuring that the data is correct is another issue | |
for my $c ( 1 .. $count ) { | |
for my $api (@apis) { | |
my $end = ( split m{/}, $api )[-1] ; | |
my $url = $base . $api ; | |
my $t0 = [gettimeofday] ; | |
$agent->get($url) ; | |
my $t1 = [gettimeofday] ; | |
my $elapsed = tv_interval( $t0, $t1 ) ; | |
push @{ $times->{$end} }, $elapsed * 1000 ; | |
} | |
} | |
say join "\t", qw{ name iter min max mean median } ; | |
say '-' x 55 ; | |
for my $api ( sort keys %$times ) { | |
my @times = @{ $times->{$api} } ; | |
my $size = scalar @times ; | |
my $max = max @times ; | |
my $min = min @times ; | |
my $omean = mean(@times) ; | |
my $mean = 0 + $omean->query ; | |
my $omedian = median(@times) ; | |
my $median = 0 + $omedian->query ; | |
say join "\t", $api, | |
map { sprintf '%4d', $_ } $size, $min, $max, $mean, $median ; | |
} | |
say '' ; | |
say 'All times in milliseconds. Smaller is better' ; | |
say '' ; | |
__DATA__ | |
name iter min max mean median | |
------------------------------------------------------- | |
pi 20 378 894 610 583 | |
pi.cgi 20 217 886 356 334 | |
pi.json 20 49 171 83 74 | |
All times in milliseconds. Smaller is better |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env perl | |
# this program compares three versions of the PI api for new submissions | |
# to see if they have the same data. If they don't have the same data | |
# their benchmarks are not comparable | |
# my modern perl boilerplate | |
use feature qw( say ) ; | |
use strict ; | |
use warnings ; | |
# modules used | |
use LWP::UserAgent ; | |
use Test::Most ; | |
use JSON ; | |
my $agent = LWP::UserAgent->new() ; | |
my $base = 'https://example.edu/AJAX/endpoints' ; | |
my @apis ; | |
push @apis, '/the_caching_one.cgi' ; | |
push @apis, '/the_dynamic_one.cgi' ; | |
push @apis, '/the_static_file.json' ; | |
my $data ; | |
# for each API endpoint being tested, get the responses | |
# and store in $data | |
for my $api (@apis) { | |
my $end = ( split m{/}, $api )[-1] ; | |
my $url = $base . $api ; | |
my $r = $agent->get($url) ; | |
if ( $r->is_success ) { | |
my $content = $r->content ; | |
$data->{$end} = decode_json($content) ; | |
} | |
else { say 'ERROR', $end } | |
} | |
# compare each endpoint's output with the others, using $done | |
# to avoid duplication | |
my $done ; | |
for my $k1 ( sort keys %$data ) { | |
for my $k2 ( sort keys %$data ) { | |
next if $k1 eq $k2 ; | |
my $k = join ' ', sort $k1 , $k2 ; | |
next if $done->{$k}++ ; | |
my $j1 = $data->{ $k1 }{ data } ; | |
my $j2 = $data->{ $k2 }{ data } ; | |
cmp_deeply( $j1 , $j2 , 'are equal: ' . $k ) ; | |
} | |
} | |
done_testing() ; | |
exit ; |
No comments:
Post a Comment