Mar 252014
 

The Raspberry Pi is an amazing piece of kit. There’s little it cannot do with sufficient tweaking. I have a passing interest in planes, and love FlightRadar24 (iPad app and website). 

fr24

FlightRadar24 screenshot

I started to wonder how they got this data. A quick rummage around their website reveals they mostly rely on a world wide network of volunteers to collect and then feed them data. You need ‘line of sight’ to an aircraft to be able to query the information, and no one entity can afford to do this globally. So, a network of volunteers run ‘feeder radar stations’ (ok, it isn’t really ‘radar’ but, more the Next Generation version of). 

Hardware I Use

I use a ‘stock’ Raspberry Pi model B, connected via ethernet to my home network ordered via RS Components (fwiw, they’re slow as heck… order one from somewhere else…!). My Pi is strapped up under my desk out of the way. It has a 4 port (unpowered) USB hub connected to it (the black blob underneath) but otherwise it is entirely unremarkable. I’m even still using the original RS Components 4Gb SD card. 

Excuse my camera charger lurking underneath it all – it’s entirely unrelated!

raspberry

My Raspberry Pi, mounted under my desk

Hardware wise, to receive the ADS-B broadcasts from planes overhead, I use a DVB-T (R820T) Stick with an external aerial attached to it. I ordered mine from 1090mhz.com (based in Germany) and it arrived in the UK about 4 days later in perfect condition, and even had an adaptor and external aerial included in the package – thanks guys! This is the ‘best’ stick to use – apparently – as it is based on the R820T chipset.

Software I Use

I use a modified version of dump1090, originally installed using instructions from David Taylor’s SatSignal.eu website. dump1090 is fantastic. Using the sdr-rtl drivers (also documented on David’s site) to re-tune my DVT-B Stick  from digital TV to the 1090MHz used by ADS-B, allows it to receive the ‘next gen’ radar data right from the planes in the sky themselves. 

dump1090 then takes this feed and decodes the data into something human rather than machine readable. Using the –interactive mode, you can see the output as the planes fly by.

interactive

dump1090 –interactive

 

Perhaps even more exciting than that though, is the –net option, which enables not only all the data sockets so that FlightRadar24′s software can pull the information out of dump1090 (setup instructions here), but also enables a built-in web server so you can run and view your own miniature version of FR24:

dump1090

Screenshot of my own ‘dump1090′ FlightRadar24-style output (–net option)


 

MySQL Support

You may remember I said I use a modified version of dump1090. That is because as well as everything above, I also wanted to store a local copy of all the data I receive in a MySQL database so I can do my own manipulations and calculations. While there is a MySQL’d branch of dump1090 on github, it is dozens of commits behind the main branch and missing out on a lot of the hard work Malcolm Robb has put in to his master branch and fork of antirez’ original work. 

So, rather than forego either mysql, or using the latest version of dump1090, I hacked them together and re-introduced mysql support into the latest version of dump1090. 

To keep updates to future versions easy, there are very minimal changes to all other source code/header files of the official dump1090 branch. 95% of mysql support code is contained within mysql.h and mysql.c with pretty much the only main branch changes being the inclusion of mysql headers and a new struct in dump1090.h, the –mysql switch handler in dump1090.c, and a call to modesFeedMySQL() in mode_s.c (that could even be moved to mysql.h I suppose to separate it even more .. but I just put it with all the other structs for consistency). 
 
This should make it relatively simple for me/you to upgrade each time a new version comes out. 
 
MySQL authentication credentials are now in mysql.h rather than buried deep in the codebase. If it’s something lots of people show an interest in, the database credentials could even be supplied on the command line for even greater simplicity and portability. We’ll see… 
 
If you’d like the latest version (1.08.1003.14) of dump1090 with mysql support, you can get it here
 
Happy flying!
Jul 302011
 

Ever since I was 13 I’ve been programming in PHP. It’s one of those “you can do anything with it” languages that I just love working with. I have recently launched a (pre-beta) service that automatically checks you into Facebook Places (and more will follow, such as Foursquare) based on where your phone reports you to be totally automatically, courtesy of Google Latitude. It was awesome fun to write and is now live for folks to play with (you can find out more at beta.CheckMeIn.at). 

The Problem

Now if it was just for me, it would have been trivial to write. Grab my Latitude position, compare it against a handful of places I frequent, and if any of them match, check me in on Facebook. Checking and comparing my location every 60 seconds would be really easy.

But what if I’m doing that for hundreds or even thousands of people? A script that runs each user in turn would run for hours just doing one sweep of the user database, querying Google Latitude, doing the distance calculation math based on latitude and longitudes, and then punching any matches to Facebook. Cron that script to run every 60 seconds and the server would fall over from RAM exhaustion in about 10 minutes, and only the first 50 or 100 people in the user database would ever be processed. 

The Solution

There are 3 background processes (excluding the maintenance bots) that ‘power’ CheckMeIn.at. They are all written to work out of a central ‘work queue’ table, where the parent process gets a list of work to do and inserts work units into the work queue table. It then counts up how much work there is to do, and divides that by the number of work units each child process will be allowed to handle at a time. If there are more work units than permitted children, it spawns off the first batch, lets them run, and then spawns more as they exit off with their completed workloads.

The beauty of it is it dynamically grows itself. With 10 users it’ll happily spawn 1 process and run through them all in a second. With 100 users it’ll spawn 2 processes and do likewise. With 2,000 users it’ll spawn 10, and so on and so forth. If we have 1 million users it’ll spawn it’s maximum (say 50), then wait and spawn extras when there is room. All without any interaction on my part.

The Google Latitude Collector (GLC) manages the collection of user locations every 60 seconds. It’s “self-aware” in the sense that it manages its own workload, keeps track of the queries allowed by Google, and generally throttles itself to Do No Evil, while keeping the service responsive. 

The User Location Processor (ULP) follows the same principles of the work queue, and compares locations collected by the GLC against a list of Places the user has configured via the web interface. It computes matches, near misses (to help with the setup), honours the delay periods, and so on and so on. If all criteria are met, it passes work units on to…

The Facebook Check-in Injector (FCI). The FCI handles a shedload of sanity checks, prevents double-checkins, examines Facebook for a users last check-in to make sure we’re not doing something they’ve already done themselves, and lots more. If it all works out, then we check them in and the whole thing goes round again. 

Sounds complex, but from firing off a Google Latitude Collector, to checking a user in (assuming we’ve adhered to delay periods here), the are checked in to Facebook about 4 seconds later. 

The Moral

Plan for growth in your application from the very beginning. This project would have been a b*tch to modify later on. But by knowing it’d grow, and implementing self-awareness and control into the app, it can handle infinite growth. If the current server that does all the processing becomes overloaded, it’s trivial to add another to halve its workload, and all without having to modify a single line of code. 

The key however is to have a powerful database server to run it all off. In an hour it can easily generate a million database queries as users interact with the site, and the daemons go about their own business. Without a database server capable of keeping up, things start to seriously slow down.

 Posted by at 7:32 pm
Apr 212011
 

Those of you who have known me for any period of time will probably have been aware that you could find my current location on my personal website (which is now this blog). This was originally just the Google Latitude ‘badge’, which was quite a simple map representation of my current location with a guestimated range bubble around me. This is still displayed on every page in the right hand column. It only however, identified the town at best in textual format, and offered no historical view ability, or alternative display methods when I was somewhere I go regularly.

Since the 19th February 2011, I have been storing my Latitude location as updated automatically by my mobile phone that goes with me everywhere. This has been updated every 60 seconds into a MySQL database, along with a reverse-geocode lookup from the Google Maps API of the best possible postal address from latitude/longitude, an accuracy estimate (can be spot on with GPS, within 50 metres with wifi and city centre 3G coverage, and upwards of 2km in the countryside), and a timestamp. A couple of authorised PHP shell scripts do all the raw collection and storage operations. This then allows me create my own map of my location that I can play with, as well as offer minimaps of my ‘last 5 positions’ and anything else that might take my fancy.

For example on my location page now, I calculate the time I have been somewhere and also check my current location against a database of places I frequent on a regular basis and stay at for quite a while when I get to them. There are 9 entries in it. A short sample are my house, my girlfriend’s house, a couple of Starbucks that I go to regularly, and where I work. If it calculates I am within a permitted range of any place in that database table (each entry has a specific permitted range) and I’ve been there for more than a few minutes, it’ll “check me in” to that place and display precisely where I am. Once I begin moving again, it’ll check me out and begin the usual ‘roaming’ display once more.

If I ever get asked “To eliminate you from our murder enquiry, where were you at 5.33pm on the 2nd April 2011?” I can honestly say IKEA, Lakeside Retail Park, W Thurrock Way, Thurrock RM16 6, UK!

Some may question the logic of doing this – surely it’s invasion into my personal life? That may be so, but given any of my friends could ring me and say “Where are you?”, what’s the difference?

 

Mar 072011
 

Anyone who has a cellular / mobile contract with Three / 3 / 3UK will have likely stumbled across their ‘porn’ block at some stage. Good goal in theory, except it doesn’t just block porn. I’ve had some really random websites blocked because of it. There is a workaround that you can use (depending on your handset I’ll leave you to figure out the specifics), but the jist is quite simple:

Create a new APN and call it ’3 Routed’ or something. The name is irrelevant.

APN: 3internet
Proxy: <not set>
Port: <not set>
Username: <not set>
Password: <not set>
Server: <not set>
MMSC: http://mms.um.three.co.uk:10021/mmsc
MMS Proxy: mms.three.co.uk
MMS port: 8799
MMS Protocol: WAP 2.0
MCC: 234
MNC: 20
Authentication Type: <not set>
APN Type: *

Save this, and activate it as your active APN. This will temporarily disconnect your data connection and renegotiate a new one. It will give you handset a public routed IP address (minor security issue), break sending and receiving of multimedia messages (MMS). But, it will give you unrestricted, unfiltered internet access. When you’ve finished, simply switch back to your other original APN to restore normality for everything else.

Simple!

Mar 012011
 
Feb 282011
 
WHS Disk Management - All Storage

 

Windows Home Server is an awesome product. Microsoft have sadly killed it off (pretty much) with version 2 (code-named Vail – or Fail amongst the version 1 users), but I firmly believe Version 1 will rock on for many years to come. Version 1 includes a fantastic and reliable entity called Drive Extender. Essentially for those who don’t know, it allows you to ‘pool’ storage drives into one big virtual pot of disk space without having to worry about drive letters or where a particular network share is. Running out of space? Just throw another drive at it; it’ll figure out the rest!

It also has a “poor man’s RAID” option, whereby certain shared folders can be set up with Replication. This ensures that data in those particular folders will be stored on two distinct physical hard drives in the event that one should fail. Clever stuff.

All my data, photographs, videos, DVD rips, downloads, documents, code… basically everything lives on the home server. This is fantastic from a centralised repository point of view. One place to keep absolutely everything, that I can access remotely. With my 50mbit/5mbit cable connection, I can remotely access everything I need without hassle. But it does leave one minor issue.

Even with folder replication, that still only technically leaves my data in one place. Ok, so if a hard drive fails it’s not a big deal; data replication will keep things ticking until I replace the failed disk. But what if two drives fail? What if I have a burglary and someone steals the server? Fire? Flood? Disaster. So, a backup strategy was needed.

The photo above shows my HP MediaSmart Server in the cupboard it calls home. To the left of that is a 2TB Western Digital MyBook Mirror Edition (1TB RAID-1), on top of the home server itself is a 1.5TB IOMEGA USB drive, and on the shelf above all that is a 3TB Seagate GoFlex USB drive, and an APC UPS. Eeesh. But, the point of which, cometh…

The Home Server itself contains 4 disks (albeit only 3 at the time the external picture was taken). A 1TB system disk that it shipped with, and 3x 2TB disks for expansion. This gives us a real-world capacity of 6.37TB, with currently 3.28TB free. A little room to grow yet!

The Backup Strategy Abstract: All folders on the home server are set to Replicate which I use not as a means of backup (for the fire/flood/theft reasons!) but purely as a means of keeping my Home Theatre and digital life going should a hard drive fail. The actual backup strategy lives by the “if data doesn’t exist in 3 places, it doesn’t exist” rule. So, data exists once on the home server. It then gets backed up to the external RAID-1 array (giving me a theoretical 4 copies of the same data) but for the “RAID-Isn’t-A-Backup” amongst us, we shall count that as the second. Except for the DVD Rips, which total up to about 1TB at the moment (before replication), so they get backed up on to the 1.5TB IOMEGA drive that sits on top of the server.

The observant will have spotted that while this gives me a theoretical 4 copies, it’s still technically only 2. That’s where the 3TB GoFlex comes in. Every week that drive is brought home from my office, plugged in, and a full backup of everything taken and placed on it. It’s then immediately unplugged, and put in my car for the return trip to the office the next day. The theory being if the house burns down, at least my car might be ok outside!

The Backup Strategy Detail: It doesn’t stop there, though. I use KeepVault to also run real-time backups of my Photographic Archive and User Directories. Windows Home Server backups are fine, but they are not real time. Example: I once copied a load of data to my user directory, then accidentally deleted it after deleting the original. If it wasn’t for KeepVault keeping a real-time backup copy, I’d have been screwed. But I just clicked Restore, found what I wanted, and it brought it back immediately. KeepVault also copies it off-site into The Cloud for me as another safety measure! So far I have about 30gb of selective stuff synced into The Cloud with KeepVault.

And as a final resort, I use Cloudberry Backup to copy my Photographic Archive share (again!) not just off-site into The Cloud, but onto a different continent (South East Asia, to be precise!) using Amazon S3. It also does the same with very selective parts of my User Directory, mainly critical stuff I cannot afford in any way shape or form to lose, such as security certificates for remote server logins, source code, that sort of thing). So my most critical data is actually protected 8 or 9 times, and all mostly automatically. Even if you discount the replication and external RAID backups that WHS does and KeepVault writes to, everything is in at least 3 places, the most critical between 4 and 5. Most importantly, everything is kept in at least one off-site backup – even the DVD rips – and the most important stuff not only locally in real time, and in a physical off-site drive I own, but into two separate Clouds on different continents as well.

Paranoid? When it comes to my data, and the memories in my photographs… you bet I’m paranoid. You can’t afford not to be!

 

Feb 282011
 

I used to have a main “me” blog, but then diversified into several different blogs for different topics. This worked well to a degree, but I then found I lacked a place for general rants, raves, thoughts, feelings and so forth. Talk about can’t win! So I’ve decided to put a blog on my main website again, as well as keeping up my other blogs at the same time.

Everything always goes full circle in the end!