May 182026
 

Or, How I Built a Floppy Preservation Platform From Scratch

There’s a moment I keep coming back to. I’m holding a 5.25-inch floppy disk from 1985, slightly warped, its label faded to near-illegibility. The disk belonged to a now-defunct magazine that no longer exists, containing software written by programmers who may have no idea their work survived at all. In about thirty seconds, my Greaseweazle is going to read it at the flux level — capturing not just the data but the magnetic signature of the original write head, preserved with a fidelity that would have seemed like science fiction to the person who formatted it 41 years ago.

That moment is why I built this.

The Problem With “Just Copy the Files”

When most people think about digital preservation, they imagine it’s straightforward. Files are files. Copy them. Done.

It isn’t like that.

Floppy disks from the 1980s and early 90s were written by a dozen different DOS versions, on hardware with varying geometry, using formatting conventions that were often undocumented, non-standard, or deliberately obfuscated. The IBM PC disk format evolved rapidly between 1981 and 1993 – from 160KB single-sided disks with no BIOS Parameter Block at all, through the chaotic proliferation of 180KB, 320KB, 360KB, 720KB, and 1.2MB formats, to the eventual standardisation around 1.44MB. Every step of that evolution left behind disks that modern tools handle poorly, or increasingly, not at all. You can’t just plug a 5.25″ floppy disk drive into a modern computer either; the motherboards do not have the headers, there is no USB version of such drives, and most modern OS’s have no idea what to do with such devices.

An open-source project (@keirf/greaseweazle: Tools for accessing a floppy drive at the raw flux level · GitHub) gets round this problem with sufficient technical knowhow, but even with the right hardware, software, tenacity and preparation, recoveries can be slow and time-consuming.

My first serious problem was a 160KB disk – an IBM DOS 1.0 single-sided format from 1985. DOS 1.0 predates the BPB entirely. The boot sector BPB area doesn’t exist; it’s either zeroed or, in many cases, contains raw bootstrap code, because the formatter never needed to write geometry data there. DOS didn’t read it. The BIOS told DOS the geometry. That was enough.

Modern Linux does not have a BIOS. It reads the BPB, finds garbage, and refuses to mount.

I spent a week on this. The solution involved mtools with explicit geometry configuration, a content-addressed scan cache, and eventually a BPB-patching function that writes a correct 25-byte geometry block into a temporary copy of the image before mounting – leaving the original completely untouched.

The Stoned Virus Problem

Several of the cover disks I’ve been archiving are infected with boot sector viruses. Stoned is the most common – a 1987 New Zealand virus that spread primarily through floppy sharing, and that infected an absolutely enormous number of disks before it was widely understood. Finding it on a magazine cover disk isn’t surprising. Finding it on a disk that was then sent to thousands of subscribers and read on machines that went on to infect office networks is a small window into how the late 80s computing ecosystem actually worked.

The interesting preservation question is: what do you do with it?

My answer: keep the original, warts and all. Archive it faithfully. Document the infection. But also provide a clean variant for people who just want to use the software.

The tool now handles this automatically. ClamAV scans happen in two passes – first against the raw .img file (the only way to catch boot sector viruses, which live in the first 512 bytes and never appear as files on the mounted filesystem), then against the mounted filesystem itself. If the original is infected and a .clean.img exists, BPB patching is applied to the clean image, not the infected one. The variants system – original, clean, patched, recovered – means every state of a disk’s history is preserved and documented.

The infected original is still there, still mountable via NetDrive if you want to study it. It just comes with a red warning badge and a button you have to click to acknowledge the risks before you get the connect command.

Archaeology at the Boot Sector Level

The most technically interesting disk I’ve encountered so far had a boot sector that was simultaneously valid bootstrap code and appeared to corrupt the filesystem. The JMP SHORT 0x3E instruction at offset zero – a two-byte jump that skips over the entire BPB area – is a deliberate design. The publisher wanted a custom boot experience: insert the disk, power on, and instead of Non-system disk or disk error, you’d get something. A menu, a splash screen, a welcome message. There wasn’t space for that in the standard three-byte jump and eight-byte OEM label. So they used the BPB fields – offsets 11 through 61 – as overflow for executable code, and got 51 extra bytes.

On real hardware, this worked perfectly. DOS never reads the BPB during normal file access. The BIOS knows the geometry. The disk just works.

It also functioned as copy protection. Duplicate a disk and your duplication tool reads the BPB to determine geometry. It gets: 147 bytes per sector, 240 sectors per cluster, 55,438 sectors per FAT. Whatever it produces next is not a working copy.

The bytes that should contain geometry – 0x93, 0x00, 0xF0, 0x1E, 0x50... – are MOV instructions and jump offsets. When I decoded them I found a fragment of the IBM 3.3 bootstrap string interspersed with 8086 opcodes. The formatter had been creative.

The Tool

What started as a PHP script that mounted disk images and printed a directory listing has become something considerably more substantial. The current version:

  • Recursively discovers .img files and groups them by base slug, automatically detecting variant types (.clean, .patched, .recovered, .cracked, and several others)
  • Detects filesystem type from the raw boot sector bytes – FAT12/16/32, early DOS formats, CP/M, NTFS, ext2/3/4 – without mounting
  • Extracts disk geometry from image size, using a lookup table covering every standard floppy format from 160KB (1981) to 2.88MB (1991)
  • Mounts via Linux loopback, falling back to mtools with explicit geometry, falling back to BPB patching on a temp copy, with a content-based sanity check at each stage to catch silent failures
  • Runs ClamAV in two passes, including raw image scanning for boot sector infections
  • Extracts readable text files – documentation, README files, source code in BASIC, C, Pascal, Assembly – and publishes them as individually-crawlable HTML pages, so a search engine can index a 1987 copyright notice or an author’s name embedded in a REM statement
  • Generates per-disk HTML reports in a retro green-screen CRT aesthetic, with directory trees, full SHA-256 file manifests, archivist’s notes, photo galleries, and a live mTCP NetDrive connect command
  • Publishes a cross-archive file search, a sitemap.xml, a robots.txt, and a static all-disks.html specifically for search engine crawling
  • Caches everything by image SHA-256, so unchanged disks cost nothing on subsequent runs

The Point

The software on these disks isn’t historically significant the way a Gutenberg Bible is significant. Most of it is small utilities, games, productivity tools, and programming experiments – written by hobbyists and professionals who were, in many cases, making things for the love of it.

That’s precisely why it matters.

The commercial software of the 1980s is relatively well-preserved. Companies had catalogues, revenues, lawyers, and reasons to maintain archives. The cover disk software – the stuff that came shrink-wrapped to a magazine, distributed to tens of thousands of subscribers and then largely forgotten – has no institutional custodian. Nobody owns the rights in any meaningful active sense. The authors have often lost their own copies.

But some of them are still out there. And some of them have children, and grandchildren, who might one day search the internet for their name and find a piece of code they wrote in 1987, preserved in a running state, mountable on a DOS machine via a TCP/IP protocol that didn’t exist when the disk was formatted.

That’s not a small thing.

The archive is live at dl.x86.world [or https://dl.x86.world] come and have a rummage…

Oct 122014
 

PCSensor Temper1F

Article updated 28/10/2014 thanks to Pete Chapman releasing even better software. Read on! 

One of the long-term goals is to pop one of my Raspberry Pi’s up in the loft. From there I plan to move my ADSB aircraft monitoring from the ‘Mancave’ into the highest point of the house; as I live up way up a hill, hopefully this will significantly improve my reception. 

I also wanted to pop a temperature sensor on the mains water pipe feeding the cold water storage tank. My loft is very well insulated from the house (good for us), but also quite exposed. In a very cold winter this could result in the ambient temperature up there dropping to or below freezing. Frozen pipes, cracked tanks … not funny. 

I ordered the above item from Amazon (search for PCSensor Temper1F). There are many types it seems but this is the one I’m writing about today. From what I can gather, they all function mostly the same and the instructions below should work regardless of which one you get. For the price (~£10-15) they are reportedly very accurate.

Throw away the driver CD

Firstly throw away the driver CD that comes with it. Yes it has linux software on it, but it’s buggy. Particularly for my purposes as when the temperature drops below 0C it overflows and reports 248C. Not that helpful when you wish to report on too cold rather than too hot. 

Plug it in

Remove the USB stick from the packaging and plug in the temperature probe to the rear. Then use a USB extension lead to plug it in to your Pi (not mandatory, but when I use the Raspberry Pi I dislike touching the actual device and extension leads make it much easier to not disturb the device too much). If it’s the same model as mine you should get a little red LED light up. 

$ dmesg | tail

[  623.621245] usb 1-1.2: new low-speed USB device number 7 using dwc_otg
[  623.735966] usb 1-1.2: New USB device found, idVendor=0c45, idProduct=7401
[  623.736004] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[  623.736020] usb 1-1.2: Product: TEMPer1F_V1.3
[  623.736034] usb 1-1.2: Manufacturer: RDing
[  623.757428] input: RDing TEMPer1F_V1.3 as /devices/platform/bcm2708_usb/usb1/1-1/1-1.2/1-1.2:1.0/input/input2
[  623.760094] hid-generic 0003:0C45:7401.0004: input,hidraw1: USB HID v1.10 Keyboard [RDing TEMPer1F_V1.3] on usb-bcm2708_usb-1.2/input0
[  623.777723] hid-generic 0003:0C45:7401.0005: hiddev0,hidraw2: USB HID v1.10 Device [RDing TEMPer1F_V1.3] on usb-bcm2708_usb-1.2/input1

$ lsusb

Bus 001 Device 007: ID 0c45:7401 Microdia

Driver Installation

As I said above the existing software that comes with it is broken. Peter Vojtek has released a fixed version of the original code on GitHub (local mirror). Pete Chapman has released an even-more fixed version of the original software, as a fork of Peter Vojtek’s code on GitHub (local mirror). While Vojtek fixed the negative temperature issue, the sub-integer (decimal) temperature values were not accurate. Pete Chapman’s code fixes that little issue as well and makes the probe even more accurate. Thanks Pete!

 Go grab it, and follow me. The instructions are a bit sparse with the code but it’s not at all difficult. 

  1. $ apt-get install build-essential libusb-dev
  2. $ unzip master.zip (or whatever the driver zip file is)
  3. $ cd usb-thermometer-master
  4. $ make
  5. $ sudo make rules-install
  6. Unplug and Re-Plug the Thermometer
  7. $ ./pcsensor

If all went according to plan and you haven’t had any errors or you’ve resolved them on the way, you should get back:

2014/10/12 13:03:35 Temperature 69.95F 21.08C

You can then use awk, or any shell script or scripting language of your choice (python example) to extract that data and make use of it. Copy the executable to /usr/local/bin when you are happy it works. Anyone on the system can execute it, it requires no special pemissions thanks to the 666 udev rules.

Have fun! 

When you’ve got a decent amount of temperature data (the sources differ, but I put them in a uniform format in a MySQL database), you can have some fun with the data as well as more serious alerting ideas like above. 

chartClick the graph to go play with the live version. 🙂

 

Lee

 

 Posted by at 1:39 pm
Feb 282011
 

I used to have a main “me” blog, but then diversified into several different blogs for different topics. This worked well to a degree, but I then found I lacked a place for general rants, raves, thoughts, feelings and so forth. Talk about can’t win! So I’ve decided to put a blog on my main website again, as well as keeping up my other blogs at the same time.

Everything always goes full circle in the end!