Most Frequent Applesoft BASIC Tokens

Deduping the 35,117 Applesoft BASIC files using an MD5 hash, we get down to 25,464 files. What are the most frequent tokens in those files? How many files does each token appear in? And on how many disks?

Applesoft BASIC Files in the Apple II Disk Corpus

Of the 6,998 “normal” disks in the Apple II Disk Corpus, 5,870 of them contain at least one Applesoft BASIC file. As mentioned in a previous post, this amounts to 35,117 Applesoft BASIC files in this subcorpus.

How the CATALOG Differs if a 4am Crack or Not

In the previous post I speculated that the counts of file names in the corpus might be skewed due to the number of disks cracked by 4am. In this post I separate them out.

What is in the CATALOG of Each Disk

I decided to extract the file metadata from disks with a standard layout.

Where the Variations in the DOS 3.3 Boot Sector Happen

Following on from the previous post, I started wondering where the different DOS-3.3-style boot sector types diverge so I constructed the following visualization.

Boot Sector Types in the Apple II Disk Corpus

We previously looked at all sectors across the Apple II Disk Corpus but what about variation in the boot sector? In other words, track 0 sector 0.

Why So Many Sectors Start with 00 11 0X 00

As we saw in the previous post, ten out of the top 20 sector types start with 00 11 0X 00 and continue with all zeros. I suspected why this might be the case and I’ve now confirmed the reason.

Sector Types in the Apple II Disk Corpus

A regular Apple II floppy disk has 560 sectors of 256 bytes. Across the 12,450 disks in the current corpus, that’s 6,972,000 sectors. How many unique sector types are there and what are the most common?

Byte Value Rank Differences For A Specific Disk

Having ranked all the byte values across 12,450 Apple II disk images, how might a specific disk differ?

Ranking Byte Values in the Apple II Disk Corpus

Which are the most common byte values across the 12,450 Apple II disk images?

Gathering Apple II Disk Images

I want to get a lot of 6502 machine code, BASIC source, etc. so I’ve gathered a few thousand Apple II disk images.

Kicking off the Project

Okay, I’ve started a repo (and even came up with a Tolkien-inspired acronym)! If you’re interested, star / watch the repo. I’ll start some discussions there too.