I decided to extract the file metadata from disks with a standard layout.

As mentioned in an earlier post: of the 12,450 disks in the initial Apple II Disk Corpus, 7,092 appear to have a regular DOS 3.3 VTOC (based on the values at offsets $03, $27, $34, $35, $36, $37). Of those, 6,998 have a CATALOG starting on Track $11 Sector $0F and that’s what I decided to explore for this post.

There are a total of 150,558 files.

The most common file types are:

$04BINARY82,357
$02APPLESOFT BASIC35,117
$00TEXT26,931
$01INTEGER BASIC3,138
$08S type989
$40B type866
$10RELOCATABLE303
$20A type172

The most common file names are:

HELLO 4,414
MENU 785
(7 x $08) 610
BOOT0 559
(empty) 527
BOOT1 486
RWTS 485
AUTOTRACE 470
ADVANCED DEMUFFIN 1.5 442
ADVANCED DEMUFFIN 1.5 DOCS 441
PDP 399
SUPER DEMUFFIN 397
LOGO 385
PDP.README 275
APPLESOFT 271
TITLE 242
(6 x $08) 231
CHAIN 195
INTBASIC 191
RUNTIME 157

There is undoubtedly a lot of file-level deduping to do and this also demonstrates the issue of cracked disks being common in the corpus. The DEMUFFIN files are, I assume, left-overs from the cracking process (in this case by the cracker “4am”). The PDP files might be that too.