20GB Internet Archive — Charles Raiders

What it is

Everything essential, nothing wasted.

The 20GB Internet Archive is an experimental project to condense the most valuable, essential information from the internet into a single, ultra-efficient file. We're talking encyclopedia entries, technical documentation, historical records, scientific papers, cultural knowledge — curated and compressed using our custom DITF (Direct Information Text Format).

This isn't a backup of the entire web. It's a distilled archive of what matters most, designed to be accessible offline, portable across any device, and readable through our open-source app.

⚠️ Experimental Status

This is an early trial. The archive is incomplete, the format is evolving, and the reader app is in active development. Expect bugs, missing content, and breaking changes. We're building this in public — feedback welcome.

How it works

DITF format + open-source reader.

The archive is stored in a .ditf file — a custom format we designed to maximize information density while maintaining readability. You can't just open it in a text editor (it's compressed and structured), but our open-source reader app makes it simple to browse, search, and extract information.

20GB of pure information

No images, no ads, no fluff. Just text, data, and structured knowledge compressed to fit on a single drive.

DITF file format

Custom compression and indexing. Designed for fast search, low storage, and cross-platform compatibility.

Open-source reader

The reader app is fully open-source on GitHub. Works on Windows, Mac, Linux, and even mobile devices.

Fast search & indexing

Built-in search lets you query the entire archive in seconds. Find what you need without scrolling through 20GB of text.

What's included

A curated selection of the web.

We're focusing on timeless, high-value content across multiple domains:

Wikipedia core articles (history, science, culture)
Technical documentation (programming languages, frameworks, APIs)
Academic papers and scientific research
Historical texts and public domain literature
Open-source project documentation
Educational resources and tutorials

This is not a web scraper dump. Every piece of content is manually reviewed or algorithmically curated for relevance and quality.

Download the Archive

The archive and reader app are currently in development. Check back soon for the first release.

Download 20GB Archive (Coming Soon) Reader App (GitHub)