byetoken (Beta)

Token compression

The Problem

Run git log in a large repo and you get 50,000 tokens of output. The LLM reads every one. It needs maybe 200 tokens of actual information from that output. Same story with ls -la, find, npm install, cargo build — verbose output that burns through your context window.

This is not just expensive. Long context makes LLMs perform worse. They lose track of what matters in a sea of irrelevant output.

How It Works

byetoken installs as a Bash command hook. When your agent runs a command, byetoken intercepts the output and routes it through a semantic compression handler matched to that command. The handler extracts the information the LLM actually needs and discards the noise.

The LLM never knows it's there. It gets clean, dense output that looks like normal command results — just shorter. No special prompting. No agent modifications. Drop it in and everything gets faster.

Installation

# Clone and build
git clone https://github.com/byelogic/byetoken.git
cd byetoken
cargo build --release

# Install the binary
cp target/release/byetoken /usr/local/bin/

# Register with Claude Code
claude mcp add byetoken --scope user --transport stdio -- byetoken

Compression Examples

git status~2,000 tokens → ~50 tokens

Strips the helpful hints, normalizes paths, collapses unchanged sections. Returns just the files and their states.

npm install~10,000 tokens → ~30 tokens

Extracts: success/failure, packages added, vulnerability count. Drops the progress bars, funding notices, and tree output.

cargo build~5,000 tokens → ~100 tokens

Keeps errors and warnings with file/line info. Drops compilation progress, download output, and “Compiling crate v0.1.0” noise.

Self-Tuning

byetoken watches which parts of compressed output the agent acts on and which it ignores. Over time, it learns your workflow and tunes its compression accordingly. If you work on a Rust project, it gets better at compressing cargo output. If you work across a monorepo with many package managers, it adapts to each one.

Architecture

Built in Rust on ripgrep's regex and glob matching internals. Compression handlers are pattern-matched rules, not LLM calls — no additional API costs, no latency. Sub-millisecond overhead per command.

Handlers are composable: a base handler for generic output, overlaid with command-specific handlers for git, npm, cargo, make, docker, and more. Adding a new handler is a single Rust file.