Menu

Base85 (Ascii85) Encode & Decode

What is Base85 (Ascii85)?

Base85, often referred to as Ascii85, is a binary-to-text encoding scheme that represents binary data using 85 printable ASCII characters (from '!' to 'u', plus 'z' for special zero compression). It's more space-efficient than Base64, typically achieving about a 4:5 expansion ratio compared to Base64's 3:4.

How It Works

Encoding: Input binary data is processed in 4-byte (32-bit) groups. Each group is treated as a 32-bit number and then encoded into five Base85 characters by repeatedly dividing by 85 and taking the remainders. A special case exists for a group of four zero bytes, which can be encoded as a single 'z' character (zero compression). Padding is handled for input not divisible by 4 bytes. The encoded data is typically wrapped in <~ and ~> delimiters.

Decoding: The delimiters (<~, ~>) are removed. The special 'z' character is expanded back to four zero bytes. Other characters are processed in 5-character groups, converted back to their numerical values (0-84), and used to reconstruct the original 32-bit number, which is then broken down into the original 4 bytes. Padding is handled correctly.

  • Uses 85 characters: '!' through 'u' (ASCII 33-117) and 'z' (ASCII 122).
  • Case-sensitive.
  • More efficient than Base64 (approx. 25% overhead vs. 33%).
  • Commonly delimited by <~ and ~>.
  • Includes a special 'z' character to represent four consecutive zero bytes (`\0\0\0\0`).
  • Whitespace within the encoded data is generally ignored during decoding.

Use Cases

Base85 encoding is found in several specific contexts:

  • Adobe PostScript & PDF: Used to embed binary data (like images or fonts) within text-based PostScript and PDF files.
  • ZeroMQ (Z85 variant): A slightly different character set (Z85) is used by the ZeroMQ messaging library for representing binary data like keys in configuration files or code.
  • Version Control Systems: Git uses a form of Base85 for binary patches.

Why Use Base85?

Its main advantage is efficiency compared to Base64 when embedding binary data into text formats.

  • Efficiency: Offers lower data expansion (overhead) than Base64.
  • Printable Characters: Uses only printable ASCII characters, suitable for text streams.

Disadvantages include the use of characters that might need escaping in some contexts (', ", &, <, >) and slightly more complex implementation than Base64.

How to Use This Tool

  1. Select either "Encode" or "Decode" mode.
  2. Enter the text (UTF-8 for encoding) or Base85/Ascii85 string (for decoding) into the top input field. For decoding, the input can optionally be wrapped in <~ and ~> delimiters.
  3. The result will appear automatically in the bottom output field. Encoded output will be wrapped in <~ and ~>.
  4. Use the swap button to switch the input and output, automatically changing the mode.
  5. Click the copy icon next to the output label to copy the result.
  6. Error messages will appear for invalid input characters, formatting issues, or if the decoded data is not valid UTF-8.