Base85 (Ascii85) Encode & Decode
What is Base85 (Ascii85)?
Base85, often referred to as Ascii85, is a binary-to-text encoding scheme that represents binary data using 85 printable ASCII characters (from '!' to 'u', plus 'z' for special zero compression). It's more space-efficient than Base64, typically achieving about a 4:5 expansion ratio compared to Base64's 3:4.
How It Works
Encoding: Input binary data is processed in 4-byte (32-bit) groups. Each group is treated as a 32-bit number and then encoded into five Base85 characters by repeatedly dividing by 85 and taking the remainders. A special case exists for a group of four zero bytes, which can be encoded as a single 'z' character (zero compression). Padding is handled for input not divisible by 4 bytes. The encoded data is typically wrapped in <~ and ~> delimiters.
Decoding: The delimiters (<~, ~>) are removed. The special 'z' character is expanded back to four zero bytes. Other characters are processed in 5-character groups, converted back to their numerical values (0-84), and used to reconstruct the original 32-bit number, which is then broken down into the original 4 bytes. Padding is handled correctly.
- Uses 85 characters: '!' through 'u' (ASCII 33-117) and 'z' (ASCII 122).
- Case-sensitive.
- More efficient than Base64 (approx. 25% overhead vs. 33%).
- Commonly delimited by <~ and ~>.
- Includes a special 'z' character to represent four consecutive zero bytes (`\0\0\0\0`).
- Whitespace within the encoded data is generally ignored during decoding.
Use Cases
Base85 encoding is found in several specific contexts:
- Adobe PostScript & PDF: Used to embed binary data (like images or fonts) within text-based PostScript and PDF files.
- ZeroMQ (Z85 variant): A slightly different character set (Z85) is used by the ZeroMQ messaging library for representing binary data like keys in configuration files or code.
- Version Control Systems: Git uses a form of Base85 for binary patches.
Why Use Base85?
Its main advantage is efficiency compared to Base64 when embedding binary data into text formats.
- Efficiency: Offers lower data expansion (overhead) than Base64.
- Printable Characters: Uses only printable ASCII characters, suitable for text streams.
Disadvantages include the use of characters that might need escaping in some contexts (', ", &, <, >) and slightly more complex implementation than Base64.
How to Use This Tool
- Select either "Encode" or "Decode" mode.
- Enter the text (UTF-8 for encoding) or Base85/Ascii85 string (for decoding) into the top input field. For decoding, the input can optionally be wrapped in <~ and ~> delimiters.
- The result will appear automatically in the bottom output field. Encoded output will be wrapped in <~ and ~>.
- Use the swap button to switch the input and output, automatically changing the mode.
- Click the copy icon next to the output label to copy the result.
- Error messages will appear for invalid input characters, formatting issues, or if the decoded data is not valid UTF-8.