# MeatPack Algorithm

5 minute read Published 29 May 2024

Detailed walkthrough of a compression algorithm for G-code.

This post is the second in a series on binary G-code.

## Content-aware Compressionđź”—

When compressing arbitrary data, you need a lossless compression algorithm like Deflate or LZMA. Both are general purpose and achieve good compression ratios on pretty much everything. Without any prior knowledge about the data, they need significant CPU and memory resources to figure out how to shrink it down. This is usually not an issue for modern commodity hardware.

However, it does matter in the G-code scenario where we are streaming data to a microcontroller for real time processing. Our resources are so limited that using even a few kilobytes of memory is hard to justify. Baking our own content-aware compression is a way to work around this; we know that we are sending G-code and only G-code. Take a look at the frequency distribution of the 16 most common characters in some example G-code files:

Generated with Python using Matplotlib, hist.py

They account for ~95.42% of the characters in a G-code file, so itâ€™s possible to come up with some encoding that takes advantage of this.

## MeatPackđź”—

MeatPack compresses G-code by leveraging these statistics. Common G-code characters are replaced with a shorter 4-bit representation using a static lookup table:

character4-bit code
0-90-9
.10
E11
\n12
G13
X14
otherN/A

Before going into specifics, let me drop some bullets on why this approach is a great fit for our microcontroller scenario:

• Extremely low memory usage: no heap allocation, uses under ~40 bytes of which only 2 are on the stack
• Simple logic: no loops, just conditional logic
• Simple math: no expensive multiplications, just bitwise shifts/ORs

### Compressionđź”—

Characters are encoded in pairs of 2. The encoded representation can be 1 to 3 bytes depending on whether or not they are in the lookup table:

• Both characters have a 4-bit code: packed into a single byte as first_code | (second_code << 4)
• One character doesnâ€™t have a 4-bit code: 15 is used as a signal code to indicate that a character could not be packed and is represented by the next byte
• Neither has a 4-bit code: both characters have a signal code 15, indicating that they are represented by the two bytes that follow

If the input has an odd number of characters, it is padded to an even number with a space so we always have pairs. If the input has lowercase E, G, or X, it is converted to uppercase so it will have a 4-bit code. G-code is case insensitive so this is harmless1.

1

ignoring G-code strings

### Decompressionđź”—

Decompression is quite straightforward. The lookup table is inverted and we check for signal codes as we go:

• Byte has no signal codes: unpack two characters
• Byte has an upper signal code: unpack the first character, the second will come next
• Byte has a lower signal code: unpack the second character, the first will come next
• Byte has both signal codes: discard character, the next two bytes are characters

### Controlđź”—

MeatPack has some command sequences to control the decompression process. A sequence of [255, 255] indicates that the next byte will be a command byte. The defined command bytes are:

ByteCommandExplanation
251Enable packingTurns on MeatPack
250Disable packingTurns off MeatPack
249Reset allResets internal state. Usually sent at the end of MeatPack data
248Query config (?)Unknown, can be ignored
247Enable no space modeReplaces E with   in the lookup table
246Disable no space modeReplaces   with E in the lookup table

MeatPack is disabled by default so there will always be an â€śenable packingâ€ť command sequence before compressed data begins.