Resources

Compaction vs Compression

Face-Off
What is compaction?
Compaction
Data Compaction is a process that reduces the size of messages by sending tiny Codewords that represent patterns in data—rather than actually sending any data.
Compression
By comparison, data compression is a process that reduces the file size by re-encoding the file data to use fewer bits of storage than the original file.
Where it Works
Compaction Works Best on Small, Repetitive Data

Compaction Works on Small Messages

Compaction works on individual messages including very small messages—IoT and machine data—messages as small as 8 bytes.

Comparison to Compression

By contrast, Compression only works on large groups of messages, or files that are large (at least 1 KB and larger).

But Not Every Kind of Small Messages

Compaction works best on repetitive, low entropy IoT or machine data.
Where it Works
How Compaction Actually Works

First

AtomBeam uses Machine Learning to look at a small volume of data to determine the patterns at the bit level. We call this “training” the software (to find patterns).

Then

Each pattern becomes a Codeword and all Codewords (representing patterns) then go into a Codebook. We do the heavy computation in advance when creating the Codebook, eliminating almost all computational overhead.

Finally..

The Codebook is placed in a source (such as a sensor), and a destination (such as the cloud). The source and destination can now communicate solely in Codewords.

With Compaction...

Compression seeks repetitions of data patterns on a single file or group of messages that becomes a file.

Generally the process involves rewriting a long string of data using a shorter version.
Then, instructions are “stapled” on. Compression produces a shortened string and instructions to reconstruct it.
Real-Time Compaction
Compacted messages can be used or transmitted in real time

With Compaction...

Messages are compacted and transmitted in real time and Codewords are sent immediately. There is no need for messages to be accumulated and sent in batches..

But with compression...

Compression does not and can not transmit in real time.

  • Enough data must be collected so that the algorithm finds sufficient patterns.
  • Compression typically requires multiple file-by-file scans, eliminating the possibility of real-time transmission.
  • Data is accumulated and sent in batches.

Reduce Size

Compaction—by its nature—adds almost no latency

With Compaction...

It only requires a table lookup which takes microseconds. Compaction can be up to 400x faster than compression to send the first bits out the door.

But with compression...

Compression—by its nature—injects latency. In Compression, the file is scanned multiple times looking for patterns. The pattern search is file by file, and takes a lot more time and energy because it’s inefficient.

Reduce Size

Compaction reduces the size of small messages by 70-90+%

With Compaction...

AtomBeam’s average IoT reduction is 75%.

But with compression...

Compression often greatly enlarges the size of small messages.

  • Compression generally will reduce machine files by less than 20%.
  • Typical Compression—average IoT reduction is 4%. Often, compression adds to the size of IoT messages.
  • It is impractical to use compression on small data.

Small Footprint

Compaction has a small footprint and ultra-light processing requirements

With Compaction...

  • AtomBeam’s IoT executable requires approximately 40 KB, which fits on most IoT devices—even very small ones.
  • The processing used to encode data coming from the sensor is minimal, consisting only of a lookup, and will run effectively on virtually any low-cost processor.
  • Requires only ultra-light, low-cost processors.

But with compression...

Compression has a large footprint and substantial processing requirements.

  • Compression algorithms have a large footprint making them impractical for machines (such as sensors) with very limited computing and memory power.
  • Compression is computationally intensive and needs serious processing power, or it’s going to take forever. Produces a high demand on hardware.
  • Compression—in order to be effective—requires more RAM to accumulate messages which are not sent in real time.

Fewer errors

Compaction transmission errors are few and easily recoverable

With Compaction
Compaction has very limited sensitivity to errors.
Errors are confined to only one burst of data (typically a group of small messages—2-200 bytes). An error does not affect everything else that was transmitted.
Error correction is tightly constrained to a single retransmission of a Codeword, versus the entire original file.
You need only retransmit that single burst of data.
AtomBeam automatically goes back to correctly decoding. Cascading errors are not possible.
But with Compression...
Compression transmission errors are serious. In compression, a single “flipped bit” can result in a cascade of errors in a compressed file. This requires retransmission of the entire file. In some cases, the file is rendered useless.

Lossless

AtomBeam is lossless

With Compaction...

AtomBeam does not lose a single bit of data.

But with compression...

Some compression is lossy, making it impractical for many uses.
\