PNG Steganography

A Warning Before Reading

The Hypothetical Problem

I have a scenario for you. Let's say I own a Detroit-based robotics company, and I just created the most awesome robot ever created. It's the first robot that is actually alive!

Johnny 5

I have an electronics convention coming up where I will demo my new Johnny 5. I need to send the design files to the manufacturer located somewhere outside of Michigan near a large mountain where there is more lightning. However, I have a competitor that doesn't want this to happen: OCP.

Robocop

OCP will be demoing its new living-robot prototype, Robocop, at the same convention. The firmware engineering department at OCP was understaffed, so Robocop will not be done in time for the convention. OCP has decided to steal my Johnny 5 firmware and adapt it for Robocop.

If OCP steals my firmware, I will not have Johnny 5 in time for the convention. In turn, OCP has its own production plant, so Robocop would be done in time.

OCP is a highly influential corporation with many political connections. My company is a small start-up business looking to break into the market. My manufacturer is in the same position. We both want this contract, and the manufacturer is willing to do what it can to obtain my files. By the end of the day, OCP will be monitoring all phone calls, emails, and web traffic (including FTP and SSH) going into and out of my company. Because it takes a whole day to package my files in a format the manufacturer can use, I will only have enough time to make a drop-off arrangement with the manufacturer, but the manufacturer is located too far away to personally hand him the files.

Whichever company demos its robot at the convention will be named the first company to have successfully created a living robot, winning a large, multi-year contract producing them for the government. How do I send the files to the manufacturer?

The Conditions

It will ship the Johnny 5 prototypes directly to the convention for me; they don't have to come in to Detroit. I only have to get the 3MB of files to the manufacturer to triumph over OCP.

OCP has a vast amount of resources, including a huge number of data processors and servers. Any encrypted traffic I send, whether it is over SSH, FTPS, or HTTPS, will be logged and decrypted. This requires me to find a way to hide the data in plain sight.

I can't trust anybody with these files. OCP will plant people everywhere, including the post office. My friends will be bribed. Trackers will be installed on my car. Taxis will drive me in the wrong direction. I'm pretty much screwed in the corporate espionage department. My loyal manufacturer is the only one I can trust will this data. (Note: Although I'm taking this to the extreme, corporate espionage and data theft is very much a real thing, not just a concept used in movies.)

After talking with the manufacturer on our final untapped phone call, we agree that I will announce the success of the Johnny 5 prototype on my company blog. In the entry, I will post a hi-res picture of my team and I standing around the prototype as most companies do. This will seem like normal traffic. However, the design files will be hidden _inside_ the picture itself. We agree on a passphrase, and we hang up the phone, not speaking to each other again until after the convention.

Steganography

Whenever the topic of information hiding comes up, people usually think of encryption. An attacker can see the data, but he can't make sense of it without a password, token, etc. However, there are ways of hiding information in such a way that an attack will not even know there is data there in the first place. Hiding data in this way is called steganography.

Steganography Before Computers

In the past, steganography was accomplished by posting letters. Anyone who sees the letter can harmlessly read it. The person who the hidden data can find the real message through the use of a mask.

One of the best examples of this is from a letter sent during the Revolutionary War in 1777. Click the link for a transcript.

Steganography - Masked Letter

Of course, the concept of steganography is even older than the Revolutionary War. The earliest example I can find was written by Herodotus in 440 BCE. Histiaeus sent a message to Aristagoras by shaving a slave's head, tattooing a message on the head, and sending the slave once the hair had regrown. Once the slave had arrived, Aristagoras once again shaved the slave's head to read the message. (source)

Digital Steganography

There are a lot of different places to hide information on a computer. Picture files are simple to post on the Web and are inconspicuous. Let's use the picture of my engineering team in the lab with Johnny 5.

Engineers Later, Dr. Bonsai left the company to join OCP. I hear that he is currently a key person on the Robocop project.
Click here for the original size

LSb Manipulation

One way of hiding a message is to take each bit of the message and change the color of the picture's pixels. This means that the least significant bit of pseudo-randomly determined x, y, and color channel is changed to be a bit of the message. The change is so subtle that no one will notice the change visually.

Red Shades
This is actually two shades of red. The LSb of the left half is set (0xFF). The LSb of the right half is clear (0xFE).

The engineer picture is 5500 pixels wide and 3115 pixels high. There are three color channels: red, green, and blue. If every (x, y, color) location is used, then the largest message that can be embedded is about 6 MB. The path through the locations is determined via an algorithm based on the password.

Equation 1

First, let's build the software.

steg $ ls
Engineers.png  build  steg.c
steg $ cat build
#!/bin/sh

gcc -o pngsteg steg.c -lgd -lssl

steg $ ./build
steg $ ls
Engineers.png  build  pngsteg  steg.c
steg $ ./pngsteg --help
pngsteg - A steganography program for PNG files
Written by contrapants@waronpants.net
Copyright 2012
See http://waronpants.net/png-steganography for more info

Usage: Embedding:  pngsteg -e -i INPUT_PIC -d MESSAGE_INPUT -o OUTPUT_PIC
                        -p PASSWORD [-m MAP_OUTPUT_PIC]
    Extracting: pngsteg -x -i INPUT_PIC -d MESSAGE_OUTPUT -p PASSWORD
    Help:       pngsteg -h

-e, --embed
    Embeds MESSAGE_INPUT into INPUT_PIT png file, creating OUTPUT_PIC png file
-x, --extract
    Extracts embedded message from INPUT_PIC png file, creating MESSAGE_OUTPUT
    file
-i, --in-pic INPUT_PIC
    The png file to be embedded into/extracted from
-o, --out-pic OUTPUT_PIC
    The png file to be created after embedding the MESSAGE_INPUT into INPUT_PIC
-d, --data MESSAGE
    The cleartext message to embed/extract
-m, --map MAP_OUTPUT_PIC
    The optional map output png file. When a bit is embedded in INPUT_PIC, the
    same pixel location will be modified in map file. The color channel
    affected will set to value 255.
-p, --password PASSWORD
    The password on which to base the algorithm
-v, --verbose
    This switch can be used multiple times. Each time increases the verbosity
    and number of output messages. (max used: 4)
-h, --help
    This text

To demonstrate the software with a small amount of data, we'll create a 500kB file and embed it.

steg $ dd if=/dev/urandom of=testdata_small bs=1K count=500
500+0 records in
500+0 records out
512000 bytes (512 kB) copied, 0.0507097 s, 10.1 MB/s

steg $ ./pngsteg -e -i Engineers.png -o StegPic_small.png -d testdata_small -m StegMap_small.png -p TestPassword1 -v
Building node list
Preallocation successful.
Shuffling nodes
Embedding data
Data embedded
Saving output picture
Saving map output picture

steg $ ./pngsteg -x -i StegPic_small.png -d testdata_small_extracted -p TestPassword1 -v
Preallocation successful.
Shuffling nodes
Extracting data
Data extracted.

steg $ diff testdata_small testdata_small_extracted
steg $

The resulting picture and map file are below. The map starts as a black picture. As each bit is of the message is embedded, the color channel of the same pixel is turned "on" (set to 0xFF).

StegPic<em>small
Click here for the original size

StegMap<em>small
Click here for the original size

How about a larger file? Let's try the same process with a 4MB file.

steg $ dd if=/dev/urandom of=testdata bs=1M count=4
4+0 records in
4+0 records out
4194304 bytes (4.2 MB) copied, 0.454543 s, 9.2 MB/s

steg $ ./pngsteg -e -i Engineers.png -o StegPic.png -d testdata -m StegMap.png -p TestPassword1 -v
Building node list
Preallocation successful.
Shuffling nodes
Embedding data
Data embedded
Saving output picture
Saving map output picture

steg $ ./pngsteg -x -i StegPic.png -d testdata_extracted -p TestPassword1 -v
Preallocation successful.
Shuffling nodes
Extracting data
Data extracted.

steg $ diff testdata testdata_extracted
steg $

StegPic
Click here for the original size

StegMap
Click here for the original size

The Source Code

The (admittedly unclean) source code can be found here: steg.c

Update (March 20, 2012): A few people have asked to see the revision history. Check it out through subversion. (Repository link)

Other Possibilities

Why not JPG?

"JPG files are smaller and more common. Why not use them?" Well, fictional interrogator, it's because of why JPGs are smaller. They use lossy compression. This means that once the data has been embedded, saving the file will erase part of the data. JPG steganography usually hides the data in the EXIF data of the JPG, which is usually very obvious to find and read.

Other lossless picture formats are BMP and TIFF. Neither are compressed, so they are usually very large. Due to bandwidth and storage, posting them as an image on the Web is generally not a good idea whether there is data embedded or not.

Algorithm

The algorithm I used is a buck-shot approach. It scatters the data all over picture across all color channels. It also doesn't consider an alpha channel (transparency).

There are more intricate methods, such as hiding the data in the anti-aliased lines between contrasting colors. However, because this method restricts the locations where data can be hidden, less data can be embedded.

Conclusion

Just because encryption works pretty well most of the time, it should not be the only thing considered whenever security is mentioned. There are other layers. Data can be encrypted then embedded in a picture.

There are also algorithms and programs for embedding data in sound files. If the data doesn't need to be extracted without errors, then pictures can even be inserted into music as an Easter egg, such as the audio spectrum of Windowlicker by Aphex Twin (source).

Aphex Twin - Windowlicker Face

There are other creative uses for steganography. Besides secretly hiding data, steganography might be used to hide copyright information in a photograph without requiring a destructive watermark, for example.

Whether the intent is malicious, defensive, or just for fun, remember that not everything may be as it seems, especially on the Internet. Just don't get too paranoid.

White House Pentagram

Note: I wanted to include a picture of the letter John Nash would see popping out at him in A Beautiful Mind, but I couldn't find the screenshot I wanted. Instead, I included the above picture from a conspiracy theory website. To make up for this, please accept this picture of Jennifer Connelly.

Jennifer Connelly - A Beautiful Mind
OMG! A PNG file!

I'm going to end this on a high note with that picture.


Image Sources