Skip to main content

Steganography

Proposed by Ian Redzic

Steganography provides a covert channel of communication, and isn't inherently malicious. Downloading an image from the internet won't install malware -- you need a catalyst to activate it.
Notes
  • Hiding a message (or worse) within another file/message
    • First recorded use in 440 BC (Herodotus), writing a message on a shaved head, and letting hair regrow
  • Used not only in malicious contexts, also watermarks, etc.
  • Reality Winner - distributing government secrets by printing classified documents
    • eff.org  has tool that can interpret pattern of yellow dots from printers; time, serial number, date that document was printed
    • NSA could check logs, identified person responsible
  • $20 bill - yellow printouts on bill indicate where money was printed
  • Least significant bit: in 4-color image, changes last 2 bits of each color square (e.g. red goes from 255 to 235)
  • Higher-resolution images make it easier to embed larger text without noticeably reducing quality
  • Used extensively in cyber-espionage, and delivering malware; lots of banking trojans & malware
  • Motivated by fame and money (destroying stuff, or stealing money)
  • Steganography isn't an everyday tool that Stanford IT security sees (did see it once recently, and it was exciting!)
    • Japanese-language Excel spreadsheet arrived in an inbox, had macros that invoked a script that pulled a .png image from a remote server, reads it, and pulls code out of first three rows of the image (blue & green channels modified using least-significant bit), contained another script loaded into clipboard, triggered another process to run what's in the clipboard, which downloads the real payload.
    • Image itself (printer printing an Android logo) has some noise, low-grade banding; noise in first three lines is less -- lower-entropy because they contain the code
    • Did see this 3 days before any public disclosures, "narwhal spider" cybercrime group
  • Least-significant bit is not a very sophisticated way to encode a message
  • Very stealthy: malicious code on a router would pull down an image from a public image-sharing site on Photobucket, if you look at EXIF tags and take lat/lon, convert to an IP address, that's IP for command-and-control server
  • Demo: using tools steghide / stegosuite
    • Steganography tools often poorly maintained, very old
    • Would typically take source code of existing tool and rewrite it
  • steghide - can embed things, with or without a password
    • Original file is slightly bigger than steg file, not filling in all the least-significant bits
  • Stegosuite has a GUI for the same thing.
  • ImageMagick can identify noise distortion, but only if you have the original to compare with
  • Q: can you automatically add noise to all images that come in to Stanford email? A: researchers would not be happy
  • Government websites: every image is signed
  • steghide won't allow image to look distorted -- you can't embed 95k file in 116k file
  • Fingerprint for file changes, diff would also work (if you can compare to original)
  • Without extraction routine, these are harmless (and some are harmless, period -- just used for marking copyright)
  • Images embedded with stegsuite can't be extracted with steghide -- but if you're using this, then the receiving party should know what tool to use to extract it
  • Note: text not embedded as ASCII string, pieces of bytes of the ASCII string are spread across the image, however that's encoded
  • StegExpose not well maintained, but it does work (for pngs and bitmaps)
    • Java application, point it to a folder with images
    • Indicates approximate amount of hidden data
    • Doesn't support jpg
  • steghide/stegsuite will remove metadata on pictures
  • Spreadsheet situation: was monitoring a specific mailbox because it'd had problems in the past. Proofpoint in email stream will quarantine files that look suspicious.
  • More concerned with stopping execution of malicious stuff -- didn't look at image, looked at code that did the evaluation, and changed it to print the result out on the screen
  • A lot of Windows malware is powershell, so that's inherenty suspicious since most users don't need to run powershell scripts
  • Possible risk for exfiltration of data (TTL section in data packet once used to detect this)