hero-image

Polyglot Files: When one files speaks two formats at once


Have you ever opened an image file, only to find it behaves like a ZIP archive? Or a PDF that is simultaneously valid JavaScript? This isn’t a bug it’s called a polyglot file, and the technique is far deeper and more dangerous than it appears on the surface, And i dedicate this post because i love working on forensic problems/challenges on CTF platform and competitions

What Is a Polyglot File?

A polyglot file is a file that is simultaneously valid and parseable by two or more different format parsers. The word polyglot comes from Greek, meaning “many languages” and that is precisely the essence: a single file that “speaks” in more than one format.

This is not simply a renamed extension. A polyglot file genuinely satisfies the magic byte signatures and internal structural requirements of two formats at the same time. As a result, depending on which application reads it, the file behaves entirely differently.

The most well known examples:

  • GIFAR — a file simultaneously valid as a GIF and a JAR archive
  • PDF + ZIP — a file that opens in Acrobat as a readable document, yet extracts cleanly with unzip
  • HTML + PNG — an image that renders visually in an image viewer, but if loaded in a browser, executes as a full web page

How Is This Possible?

The core mechanic behind polyglot files lies in how each format defines the boundaries of its valid data.

Formats like GIF and PNG read from the beginning of the file forward they care about the magic bytes at offset 0. Other formats like ZIP and JAR read from the end of the file backward — they search for the End of Central Directory (EOCD) signature PK\x05\x06 in the tail.

This creates an elegant exploitation gap: place a valid GIF header at the start, embed ZIP data at the end, and both parsers will be satisfied with their own interpretation of the same bytes.

[0x00]  GIF89a ... (valid GIF header)
        ... image data ...
        ... hidden payload ...
[EOF-22] PK\x05\x06 ... (valid ZIP EOCD)

Lenient Parsers and Trailing Data

Many parsers are purposefully tolerant; they silently ignore everything that comes after they reach the end of the recognized structure. In contrast, ZIP only considers the signature at the end of the file. For a polyglot, a footer-anchored structure combined with a permissive header-first format is ideal.

Manual Construction: PDF + ZIP

Let’s build a simple polyglot manually to understand the mechanics from first principles.

import zipfile
import io

pdf_content = b"""%PDF-1.4
1 0 obj
<< /Type /Catalog /Pages 2 0 R >>
endobj
2 0 obj
<< /Type /Pages /Kids [3 0 R] /Count 1 >>
endobj
3 0 obj
<< /Type /Page /Parent 2 0 R /MediaBox [0 0 612 792] >>
endobj
xref
0 4
0000000000 65535 f
0000000009 00000 n
0000000058 00000 n
0000000115 00000 n
trailer
<< /Size 4 /Root 1 0 R >>
startxref
190
%%EOF"""

zip_buffer = io.BytesIO()
with zipfile.ZipFile(zip_buffer, 'w', zipfile.ZIP_DEFLATED) as zf:
    zf.writestr("secret.txt", "flag{p0lygl0t_f1l3s_r0ck}")

with open("polyglot.pdf", "wb") as f:
    f.write(pdf_content)
    f.write(zip_buffer.getvalue())

The resulting polyglot.pdf opens in Adobe Reader as a valid PDF document, and can be extracted with unzip polyglot.pdf to retrieve secret.txt.

Verification:

$ file polyglot.pdf
polyglot.pdf: PDF document, version 1.4

$ unzip -l polyglot.pdf
Archive:  polyglot.pdf
  Length      Date    Time    Name
---------  ---------- -----   ----
       26  2025-04-19 10:00   secret.txt

Both interpretations are completely valid. Neither parser complains.

Real-World Implications

1. Upload Filter Bypass

The most dangerous scenario is a web application that validates file type based solely on magic bytes or file extension.

Consider a web app that only allows PNG uploads for user avatars. If validation only checks for the \x89PNG magic bytes, a polyglot PNG + PHP or PNG + HTML file can pass that check cleanly. If the server stores uploaded files inside the web root with PHP execution enabled, accessing the file via a direct URL causes the server to execute it as a script rather than serve it as an image.

This is a real attack vector that has appeared in bug bounty programs repeatedly, particularly against misconfigured servers where the upload directory sits within an executable web root.

2. Antivirus Evasion

Some antivirus engines apply heuristics based on the file format they detect. Malware packaged as a polyglot can deceive scanners that only inspect one interpretation of the file typically whichever matches the extension while the malicious payload lives inside the second, uninspected format interpretation.

3. GIFAR and Java Applets (Historical)

GIFAR (GIF + JAR) was a highly relevant attack in the Java Applet era. An attacker could upload a GIFAR file to a victim server as a valid “image,” then load it as a Java Applet from the same domain. Because the same origin policy was satisfied (the file was served from the victim’s own domain), the applet gained full access to the user’s session cookies and DOM. This technique was presented at Black Hat 2008 and represented a genuine, exploitable browser level vulnerability.

Polyglot Files in CTF

In CTF competitions, polyglot files appear frequently in forensics and steganography categories. Two common patterns:

Scenario 1: The “broken” file that isn’t broken

You are given a .jpg file that looks normal at a glance. Running binwalk reveals something else entirely:

$ binwalk suspicious.jpg

DECIMAL       HEXADECIMAL     DESCRIPTION
-----------------------------------------------
0             0x0             JPEG image data, JFIF standard
4096          0x1000          Zip archive data, at least v2.0

The file is a polyglot. The flag lives in the ZIP portion.

Scenario 2: Raw magic byte analysis

A file is provided with no extension. Inspecting the raw bytes with xxd:

00000000: 5049 4b03 0414 0000 0000 0000 ...  PK..........

The leading bytes are a ZIP signature (PK\x03\x04), but deeper in the file there is a valid PDF structure. Both interpretations must be extracted and examined for the flag.

Relevant tools for analysis:

binwalk -e target_file
foremost -i target_file
file target_file
xxd target_file | head -20
python3 -c "import zipfile; zipfile.ZipFile('target_file').extractall('.')"

Scenario 3: Network Forensic

Polyglot files appear here as the payload, a file that was transferred over the network and needs to be reconstructed from stream data. the common challenge (scenario): export HTTP object from a PCAP file, get a file named report.pdf, run file on it, if it says zip archive. The attacker sent a polyglot to evade a content inspection that only checked teh PDF magic bytes at offset 0.

tshark -r capture.pcap --export-objects http,./output
file ./output/*             # never trust the filename
binwalk ./output/report.pdf # check for embedded formats

Scenario 4: Memory Forensic (volatiliy)

Memory analysis captures stuff that disk forensics might miss. For example, after a cyber attack, malware could delete its own files or run without saving anything to the disk at all. That leaves you with nothing to find on the hard drive. But in memory, you can spot remnants like active connections or secret codes. In 2007, the first version of The Volatility Framework was released publicly at Black Hat DC. The software was based on years of published academic research into advanced memory analysis and forensics

Frequently Used Volatility Modules Here are some modules that are often used:

  • pslist: Shows the active processes.
  • cmdline: Reveals the command-line parameters for processes.
  • netscan: Checks for network links and available ports.
  • malfind: Looks for possible harmful code added to processes.
  • handles: Examines open resources.
  • svcscan: Displays services in Windows.
  • dlllist: Lists the dynamic-link libraries loaded in a process.
  • hivelist: Identifies registry hives stored in memory.

You can find documentation on Volatility here:

Detection and Mitigation

From a blue team and forensic analysis perspective, detecting polyglot files requires a multi layer approach.

Deep content inspection — never trust magic bytes or file extensions alone. Use strict parsing libraries that validate the full structural integrity of the claimed format, not just its header.

Entropy analysis — polyglot files frequently exhibit anomalous entropy distribution in specific sections. An image section with entropy approaching 8.0 is a strong red flag, as natural image data rarely reaches that level of randomness.

File size heuristics — a PNG avatar for a 100×100 pixel image that weighs 4MB is an obvious outlier worth investigating.

On the developer side, the most effective mitigation is re-encoding:

from PIL import Image
import io

def validate_image_strict(file_bytes: bytes) -> bool:
    try:
        img = Image.open(io.BytesIO(file_bytes))
        img.verify()
        output = io.BytesIO()
        img_copy = Image.open(io.BytesIO(file_bytes))
        img_copy.save(output, format=img_copy.format)
        return True
    except Exception:
        return False

Re-encoding works by opening the image and saving it from scratch through the image parser’s own output pipeline. The result contains only data the image parser understood and reproduced any embedded ZIP, PHP, or HTML payload that lived outside the valid image structure is simply not carried over into the new file.

Closing Thoughts

Polyglot files live at the intersection of format specification, parser leniency, and attacker creativity. They exist in the gap between what a format intends and what a parser actually accepts.

Understanding this technique gives a dual advantage: as an attacker, you know how to bypass naive validation; as a defender, you understand that extension checks and magic byte inspection alone are never sufficient.

In CTF, the ability to read binary structures manually and reason about format ambiguity separates solvers from non-solvers. And in production systems, this concept has a documented history in real CVEs all rooted in the same over-trust of surface-level file type validation.


Want to go deeper? Check out PolyFile by Trail of Bits a static analysis tool built specifically to detect and dissect polyglot files.