//Excavator
>TL;DR
We’re given a partially corrupted Linux core dump from a crashed stager. The stager was staging a ZIP, then decrypting config.bin using a saved stream-cipher state.
The trick is:
- Carve the embedded ZIP out of the core using the ZIP EOCD signature.
- Repair the ZIP because the malware intentionally zeroed the first 30 bytes of the ZIP local header (
memset(zip_buf, 0, 30)), making the archive look “corrupt”. - Extract
config.binfrom the repaired ZIP. - Recover the stream cipher state from memory: the
struct state { s[256], i, j }is effectively RC4 state. - Decrypt
config.binusing the recovered RC4 state.
Recovered plaintext config:
{!!!Y0u_D1d_I7_l1k3_4n_3xC4vat0r!_w3eeell_d0ooone!!!}
Flag:
shellmates{!!!Y0u_D1d_I7_l1k3_4n_3xC4vat0r!_w3eeell_d0ooone!!!}
>Files Provided
The challenge ZIP contained:
core.522044— 64-bit Linux ELF core dump (x86_64)stager.c— pseudo-code / reconstructed C showing what the malware did
>Step 0 — Read the clue: stager.c
The most important lines are:
- It loads a persistent cipher state:
struct state {
uint8_t s[256];
uint8_t i;
uint8_t j;
};
load_state("state", &ctx);
- It loads a ZIP into memory (
zip_buf), then runs unzip on/tmp/config.zip(downloaded earlier) and wipes the first 30 bytes ofzip_buf:
uint8_t *zip_buf = load("config.zip", &zip_len);
...
memset(zip_buf, 0, 30);
- It loads
config.bin, deletes the files, then crashes before/while decrypting:
uint8_t *enc = load("config.bin", &enc_len);
...
if (decrypt(&ctx, enc, enc_len) != 0) { ... }
startInfoStealer(enc, enc_len);
Why this matters
This tells us exactly what should be inside the core dump:
- A heap buffer containing
config.zip(or parts of it) - A heap buffer containing
config.bin - A
ctxstructure with a 256-byte permutation and two indices
Also, the memset(zip_buf, 0, 30) is a massive hint: 30 bytes is the fixed-size portion of a ZIP Local File Header.
That clue is what made me go after ZIP header repair instead of “generic carving”.
>Step 1 — Quick recon: strings + ZIP signatures
A core dump is just a big memory snapshot. A fast first pass is:
strings -a core.522044to see if filenames, magic bytes, or fragments exist
In the output, we saw:
config.binPKconfig.bin
That PK is the classic ZIP marker (0x4b50 little endian), so I pivoted to ZIP structure carving.
>Step 2 — Carve ZIPs reliably using EOCD
Why EOCD?
ZIP files are easiest to recover from the end, because the End Of Central Directory record contains:
- central directory size
- central directory offset
So even if the beginning is damaged, EOCD often survives.
The EOCD signature is:
PK\\x05\\x06
Strategy:
- Scan the core for
PK\\x05\\x06 - Parse the EOCD fields
- Compute:
$$\text{zip_start} = \text{eocd_offset} - (\text{cd_offset} + \text{cd_size})$$
- Extract
[zip_start : eocd_end]as a candidate ZIP blob
I implemented this in a small helper:
carve_zip_from_core.py
Running it produced small candidate ZIP blobs.
>Step 3 — Why the carved ZIP “looks corrupt”
When testing the carved ZIP with unzip -t, it failed with:
bad zipfile offset (local header sig)
This is exactly what you’d expect if the ZIP central directory is present (end of file) but the local file header at the start is broken.
Now re-check the malware clue:
memset(zip_buf, 0, 30)
And from the ZIP spec:
- the fixed part of the local file header is 30 bytes
So the archive isn’t “randomly corrupted”; it’s intentionally sabotaged in-memory.
>Step 4 — Repair the zeroed local header from the central directory
ZIP has redundancy:
- The Central Directory File Header (CDFH) repeats most important metadata.
So we can reconstruct the local header fields (version needed, flags, method, timestamps, CRC, compressed/uncompressed size, filename length, extra length) from the CDFH.
In this challenge the ZIP is tiny and contains only config.bin, so a narrow repair is sufficient:
- Find EOCD and central directory offset
- Parse the first CDFH (
PK\\x01\\x02) - Build the 30-byte local header fixed structure:
- signature
PK\\x03\\x04 - fields copied from central directory
- Replace bytes
zip[0:30]
I wrote this as:
repair_zeroed_local_header.py
After repair, unzip -t succeeded and we extracted config.bin.
>Step 5 — Identify the cipher: struct state screams RC4
The stager’s struct state is:
s[256]— a permutation of bytes 0..255i, j— two 8-bit counters
This matches the classic RC4 internal state (array S and indices i,j).
This matters because the core dump likely contains the live state used for decryption.
Why brute-force scanning for the state works
A random 256-byte window almost never forms a full permutation of 0..255.
So a practical technique is:
- Slide across the core bytes
- For each offset, test if the next 256 bytes are a permutation
- If yes, read the next 2 bytes as
iandj - Use RC4 PRGA with that state to decrypt
config.bin - Look for something that looks like a config/flag
I used a “printable ratio” heuristic and/or direct substring checks.
>Step 6 — End-to-end solver
After extracting config.bin, the combined solver (solve_excavator.py) found a valid RC4 state at a specific core offset and produced plaintext:
{!!!Y0u_D1d_I7_l1k3_4n_3xC4vat0r!_w3eeell_d0ooone!!!}
That becomes the flag by wrapping in the required format:
shellmates{!!!Y0u_D1d_I7_l1k3_4n_3xC4vat0r!_w3eeell_d0ooone!!!}
>How to reproduce (commands)
From the challenge directory:
# unzip challenge
unzip -q excavator.zip -d extracted
# run the full solver
python3 solve_excavator.py extracted/core.522044
# if you want to see intermediate artifacts:
python3 carve_zip_from_core.py extracted/core.522044 carved_zips
python3 repair_zeroed_local_header.py carved_zips/<one>.zip repaired.zip
unzip -o repaired.zip -d recovered
xxd recovered/config.bin
>Solver Code (all scripts)
Below are the exact scripts used.
1) carve_zip_from_core.py
#!/usr/bin/env python3
"""Carve ZIP blobs from an ELF core by locating EOCD records.
This is a small CTF helper: given a core file, find all occurrences of the
ZIP End-Of-Central-Directory signature (PK\\x05\\x06), parse the EOCD, and
attempt to carve the full ZIP range based on central directory offset/size.
Usage:
python3 carve_zip_from_core.py extracted/core.522044 out_dir
"""
from __future__ import annotations
import os
import struct
import sys
from dataclasses import dataclass
EOCD_SIG = b"PK\\x05\\x06"
@dataclass(frozen=True)
class Eocd:
offset: int
disk_no: int
cd_start_disk: int
disk_entries: int
total_entries: int
cd_size: int
cd_offset: int
comment_len: int
def parse_eocd(buf: bytes, off: int) -> Eocd | None:
# EOCD fixed header is 22 bytes.
if off < 0 or off + 22 > len(buf):
return None
if buf[off : off + 4] != EOCD_SIG:
return None
(disk_no, cd_start_disk, disk_entries, total_entries, cd_size, cd_offset, comment_len) = struct.unpack_from(
"<HHHHIIH", buf, off + 4
)
if off + 22 + comment_len > len(buf):
return None
# sanity bounds
if cd_size > len(buf) or cd_offset > len(buf):
return None
return Eocd(
offset=off,
disk_no=disk_no,
cd_start_disk=cd_start_disk,
disk_entries=disk_entries,
total_entries=total_entries,
cd_size=cd_size,
cd_offset=cd_offset,
comment_len=comment_len,
)
def carve_zip(buf: bytes, eocd: Eocd) -> tuple[int, int] | None:
# EOCD begins at zip_start + cd_offset + cd_size
zip_start = eocd.offset - (eocd.cd_offset + eocd.cd_size)
zip_end = eocd.offset + 22 + eocd.comment_len
if zip_start < 0 or zip_end > len(buf) or zip_start >= zip_end:
return None
# Basic sanity: central directory signature should be present.
cd_start = zip_start + eocd.cd_offset
if cd_start < 0 or cd_start + 4 > len(buf):
return None
if buf[cd_start : cd_start + 2] != b"PK":
return None
return zip_start, zip_end
def main() -> int:
if len(sys.argv) != 3:
print(__doc__.strip())
return 2
core_path = sys.argv[1]
out_dir = sys.argv[2]
with open(core_path, "rb") as f:
buf = f.read()
os.makedirs(out_dir, exist_ok=True)
hits = []
start = 0
while True:
idx = buf.find(EOCD_SIG, start)
if idx == -1:
break
e = parse_eocd(buf, idx)
if e is not None:
rng = carve_zip(buf, e)
if rng is not None:
hits.append((e, rng))
start = idx + 1
if not hits:
print("No carveable EOCD records found.")
return 1
for n, (e, (zs, ze)) in enumerate(hits, 1):
out_path = os.path.join(out_dir, f"carved_{n}_start{zs}_end{ze}.zip")
with open(out_path, "wb") as f:
f.write(buf[zs:ze])
print(
f"[{n}] wrote {out_path} (len={ze-zs}) eocd_off={e.offset} cd_off={e.cd_offset} cd_size={e.cd_size} entries={e.total_entries}"
)
return 0
if __name__ == "__main__":
raise SystemExit(main())
2) repair_zeroed_local_header.py
#!/usr/bin/env python3
"""Repair a ZIP whose first 30 bytes (local header fixed fields) were zeroed.
This matches the common CTF pattern where `memset(buf, 0, 30)` clobbers the
local-file-header fixed portion but leaves filename/extra/data intact.
We reconstruct the local header fixed fields from the first central directory
record and write a repaired zip.
Usage:
python3 repair_zeroed_local_header.py in.zip out.zip
"""
from __future__ import annotations
import struct
import sys
CD_SIG = b"PK\\x01\\x02"
LFH_SIG = b"PK\\x03\\x04"
EOCD_SIG = b"PK\\x05\\x06"
def main() -> int:
if len(sys.argv) != 3:
print(__doc__.strip())
return 2
in_path, out_path = sys.argv[1], sys.argv[2]
data = bytearray(open(in_path, "rb").read())
eocd = data.rfind(EOCD_SIG)
if eocd == -1:
raise SystemExit("EOCD not found")
# EOCD: sig(4) + H H H H I I H
disk_no, cd_start_disk, disk_entries, total_entries, cd_size, cd_offset, comment_len = struct.unpack_from(
"<HHHHIIH", data, eocd + 4
)
if total_entries < 1:
raise SystemExit("No central directory entries")
cd = cd_offset
if data[cd : cd + 4] != CD_SIG:
raise SystemExit("Central directory signature not found at cd_offset")
# Central directory fixed fields (46 bytes after signature)
(
ver_made,
ver_needed,
flags,
method,
mtime,
mdate,
crc32,
comp_size,
uncomp_size,
fname_len,
extra_len,
comment_len2,
disk_start,
int_attr,
ext_attr,
lfh_offset,
) = struct.unpack_from("<HHHHHHIIIHHHHHII", data, cd + 4)
if lfh_offset != 0:
# This script is intentionally narrow: the memset in the challenge hits the start.
raise SystemExit(f"Unexpected local header offset {lfh_offset}; expected 0")
# Build local file header fixed part (30 bytes)
lfh_fixed = struct.pack(
"<4sHHHHHIIIHH",
LFH_SIG,
ver_needed,
flags,
method,
mtime,
mdate,
crc32,
comp_size,
uncomp_size,
fname_len,
extra_len,
)
if len(lfh_fixed) != 30:
raise AssertionError("LFH fixed size mismatch")
data[0:30] = lfh_fixed
with open(out_path, "wb") as f:
f.write(data)
return 0
if __name__ == "__main__":
raise SystemExit(main())
3) solve_excavator.py
#!/usr/bin/env python3
"""Solve the 'Excavator' challenge by recovering config from a damaged core.
Approach:
1) Extract a ZIP blob from the core whose local header was zeroed.
2) Repair the ZIP local header using central directory metadata.
3) Extract config.bin (ciphertext).
4) Scan the core for RC4 state candidates (a 256-byte permutation + i/j).
5) Decrypt config.bin with each candidate; stop when flag/config is found.
This script is intentionally self-contained and conservative.
Usage:
python3 solve_excavator.py extracted/core.522044
Outputs recovered plaintext to ./recovered_config.txt when found.
"""
from __future__ import annotations
import os
import struct
import sys
from dataclasses import dataclass
from typing import Iterable
EOCD_SIG = b"PK\\x05\\x06"
CD_SIG = b"PK\\x01\\x02"
LFH_SIG = b"PK\\x03\\x04"
@dataclass(frozen=True)
class ZipCarve:
zip_start: int
zip_end: int
eocd_off: int
cd_off: int
cd_size: int
def find_all(buf: bytes, needle: bytes) -> Iterable[int]:
start = 0
while True:
idx = buf.find(needle, start)
if idx == -1:
return
yield idx
start = idx + 1
def parse_eocd(buf: bytes, off: int):
if off + 22 > len(buf):
return None
if buf[off : off + 4] != EOCD_SIG:
return None
disk_no, cd_start_disk, disk_entries, total_entries, cd_size, cd_off, comment_len = struct.unpack_from(
"<HHHHIIH", buf, off + 4
)
if off + 22 + comment_len > len(buf):
return None
if cd_off + cd_size > len(buf):
return None
return (cd_off, cd_size, comment_len, total_entries)
def carve_zips(core: bytes) -> list[ZipCarve]:
out: list[ZipCarve] = []
for eocd in find_all(core, EOCD_SIG):
parsed = parse_eocd(core, eocd)
if not parsed:
continue
cd_off, cd_size, comment_len, total_entries = parsed
if total_entries < 1:
continue
zip_start = eocd - (cd_off + cd_size)
zip_end = eocd + 22 + comment_len
if zip_start < 0 or zip_end > len(core):
continue
cd_start = zip_start + cd_off
if core[cd_start : cd_start + 4] != CD_SIG:
continue
out.append(ZipCarve(zip_start, zip_end, eocd, cd_off, cd_size))
# Deduplicate (same start/end)
uniq = {(z.zip_start, z.zip_end): z for z in out}
return list(uniq.values())
def repair_zeroed_local_header(zip_bytes: bytes) -> bytes:
data = bytearray(zip_bytes)
eocd = data.rfind(EOCD_SIG)
if eocd == -1:
raise ValueError("EOCD not found")
disk_no, cd_start_disk, disk_entries, total_entries, cd_size, cd_offset, comment_len = struct.unpack_from(
"<HHHHIIH", data, eocd + 4
)
if total_entries < 1:
raise ValueError("No central directory entries")
cd = cd_offset
if data[cd : cd + 4] != CD_SIG:
raise ValueError("Central directory signature not found")
(
ver_made,
ver_needed,
flags,
method,
mtime,
mdate,
crc32,
comp_size,
uncomp_size,
fname_len,
extra_len,
comment_len2,
disk_start,
int_attr,
ext_attr,
lfh_offset,
) = struct.unpack_from("<HHHHHHIIIHHHHHII", data, cd + 4)
if lfh_offset != 0:
raise ValueError(f"Unexpected local header offset {lfh_offset}")
lfh_fixed = struct.pack(
"<4sHHHHHIIIHH",
LFH_SIG,
ver_needed,
flags,
method,
mtime,
mdate,
crc32,
comp_size,
uncomp_size,
fname_len,
extra_len,
)
data[0:30] = lfh_fixed
return bytes(data)
def extract_stored_file_from_single_entry_zip(zip_bytes: bytes) -> tuple[str, bytes]:
"""Parse a single-entry ZIP (stored or deflated) without relying on external tools."""
# Local file header
if zip_bytes[:4] != LFH_SIG:
raise ValueError("Bad local header signature")
(
sig,
ver_needed,
flags,
method,
mtime,
mdate,
crc32,
comp_size,
uncomp_size,
fname_len,
extra_len,
) = struct.unpack_from("<4sHHHHHIIIHH", zip_bytes, 0)
header_len = 30 + fname_len + extra_len
name = zip_bytes[30 : 30 + fname_len].decode("utf-8", errors="replace")
data_start = header_len
data_end = data_start + comp_size
payload = zip_bytes[data_start:data_end]
if method == 0: # stored
return name, payload
if method == 8: # deflate
import zlib
# raw DEFLATE stream
decompressed = zlib.decompress(payload, -zlib.MAX_WBITS)
return name, decompressed
raise ValueError(f"Unsupported compression method {method}")
def is_permutation_0_255(block: bytes) -> bool:
if len(block) != 256:
return False
# Fast-ish check: all bytes distinct and cover 0..255
return len(set(block)) == 256
def rc4_decrypt_from_state(s: bytes, i0: int, j0: int, data: bytes) -> bytes:
S = bytearray(s)
i = i0
j = j0
out = bytearray(len(data))
for n, b in enumerate(data):
i = (i + 1) & 0xFF
j = (j + S[i]) & 0xFF
S[i], S[j] = S[j], S[i]
k = S[(S[i] + S[j]) & 0xFF]
out[n] = b ^ k
return bytes(out)
def score_printable(data: bytes) -> float:
if not data:
return 0.0
printable = 0
for b in data:
if b in (9, 10, 13) or 32 <= b <= 126:
printable += 1
return printable / len(data)
def main() -> int:
if len(sys.argv) != 2:
print(__doc__.strip())
return 2
core_path = sys.argv[1]
core = open(core_path, "rb").read()
# Step 1-3: carve/repair/extract config.bin
zips = carve_zips(core)
if not zips:
print("No candidate ZIP blobs found in core")
return 1
config_bin = None
config_zip_bytes = None
for z in sorted(zips, key=lambda x: x.zip_end - x.zip_start):
raw = core[z.zip_start : z.zip_end]
try:
repaired = repair_zeroed_local_header(raw)
name, payload = extract_stored_file_from_single_entry_zip(repaired)
if name.endswith("config.bin") and payload:
config_bin = payload
config_zip_bytes = repaired
break
except Exception:
continue
if config_bin is None:
print("Failed to extract config.bin from any carved ZIP")
return 1
print(f"Extracted config.bin ({len(config_bin)} bytes)")
# Step 4-5: scan for RC4 state and decrypt
targets = [b"shellmates{", b"{", b"[", b"C2", b"http", b"https"]
best = (0.0, None, None)
for off in range(0, len(core) - 258):
s = core[off : off + 256]
if not is_permutation_0_255(s):
continue
i0 = core[off + 256]
j0 = core[off + 257]
pt = rc4_decrypt_from_state(s, i0, j0, config_bin)
if b"shellmates{" in pt:
text = pt.decode("utf-8", errors="replace")
open("recovered_config.txt", "w", encoding="utf-8").write(text)
print(f"FOUND FLAG using state at core offset {off} (i={i0}, j={j0})")
print(text)
return 0
sc = score_printable(pt)
if sc > best[0]:
best = (sc, off, pt)
# Early accept: looks like JSON-ish and pretty printable
if sc > 0.9 and any(t in pt for t in targets):
text = pt.decode("utf-8", errors="replace")
open("recovered_config.txt", "w", encoding="utf-8").write(text)
print(f"Likely config using state at core offset {off} (i={i0}, j={j0})")
print(text)
return 0
if best[1] is not None:
sc, off, pt = best
print(f"No flag found. Best printable candidate: score={sc:.3f} at offset={off}")
print(pt)
return 1
if __name__ == "__main__":
raise SystemExit(main())
>References (format + crypto)
These are the references I leaned on for the “why this works” parts:
- PKWARE ZIP AppNote (official format details, including Local File Header = 30 bytes, Central Directory, EOCD):
- ZIP structure overview (EOCD/CDFH/Local header layout; helpful for quick recall):
- RC4 state and PRGA description (why
S[256]+i,jis RC4):