The 'user uploaded a thing, normalize it before storing' workflow. Probe the file (image or audio/video) with media-info, decode any embedded barcodes/QRs, resize and thumbnail the image, convert it to a web-friendly format, and normalize audio loudness to broadcast standard. Six tools, one canonical normalize-on-upload pipeline that turns arbitrary user uploads into predictable storage artifacts.
Every product that accepts user uploads ships the same five bugs: the receipt photo nobody can read because it's 11MB and rotated, the audio voice-memo that's 6dB louder than every other voice-memo because the user had headphone gain cranked, the QR code in the uploaded business card that the app never scanned because it looked at the filename instead of the bytes, the AVIF that the iOS WebView can't display, the 4K hero image that blew up the thumbnail grid. This pack runs the canonical normalize-then-store pipeline so every uploaded artifact lands in storage with predictable dimensions, predictable codec, predictable loudness, and any embedded payload (QR/barcode) already extracted into metadata. Pairs with document-intel when the upload is a PDF.
claude mcp add agent402 -s user -- npx -y agent402-mcp@latest
Then paste this prompt into Claude:
Normalize this user upload using Agent402.
Input: uploaded file at temp path /tmp/upload-abc123 (1 file per invocation).
Max stored long-edge: 2000px.
Thumbnail size: 200x200.
Target image format: WebP quality 82.
Audio target: -16 LUFS (voice-memo standard).
(1) media-info — return {kind: 'image'|'audio'|'video'|'other', format, codec, width, height, durationSec, bitrate, sampleRate, colorSpace, exifOrientation, declaredMime, detectedMime}. If declaredMime ≠ detectedMime, log 'mime mismatch' and trust detectedMime. Branch on kind: 'image' → steps 2-5, 'audio' → step 6 only, 'video' → return as-is with kind=video and stop (out of scope), 'other' → reject. (2) barcode-decode on the image bytes — return {payload: '<decoded text>'|null, symbology: 'qr'|'ean13'|'upc'|...|null}. Pure-CPU. Null = no decodable barcode, that's fine. (3) image-resize with maxLongEdge=2000, preserveAspect=true, applyExifRotation=true, stripExifGps=true — return {bytes: <resized>, width, height}. (4) image-thumbnail with size=200, mode='cover' — return {bytes: <thumb>}. (5) image-convert on both step-3 output and step-4 output, format='webp', quality=82, skipIfAlreadyTarget=true — return {primary: {bytes, sizeBytes}, thumb: {bytes, sizeBytes}}. (6) audio-normalize with targetLufs=-16, format='mp3' — return {bytes, lufsBefore, lufsAfter, peakDbfs}. ONLY if step 1 said kind='audio'. Final return: {kind, normalized: {primaryBytes: <ref>, primarySize, thumbBytes: <ref>, thumbSize, width, height, format} | audio: {bytes: <ref>, durationSec, lufsAfter, format}, metadata: {barcodePayload, originalSizeBytes, sizeSavingsPct, exifOrientation, declaredVsDetectedMime}, oneLineSummary: 'image normalized: 4032x3024 HEIC → 2000x1500 WebP (847KB → 162KB, 81% smaller), 1 QR decoded (https://example.com/menu/42), thumb 200x200 → 8KB' | 'audio normalized: 6m 12s, -23.4 LUFS → -16.0 LUFS, peak -1.2 dBFS'}. media-info + barcode-decode + image-* + audio-normalize all involve ffmpeg/ffprobe/imagemagick under the hood — egress is 0, but CPU is meaningful, so this is a wallet/paid pack, not PoW. Budget ~$0.06 per upload.