From 600 lines of bash to a macOS app

The script that wouldn't stop growing

nightloop.sh started as 50 lines. A while loop, a task file, and a call to claude -p. If you've read my earlier post about the pipeline origin, you know that part. What I haven't talked about is the middle period. The months where the script grew from 50 lines to 600, and every bug I fixed made me think "this should be a real thing."

This post is about that transition. Not the pipeline pattern. The engineering decision to go from bash to native macOS app, and what that journey looked like as a solo developer.

Bug #1: Log contamination

Bash doesn't give you structured logging. You get stdout and stderr, and everything goes into the same pipe unless you're very careful.

My logs were a disaster. Claude's output, my echo statements, error messages from subshells, raw ANSI escape codes from tools that didn't realize they were being piped, all interleaved in one file. When a task failed at 3am, debugging meant scrolling through hundreds of lines of mixed output trying to figure out which chunk belonged to which task.

# What I wanted:
# [task-3] [precheck] Reading codebase... OK
# [task-3] [implement] Writing changes...
# [task-3] [validate] Running tests... FAIL

# What I got:
# Reading codebase...
# ←[32mOK←[0m
# src/utils.ts - exported 3 functions
# Writing changes...
# Error: ENOENT: no such file or directory (this was from a DIFFERENT task)
# Running tests...
# ←[31mFAIL←[0m

I tried fixing this. Added task IDs as prefixes. Redirected stderr separately. Used tee to split output. Each fix introduced a new problem. The task ID prefix broke when Claude's output contained newlines (which it always does). The stderr redirect lost error context. The tee approach created race conditions when tasks ran in parallel.

After two weeks of log plumbing, my log handling code was longer than the actual pipeline logic. That's when the first "this should be a product" thought showed up.

Bug #2: Argument passing

Try passing a multi-line string that contains quotes, backticks, dollar signs, and markdown code blocks through three levels of bash function calls. I dare you.

# The task description:
# Implement the `formatPrice` function that takes a price 
# object like {"amount": 1999, "currency": "USD"} and 
# returns "$19.99"

# What Claude received after bash was done with it:
# Implement the formatPrice function that takes a price 
# object like {amount: 1999, currency: USD} and 
# returns .99

The JSON quotes got eaten. The dollar sign got interpreted as a variable. The backticks got executed as a subshell command. I'd write a perfectly clear task description and the agent would receive something that looked like it went through a blender.

I fought this for a while. Aggressive quoting. Heredocs. Base64 encoding the task and decoding it before passing to Claude. That last one actually worked, but look at what I'd become: base64-encoding strings in a bash script to avoid argument corruption. This is not how software should work.

Bug #3: The lockfile disaster

I needed to make sure only one instance of nightloop.sh ran at a time. Multiple instances modifying the same git repo simultaneously would be catastrophic. So I implemented a lockfile.

Problem: macOS doesn't have flock. Linux does, macOS doesn't. Apple ships shlock in the developer tools, which has different semantics. I wrote my own lockfile implementation using mkdir (atomic on most filesystems).

acquire_lock() {
  local lockdir="/tmp/nightloop.lock"
  if mkdir "$lockdir" 2>/dev/null; then
    trap 'rm -rf "$lockdir"' EXIT
    return 0
  else
    return 1
  fi
}

This worked. Until it didn't.

If the script crashed hard (segfault in a subprocess, OOM kill, power loss), the trap never fired and the lockfile stayed. Next time I ran the script, it thought another instance was running and refused to start. I'd have to manually delete /tmp/nightloop.lock before it would run again.

So I added stale lock detection. Check the PID stored in the lockfile, verify if that process still exists, remove the lock if the process is dead. That worked too. Until macOS reused the PID for a different process and my stale lock check said "yep, that process is alive" even though it was a completely unrelated app.

I spent an afternoon making my lock detection check both the PID and the process name. Three nested conditionals and a call to ps parsed with awk. For a lockfile. In a bash script.

Bug #4: No visual feedback

When nightloop.sh was running, I had no idea what it was doing unless I tailed the log file. Which task is it on? How far along is it? Did something fail 20 minutes ago and it's been stuck in a retry loop ever since?

I tried a few things. A status file that got updated after each task. A simple ncurses-style progress display. A Slack webhook that sent updates to a channel. Each one was fragile and added more code to maintain.

What I really wanted was a window. A simple UI where I could see the pipeline state, the current task, the logs per task, and a big obvious indicator when something needed my attention. Bash can't give you that. Not in any way that isn't a complete hack.

Bug #5: Zero error recovery

When a task failed in nightloop.sh, the script had three options: retry, skip, or stop. That's it. No way to pause the pipeline and let me intervene. No way to edit the task and rerun just that one. No way to see what went wrong in context and make a decision.

The retry logic got complicated fast. Exponential backoff, max retry counts, different strategies for different failure types. But every strategy was hardcoded. If I wanted to change the retry behavior, I was editing bash conditionals at midnight.

And dependency handling was the worst. If Task 3 failed and Task 7 depended on it, the script needed to skip Task 7. But what about Task 9, which depended on Task 7? The dependency graph traversal was 80 lines of bash that I was genuinely afraid to touch because I couldn't write tests for it.

The decision

By the time nightloop.sh hit 600 lines, I was spending more time maintaining the script than using it. Every fix created a new edge case. Every feature required plumbing through bash's limitations. I was fighting the tool instead of using it.

I listed everything that was wrong:

No structured logging
Argument corruption
Platform-specific lockfile issues
No UI, no visibility
Brittle error recovery
Untestable dependency logic
Three copies of the script on different machines, each with different bugs

And I listed what I actually wanted:

Process isolation per task
Structured, per-task logs
A visual pipeline display
Pause, resume, intervene
Proper dependency graph execution
Retry with configurable strategies
One install, works everywhere (on macOS at least)

That second list is a native app. Not a better script. Not a Python rewrite. A real application with a real UI and real process management.

Going native with Swift

I chose Swift because I'm building for macOS and I wanted it to feel right. Not an Electron app with 400MB of RAM usage for a task runner. Not a web app in a wrapper. A native macOS app that uses system frameworks, respects macOS conventions, and doesn't make your fans spin when it's idle. That bash script eventually became Zowl.

SwiftUI for the interface. Proper process management through Foundation's Process class instead of bash subshells. Structured logging with levels, timestamps, and per-task isolation. A dependency graph engine that's actually testable.

The first version took about six weeks. It could do everything nightloop.sh could do, minus the bugs, plus a UI. Seeing my pipeline state in a real window instead of tailing a log file felt like upgrading from a bicycle to a car.

What I kept, what I threw away

I kept the three-step pattern. Pre-check, implement, validate. That was the right architecture. bash or Swift, it doesn't matter. Agents need guardrails before and after they write code.

I kept the philosophy of running overnight. The whole point was always "set it up, go to sleep, review in the morning."

I threw away every line of bash. The lockfile system, the log parser, the argument escaping, the retry spaghetti, the dependency traversal. All of it. The problems those 600 lines were solving either disappeared in a native app context or had proper solutions in Swift.

The lockfile problem? Gone. The app is a single process. Can't run two instances because macOS handles that for you with launch services.

The argument passing problem? Gone. Strings in Swift are strings. They don't get mangled by shell expansion.

The logging problem? Gone. Each task gets its own log stream. No interleaving. No ANSI codes. Structured data I can display in a table view.

The thing about bash

I'm not bashing bash (sorry). It's a great tool for what it's designed for. Gluing commands together, quick automation, system admin tasks. nightloop.sh was the right tool for figuring out whether the idea worked. Prototyping in bash is fast. You can test a concept in an afternoon.

But there's a point where a script wants to be a program. When you're base64-encoding arguments to avoid corruption, when your lockfile code is more complex than your business logic, when you can't add a feature without breaking two others, that's the point.

nightloop.sh told me the idea was worth building. Swift let me build it for real. I still have the script on my machine. 637 lines. I open it sometimes when I need to remember that every good tool starts as a bad script that refused to stay small. If you're thinking about building a better pipeline harness, you might want to look at how others are structuring pipeline origins or even how to run 60 tasks in a single night with the right infrastructure.