18 Months of Learning to Build Software with LLMs - Shayon Pal

# 18 Months of Learning to Build Software with LLMs _How someone with product management experience learned to ship code, contribute to open source, and build production applications using AI tools_ > [!metadata]- Metadata > **Published:** [[2025-08-30|August 30, 2025]] > **Tags:** #🌐 #development #documentation #learning ![[learning-to-build-software-with-llm.jpg]] ## Act I: The Foundation (Before Claude Code) Before discovering Claude Code in May 2025, I spent 14 months failing forward with various AI tools. I'd been using [Perplexity](https://perplexity.ai/) and ChatGPT for research since 2023, but March 2024 marked when I first attempted using AI for actual coding. My background gave me unexpected advantages. Fifteen years in product management meant I could spec software, identify edge cases, and think systematically about problems. I'd understood APIs and methods from my PM work long before I started tinkering with [Apple Shortcuts](https://support.apple.com/guide/shortcuts/welcome/ios), where I spent five years building complex automations that taught me logical flow and practical API implementation. But writing actual code? That was foreign territory. ### The Obsidian Disaster My first attempt should have been simple: use [Cursor IDE](https://cursor.sh/)'s AI features to organize my [Obsidian](https://obsidian.md/) vault with proper YAML frontmatter and tags. I fired off crude prompts: "find all my restaurant-related notes and tag them properly" and "find all my daily notes, identify duplicates and reorganize my thoughts accordingly." I had no understanding of context management or how to create boundaries for LLMs. Two weeks later, my vault was destroyed. Notes vanished, YAML properties corrupted, content mangled. Each fix attempt made things worse. I watched years of notes turn into digital confetti. I learned developers had version control for a reason. Thankfully, I'd made backups. ### The Xcode Nightmare By June 2024, I tried building my dream RSS reader as an iOS app. I'd been using [Lire](https://apps.apple.com/ca/app/lire/id1482527526?mt=12) and [Reeder 5](https://apps.apple.com/ca/app/reeder-classic/id1529448980?mt=12) connected to [Inoreader](https://inoreader.com/), but nothing matched my exact requirements. Cursor with Claude Sonnet 3.5 gave instructions for Xcode interface elements that didn't exist. The LLMs had poor GUI knowledge, and I still hadn't learned proper prompting. Worse, I wasn't using version control—not because I didn't know about Git, but because I couldn't handle another variable while everything else was failing. The project collapsed spectacularly. ### Learning to Create Boundaries These disasters taught me the fundamental lesson: LLMs need "a box within which they can run around." Without clear boundaries and context, they hallucinate wildly. More importantly, I realized LLMs work far better with command-line tools than GUIs. This insight would prove transformative. ### The First Success By mid-2024, I faced a practical problem. I'd switched from Obsidian Sync to iCloud for better iPad workflow—the new "keep files downloaded" feature finally made it viable—but lost the note versioning capability that Obsidian Sync provided. My solution: back up the vault to GitHub. The [Obsidian Git plugin](https://github.com/Vinzent03/obsidian-git) had iPad syncing issues, so I used Cursor to create a commit/push script running on my always-on Mac Mini. I automated it with [Keyboard Maestro](https://www.keyboardmaestro.com/) running twice daily—deliberately avoiding cron jobs because I knew I couldn't manage them properly, even though LLMs could have easily created them. Sometimes knowing your limitations is wisdom. This wasn't my first taste of the terminal—I'd been using iTerm2 for ad hoc commands for a decade. But it was my first time directly interacting with Git, and I realized that despite my lack of knowledge, LLMs could guide me through version control effectively. The `gh` CLI command unlocked sophisticated GitHub interactions. For the first time, GitHub + LLM + me worked efficiently together. ### Building Momentum From mid-2024 through May 2025, I built project after project, each solving a specific personal workflow problem where existing tools didn't meet my needs: - **Calculator** (June 2024): My first Python project—basic calculations but fundamental programming concepts learned - **Car Compare** (December 2024): A Canadian car comparison platform that failed spectacularly. I learned how difficult web scraping can be and how LLMs randomly suggest and install packages without asking. For someone with no understanding of MVCs, this created chaos. I didn't even know what `package.json` was for or what `npm run dev` did. Running multiple `npm run dev` commands created port conflicts that agentic tools couldn't manage without proper instruction sets - **URL Converter** (December 2024): [Next.js](https://nextjs.org/) PWA for converting URLs to clean markdown format—built because existing tools either had poor formatting or required multiple steps - **Git Monitor** (May 2025): Custom [BetterTouchTool](https://folivora.ai/) menu bar tool showing which repositories need pulling/pushing—nothing else gave me this at-a-glance status across multiple repos - **Config Sync** (May 2025): Bidirectional rsync tool for keeping development environment settings consistent across devices—existing sync tools were either too complex or didn't handle my specific file patterns - **Snippet AI** (May 2025): Claude AI-powered [Alfred](https://www.alfredapp.com/) snippet management system for intelligent text expansion—wanted snippets that could adapt based on context, not just static replacements - **Vault Web** (June 2025): Web interface for my Obsidian vault using [Docker](https://www.docker.com/). [LinuxServer.io](https://linuxserver.io/) helped significantly. Claude suggested running it on [Colima](https://github.com/abiosoft/colima) (which I'd never heard of), and now all my Docker containers still run on Colima. Built because I needed web access to my vault when native apps weren't available - **Reddit Analyzer** (May 2025): GPT-4.1 powered tool for extracting business insights from subreddit discussions. Still use this to identify potential product ideas for both personal projects and consulting work—no existing tool provided the specific analysis I needed - **Docs Server** (June 2025): Documentation-focused [Model Context Protocol](https://modelcontextprotocol.io/) server for creating and accessing personal documentation on various APIs and tools. Creates nifty cheat sheets too—built because I was tired of searching through multiple documentation sites Each project pushed me deeper into developer territory. But the crown jewel—my RSS reader—remained frustratingly out of reach. ## Act II: The Struggle (Multiple Tool Attempts) ### The RSS Reader Saga: Versions 1-12 From July 2024 to January 2025, I kept returning to my RSS reader project with tenacity. I'd been a news junkie for years, using [Lire](https://apps.apple.com/ca/app/lire/id1482527526?mt=12) and [Reeder 5](https://apps.apple.com/ca/app/reeder-classic/id1529448980?mt=12) connected to my [Inoreader](https://inoreader.com/) account. Inoreader pre-processed my feeds, removed duplicates, filtered unwanted topics—but no RSS reader app matched my exact taste for how I wanted to consume news. The first attempt in June had failed spectacularly with Xcode. So I pivoted to building a PWA—surely web development would be simpler. It wasn't. I created twelve different repositories. Not branches—entirely new repos named `rss-reader-v1` through `rss-reader-v12`. Each represented a strategic pivot when I couldn't fix what I'd broken. Every time I'd learn something new, some new technique, I'd get excited again to restart. Sometimes I'd pivot on the scope in the hope of building it better next time. Every 2-3 weeks, the pattern repeated: initial excitement, rapid progress on basic features, then integration challenges. State management remained confusing. Authentication flows were unpredictable. The build system was fragile. Starting from version 3, I began applying my PM skills more deliberately—creating comprehensive PRDs before diving into code, just like I'd always done for my day job. It helped, but not enough. I tried everything: - **Cursor IDE**: Great for file management and editing, but the agentic tools could never build what I wanted at my skill level. Its autocomplete abilities were useless since I wasn't writing code myself - **Windsurf IDE**: Similar limitations with slightly different flavoring - **Raw ChatGPT**: Helpful for concepts, useless for implementation - **Codex CLI** (OpenAI): Promising but horrible UX, never found it capable for entire projects until GPT-5 came out - **Gemini CLI**: Technically capable but somehow never clicked—extremely verbose and an expert in over-engineering solutions ### The Evolution Through Failure What I didn't realize at the time was that each failed version was teaching me something crucial. By version 6, I'd stopped using the GitHub Desktop app entirely, moving to the command line. By version 10, the terminal was becoming a natural part of my workflow. I had aliases like: ```bash macmini='ssh [email protected] -t "cd ~/DevProjects; clear; hostname; exec $SHELL"' ``` This wasn't about the RSS reader specifically—it showed how the terminal was slowly becoming central to my life. I still needed Claude to decipher error messages and explain Git operations, but I was learning when to use these commands and how to interpret the patterns in what Claude showed me. ### The Persistence By January 2025, after six months and twelve versions, the PWA was functional enough for personal use but nowhere near shippable. It was a pet project—I had nothing to prove to anyone. Despite the occasional disappointments, I was enjoying the journey and learning with each iteration. I could see what needed to be built. Years of PM experience meant I knew exactly what features were needed, how they should work, what edge cases to handle. The gap between vision and execution was narrowing, but it was still there. ### The Missing Piece By January 2025, I had knowledge. I had tools. I had persistence (12 versions worth!). But I couldn't ship. Then everything changed. ## Act III: The Breakthrough (Claude Code Era) ### Discovering Claude Code CLI When I first tried [Claude Code CLI](https://docs.anthropic.com/en/docs/claude-code) (on May 24, 2025), the difference was immediate. Claude Code lived in the terminal where I'd become comfortable. By this time, after all my failures, I'd learned how to make LLMs maintain context across sessions—how to manage project memories, create instruction files, and structure prompts to preserve continuity. Claude Code fit perfectly into this learned methodology. I started with the API directly—pay per use. Within five days, I'd burned through C$83 in credits. But in those same five days, I started building shippable software. After 12 failed repository versions over six months, I started fresh and suddenly had working code. That proved its value immediately. I switched to the flat ~C$150/month MAX 5x plan. Within a month, I realized it was creating enough value that I wanted more access to their frontier model Opus (it was still Opus 4 at that time). The next month, I upgraded to the MAX 20x plan at ~C$325/month. According to [Claudia](https://opcode.sh/) (they recently rebranded as `opcode`), if I'd stayed on pay-per-use APIs, the RSS reader alone would have cost over C$7,000. Worth every penny. The fact that I could build shippable code within 5 days—something I couldn't do in six months of attempts—justified the cost instantly. By early August 2025, I discovered [Serena MCP](https://github.com/oraios/serena) through a [Reddit post](https://www.reddit.com/r/ClaudeAI/comments/1lfsdll/try_out_serena_mcp_thank_me_later/). This became the missing piece for long-term project memory. While my `CLAUDE.md` files could guide individual sessions, Serena maintains persistent project context across weeks and months—understanding not just what was built, but why architectural decisions were made and which approaches failed. ### The Terminal Takes Over Claude Code fundamentally changed how I worked. My workflow became terminal-heavy—the first app I'd open on my Mac. This shift led to crucial discoveries. A few weeks in, I discovered [tmux](https://github.com/tmux/tmux), which "really changed my life" for iPad coding. Combined with [Tailscale](https://tailscale.com/), it eliminated my biggest workflow friction. No more GitHub push from Mac Mini → pull on MacBook Air → work → push → pull on Mini. No more `.env` files out of sync. No more losing context when switching devices. Now I could SSH into my Mac Mini from my iPad, attach to a persistent tmux session, and continue exactly where I left off. The code was always running on the Mini, so the environment was always consistent. I could start debugging on my MacBook Air at my desk, continue on my iPad from the couch, and even make emergency fixes from my iPhone while walking the dog—all in the same session, with the same state, without a single git commit. I switched from iTerm2 to [Ghostty](https://ghostty.org/) on the Mac, from Termius to [Blink Shell](https://blink.sh/) on iPad. Everything revolved around the terminal now. ### Why Claude Code Succeeded Where Others Failed - **CLI-based**: The primary reason—I could use it on my iPad via SSH. No IDE could match this - **Terminal-native**: Fit perfectly into the workflow I'd gradually built - **Custom slash commands**: Unlike Cursor, I could create my own commands for specific workflows - **Sub-agent architecture**: Could build specialized agents that Cursor doesn't support - **MCP server integration**: Better and quicker at using MCP servers than alternatives - **Context handling**: Better at maintaining context for my specific use case (though I could have made Cursor retain memory with the same strategies) - **Permission model**: Running with `--dangerously-skip-permissions` meant I didn't have to bother about multiple permission prompts, while managing what's allowed and what's not using Claude hooks, slash commands and `.claude.local.json` file. (Cursor launched a CLI version [two weeks ago](https://cursor.com/cli), but it's currently far behind Claude Code's feature set.) ### The Army of Sub-Agents I created a workflow with specialized sub-agents (these are just examples from my larger collection): - `db-expert`: Handles all [Supabase](https://supabase.com/) database operations, schema design, and query optimization - `linear-expert`: Manages [Linear](https://linear.app/) project management, creates issues, tracks progress - `tech-expert`: Provides architecture analysis and implementation guidance for complex features - `infra-expert`: Solves build problems, TypeScript compilation issues, dependency conflicts - `devops-expert`: Manages [PM2](https://pm2.keymetrics.io/) service monitoring, deployment processes ### My Workflow with Custom Slash Commands I also created a set of sequential commands that automated my entire development flow: #### `/get-project-context` Loads all project knowledge including README files, architecture decisions, past implementation patterns, and known issues. This ensures Claude Code starts with full awareness of project history, preventing repeated mistakes and maintaining consistency with existing code patterns. When combined with Serena MCP's memory capabilities, this command can access weeks of accumulated project wisdom—understanding not just the current state but the evolution of the codebase and why certain approaches were abandoned. #### `/plan` Analyzes requirements by breaking down the task, identifying dependencies, and coordinating which sub-agents will be needed. Creates a step-by-step implementation strategy that considers edge cases, potential conflicts with existing code, and defines clear success criteria before writing any code. Serena helps here by recalling similar problems solved in the past and which development patterns worked versus failed, preventing me from repeating mistakes from three weeks ago that my `CLAUDE.md` file might not capture. #### `/stage` Prepares the workspace by checking for uncommitted changes, ensuring all dependencies are installed, and verifying the development environment is ready. Sets up any necessary test data, environment variables, or configuration files needed for the upcoming implementation. #### `/test-design` Establishes testing strategies before implementation begins, defining what tests need to be written and what scenarios must be covered. This test-first approach ensures I'm building toward specific, measurable outcomes rather than coding blindly and hoping it works. #### `/execute` The actual implementation phase where appropriate sub-agents are called with accumulated context from previous steps. Each agent works within its domain (database, frontend, API, etc.) while maintaining awareness of the overall plan and how their piece fits into the larger solution. Serena's understanding of development patterns guides the LLMs toward proven solutions rather than reinventing wheels—it knows when to use a factory pattern versus a singleton, knowledge I certainly don't possess but can leverage through proper tooling. #### `/code-review` Automated quality checks that look for common issues like unused variables, inconsistent naming, missing error handling, or violations of project conventions. Acts like a senior developer reviewing code, suggesting improvements and catching problems before they make it into the codebase. Serena enhances this by remembering project-specific conventions and architectural decisions that might not be documented elsewhere. #### `/test` Runs the test suite and validates that new code works as expected without breaking existing functionality. Goes beyond just running tests—analyzes failures, suggests fixes, and ensures test coverage is adequate for the new features. #### `/document` Updates all relevant documentation including README files, API docs, inline code comments, and architecture decision records. Ensures future me (or other developers) will understand why decisions were made and how to use or modify the new code. Serena maintains a deeper layer of documentation—the unwritten knowledge about what approaches were tried and failed, which becomes invaluable context for future development sessions. #### `/commit-push` Handles version control with meaningful commit messages that follow conventional commit standards. Manages git operations safely, checking for conflicts, organizing changes into logical commits, and pushing to the appropriate branch. #### `/prepare-for-release` Manages production deployment including building production assets, updating version numbers, creating release notes, and running final smoke tests. Ensures the code is truly ready for users, not just "works on my machine" ready. ### The RSS Reader Lives After 12 failed versions, I made a crucial decision. I deleted all the old repositories and created a fresh one for the project—no more `rss-reader-v1` through `v12` cluttering my GitHub. This clean slate marked a mental shift: I was ready for production. The PWA version was initially built just for me, but when I decided to target the App Store, I forked it to a private repository and converted it to a React iOS app. I also needed to add a way for users to authenticate themselves via GUI, and not the `.env` file. The original PWA remains [public on GitHub](https://github.com/shayonpal/rss-news-reader)—my commitment to open source. Once I figure out a monetization strategy that works with GPL licensing, I plan to open-source the iOS app code as well. I'd learned to use multiple LLMs strategically: [Claude](https://claude.ai/) for backend logic, [OpenAI's models](https://openai.com/) for Xcode navigation and frontend work. After 6-7 rounds with Apple (icon sizes, descriptions, screenshots), my RSS reader hit the App Store. 100+ users in two months. Most found it organically. They actually pay for subscriptions. ### Contributing Back With Claude Code, I could finally give back. My first PR to [PopClip Extensions](https://github.com/pilotmoon/PopClip-Extensions/pull/1280)—merged within a day. Three more PRs followed for different tools and projects I use personally, where I'd found helpful use cases that didn't exist in the original versions. Two merged, one pending. It felt good to contribute something, however small, to projects that had helped me learn. ## The Tools That Made It Possible My current stack, refined through painful trial and error: **Development**: - Claude Code CLI (primary) with custom `CLAUDE.md` instructions - [Ref.tools](https://ref.tools/) for documentation indexing—can index entire git repos and custom docs, providing contextual documentation right when I need it. At US$9 per 1,000 credits with no flat monthly subscription, the cost is quite steep, but I still use it because nothing else indexes custom documentation as effectively - [Context7](https://github.com/upstash/context7) for additional documentation reference, just in case Ref.tools failed - [Perplexity MCP server](https://github.com/perplexityai/modelcontextprotocol) & [Brave Search MCP](https://github.com/brave/brave-search-mcp-server) for research **Terminal & Remote**: - Ghostty + tmux on Mac - [Blink Shell](https://blink.sh/) + tmux on iPad - Custom tmux config with ⌃+K prefix (repurposed Caps Lock as hyper key on the Macs and as ⎋ on the iPad) - Tailscale for coding from anywhere **Project Management**: - [Linear](https://linear.app/) with MCP integration - GitHub issues for public projects **Voice & Input**: - [SuperWhisper](https://superwhisper.com/) for dictation with AI formatting ## What Changed I discovered an effective problem-solving technique: asking Claude to interview me with multiple-choice questions. When I'm stuck on a feature or architecture decision, instead of trying to explain everything at once, I have Claude ask me targeted questions one by one. "What's the primary goal: A) Performance optimization, B) User experience improvement, C) Code maintainability, or D) Something else?" Each answer leads to more specific questions, gradually building the complete picture. This atomic thinking process forces me to consider options I hadn't thought of. The multiple-choice format often includes possibilities beyond my initial thinking—that option D has led to breakthrough solutions multiple times. It's like having a senior developer interview me about my requirements, except this senior developer has infinite patience and helps me think through problems systematically. Piece by piece, unclear requirements become detailed specifications. The terminal became my primary interface—first app I open. The depth of this shift still surprises me: I manage personal tasks in [Todoist](https://todoist.com/) via terminal commands, write journal entries directly to my Obsidian vault through a [[Building a MCP Server for Obsidian Notes|custom MCP server I built]] (with LLM assistance) specifically for my vault's YAML structure. I review PRs, deploy services, debug issues—all without leaving the command line. This isn't about being a "real developer"; it's about finding a workflow that finally clicked. Here's a glimpse of my tmux configuration: ```bash # Set Ctrl+K as the new prefix (instead of Ctrl+B) unbind C-b set -g prefix C-k bind-key C-k send-prefix # Make it easier to split panes (inherits current directory) bind-key '\' split-window -h -c "#{pane_current_path}" bind-key - split-window -v -c "#{pane_current_path}" ``` This isn't developer cosplay—it's practical adaptation. Every customization solves a real problem I encountered. ## The Reality Check I still don't call myself a developer. When real developers discuss memory management or algorithmic complexity, I nod politely and search Perplexity frantically later (yes, "Perplexity" doesn't quite work as a verb yet). Git staging remains confusing. GitHub Actions break mysteriously despite me frantically trying to explain to Claude what's breaking and providing copious amounts of error logs from GitHub's workflow runs (or rather, failures). My limitations are real: - Can't debug without LLM assistance - Don't understand compilation errors intuitively - Still create 5 versions of things before one works (at least not 12 anymore) - Spend C$325/month on AI tools to build things I otherwise couldn't—still costs less than golf as a hobby But here's what changed: I ship working software. My RSS reader has paying users. My PRs are getting merged. When I encounter a workflow problem, I build a solution instead of searching for one. ## The Meta-Lesson Looking back, Claude Code didn't make me a developer—it made me more capable in unexpected ways. It took what I already knew how to do (spec features, test systematically, spot edge cases) and made those skills actually useful for building software, while handling the parts I'll probably never understand or learn (proper syntax, framework conventions, design patterns). The path wasn't planned. It just happened: - First, I learned to fail without losing everything (backups saved me from myself) - Then figured out LLMs need guardrails or they hallucinate wildly - Gradually got comfortable with terminal, Git, basic project structure - Tried tool after tool until Claude Code clicked with my workflow - Built increasingly elaborate systems (sub-agents, sequential commands) to handle complexity - Kept shipping, even when it meant starting over 5 times (or 12, for the RSS reader) ## Looking Forward My goal remains simple: reach $10,000 monthly revenue from the RSS reader, then hire traditional developers who actually know what they're doing. I continue contributing to open source when I find gaps I can fill. I keep building tools that solve my specific annoyances. The tools will evolve. Next year, Claude Code might be obsolete. The fundamental shift, though, feels permanent: the gap between having an idea and building it has collapsed. Not eliminated—collapsed. There's still a gap, filled with 5 failed attempts, frantic Perplexity searches, and error logs that make no sense. But it's crossable now. I spent fifteen years as a PM watching engineers build things I'd specified. Now I ship my own broken code, and strangers on the internet pay money for it. That still feels surreal. If you're reading this thinking "maybe I could try this," know that you'll probably destroy something important in your first week (make backups). You'll create twelve versions of something that should take one. You'll spend more on AI tools than seems reasonable. Your code will horrify actual developers. But you'll ship. And someone, somewhere, will find it useful enough to pay for. That's worth something. Even if it takes you 18 months to figure it out.