# Cat Pereira

> Full stack software engineer. I build things across the stack, from polished product UIs to cloud services and ML tooling. Lately I'm deep in computer vision and audio.

## Projects

- [Otterwatch](https://github.com/catherinepereira/otterwatch): Live status board that watches sea otter cams, detects when an otter is on screen, and notifies you when one appears.
- [MH Nature Cam](https://github.com/catherinepereira/mh-nature-cam): Wildlife cam board that watches Morten Hilmer's 24/7 woodland livestream and records a clip around each animal sighting.
- [explorable.cv](https://explorable.cv): The home for my computer vision explorables!
- [detstream](https://github.com/catherinepereira/detstream): Modular object detection framework for live video feeds. The engine behind Otterwatch.
- [Sign Cards](https://sign-cards.vercel.app): Browser game for learning ASL fingerspelling, with themed levels and webcam flashcards.
- [airdraw](https://airdraw-cat.vercel.app): Draw on screen by pinching your fingers in the air, tracked through your webcam.
- [OpenHand](https://openhand-asl.vercel.app): Web app that converts American Sign Language fingerspelling to text in real-time using a webcam.
- [Captionaut](https://github.com/catherinepereira/captionaut): Video captioning app with automated transcription, speaker diarization, and inline caption editing.
- [prompt2dataset](https://github.com/catherinepereira/prompt2dataset): CLI tool that generates labeled image datasets from plain-English descriptions using Claude AI.
- [cli-cards](https://cli-cards.vercel.app): Renders the terminal-style usage cards shown across this site. Works as a CLI, an npm library, and a browser-based card editor.
- [airship.top](https://github.com/catherinepereira/airship-top): A website I hosted to track live player count statistics for the multiplayer game platform Airship using GCP Cloud Run, Cloud Scheduler, Firebase Hosting, and Neon.
- [Airship](https://airship.gg): Worked on platform micro services, fully designing and implementing features across our stack from the database structure (Postgres, Prisma), to API design (NestJS, REST), and our user interfaces (TypeScript, React, Svelte).
- [dinnote](https://github.com/catherinepereira/dinnote): Package building off my existing work with the dinscribe package to add speech diarization into my audio pre-preprocessing pipeline.
- [dinscribe](https://github.com/catherinepereira/dinscribe): Package streamlining multiple steps of applying audio pre-processing libraries to clean noisy audio and detect speech.
- [doctape](https://github.com/catherinepereira/doctape): CLI tool that converts large PDFs to Markdown by chopping them into page windows, running each through docling, and reassembling the result.
- [F1 Pit Wall](https://f1pitwall.vercel.app): Website hosting transcriptions of all driver radio communications from the 2025 F1 season.
- [F1Guessr](https://f1guessr.com): Web game inspired by GeoGuessr for F1 fans to guess the year and grand prix from a photo of the race.
- [Roblox BedWars](https://bedwars.com): Led technical and creative development of over 10 purchasable and playable in-game characters
- [Roblox Islands](https://www.roblox.com/games/4872321990/): Contributed to a weekly content update schedule

## Explorables

- [Embeddings Playground](https://embeddings-playground-cat.vercel.app/): Audio embedding and dimensionality reduction visualizer using audio data sourced from the FreeMusicArchive music library.
- [BPE Playground](https://bpe-playground.vercel.app/): Interactive step-through visualization of the Byte Pair Encoding algorithm as implemented in GPT-2's tokenizer.
- [CNN Playground](https://explorable.cv/cnn-playground): Browser-based interactive tool for visualizing how convolutional operations transform images in real-time.
- [CNN Visualizer](https://explorable.cv/cnn-visualizer): Web app that visualizes what a trained CNN model perceives at each layer when processing CIFAR-10 images.
- [CNN Architecture Comparison](https://explorable.cv/cnn-architecture-comparison): Interactive web app that compares six major CNN architectures (LeNet-5, AlexNet, VGG-11, Inception-mini, ResNet-20, DenseNet-BC) side-by-side on the same image.
- [CV Interpretability Explorer](https://explorable.cv/cv-interpretability): Compares three image classifiers (Custom CNN, ResNet-18, ViT-S) trained on ImageNette, showing what each one looks at.
- [Transformer Playground](https://transformer-playground.vercel.app/): Runs a pretrained BERT-tiny in the browser and visualizes a transformer end to end on whatever sentence you type.
- [ViT Playground](https://explorable.cv/vit-playground): Visualizes how a Vision Transformer turns an image into a sequence of tokens, then runs ViT-tiny end to end in the browser.