Blog

Posts about the development process, solved problems and learned technologies

All tags #ai #api #claude #clipboard #commit #cursor #git #ide #javascript #python #security #test #vscode

All categories New Feature Bug Fix Code Change Debug Session Learning General

From Zero to Spam-Proof: Building a Bulletproof Feedback System

# Building a Feedback System: How One Developer Went from Zero to Spam-Protected The task was straightforward but ambitious: build a complete feedback collection system for borisovai-site that could capture user reactions, comments, and bug reports while protecting against spam and duplicate submissions. Not just the backend—the whole thing, from API endpoints to React components ready to drop into pages. I started by designing the **content-type schema** in what turned out to be the most critical decision of the day. The feedback model needed to support multiple submission types: simple helpful/unhelpful votes, star ratings, detailed comments, bug reports, and feature requests. This flexibility meant handling different payload shapes, which immediately surfaced a design question: should I normalize everything into a single schema or create type-specific handlers? I went with one unified schema with optional fields, storing the submission type as a categorical field. Cleaner, more queryable, easier to extend later. The real complexity came with **protection mechanisms**. Spam isn't just about volume—it's about the same user hammering the same page with feedback. So I built a three-layer defense: browser fingerprinting that combines User-Agent, screen resolution, timezone, language, WebGL capabilities, and Canvas rendering into a SHA256-like hash; IP-based rate limiting capped at 20 feedbacks per hour; and a duplicate check that prevents the same fingerprint from submitting twice to the same page. Each protection layer stored different data—the fingerprint and IP address were marked as private fields in the schema, never exposed in responses. The fingerprinting logic was unexpectedly tricky. Browsers don't make it easy to get a reliable unique identifier without invasive techniques. I settled on collecting public browser metadata and combining it with canvas fingerprinting—rendering a specific pattern and hashing the pixel data. It's not bulletproof (sophisticated users can spoof it), but it's sufficient for catching casual spam without requiring cookies or tracking pixels. On the frontend, I created a reusable **React Hook** called `useFeedback` that handled all the API communication, error states, and local state management. Then came the UI components: `HelpfulWidget` for the simple thumbs-up/down pattern, `RatingWidget` for star ratings, and `CommentForm` for longer-form feedback. Each component was designed to be self-contained and droppable anywhere on the site. Here's something interesting about browser fingerprinting: it's a weird space between privacy and security. The same technique that helps prevent spam can also be used for user tracking. The difference is intent and transparency. A feedback system storing a fingerprint to prevent duplicate submissions is reasonable. Selling that fingerprint to ad networks is not. It's a line developers cross more often than they should admit. By the end, I'd created eight files across backend and frontend, generated three documentation pieces (full implementation guide, quick-start reference, and architecture diagrams), and had the entire system ready for integration. The design team had a brief with eight questions about how these components should look and behave. The next phase is visual design and then deployment, but the hard structural work is done. The system is rate-limited, protected against duplicates, and extensible enough to handle new feedback types without refactoring. **Mission accomplished**—and no spam getting through on day one.

Feb 13, 2026

New Featureborisovai-site

Smart Feedback Without the Spam: A Three-Layer Defense Strategy

# Building a Spam-Resistant Feedback System: Lessons from the Real World The borisovai-site project needed something every modern developer blog desperately wants: meaningful feedback without drowning in bot comments. The challenge was clear—implement a feedback system that lets readers report issues, mark helpful content, and share insights, all while keeping spam at bay. No signup required, but no open door to chaos either. **The first decision was architectural.** Rather than reinventing the wheel with a custom registration system, I chose a multi-layered defense approach. The system would offer three feedback types: bug reports, feature requests, and "helpful" votes. For sensitive operations like bug reports, OAuth authentication through NextAuth.js would be required, creating a natural barrier without friction for legitimate users. The real puzzle was handling spam and rate limiting. I sketched out three strategies: pure reCAPTCHA, pattern-based detection, and a hybrid approach. The hybrid won. Here's why: reCAPTCHA alone feels heavy-handed for a simple "mark as helpful" action. Pattern-based detection using regex against common spam markers catches obvious abuse cheaply. But the real protection came from rate limiting—one feedback per IP address per 24 hours, tracked either through Redis or an in-memory store depending on deployment scale. **The implementation stack reflected modern web practices.** React 19 with TypeScript provided type safety, Tailwind v4 handled styling efficiently, and Framer Motion added subtle animations that made the interface feel responsive without bloat. The backend connected to Strapi, where I added a new feedback collection with fields tracking the page URL, feedback type, user authentication status, IP address, and a timestamp. The API endpoint itself became a gatekeeper—checking rate limits before creating records, validating input against spam patterns, and returning helpful error messages like "You already left feedback on this page" or "Too many feedbacks from your IP. Try again later." **One unexpectedly thorny detail:** designing the UI for the feedback count. Should we show "23 people found this helpful" or just a percentage? The data model needed to support both, but the psychological impact differs significantly. I opted for showing the count when it exceeded a threshold—small numbers feel insignificant, but once you hit thirty or more, social proof kicks in. Error handling demanded attention too. Network failures got retry buttons, server errors pointed toward support, and validation errors explained exactly what went wrong. The mobile experience compressed the floating button interface into a minimal footprint while keeping all functionality accessible. ## The Tech Insight Most developers overlook that **rate limiting isn't just about preventing abuse—it's about conversation design.** When someone can only leave one feedback per day, they tend to make it count. They think before commenting. The constraint paradoxically improves feedback quality by making it scarce. **What's next?** The foundation is solid, but integrating an ML-based spam detector from Hugging Face would add a sophistication layer that adapts to evolving attack patterns. For now, the system ships with pattern detection and OAuth—practical, maintainable, and battle-tested by similar implementations across the web. Why is Linux safe? Hackers peek through Windows only.

Feb 13, 2026

New Featurellm-analisis

Random Labels, Silent Failures: When Noise Defeats Self-Modifying Models

# When Random Labels Betrayed Your Self-Modifying Model The `llm-analisis` project hit a wall that looked like a wall but was actually a mirror. I was deep into Phase 7b, trying to teach a mixture-of-experts model to manage its own architecture—to grow and prune experts based on what it learned during training. Beautiful vision. Terrible execution. Here's what happened: I'd successfully completed Phase 7a and Phase 7b.1. Q1 had found the best config at 70.15% accuracy, Q2 optimized the MoE architecture to 70.73%. The plan was elegant—add a control head that would learn when to expand or contract the expert pool. The model would become self-aware about its own computational needs. Except it didn't. Phase 7b.1 produced a **NO-GO decision**: 58.30% accuracy versus the 69.80% baseline. The culprit was brutally simple—I'd labeled the control signals with synthetic random labels. Thirty percent probability of "grow," twenty percent of "prune," totally disconnected from reality. The control head had nothing to learn from noise. So I pivoted to Phase 7b.2, attacking the problem with entropy-based signals instead. The routing entropy in the MoE layer represents real model behavior—which experts the model actually trusts. That's grounded, differentiable, honest data. I created `expert_manager.py` with state preservation for safe expert addition and removal, and documented the entire strategy in `PHASE_7B2_PLAN.md`. This was the right direction. Except Phase 7b.2 had its own ghosts. When I tried implementing actual expert add/remove operations, the model initialization broke. The `n_routed` parameter wasn't accessible the way I expected. And even when I fixed that, checkpoint loading became a nightmare—the pretrained Phase 7a weights weren't loading correctly. The model would start at 8.95% accuracy instead of ~70%, making the training completely unreliable. Then came the real moment of truth: I realized the fundamental issue wasn't about finding the perfect control signal. The real problem was trying to do two hard things simultaneously—train a model AND have it restructure itself. Every architecture modification during training created instability. **Here's the non-obvious fact about mixture-of-experts models:** they're deceptively fragile when you try to modify them dynamically. The routing patterns, the expert specialization, and the gradient flows are tightly coupled. Add an expert mid-training, and you're not just adding capacity—you're breaking the learned routing distribution that took epochs to develop. It's like replacing car parts while driving at highway speed. So I made the decision to pivot again. Phase 7b.3 would be direct and honest: focus on actual architecture modifications with a fixed expert count, moving toward multi-task learning instead of self-modification. The model would learn task-specific parameters, not reinvent its own structure. Sometimes the biological metaphor breaks down, and pure parameter learning is enough. The session left three new artifacts: the failed but educational `train_exp7b3_direct.py`, the reusable `expert_manager.py` for future use, and most importantly, the understanding that self-modifying models need ground truth signals, not optimization fairy tales. Next phase: implement the direct approach with proper initialization and validate that sometimes a fixed architecture with learned parameters beats the complexity of dynamic self-modification. 😄 Trying to build a self-modifying model without proper ground truth signals is like asking a chicken to redesign its own skeleton while running—it just flails around and crashes.

Feb 13, 2026

New Featurespeech-to-text

When Stricter Isn't Better: The Threshold Paradox

# Hitting the Ceiling: When Better Thresholds Don't Mean Better Results The speech-to-text pipeline was humming along at 34% Word Error Rate (WER)—respectable for a Whisper base model—but the team wanted more. The goal was ambitious: cut that error rate down to 6–8%, a dramatic 80% reduction. To get there, I started tweaking the T5 text corrector that sits downstream of the audio transcription, thinking that tighter filtering could squeeze out those extra percentage points. First thing I did was add configurable threshold methods to the T5TextCorrector class. The idea was simple: instead of hardcoded similarity thresholds, make them adjustable so we could experiment without rewriting code every iteration. I implemented `set_thresholds()` and `set_ultra_strict()` methods, then set ultra-strict filtering to use aggressive cutoffs—0.9 and 0.95 similarity scores—theoretically catching every questionable correction before it could degrade the output. Then came the benchmarking. I fixed references in `benchmark_aggressive_optimization.py` to match the full audio texts we were actually working with, not just snippets, and ran the tests. The results were sobering. **The baseline** (Whisper base + improved T5 at 0.8/0.85 thresholds): 34.0% WER, 0.52 seconds. **Ultra-strict T5** (0.9/0.95): 34.9% WER, 0.53 seconds—marginally *worse*. I also tested beam search with width=5, thinking diversity in decoding might help. That crushed performance: 42.9% WER, 0.71 seconds. Even stripping T5 entirely gave 35.8% WER. The pattern was clear: we'd plateaued. Tightening the screws on T5 correction wasn't the lever we needed. Higher beam widths actually hurt because they introduced more candidate hypotheses that could mangle the transcription. The fundamental issue wasn't filtering quality—it was the model's capacity to *understand* what it was hearing in the first place. Here's the uncomfortable truth: if you want to drop from 34% WER to 6–8%, you need a bigger model. Whisper medium would get you there, but it would shatter our latency budget. The time to run inference would balloon past what the system could tolerate. So we hit a hard constraint: stay fast or get accurate, but not both. **The lesson stuck with me**: optimization has diminishing returns, and sometimes the smartest decision is recognizing when you're chasing ghosts. The team documented the current optimal configuration—Whisper base with improved T5 filtering at 0.8/0.85 thresholds—and filed a ticket for future work. Sometimes shipping what works beats perfecting what breaks. 😄 Optimizing a speech-to-text system at 34% WER is like arguing about which airline has the best peanuts—you're still missing the entire flight.

Feb 13, 2026

New FeatureC--projects-ai-agents-voice-agent

Voice Agent: Bridging Python, JavaScript, and Real-Time Complexity

# Building a Voice Agent: Orchestrating Python and JavaScript Across the Monorepo The task landed on my desk with a familiar weight: build a voice agent that could handle real-time chat, authentication, and voice processing across a split architecture—Python backend, Next.js frontend. The real challenge wasn't the individual pieces; it was orchestrating them without letting the complexity spiral into a tangled mess. I started by sketching the backend foundation. **FastAPI 0.115** became the core, not just because it's fast, but because its native async support meant I could lean into streaming responses with **sse-starlette 2** for real-time chat without wrestling with blocking I/O. Authentication came next—implementing it early rather than bolting it on later proved essential, as every subsequent endpoint needed to trust the user context. The voice processing endpoints demanded careful thought. Unlike typical REST endpoints that fire-and-forget, voice required state management: buffering audio chunks, running inference, and streaming responses back. I structured these as separate concerns—one endpoint for transcription, another for chat context, another for voice synthesis. This separation meant I could debug and scale each independently. Then came the frontend integration. The Next.js team needed to consume these endpoints, but they also needed to integrate with **Telegram Mini App SDK** (TMA)—which introduced its own authentication layer. The streaming chat UI in React 19 had to handle partial messages gracefully, displaying text as it arrived rather than waiting for the full response. This is where **Tailwind CSS v4** with its new CSS-first configuration actually simplified things; the previous @apply-heavy syntax would have made dynamic class management messier. Here's something I discovered during this phase that most developers overlook: **the separation of concerns in monorepos only works if you establish strict validation protocols upfront.** I created a mental model—Python imports always get validated with a quick `python -c 'from src.module import Class'` check, npm builds happen after every frontend change, TypeScript gets run before anything ships. This discipline saved hours later when subtle import errors could have cascaded through the codebase. The real insight came from studying the project's **ERROR_JOURNAL.md pattern**. Instead of letting errors vanish into git history, documenting them upfront and checking that journal *before* attempting fixes prevented the classic mistake of solving the same problem three times. It's institutional memory in a single markdown file. One unexpected win: batching independent tasks across codebases in single commands. Rather than switching contexts repeatedly, I'd prepare backend validations and frontend builds together, letting them run in parallel. The monorepo structure—Python backend in `/backend`, Next.js in `/frontend`—made this clean. No cross-contamination, clear boundaries. By the end, the architecture was solid: defined agent roles, comprehensive validation checks, and a documentation pattern that actually prevented repeated mistakes. The frontend could stream chat responses while the backend processed voice, and authentication threaded through both without becoming a bottleneck. **A SQL statement walks into a bar and sees two tables. It approaches and asks, "May I join you?" 😄**

Blog

From Zero to Spam-Proof: Building a Bulletproof Feedback System

Smart Feedback Without the Spam: A Three-Layer Defense Strategy

Random Labels, Silent Failures: When Noise Defeats Self-Modifying Models

When Stricter Isn't Better: The Threshold Paradox

Voice Agent: Bridging Python, JavaScript, and Real-Time Complexity

Already Done: Reading the Room in Refactoring

Already Done: When Your Plan Meets Reality

From Technical Jargon to User Gold: Naming Features That Matter

Decoupling SCADA: From Duplication to Architecture

20 Pages of Chaos → One Structured Roadmap

Mapping AI's Wild Growth: Building Your Trend Dashboard

Stripping the Gloss: Making Antirender Production Ready

An Interface That Speaks the Operator's Language

When Feedback Redesigned Everything

From 3+ Seconds to Sub-Second: Inside Whisper's CPU Optimization Sprint

Silencing the Ghost Console: A Windows Subprocess Mystery

Wiring Up Admin Endpoints: When Architecture Meets Reality

121 Tests Green: The Router Victory Nobody Planned

When the System Tray Tells No Tales: Debugging in Real Time

Adapter Pattern: Untangling the AI Agent Architecture