Concept

Why We Built Sales Email Detection -- 8 Out of 10 Were Pitches, and the Design Philosophy Behind the Fix

Why We Built Sales Email Detection -- 8 Out of 10 Were Pitches, and the Design Philosophy Behind the Fix

Last updated: 2026-04-17

This is the background and design philosophy behind FORMLOVA's sales email auto-detection, from the team that built it. For the release announcement see AI Now Detects Sales Emails in Your Forms. For the practical usage guide see How to Use Sales Email Detection.

8 out of 10 were sales pitches

Let me start with a number: 8 out of 10.

I was helping a client manage their inquiry form. I opened the response list to run the weekly ad-channel analysis, and what I found was a grid of templated openers, unfamiliar names, and unfamiliar company names. I triaged them by eye, stripped out the sales, and computed the ad CVR on the two responses that remained. Then I wrote the report.

That work came back every week the ads were running. Done carefully, it ate hours. Done roughly, the numbers lied. Neither path led anywhere good. That was the moment I knew the problem was structural, not accidental.


The "strip sales by hand" assumption

Gemini_Generated_Image_utu4ogutu4ogutu4.webp

Most form services offer bot protection -- CAPTCHA and similar. That is not the same problem. A human who sits down and types out a sales pitch is not a bot. The question was never "block the submission." It was "what do you do with it after it arrives?"

In practice, teams answered that question in two ways. Careful operators triaged every inquiry by eye. Less-careful operators reported the numbers as-is, pitches included. The first path burned time. The second path contaminated every decision downstream. Neither path had a win condition.

When the CVR is off, every decision stacked on top of it is off too. Ad channel allocation. Creative evaluation. The simple yes/no on whether to keep a campaign running. If 20% or 80% of your pipeline is noise, you cannot trust any of those decisions. This is not a problem "outside" the form. It is the form's job to hand clean data to the next step, and right now it is not doing that job.


What it means to solve this inside the form service

If the work is being done by hand, the natural question is whether the tool can absorb it. As far as we can find, no form service classifies response content with AI today. Unlike CAPTCHA, this is an empty space in the category.

There is a reason it has stayed empty. Until recently, LLM cost and quality did not cross the threshold where a form service could absorb them on every plan. In the last year that threshold moved. The ground is ready now, and no one has built on it yet. We had this feature on the roadmap as a natural extension, and this was the right moment to ship it.

The cost today is roughly $0.0002 per classification. A free-plan user hitting their full monthly response limit still costs us very little in aggregate, and we absorb it. We chose not to gate this behind a paid tier. Seeing through sales emails should be standard equipment for anyone operating a form, not a premium privilege.


"When in doubt, legitimate"

Gemini_Generated_Image_uxrqk3uxrqk3uxrq.webp

The hardest design decision was not the algorithm. It was which way to tilt the default.

If you trust AI precision fully, you can build a system that hides sales pitches automatically. We rejected that design. The reason is simple: we never wanted to risk blocking a legitimate inquiry. Losing a real inquiry is losing a real relationship, and that is not a fair trade against the cost of a few sales pitches slipping through.

So the prompt's internal bias is explicit: when in doubt, mark it legitimate. Borderline cases stay legitimate. Sales is only assigned when the signal is clear. We attach a score (0-100) so the operator can see the model's uncertainty without being forced to act on it. Machines classify. Humans decide.

That tilt became real through iteration. The first version was three lines of judgment criteria. It nailed the obvious cases but went feral in the gray zone. The revised prompt has a three-step judgment procedure, six positive patterns for "legitimate," five for "sales," stricter conditions for "suspicious," three few-shot examples, and finally an explicit rule: "when unsure, treat as legitimate." Precision went up. More importantly, the whole thing became a system where humans can override and trust that their override sticks.


Not trusting the LLM to be the final word

A few related design choices flow from the same stance:

  • Every classification carries a score, not just a label. Users get to see how confident the model is
  • Classification runs asynchronously after the response is saved. Failures and timeouts never break form submission
  • A manual label correction is never overwritten by later automatic runs (spam_label_source = manual)
  • Classification is not run on paid-event forms (Stripe Connect). Sending a pitch that requires payment is a pathologically rare combination
  • Classification is not run on forms without text input fields. A selection-only form has no place for a pitch to land anyway

Each of these is a variation of the same principle: do not over-trust the AI. As models become more capable, the separation between "machine proposes" and "human decides" should stay sharper, not softer.


What comes next

Gemini_Generated_Image_xrdjkdxrdjkdxrdj.webp

This release has three buckets -- legitimate, sales, and suspicious. The same mechanism extends naturally to finer intent classification within the legitimate bucket: evaluation, information gathering, support, partnership inquiries.

Here is where it stops being a thing a standalone form service can do. FORMLOVA is an MCP server -- a form backend you can drive from outside. That means an MCP client (Claude, ChatGPT, and so on) can compose FORMLOVA's MCP with MCP servers from other services.

  • Evaluation-labeled responses auto-route to the sales team's Slack
  • Information-gathering responses get added to the CRM and nothing else
  • Support requests open a ticket in the helpdesk

Users can compose these branches themselves, in one chat sentence. That composition is not possible in a standalone form service. Receiving a response, classifying its meaning, and routing it to the right downstream system: stitching those three steps together inside the user's environment is what makes this new territory.

This release is the first step in that direction.


Closing

A form sits between a person and an organization, as a receiver. When the receiver is imprecise, every decision made from its output is also imprecise. Conversely, a small shift on the receiver side propagates through to the decisions, the reports, and the daily work that rest on top of it.

Where you draw the line between AI judgment and human judgment, between the cost the platform absorbs and the cost the user pays, between automation and manual control -- these choices shape how a product stands.

We drew ours at "AI proposes, humans decide," "when in doubt, legitimate," and "free on every plan." From here we will watch how people use it, and move on to intent classification next.

Related articles:


Image generation prompts

This concept article uses a different eye-catch style from the guide / tips / update series (outside the ICatch rule). Pass the prompts below to Gemini or a similar image model and drop the results into the article.

Last verified on:

Share this article

Written by

@Lovanaut
@Lovanaut

Creator of Sapolova, Lovai, Molelava, and FORMLOVA. Building kind services with love.

More in this category