Startup idea: Swear words API

27.02.2022

How to (maybe - prove it yourself) make money off corporate family-friendliness.

My Facebook feed consists of only two things: job postings and ads. Today piece isn’t about career, so it’s easy to guess they made me click on that nice, personalizable, Swiss-made watch.

After giving my timepiece a unique graphic design (try it yourself - much fun!), a final touch was to put some custom text to be engraved on the watch. First word out of my mind?

kurwa

No good! This most sophisticated profanity was rejected by the online creator. Quick substitution with Greek Small Letter Alpha, hovewer, was enough to defeat the validation. Almost the same looking, definitely equal sounding, still vulgar - kurwα was my not-even-script-kiddie-level workaround.

Apparently, the check for inappropriate language is based on a rather big regex of known bad words. You’ll get an idea from the below picture:

I’m really proud of the Polish ones taking up a significant part of the whole! Further diving into the list is as educational as entertaining:

“Mamma knullare” - based on a quick search I have no idea what it is or why it would be considered offensive. Maybe it’s some deep web stuff? Too afraid to look up. Edit: it’s probably just Swedish.
“shit|shiit|shiiit|shiiiit” - okay, so 5+ “i”s can do!
“Va te faire enculer” - 100% valid entry, but at the same time much longer than the 12-character limit imposed on the input. So, why it’s even there - to show off the beauty of the French language?
It’s adorable how all German curses (from “Arsch” to “Zicke”) are orthographically capitalized.
On top of that, we can find “zalando” - nice!

Imagine how ridiculous it must have been to propose, discuss and implement this validation. “Why is Mike not at the computer?” “He’s off to library, in search of Bulgarian obscenities” “Well, tell him later I created a new Jira ticket - add »buttpirate« to the list”. I haven’t mentioned testing, because it clearly wasn’t checked by someone aware of Zero Width Space or the fact there are several characters looking just like like “o”.

It’s not how you’re supposed to handle censorship in 2022! The downsides of the current approach are:

Waste of resources and trust. The website saves about 15 kB of assorted “fucks” into sessionStorage of every single user, just in case some weirdo wishes to wear internationalized “penis” on their wrist instead of, I don’t know, “Michael”.
The language is evolving. “Hitler” is banned, but “Putin” isn’t (at the time of writing), while it should be for a good few days now. By offloading such checks to external service, the company would be able to accomodate faster and focus on their core business (delivering watches).
SaaS is simply the best!

The above case led me to thinking, maybe there’s a market for online service that could be described as “Swear words API”. Customization is hot in e-commerce, but it must be kept polite - no company wants to allow designs/wording that would undermine their brand.

I’m giving away the idea for free, along with the following API draft, simply because I don’t know how to implement it. My best guess is: render whatever the client sent, and let some low-paid human native speaker decide whether it’s offensive. Rather unacceptable response time.

paths:
  /rate:
    get:
      produces:
      - "application/json"
      parameters:
      - in: "query"
        name: "term"
        type: "string"
        required: true
      responses:
        "200":
          description: "OK"
          schema:
            type: "object"
            properties:
              score:
                description: "Severity of the term, on a scale from 0 (safe) to 1 (definitely a swear)."
                type: "number"
                minimum: 0
                maximum: 1
              isOriginalStopWord:
                description: "Whether the term comes from the lame, pre-API regex."
                type: "boolean"

In the end, I didn’t buy. If you stumbled upon this article and got rich by turning this post into working solution, please consider ordering me that watch. The decoration isn’t going to be as important as the inscription, and my word of choice is “SODOMIZER”. Thanks!