Unexpected Bots and reCAPTCHA defense

A few weeks ago I soft launched my product to allow for direct sign ups from the public web site, Process PA. We’ve been running for a few month with just our foundation customers making sure things are running well before letting anyone register without interaction. I updated the web site without announcing to just test the process a bit before (hopefully) driving more traffic to the site.

Interestingly, I started getting random sign ups registered in the database. Many had even verified their email address. I hadn’t been directing people to the site yet and where we are now we don’t get many unknown visitors. Sure enough, it appears they are bot accounts. Quite surprising that bots are onto new sites quickly and filling out registration forms. I’m not sure what they expect out of it.

CAPTCHA Required

This is kind of a pain. While building a startup I have many things to do. And stopping bots this early on I didn’t think would be required. Fortunately putting CAPTCHA in place is pretty quick. However, I started using a very popular NuGet package BotDetect CAPTCHA. Although implementing was easy it results in those horrible user experience that everyone hates. I did not want to add any friction to legitimate sign ups.

Although the BotDetect CAPTCHA claims “not one confirmed case of automated CAPTCHA breaking by spammers” I’m sceptical. I did a thesis on vision processing over 10 years ago and it’s gotten much better since then. Spammer may not be breaking them, but Google states, “it can decipher the hardest distorted text puzzles from reCAPTCHA with over 99% accuracy”.

Google No CAPTCHA reCAPTCHA to the rescue

You’ve seen it across many websites now. Launched December 2014, this provides the simple ability for the user to check the box that says, “I’m not a robot”. And they are, most of the time, done. So much better for the user. So much harder for the bot.

Implementing is very easy from the instructions on the admin site which contains your keys. Get started at Google reCAPTCHA. Client side is a script include and a div. Server side is a web request. It is even simpler nicely wrapped up as an attribute from the NuGet package reCAPTCH.MVC with clear instructions on their project site.

With it all in place now, it looks like I’ll be needing human customers to keep up the sign up rate now that the bots aren’t allowed in. If you are tired of doing minutes and governance manually for your association, club or board come and try out Process PA.