Tech Tips

Blocking Form Spam With the Tectite Formmail.php Script

In many ways, spam sent through website contact forms is some of the most annoying spam out there - compared to regular spam, the actual EMail message is typicaly coming from your own server, so it's both nearly impossible to block with spam filters (without inadvertently blocking legitimate messages), and there's usually no point reporting it to the hosting providers, because you would effectively be reporting yourself. One of the more popular ways to process website form submissions is using the Tectite formmail.php script, which does include a number of built-in options for blocking spam submissions - we've found that we're able to block most form spam with a combination of the "ATTACK_DETECTION_REVERSE_CAPTCHA" setting, the options to check for duplicate data in multiple fields, and the options to limit the number of URLs that can be entered into a single field (as well as the number of individual fields that can contain URLs).

The built-in spam protection options are not effective for blocking all types of form, spam however: recently we've seen a rash of spam sent through forms on multiple sites, all with extremely short messages (2-5 words), without any links or other apparent purpose, meaning that they're likely intended to "poison" spam filters by feeding them meaningless data. There are also some forms or sites where it would make sense to allow URLs only in certain fields, but not others - for example, a form for requesting support for a domain name or website, with a field for entering the relevant URL. While the script does have support for blocking submissions that contain more than specified number of URLs, and/or URLs in more than a specified number of fields, it doesn't allow you to selectively specify which fields can contain URLs and which can't (and it doesn't apear to contain any options for allowing/blocking submissions based on the amount of data in a given field).

Fortunately, the Tectite script does have support for doing conditional tests on fields, using custom regex (regular expressions). Conditions are a fair bit more difficult to use than the built-in spam protection options, especially if you're not that familiar/comfortable with writing custom regex, but it can be used for blocking both types of spam submissions mentioned above.

 

REJECTING SUBMISSIONS THAT ARE TOO SHORT

Most people who have built web-based forms are familiar with the concept of required fields - in other words, fields that must be filled-in in order to submit the form, otherwise it will fail. This works for rejecting submissions where a field is completely blank, but entering so much as a single character (letter or number) will get around that; to do that with Tectite's formmail script, a custom condition is needed:

<input type="hidden" name="conditions" 
   value="#@ @TEST@message1 ~ /.{32,}/@ 
   ERROR: your message is too short!@" />

The code above assumes that the form contains a textarea field (though would presumably also work with single-line text input field) with the name "message1" - which can be edited if the actual field you want to check has a different name. The number "32" near the end of that line specifies the minimum number of characters that the "message1" field must contain; and the last line contains the error message that will be displayed if the test fails (in other words, if "message1" contains fewer than 32 characters). The same page also needs to include code for the actual form field, for example:

<textarea id="message1" name="message1" rows="5" placeholder="Your Message"></textarea>

 

REJECTING SUBMISSIONS WITH URLS IN SPECIFIC FIELDS

As mentioned above, the Tectite script includes options for blocking submissions if any field contains a URL or more than a specified number of URLs, but no built-in way to allow or block URLs in specific form fields. That can be achieved using the following condition:

<input type="hidden" name="conditions" 
   value="#@ @TEST@message1 !~ /(http:|https:)/
   @ERROR: URLs are not allowed in the Message field@" />

As with the previous example, this code assumes a textarea (or text input) field with the name "message1" (and can be edited to work with a different field name). The text in parentheses at the end of that line specifies what the condition will look for in the field data - "http|https" (the "|" character betweeen "http" and "https" means that the condition will look for either). So, in other words, if the form is submitted and the field "message1" contains any text starting with "http" or "https," then it will be rejected. And as with the previous example, it obviously requires a corresponding field, for example:

<textarea id="message1" name="message1" rows="5" placeholder="Your Message"></textarea>

One thing to note is that, when creating conditions for the Tectite formmail script, the first character in the "value" attribute is used as a separator for multiple conditions - while the examples they provide all use the colon (":") for that, that won't work with this condition because it needs to check for text containing a colon. So, instead, we've used the "#" character for that in the example above. Otherwise, if the colon character is used as a separator and the condition itself contains a colon, then the conditional tests won't work - and while you could modify the condition to search for the text "http|https" (without the colon), that could cause the unintended consequences of potentially blocking messages were that text is included, but not part of a link, E.g. references to the HTTP or HTTPS protocols themselves.

 

COMBINING BOTH CONDITIONS

In some, if not most, cases, it would be useful to run both conditional tests on form submissions - so that submissions are blocked both if they're too short, and/or if certain fields contain URLs. Referring to Tectite's documentation, there are two different ways to do that - but unfortunately, neither seems to work. Their documentation states that you can have multiple hidden "conditions" fields, as long as the field names are unique - so "conditons1" and "conditions2" should work; but when we tested, the conditions don't work at all if the field name is anything other than "conditions". The documentation also states that you can put multiple conditions in a single field by separating them using the separator character ("#" in this case), but that doesn't appear to work eiher - doing so appears to break the conditions, in the sense that the submission always fails for being too short even when it shouldn't, when there's much more than 32 characters in the message field.

If you know of a workaround for that issue, feel free to share it in the comments. Or, if Tectite corrects the error in the script or their documentation at some point, we wll upate this post with the details.






Comments

Linux and Windows web hosting plans start at just $7.95/mo.