Week 8 Writeup

This week’s material covered email security.

Exercise 1 – Poking around in PostgreSQL

On the linux VM, we are given access to a Postgres database containing records containing emails that have been classified as spam or ham (legit email). There are 100,000 emails in this database, as shown from a “SELECT COUNT(*) FROM message_data;”.

Firing up PostgreSQL on the Linux VM
\list shows a list of databases. The postgres, template0, and template1 dbs are standard
\d shows a list of relations
“\d message_data” prints a list of columns in the message_data table. These columns detail information about each email in the table.
One entry from “SELECT * FROM message_data”.
The inconsistent alignment of the columns is hard to read.

If these emails had not already been labeled is_spam, we would be tasked with classifying the emails. This can be done using a rule-based system so that emails that meet a certain set of rules will be marked as spam. These rules may not necessarily cover all cases of spam, but email spam filters philosophically would rather allow more spam through if it means that legit email is not blocked. The consequences of more spam is most likely the inconvenience of a fuller inbox. However, if business critical emails are being blocked, then blocking this email is an application-breaking bug.

Exercise 2

Writing rules to classify emails is similar to writing YARA rules previously in class. We look through the contents of the emails for patterns that are suspicious, like how we examined the string contents of executables for suspicious contents. One of the tools we can use to classify emails is regular expressions. While not directly applicable to classifying emails in the db format we looked at in the previous exercise, it was fun to practice regular expressions.

We were tasked with writing a regular expression rule to match the three obfuscated spellings of Viagra but not match “Viagra” itself. I used the “g” and “i” flags for my expression. The “g” flag is for a global search or matching all occurrences and the “i” flag will make the match case insensitive. The first part of the expression “(?!Viagra) will match the string after “!” but ignore it and continue with the rest of the match. I used the pipe operator like an “or” to select between different possibilities for a match. For example, matching V or \/. The (\s*) will match 0 or more whitespace characters. This can also be limited by using curly braces to specify a range of characters to match. (\s*) might match 1000 spaces but it may not be practical to test for that match given our original set of strings.

Leave a comment