How Community Software Can Use Forensic Science to Identify Bad Members
When you hear the term “forensic evidence,” you think about police work and court cases. You think about DNA, blood and finger prints. You don’t think about online communities.
But our communities are home to a great deal of digital DNA and trace information. This evidence can be used to identify people who are trying to abuse or take advantage of our communities. Yet I don’t know of any software options that are making use of the data in this way.
There are certain key areas where this could be very helpful, where it could take a task best performed by machines and let a machine perform it.
I am sure that there are many ways this would be done, but I’ll give you one good example. Members who hold multiple accounts to push an agenda, agree with themselves or promote something. How could the software help, you might ask? Well, what if you received a notification in your admin area whenever any of these things happened:
- Two members had the same link (domain name) in their profile homepage or signature that wasn’t to a popular website (in the top 25,000 to 100,000 most popular in the world).
- Two members used the same IP address within a short period of time (defined by the community manager).
- Two members posted the same link (domain name) and the link wasn’t to a popular website.
Of course, there will be false positives. The software could be built in a way that would show you the comparable data and make it easy for you to dismiss and even white list certain websites or IP addresses from showing up again. All 3 might not work well for all communities, so you could pick and choose which method(s) work for you. As such a system was used, it would become clear what common occurrences would need to be white listed and how the software would need to adapt.
The reason you would cut out the popular websites is because the vast majority of links posted on your community go to those websites. Google, Amazon, Facebook, etc. They usually aren’t the people doing the spamming. This will drop the false positives drastically.
This is the same reason why it helps to limit the IP based forensics to a short time frame. Over a period of time, a member’s IP address will change. But for many, it will stay the same for a while. Weeks and months. Most people who are signing up for multiple accounts are doing so within a month of their last post from their original account, not months later or years later. If you have a member use an IP today and another member use that same IP three years from now, it probably isn’t the same person (especially on the major internet service providers, under which most of your members fall).
On the other hand, if you find 3 different members talking about a company and they all have the same IP, they are probably either the same person or working out of the same office. People talk about how unreliable IPs are and how they can be spoofed. That’s true, except that the people you are trying to catch with this system are not the ones who know how to spoof an IP. You are just trying to cut the percentage down a bit and knock off the easy targets that the software can hit, not eradicate something that will never fully be eradicated.
IP information gets less meaningful the bigger that your community becomes, but I think it can still be useful for big communities, too. You might just need to limit the time span even more.
For the 99% smallest online communities that exist, this system would identify many bad actors. We all know we are missing members who fit into this category. Software can help us find them.