Mark Hall's Webdevblog
Busting Myths with Code #1: The Zodiac KillersPosted on

Sometimes I stumble upon some dubious piece of information online. Usually a quick google search will bring up a fact check that solidly refutes it. But that's boring. I'm a programmer. So I want to bust these myths myself... with the power of code!

I remember one day I saw on social media something that showed how many of the most infamous serial killers happened to be which Zodiac sign. Apparently 40 percent of serial killers are Cancer, Pisces, Sagittarius, and Scorpio. So I was a little shocked. That does seem like an unusually large number of people in a small number of signs. Also, I'm a little offended because I'm a Scorpio. I mean, I haven't even killed one person... yet.

So maybe I'm taking it a bit personally, but I just had to investigate the claim, and I realized I can use my coding skills to figure out if that's actually statistically significant.

First, I needed to know what their methodology was. So I found the original source, which is here. To sum it up briefly - they compiled a list of 485 serial murderers wordlwide and counted up their Zodiac signs. Cancer, Pisces, Sagittarius, and Scorpio tied for first at 46. And at the bottom of the list were Taurus and Gemini, with 27 each.

Okay, so now we have the numbers. That seems like a big spread - the top 4 signs each had almost twice the number of killers as the bottom 2.

But is that significant? Let's break out Python and generate 485 serial killes with completely random zodiac signs, and see how they stack up:

Python
import random TOTAL_KILLERS = 485 signs = [ "Aries", "Taurus", "Gemini", "Cancer", "Leo", "Virgo", "Libra", "Scorpio", "Sagittarius", "Capricorn", "Aquarius", "Pisces" ] # Randomly generate killers killers = {sign: 0 for sign in signs} for _ in range(TOTAL_KILLERS): killers[random.choice(signs)] += 1 # Sort the list (largest first) sorted_list = sorted(killers.items(), key=lambda item: item[1], reverse=True) # Print out the results for sign, count in sorted_list: print(f"{sign}: {count}")

Intuitively, I expected a relatively even distribution of killers, but when I ran it, the results weren't what I expected:

Output
Libra: 52 Leo: 47 Sagittarius: 44 Aquarius: 44 Capricorn: 43 Scorpio: 43 Taurus: 40 Pisces: 39 Gemini: 37 Virgo: 36 Cancer: 33 Aries: 27

That's interesting! It's an even wider spread than what we see in the myth. When I run it multiple times, I get similar results, with the most homicidal zodiac sign having a little under twice the numbers as the least homicidal sign.

Myth busted!

Looking at the original myth, there's technically no misinformation, but it's very misleading. What it boils down to is that 485 is far too small of a sample size to draw any meaningful conclusion.

When I change the TOTAL_KILLERS constant to 10000 and run the same code again, then the results end up being much more close:

Output
Capricorn: 870 Pisces: 870 Libra: 868 Cancer: 844 Sagittarius: 835 Virgo: 834 Aquarius: 829 Scorpio: 827 Gemini: 813 Taurus: 811 Aries: 806 Leo: 793

Well, that was fun! I'm gonna look for some more myths that I can bust... with code!