Saturday, January 10, 2009

More Thoughts on Murder - With Graphs!

After my recent post on Baltimore's low (by Baltimore standards) murder total last year, a couple people commented that we should look at per capita murders rather than the annual total.  I think that there is some truth in those comments, but after fooling around with the data a little, I can also see how you would get a misleading story.

First off, a quick word on my data sources; all data for 2000-2007 is from the FBI's annual Crime in the United States (CIUS) publication.  The CIUS summarizes the Uniform Crime Reporting (UCR) program every year.  I'd never really looked at this before, but it's an amazing amount of information that's available to everyone.  I'm jumping the gun a little on 2008, so any numbers for 2008 are unofficial and the per capita numbers are especially suspect because the Census Bureau has not released 2008 population estimates (so I assumed no population change since 2007, which is obviously false for places like Baltimore and Detroit).  Finally, we need to remember that these are population estimates, not full Census data.  Also, I'm no trained demographer, so I'm not necessarily qualified to really analyze any of this - but I do love to make a good chart.

I chose to look as far back as 2000 because 1) that's when I moved to Baltimore and 2) after you look too far back you start wondering if you're really comparing apples to apples.  Anyway, here's a quick plot of where we've been as a city:


So you can look at graph and see that it's not immediately obvious that a reduction in murders from 261 in 2000 to 234 in 2008 might not actually reduce the per capita murder rate in Baltimore.  Before we get to per capita murders, I'd like to think a little bit more about the total murders in Baltimore.

In 2000, Baltimore was home to about 648,000 people.  At the same time East Newark, New Jersey was home to 2,377 people.  From 2000 to 2008, 2,377 people were murdered in Baltimore.  That's right, it took only 9 years for Baltimore's criminals to exterminate a population equivalent to a small New Jersey borough.  Yes, it seems silly that New Jersey have a mayor and six city council members to govern the 0.1 square miles of land and 0.04 square miles of water that comprises the borough of East Newark, but I'm sure Mayor Joseph R. Smith would be very disappointed to have his entire city wiped out. 

I'm making fun of New Jersey, sure, but the numbers really are mind-boggling.  Just trying to wrap your mind around the idea of a couple thousand murders is tough.  Here's another way to look at it: from 2000 to 2007, Baltimore's population is estimated to have declined from 647,955 to 624,237 - a decline of 23,718 residents.  Over that same time period, 2,143 people were murdered in Baltimore (i.e. the entire population of Vernon, AL).  Murders alone account for 9.0% of Baltimore's 2000-2007 population decline!

This ridiculously high murder rate is made even more tragic because of it's disproportionate effect on certain populations of Baltimore.  For example, in 2008 91% of homicide victims are African-American in a city that is only 64% African-American (source: Baltimore Sun).  It's just unbelievable; it works out to a per capita murder rate five times higher for African-Americans than everyone else in 2008!

Speaking of per capita murder rates, without further ado, here's the chart you've all been waiting for:


With the usual warnings about comparing crime statistics, I'm going to launch into what I think I've learned by looking at that chart.  First of all, Baltimore's murder rate has been fairly constant, even when adjusted for a falling population.  Over this time period we've looked a lot like Detroit, but I wouldn't be surprised to see that relationship loosen up over the next couple years.  Detroit looks to be in for a much rougher economic ride than we are (knock on wood).  It is interesting that both Baltimore and Detroit saw a big dip in murders this year (but St. Louis jumped out of trend).  St. Louis's trend line is a little scary because they saw a big one-year dip earlier this decade, but then jumped right back up to a murder rate in the high 30's.  Washington DC does give us a little hope, they managed to reset to a new, lower murder rate in the middle of this decade.

Other things you might want to know (but are not immediately obvious) - the city of St. Louis has about half Baltimore's population, so you'd expect the data to be a bit noisier since the murder rate LSB in St. Louis is twice as large as in Baltimore.  DC is approximately the same size at this point (because Baltimore has lost a lot of population and DC is fairly steady) and Detroit is larger.  However, I'd expect a big upward revision in Detroit's per capita murder rate because they are losing population fast enough that my "re-use the 2007 population estimate" makes the denominator too big.

Finally, it's kind of interesting how rock-steady the national murder rate has been this decade.  It's varied from 5.5 to 5.7 - which is actually way down from the mid-1980's (when it was over 8 for a few years).  When you look at the time-series data from the 1980's to the present you really start to appreciate the drop in violent crime nationwide.  This is probably why your parents worry so much about you living in the city.  Their concept of urban crime was set a generation ago, when it was objectively more dangerous to live anywhere!

As I said in my previous post, 2008 was a good year for Baltimore in terms of murders.  We were 3.28 sigma below the 2000-2007 average for total murders (although, due to the vagaries of population estimates and a falling population we were only 1.62 sigma lower than average for the per capita murder rate).  It'd sure be nice to say that 2008 was statistically significant and that something has really changed in Baltimore - that what the police are doing different is definitely working - but the first nine days of 2009 sure haven't helped (12 murders plus a shooting victim on life support).

The problem here is that for all the good work that the police may have done in 2008, 2-3 very violent months (November to now) could spell real trouble.  Many people have criticized the Baltimore Police for jumping from strategy to strategy.  If murders stay high for another 2 months, even if it's not necessarily out of trend for the last decade, there will be pressure on the police to change tactics yet again.  Those crime rate graphs are powerful tools but as the platitude goes, with great power comes great responsibility.  It is very, very easy to make a graph say whatever you'd like.

Working in a technical field, I see plots of things ALL the time.  Even though your natural instinct is to look a the pretty lines or bars or whatever, you have to train yourself to first look at the axes, at the units, at the title, and then ask whoever made the graph "Where did you get your data?"  Or "How did you calculate x?"  Contrary to popular belief, measured data is not the end-all, be-all of engineering.  Probably about half of all charts of measured data I see have some sort of fatal problem.  That's when you start to see who the really good engineers are.  The good engineers look a plot of data, even measured data, and if it doesn't match up to what their technical intuition says reality ought to look like they'll just say "I don't believe that, I think there's something wrong with the measurement."  So after arguing about it for half an hour, someone will agree to go back and remeasure.  A bad cal on the measurement equipment, a loose connector, operator misunderstood directions, measured from the wrong datum, measured the wrong channel, measured the wrong serial number, forgot to turn on the coolant, error in the test software, error in the spreadsheet math, test equipment not sensitive enough... the list of things that can go wrong in a measurement is endless.

So what does that have to do with anything?  My point is that we're measuring physical properties at my workplace.  I mean, these are things that are governed by physics, things that can't change.  The most important thing to remember in engineering is that those measurements aren't the truth, if you're lucky you get a picture of what the truth probably looks like.  So I figure that maybe half of all graphs of measured physical data are misleading or do not tell the whole truth (eventually you get there, but it takes a few iterations).  Now we're talking about something that is not governed by a physical, or even rational, process.  We're talking about violent crime and murder, so a short-term spike gets noticed.  Even if it were just truly random process, sometimes you get a clump, the human brain tries to see patterns in randomness.  But it's hard to adjust your mind to the idea that "Yeah, even if this were random, we could get 12 murders in 9 days," it's an emotional issue that skips over that logical part of your brain.  People see 12 murders in 9 days and think "Surely this is a trend!  This is a crisis!"  But it doesn't have to be.  It could just be a random coincidence.  Just looking at the graph doesn't tell you the story.

Finally, murders are not caused by a mechanistic, or even rational, process.  So to some extent, saying "We followed X police procedure and saw a Y% reduction in murders" is a worthless statement.  There's a correlation all right, but is there a causal relationship?  Murders are not one of those things that you can afford to have a control in your experiment (not that there are not advanced statistical methods that professionals can use to evaluate procedures that do a better job of controlling for other variables).  My advice would be to take a look at how other forms of crime change, because some crimes are more rational that others.  I may yet do this, but I'm also afraid that any improvements made by the Baltimore PD might be swamped out by the huge effect of the recession and increased unemployment.  

Overall, I sure hope that they give Bealefield a little while longer to apply his strategies before forcing him to change to something new.  I like a lot of the much-reported ideas (closer work with the state's attorney and US district attorney, sending repeat gun offenders away to federal prison for longer sentences, working harder to prevent domestic violence before it turns into assault/murder, focusing on arrest warrants for violent offenders) and would like to see how they turn out.

Ok, time for one last chart even though I've said about graphs and statistics are often misleading (every engineer can tell you about the time they passed some product because 4 out of 10 tested perfect only to find out later that 5 of the other 6 would have failed).  I mean, despite all their drawbacks, a good plot or graph can sure be interesting.

(On the plot above I am counting murderous acts that occur in that year, so if someone dies of injuries sustained in prior year violence they are not counted.  Since I have this all in a file now, if someone dies in 2009 from 2008 injuries, I will update my 2008 murder count instead of 2009 (this is the opposite of how Baltimore's PD (and I think the UCR) count murders)).

PS - I spent so much time on this post, I made a PDF version of it that is easier to read.

No comments: