A year of Chicago's crime, in 30 seconds

Yesterday Brett Goldstein, the Chief Data Officer for the City of Chicago, announced on Twitter the release of Chicago's crime data for the past year. The data is very detailed, and wonderful resource for criminologist and social scientists alike.

I have been playing around with the data a bit, and have produced an animation that explores the geospatial nature of the data. Similar to what Mike Dewar did with the Afghanistan War Logs, I wanted to show variation over time rather than simple aggregates. To do so, I decided to plot moving 10-day windows of the data on a map of Chicago's police districts. Moreover, I wanted to show the regional trends of different kinds of crime throughout Chicago.

The map below shows these 10-day windows, with the crime types color coded. The boundary lines on the map indicate police districts. Each dot represents a crime, and the opacity of each dot corresponds to the number of this type of crime reported in that geographic location on that day. For example, a dark shaded pink dot would indicate an area of heavy theft. Because there are a large number of crime types in the data I restricted this animation to only the top 18 crime types. These are those crimes for which there are over 1,000 incidents total.

This visual technique provides insight into the intensity of various crime types across region in Chicago. In addition, the timeline below the map highlights the current chronological window being plotted, as well as the total density of crimes for the entire data set. The color codes in the this timeline correspond to those on the map.

There is a lot going on here, so it is best viewed in full-screen mode...

For me, lots of interesting observation:

  1. There actually appears to be very little variation in both the volume and location of crime. Downtown Chicago is consistently plagued by a high-degree of theft, while burglary is much more frequent in the south of Chicago. Also, the density timeline shows vary little change in the volume over time.
  2. Crimes are symbiotic. That is, it seems certain types of crimes coexists quite well; such as narcotics and prostitution. This is exemplified by he prominent north-eastern ring.
  3. People do not like to commit crimes in the cold. Conventional wisdom supports this, and Chicago appears to be no different. Though there is very little variation, there is a slight dip in the overall number of crimes from December through February.
  4. If you look closely, you can actually see the formation of roads where crimes occur. Particularly those that lead in and out of downtown.

Not being a Chicagoan, I would welcome alternative observations from those with a better understanding of the geography and crime trends.

P.S., The above animation was made entirely with open source tools: R, ggplot2, ImageMagick, and ffmpeg.

Code available here

The things that keep me from blogging

I was recently at a conference where my friend Pete Skomoroch confessed to the audience that he was a "bad blogger," because he had not blogged in several months. Pete has been reasonably distracted as of late, so his absence is completely understandable. His comment, however, caused a sudden wave of guilt to wash over me as I sat in the audience. I am a very bad blogger, as it has been over three weeks since my last post, which admittedly was more of an announcement than an actual blog post.

Though not to the level of Pete, I too have been preoccupied. Over the last several weeks I have worked through some large projects and milestones. Two of these may be of general interest. First, John Myles White and I submitted half of the chapters for our upcoming O'Reilly book Machine Learning for Hackers to our editor. At the risk of sounding self-congratulatory, I am really impressed by what we have managed to pull together thus far and am really looking forward to getting the text out. I think teaching machine learning concepts algorithmically, and motivating each method with a case study will appeal to a broader audience. There may be a mini-eBook version of the first half of the book out before the full text is complete, so be on the look out for that announcement.

The second big announcement is that as of the end of this semester I have fulfilled all requirements for my PhD other than the dissertation. In the academic jargon I am now "all but dissertation" (ABD), and am now beginning the final leg of this scholarly adventure. I mention this for two reasons. First, so that you may shower me with the appropriate level of congratulations; and second, because part of my dissertation work will require the use of some new or novel social network data.

My most recent work on modeling network evolution with graph motifs has a serious deficiency: real data. To move this research from an abstract idea to something that makes a meaningful contribution to the social sciences I need to apply it to real data. Unfortunately, there is a dearth of real dynamic social network data—especially related to terrorist or criminal organizations. As such, I am putting out the call to my readership. If you, or someone you know, is working with a dynamic social network data set please contact me.

I know from my traffic logs that ZIA gets a fair amount of traffic from both academic and government readers. If any of you have data like this and are interested in working together I would love to chat. I am easy to get in touch with, so please let me know if you are interested.

Finally, I look forward to getting back into a regular blogging routine. There are so many fascinating things happening in the world related to conflict, terrorism and data it is hard to imagine where to begin. It should be an interesting summer!

Articles to Reconsider in the Wake of bin Laden's Death

For me the prevailing emotion after learning that Osama bin Laden had been killed was sadness. Certainly not for the man, but his death brought back many memories and emotions I remember feeling as an undergraduate ten years ago on the morning of the attacks. How this single event entirely changed the world I lived in, and it subsequent direct impact it had on my life and career.

I also felt a sense of discomfort at the unabashed jubilation. Killing bin Laden is a significant milestone, but not a yardstick of progress. This is a great victory for the Obama White House, the U.S. intelligence community, and the special forces that executed the raid. But, will it have any impact on transnational terrorism, or radicalization more generally? Cautious optimism should rule of the day, and by way of promoting this perspective below are five articles that approach this question from many different angles.

If there are others you think should be included please add them in the comments.

My article in IQT Quarterly, "Data Science in the U.S. Intelligence Community"

I was asked to write a short piece in In-Q-Tel's journal, IQT Quarterly. The article attempts to address how the U.S. intelligence community can begin incorporating "data science" into intelligence cycle, and some of the consequences. The issue was just published, and my article is entitled (appropriately), "Data Science in the U.S. Intelligence Community."

Readers of this blog will note several themes, ideas, and even graphics in the article that I have mentioned in previous posts. But, I was very pleased by the sentence the editors decided to draw out in the print version.

Understanding how modeling assumptions impact the interpretations of analytical results is critical to data science, and this is particularly true in the IC.

I welcome you thoughts, especially if you are a practicing intelligence professional.