Analysing police data with Orange 3: making simple maps and predictions

Maps in so many domains are an essential tool for plotting and making sense of geo data. Sometimes it is only when we see it on a map do we make connections or recognise the relevance or significance of what our markers represent.

There are lots of ways to make maps and today it has never been easier; my favourite is using a piece of software called Orange 3. Orange 3 uses python widgets to allow the user to do a whole range of data analysis, from simple constructions like Scatter Plots up and including Principle Component Analysis (and more), without having to write code. Using a drag and drop methodology it allows users to create simple work flows and the following will be on how we read in and analyse some police geo data.

You can download UK police data here choose your police force or forces and decide which month or months you want to view. Each downloaded file comes as a CSV (Comma Separated Values) format which means if you want to you can easily open it in Excel. Orange 3 gives a quick view of each of the variables and allows the user to see it in a spreadsheet data frame structure.

I chose for this blog the crimes reported by Durham Constabulary for the month of January 2017. When we examine the dataframe we can see that it contains the crime ID, Location, LSOA code, crime type, outcome category and a few other bits and best of all a pretty accurate set of latitude and longitude coordinates. The lat and long are not exactly spot on for reasons of Data Protection but are near enough to allow the researcher to make really good appreciations of what is happening and where. The simplicity of just connecting drag and drop widgets is quite astonishing. The filters are also easy to operate and work on a boolean format but instead use words rather than symbols for ‘greater than, equal to etc’. The map function is created using ‘leaflet’ and OpenStreetMap which in another post I will show how t0 do using ‘leaflet and folium’ in python code; for now if you have a list of latitudes and longitudes and want to plot them there probably is not an easier way for free!

We can easily change the parameters to show us all crimes, or different ones. You will notice that the map begins with clusters and when the user clicks on these clusters they break up and show where each of the offences happened, or again being precise, they show a really good approximation of where these offences happened.

It would be wrong to have the precise geo location of crimes showing sexual assault for example especially if these occurred inside a dwelling, so a cleaning process is applied before these data are released.

We can actually go a little bit further and run an algorithm to help predict where certain types of offences would be. Now I have only downloaded one month for this blog but you will get the idea nonetheless. We can zoom in and get some sense of where we are likely to see crimes of the ones we have selected. The more data we have the better the prediction.

Hope this helps anyone wanting to plot data containing latitude and longitude coordinates. If you want to know more then have a look at the following

Bye for now

Dr Mark Butler Senior Lecturer in Crime Scene Science and Course Leader for MSc in Crime Intelligence & Data Analytics at Teesside University