Analysing police data with Orange 3: making simple maps and predictions

Maps in so many domains are an essential tool for plotting and making sense of geo data. Sometimes it is only when we see it on a map do we make connections or recognise the relevance or significance of what our markers represent.

There are lots of ways to make maps and today it has never been easier; my favourite is using a piece of software called Orange 3. Orange 3 uses python widgets to allow the user to do a whole range of data analysis, from simple constructions like Scatter Plots up and including Principle Component Analysis (and more), without having to write code. Using a drag and drop methodology it allows users to create simple work flows and the following will be on how we read in and analyse some police geo data.

You can download UK police data here https://data.police.uk/data/ choose your police force or forces and decide which month or months you want to view. Each downloaded file comes as a CSV (Comma Separated Values) format which means if you want to you can easily open it in Excel. Orange 3 gives a quick view of each of the variables and allows the user to see it in a spreadsheet data frame structure.

I chose for this blog the crimes reported by Durham Constabulary for the month of January 2017. When we examine the dataframe we can see that it contains the crime ID, Location, LSOA code, crime type, outcome category and a few other bits and best of all a pretty accurate set of latitude and longitude coordinates. The lat and long are not exactly spot on for reasons of Data Protection but are near enough to allow the researcher to make really good appreciations of what is happening and where. The simplicity of just connecting drag and drop widgets is quite astonishing. The filters are also easy to operate and work on a boolean format but instead use words rather than symbols for ‘greater than, equal to etc’. The map function is created using ‘leaflet’ and OpenStreetMap which in another post I will show how t0 do using ‘leaflet and folium’ in python code; for now if you have a list of latitudes and longitudes and want to plot them there probably is not an easier way for free!

We can easily change the parameters to show us all crimes, or different ones. You will notice that the map begins with clusters and when the user clicks on these clusters they break up and show where each of the offences happened, or again being precise, they show a really good approximation of where these offences happened.

It would be wrong to have the precise geo location of crimes showing sexual assault for example especially if these occurred inside a dwelling, so a cleaning process is applied before these data are released.

We can actually go a little bit further and run an algorithm to help predict where certain types of offences would be. Now I have only downloaded one month for this blog but you will get the idea nonetheless. We can zoom in and get some sense of where we are likely to see crimes of the ones we have selected. The more data we have the better the prediction.

Hope this helps anyone wanting to plot data containing latitude and longitude coordinates. If you want to know more then have a look at the following https://docs.orange.biolab.si/3/visual-programming/widgets/visualize/geomap.html

Bye for now

Dr Mark Butler Senior Lecturer in Crime Scene Science and Course Leader for MSc in Crime Intelligence & Data Analytics at Teesside University

Some thoughts on Big Data in Policing

In preparation for my MSc teaching next semester I have been reading articles on many of the topics we will be discussing in class. On of my favourite papers so far is by Chan and Moses “Is Big Data challenging criminology?”

The open premise is simple and thought provoking and actually connects very strongly to one of my other areas of interest ‘Expertise’ specifically within a crime analysis context, be the Crime Scene Investigator or Crime Intelligence Analyst. Anyway the point Chan and Moses debate is that, at what point will be need specialists? That is to say experts. Will they be valued in future policing ? Now there is much in the press about direct entry to being a Detective, direct entry to senior ranks for those largely with specialist skills but let us think about what experts do.  If we need experts to help us navigate paths where we don’t have enough information and accept that experts help us ‘fill-the-gaps’ what happens when we don’t need to worry anymore about having only some of the information? Perhaps there is no need to rely on people with intuition and experience or at the very least value them, why? Because we won’t just have a sample of the data anymore, we have all of it! Or at least the decision makers will.

My favourite quote Chan and Moses capture from Mayer-Schönberger and Cukier (2013) “…To be sure, subject-area experts won’t die out. But their supremacy will ebb. From now on, they must share the podium with the big-data geeks…” In a future blog I may come back to this article in more depth but for now it raises an interesting point in policing in general – to what extent in the near future will we rely or need experience? Will the future be a soup of algorithms that mine all the cases someone, scratch that, everyone has ever worked with a result output? If this is the case and I should say very quickly that not everyone believes this to be the true, it nevertheless raises the point about data and the skills people need to use it effectively. Chan and Moses then go on to discuss the work of Uprichard’s (2013) here she states that Sociologists, some of the hardest hit, must fight back by improving ‘quantitative skills’. Perhaps a point for any Crime Intelligence Analyst to think about.

In this paper Steadman(2013) is remarked as suggesting that well frankly not everything will be ‘codified’; not everything can or perhaps will be measured and even if  it is, someone will still need to be required to comment on the result.

The following quote by Chan and Moses is important because it begins to show how analytics has begun to invade/help (In the interests of balance I will leave it to you to decide) a wide variety of policing roles.

“The relevance of these debates is made more urgent by the increasing popularity of the use of data analytic software for ‘predictive policing’ (Bond-Graham and Winston, 2013; Perry et al., 2013; Uchida, 2013) and decisions about bail and parole (Berk and Bleich, 2013; Bennett Moses and Chan, 2014). It is important not only to develop a clear counter-argument to the widely cited arguments of Anderson (2008) and Mayer-Schönberger and Cukier (2013), but also to articulate the limits of ‘criminal justice forecasting’ as a rational basis for making strategic choices in law enforcement or for policy-making more broadly (cf. Berk and Bleich, 2013).”

I will write a further blog later on what they found out but if you can’t wait till then then the following link below is where you can read the paper in full. If nothing else I hope it has stirred you to thinking more about Big Data and how it is or could be influencing your own work: where the benefits are as well as the weaknesses. And if you are an Analyst just what sort of skills do you need to help you use data? In future blogs I hope to explore some simple code to do some data mining but for now, ponder the philosophical issue above.

http://journals.sagepub.com/doi/abs/10.1177/1362480615586614 [accessed 26th July 2017]

Bye for now. Regards Mark

Dr Mark Butler course leader for Crime Intelligence and Data Analystics

https://www.tees.ac.uk/postgraduate_courses/Forensic_Science/PgDip_MSc_Crime_Intelligence_and_Data_Analytics.cfm

Here is a list of papers I mentioned when I discussed the points debated by Chan and Moses. This will give you the chance to explore in more detail anyone that got a mention 🙂

Anderson C (2008) The end of theory: The data deluge makes the scientific method obsolete. Wired Magazine, 23 June. Available at: http://archive.wired.com/science/discoveries/ maga- zine/16–07/pb_theory (accessed 17 July 2014).

Bennett Moses L and Chan J (2014) Using Big Data for legal and law enforcement decisions: Testing the new tools. UNSW Law Journal 37(2): 643–678.

Berk R and Bleich J (2013) Statistical procedures for forecasting criminal behavior. Criminology & Public Policy 12(3): 513–544.

Bond-Graham D and Winston A (2013) All tomorrow’s crimes: The future of policing looks a lot like good branding. SF Weekly, 30 October. Available at: http://www.sfweekly. com/sanfran- cisco/all-tomorrows-crimes-the-future-of-policing-looks-a-lot-like-good-branding/Content? oid=2827968&showFullText=true (accessed 15 April 2015).

Mayer-Schönberger V and Cukier K (2013) Big Data: A Revolution That Will Transform How We Live, Work and Think. London: John Murray.

Perry WL, McInnis B, Price CC, et al. (2013) Predicting Policing: The Role of Crime Forecasting in Law Enforcement Operations. Rand Corporation. Available at: www.rand.org (accessed 17 December 2014).

Steadman I (2013) Big Data and the death of the theorist. Wired, 25 January. Available at: http://www.wired.co.uk/news/archive/2013–01/25/big-data-end-of-theory (accessed 17 July 2014).

Uchida CD (2013) Predictive policing. In: Bruinsma G and Weisburd D (eds) Encyclopedia of Criminology and Criminal Justice. New York: Springer, 3871–3880.

Uprichard E (2013) Focus: Big Data, little questions? Discover Society, 1 October. Available at: http://www.discoversociety.org/2013/10/01/focus-big-data-little-questions/ (accessed 17 July 2014).

Orange3 data analysis

In my practicals I am often asked questions about SPSS; I show examples of data mining, creating charts for a visual understanding of the data, correlations and more. I am also asked by my 3rd year students doing research projects about t-tests or more frequently answering questions like “My supervisor says I need to do some stats on this work can you help” :/

I like how SPSS over the years has become drag and drop and has a interface I work quickly with. But this blog is about another piece of software that actually I’m ashamed to say I stumbled across whilst I downloaded the Anaconda Continuum and it appeared as an option along with Jupyter Notebook, Spyder, RStudio and a few more. Recently when marking calmed down here at Teesside Uni I clicked launch on the Orange3 icon and was gobsmacked at the functionality Orange3 has, especially for the analyst or a student studying crime science. I quickly scoped out YouTube and found online tutorials from Text mining a corpus of documents, using API keys with big name new papers, Twitter, analysing geo data and much more. I have still to dig into the machine learning opportunities and these look amazing.

In fact it really should be a must for any Intelligence Analyst to at least explore. I am currently writing up a few of my findings but it already will be a feature in our new MSc as a tool for students to use. I will still use SPSS and I’ve discovered for a data scientist/analyst it’s not about the next new data mining toy, but instead picking the tools you need or work with the best.

I’ll update you more in the coming days and weeks in the mean time here is the link. https://orange.biolab.si/#Orange-Features

Bye for now

Regards Mark

Dr Mark Butler Course Leader Crime Intelligence and Data Analytics

What will this blog be about: Future Posts

I have explored writing blogs before and to be honest I ran out of steam rather quickly, and I perhaps took it a bit too seriously too :/

However I decided to take a different approach this time and will aim to upload a series of posts that will explore the skills that I hope students and analysts of crime/forensic science or criminology might find useful.

In the past few months I have been scoping out coding and a fantastic piece of software called Orange3. I will include in the coming posts how to do some text mining, access to API keys for newspapers such as the Guardian and the New York Times and much more.

Regards Mark

Dr Mark Butler

http://www.tees.ac.uk/prospectus/pg/PG_course.cfm?courseid=2879&fos=22&fossub=50#coursecontent