This is a guest blog post by Dr Mark Butler, Senior Lecturer and Course Leader MSc Intelligence and Data Analytics from School of Science, Engineering & Design. He currently teaches crime scene examination techniques as well as intelligence analysis.
Current approach for learners doing Data Analysis at Teesside University
It seems everyone in learning and teaching is aware that data is becoming bigger, more complex and unquestionably more accessible. The point perhaps is how to make sense of all of it?
Historically academics and students have had access to rich interfaces to analyse and derive meaning from data such as Minitab, IBM SPSS, MATLAB, NVivo and more, these are on top of generic options such as MS Excel and Google Sheets. These giants in data processing, statistical analysis and qualitative thematic coding are however not the only option and there is good reason to explore other packages.
Orange (https://orange.biolab.si/) has certainly been my go to option in the last 2 years, with everything from machine learning to text mining – all for free. To help with being digitally empowered is the opportunity to move a class of learners out of an IT suite, a potentially liberating experience, with the flexibility of being able to use the Apple iPad for data science.
Microsoft Azure Notebooks for Data Science
Microsoft Azure Notebooks (https://notebooks.azure.com/) offer for free, the opportunity to process and visualise data using an array of languages such as Python (https://www.python.org/) and R (https://www.r-project.org/). Essentially, with some coding knowledge it is possible to carry out some detailed analysis.
The service is new and under active development so with that comes the potential for disruption, however I have used it for over 6 months and not found this to be a concern.
Data Science Case Study: using Python, pandas, seaborn and a mapping library called folium on the iPads for level 3 and level 7 learners
Microsoft Azure uses Jupyter Notebooks (https://jupyter.org/) that some of you will be familiar with on the Anaconda platform (https://www.anaconda.com/distribution/), a free sophisticated coding editor that allows the user to code and analyse in the same environment. With the option of having cells in markdown it makes it possible to creative a narrative about the data, essentially the data, its analysis and the data story are in the same place.
Library packages of your choice can also be downloaded in the cloud, doing away the need for administrator level access. The visualisations are better too than many of the software giants I discussed earlier. For now, Microsoft Azure is free, it doesn’t penalise users on speed and is accessible with your student/staff email account and student/staff password. With a small amount of coding knowledge, the iPads offer good prospects for data science work.
I have tried this in lessons with Level 3 students working through worksheets. At level 7, the benefit has been unexpected in that it has opened up learning to being more accessible. Here some learners did not have laptops of sufficient power or storage to download Anaconda and managed their learning on Chrome books or mobile devices such as the iPad.
Having a cloud-based application that ran code in a format mimicking that on campus allowed these MSc learners to keep pace with others, since all they required now was a device, any device and access to WIFI. The processing is done remotely and so the device now becomes a window to view and send instructions.
Screenshots examples from a series of worksheets learners engaged with
Here are some images: each plot, graph or map can be saved separately, and any interactive charts can be saved as an HTML version keeping that interactive element alive. Jupyter Notebook also has the ability to make slides from the notebook; raw code can be hidden leaving the viewer to simply pay attention to the narrative and charts.