Data Analysis

In this exercise we will log in to the website to which we have streamed the data, retrieve the data and start analysing it.


  • Go to one of the web pages below, and click 'sign in'. Use the following details to sign in:

  • Click on Channels then My Channels, then you will be faced with a list of the web pages on which the data is stored. Click on the Public tab for each of the groups, and you will see the full dataset that has been collected - note that it might not look quite like you were expecting.
  • Start thinking about the data, asking yourself questions like:

  • What are the differences between the average values at each site?

  • How has each site responded to the day-night cycle, are there any striking differences?

  • Is each dataset complete? If not, what does this tell you about the nature of collecting real-world data?

  • Does the data vary smoothly through time, or are there any 'spikes'? What might explain these?

  • Can you relate any of the differences or features you have thought about in the above to the different locations/environments that each sensor was deployed in?


  • Now click on the Data Import/Export tab, then below the heading Export, click Download. Repeat this for each of the web pages, so that you have the data from each group.

  • Next open this data up in Microsoft Excel, and copy the data from each group into a single spreadsheet.

  • You will see tht Excel does not initially recognise the time information at time, thsi is beacsue it contains the text UTC. To get rid of this unwanted text, do a find and replace finding UTC (with a space in front of it) and replacing this with and empty field - i.e. with nothing

  • Now try and produce a chart that contains each of the timeseries on a single plot
  • Hint: first produce an empty scatter plot (NOT a line graph), then add each dataset one by one, making sure that the time infromation is on the x-axis, and temperature data is on the y-axis.

  • Now look back at the questons above - can you improve on yoru answers now that you see all of the data together?


  • 2016 Results!


    Well done everyone, you've collected some really exciting data! Below you will find a copy of the summary plot showing the temperatures records compiled from each of the sites:


    terminal

    Optional Homework Exercise

    The aim of the exercises so far has been to show you what computers can do when you are happy to 'hack' them about, and make you think about the process of collecting data plays within scientific research. What we have not done is really dig in to the computer code behind all of this (although you have seen it, and made a simple edit).

    An optional homework exercise, allows you to start learning about basic computer coding (if this is something new to you). You can do this using a program called 'Scratch':



    terminal

    https://scratch.mit.edu/projects/editor/).

    The exercises we want you to work through can be found here http://scratched.gse.harvard.edu/resources/22-scratch-worksheets/p>

    The material in these is perhaps a little patronising in places, but it starts from  the very basics, so are a fantastic resource. 

    Once you have completed all of these (and don't worry if there are a few steps that don't make sense - if you are confused, just have a really careful thing, then move on to the next step), take a look at the script that we gave you to measure temperature data and upload this to the web 'generating_genius.py' You can view or download this here http://people.exeter.ac.uk/ph290/ftp/generating_genius.py.

    Now lots that is going on in this script will not make much sense, but try your very best to work out what is foing on in each step - can you see any features that look like what you saw when using Scratch - perhaps loops and variables? Now try and write down in a list of bullet points what you think the program is doing at each stage,