7. Project

7.1 Sample Project

In this chapter a small project is developed. The project brings together many of the techniques and ideas that have been learned up to this point. The project also illustrates the structure of the term project for this class. The sample presented here gives an example of the form of a good project but is more modest in scope and depth that would represent a really good course project. The major components of the project are:
  • Data Stream: A description of the nature, structure, and content of the big data stream
  • Exploratory Questions: what questions are to be answered by the computational analysis of the data stream? Why are the answers to these questions important? What audience(s) would be interested in the answers to these questions
  • Limitations: A list of factors that limit the generality of the possible conclusions due to the nature of the data or the method of analysis
  • Program Development: A presentation of the algorithms and code developed to perform the computational analysis
  • Visualizations: the informative displays of the output generated by the computational analysis of the data stream
  • Conclusions: the answers to the exploratory questions and the implications of these answers
  • Social Impacts: the identification of stakeholders and the benefits and harms that might affect these stakeholders
  • Acknowledgements: a recognition of the individuals who have contributed in some way to the development of this project
Each of these seven components will be illustrated in the following sections of this chapter.

7.1.1 Data Stream

This project uses the Earthquake data stream. This data stream, made publicly available by the U.S. Geological Survey, contains a record of all earthquakes that are reported throughout the world. An abstraction of an earthquake is captured by a set of properties. The data stream has more properties than are used in this project. The Earthquake data stream is accessible from Python through the earthquakes.py module. This module contains a function get_report that returns information about recent earthquakes based on two parameters: the time period of the reported earthquakes and the threshold of severity level. The time period may be one of “hour”, “day”, “week”, or “month”. The threshold specifies the range of magnitudes included in the report and may be one of “significant”, “all” “4.5”, “2.5”, or “1.0”. The Python code used to obtain the data stream for this project is as follows:
import earthquakes
quakes = earthquakes.get_report('month','all')
This stream consists of all reported earthquakes in the last month of any severity. A map of a complex data structure can be made using the Variable explorer window in Spyder. The map is a pictorial representation of the data that can outline the structure of the data stream and give guidance on how to access the parts of the data of interest. A portion of the Spyder window is shown in the figure below. Properties of various types are defined in the editor window. After the Run button is pushed the display in the Variable explorer window appears as shown. Each of the properties used in the code has an entry in the Variable explorer.
../_images/Python-Spyder-Variable-Explorer1.png

The Variable Explorer in Spyder

7.3 New thing

This project uses the Earthquake data stream. This data stream, made publicly available by the U.S. Geological Survey, contains a record of all earthquakes that are reported throughout the world. An abstraction of an earthquake is captured by a set of properties. The data stream has more properties than are used in this project. The Earthquake data stream is accessible from Python through the earthquakes.py module. This module contains a function get_report that returns information about recent earthquakes based on two parameters: the time period of the reported earthquakes and the threshold of severity level. The time period may be one of “hour”, “day”, “week”, or “month”. The threshold specifies the range of magnitudes included in the report and may be one of “significant”, “all” “4.5”, “2.5”, or “1.0”. The Python code used to obtain the data stream for this project is as follows:
import earthquakes
quakes = earthquakes.get_report('month','all')
This stream consists of all reported earthquakes in the last month of any severity. A map of a complex data structure can be made using the Variable explorer window in Spyder. The map is a pictorial representation of the data that can outline the structure of the data stream and give guidance on how to access the parts of the data of interest. A portion of the Spyder window is shown in the figure below. Properties of various types are defined in the editor window. After the Run button is pushed the display in the Variable explorer window appears as shown. Each of the properties used in the code has an entry in the Variable explorer.
../_images/Python-Spyder-Variable-Explorer1.png

The Variable Explorer in Spyder

Each property shown in the Variable explorer window has four columns. The Name column gives the name of the property. The Type column shows what kind of value the property has. The names used in the Type column are summarized in the following table. Notice that the property name is a str (character string), whole_number is an int, and number is a float.
Type field Meaning
dict a dictionary accessed by keywords
list a list structure accessed by position
str a character string
float a number with a decimal point
int a whole number (without a decimal point)
The Size column in the Variable explore is 1 for simple types (numbers and strings) because these are single values. The size is not the number of characters in the character string or the number of digits in the number. Each of these types of values are considered a single unit. Notice that name, number, andwhole_number are all of size 1. The size of a list is the number of elements in the list. Notice thatnumber_list has a size of 5 because the list is defined with 5 values. The size of a dictionary is the number of key-value pairs in the dictionary. Notice that weather has a size of 3 because it has 3 key-value pairs. A dictionary and a list displayed by the Variable explorer can be expanded to show its contents in detail by double-clicking on the entry in the Variable explorer. For example, the figure below show the result of double-clicking on the number list. A separate window is displayed that shows the details of each element of the list. In this example, all of the elements of the list are int*s and of size 1. Notice that the values of the list elements are exactly the same as those defined in the Python code. When you are done examining the list the window displaying the list can be close by clicking on the *OK button.
../_images/Python-Spyder-Variable-Explorer-List1.png

Expanding a List using the Variable Explorer in Spyder

Similarly, a dictionary can be expanded in the same way to show its contents in more detail. The following figure shows the result of double-clicking on the weather dictionary shown in the Variable explorer. A new window appears to display the contents of the dictionary. Each entry in the window shows a key-value pair. For example, the first row has the key ‘humidity’ and a value of type int. This value is the number 20. Notice that the three key-value pairs shown in the window are the same as the Python code defines. This window can be closed by clicking the OK button.
../_images/Python-Spyder-Variable-Explorer-Dictionary1.png

Expanding a Dictionary using the Variable Explorer in Spyder

Leave a Reply

Your email address will not be published. Required fields are marked *

*