Understanding data mining

Understanding data mining: Extracting, organizing, and analyzing large sets of data
Large sets of data, accessible through new technology, are paramount to forecasting trends in business and economics. In Algebra I, students typically study data sets with one predictor variable and one response variable. But in the real world, most response variables have numerous predictors, which may significantly impact the data. It is important to be able to identify their effects and use them appropriately to make sound, valid predictions.
As a result of this unit, mathematics students in grades nine through twelve will be able to extract useful information from large sets of data that represent multiple disciplines. With these real-world applications, students will analyze data and use their findings to make predictions and to provide solutions to problems.
These lessons have been designed to help Algebra I students navigate the basics of data mining, and then learn to determine which variables are most influential in a given situation. Students will also use R statistical software, available online for free, to help with variable selection.
Each lesson in this unit is aligned to the North Carolina Standard Course of Study. In addition to those objectives, the following principles and standards of the National Council of Teachers of Mathematics are supported:
- Use mathematical models to represent and understand quantitative relationships
- Formulate questions that can be addressed with data and collect, organize, and display relevant data to answer them
- Select and use appropriate statistical methods to analyze data
- Develop and evaluate inferences and predictions that are based on data
- Build new mathematical knowledge through problem solving
- Recognize and apply mathematics in contexts outside of mathematics