Data Mining Assignment

In: Computers and Technology

Submitted By moer1990
Words 316
Pages 2
1. Briefly describe th emajor differences between data mining and statistics. a. Statistics is user driven, while data mining is data driven. b. In statistics, there exist underlying theory about certain relationships in data. While in data mining, there is often no pre-existing theory. c. In statistics, users use statistical methods to testify the hypothesis among data. While in data mining, users often use different techniques to examine data and uncover unknown relationships.

2. What can an organization do to deal with data problems such as missing data and outliers?
Missing data: a. Ignore the tuple b. Fill in the missing value manually c. Use a proxy variable with no missing values.
Outliers:
a. Delete rows. b. Recode c. Transform variables.

3. In a data mining exercise, a data set is usually partitioned into training, validation, and test data. Briefly describe the roles assumed by these partitions. a. Training data: used to build and fit models b. Validation data: used to monitor and fine-tune the model to improve its generalization. Tuning involves selecting competing models and optimize the selected model based on validation data. c. Test data: used to test the performance of model m unbiased assessment.

4. Which takes four possible values: freshman, sophomore, junior, and senior.
Recode the variable: replace freshman with 1, replace sophomore with 2, replace junior with 3, replace senior with 4. 5. Data cleansing a. I select the whole Comment table, and insert a pivot table in a new sheet. Then I summarize the Comment table b. I check each row of thread ID in pivot table to find out whether this specific thread ID is contained in Thread table by using Find c. I use red color to highlight the thread IDs which are not find in Thread table. d. I make a copy of Comment…...

Similar Documents

Data Mining

...Data Mining 0. Abstract With the development of different fields, artificial intelligence, machine learning, statistic, database, pattern recognition and neurocomputing they merge to a newly technology, the data mining. The ultimate goal of data mining is to obtain knowledge from the large database. It helps to discover previously unknown patterns, most of the time it is followed by deeper manual evaluation to explain and correlate the results to establish a new knowledge. It is often practically used by government, bank, insurance company and medical researcher. A general basic idea of data mining would be introduced. In this article, they are divided into four types, predictive modeling, database segmentation, link analysis and deviation detection. A brief introduction will explain the variation among them. For the next part, current privacy, ethical as well as technical issue regarding data mining will be discussed. Besides, the future development trends, especially concept of the developing sport data mining is written. Last but not the least different views on data mining including the good side, the drawback and our views are integrated into the paragraph. 1. Introduction This century, is the age of digital world. We are no longer able to live without the computing technology. Due to information explosion, we are having difficulty to obtain knowledge from large amount of unorganized data. One of the solutions, Knowledge Discovery in Database (KDD) is......

Words: 1700 - Pages: 7

Data Mining

...[pic] Data Mining Assignment 4 [pic] “Data mining software is one of a number of analytical tools for analyzing data (Data Mining, para. 1).” We will be learning about the competitive advantage, reliability of such tool, and privacy concerns towards consumers. Data mining tool is used by majority of companies to increase revenue, and build on the relationship with current consumers. Let’s explore the world of data mining technology in the following selection. “Data mining is primarily used today by companies with a strong consumer focus - retail, financial, communication, and marketing organizations. It enables these companies to determine relationships among "internal" factors such as price, product positioning, or staff skills, and "external" factors such as economic indicators, competition, and customer demographics. And, it enables them to determine the impact on sales, customer satisfaction, and corporate profits. Finally, it enables them to "drill down" into summary information to view detail transactional data (Data Mining, para. 7).” Data mining is implemented online to promote business ideas, products, and other ways to market them. Data mining is used in political websites, when you go to some sites they take your information then, they began to send you things to promote the Republicans and Democrats message. This is how your voice counts. “Companies have used powerful computers to sift through volumes of supermarket scanner data and analyze market......

Words: 1183 - Pages: 5

Data Mining

...above to submit your assignment. Students, please view the "Submit a Clickable Rubric Assignment" in the Student Center. Instructors, training on how to grade is within the Instructor Center. Assignment 4: Data Mining Due Week 9 and worth 75 points The development of complex algorithms that can mine mounds of data that have been collected from people and digital devices have led to the adoption of data mining by most businesses as a means of understanding their customers better than before. Data mining takes place in retailing and sales, banking, education, manufacturing and production, health care, insurance, broadcasting, marketing, customer services, and a number of other areas. The analytical information gathered by data-mining applications has given some businesses a competitive advantage, an ability to make informed decisions, and better ways to predict the behavior of customers. Write a four to five (4-5) page paper in which you: Determine the benefits of data mining to the businesses when employing: Predictive analytics to understand the behavior of customers Associations discovery in products sold to customers Web mining to discover business intelligence from Web customers Clustering to find related customer information Assess the reliability of the data mining algorithms. Decide if they can be trusted and predict the errors they are likely to produce. Analyze privacy concerns raised by the collection of personal data for mining purposes.......

Words: 493 - Pages: 2

Data Mining

...Data mining is an iterative process of selecting, exploring and modeling large amounts of data to identify meaningful, logical patterns and relationships among key variables.  Data mining is used to uncover trends, predict future events and assess the merits of various courses of action.             When employing, predictive analytics and data mining can make marketing more efficient. There are many techniques and methods, including business intelligence data collection. Predictive analytics is using business intelligence data for forecasting and modeling. It is a way to use predictive analysis data to predict future patterns. It is used widely in the insurance, medical and credit industries. Assessment of credit, and assignment of a credit score is probably the most widely known use of predictive analytics. Using events of the past, managers are able to estimate the likelihood of future events. Data mining aids predictive analysis by providing a record of the past that can be analyzed and used to predict which customers are most likely to renew, purchase, or purchase related products and services. Business intelligence data mining is important to your marketing campaigns. Proper data mining algorithms and predictive modeling can narrow your target audience and allow you to tailor your ads to each online customer as he or she navigates your site. Your marketing team will have the opportunity to develop multiple advertisements based on the past clicks of your visitors.......

Words: 1136 - Pages: 5

Data Mining

...Running Head: DATA MINING Assignment 4: Data Mining Submitted by: Submitted to: Course: Introduction Data Mining is also called as Knowledge Discovery in Databases (KDD). It is a powerful technology which has great potential in helping companies to focus on the most important information they have in their data base. Due to the increased use of technologies, interest in data mining has increased speedily. Data mining can be used to predict future behavior rather than focus on past events. This is done by focusing on existing information that may be stored in their data warehouse or information warehouse. Companies are now utilizing data mining techniques to assess their database for trends, relationships, and outcomes to improve their overall operations and discover new ways that may permit them to improve their customer services. Data mining provides multiple benefits to government, businesses, society as well as individual persons (Data Mining, 2011). Benefits of data mining to the businesses when employing Advantages of data mining from business point of view is that large sizes of apparently pointless information have been filtered into important and valuable business information to the company, which could be stored in data warehouses. While in the past, the responsibility was on marketing utilities and services, products, the center of attention is now on customers- their choices, preferences, dislikes and likes, and possibly data mining is one of the most important......

Words: 1302 - Pages: 6

Data Mining

...A Statistical Perspective on Data Mining Ranjan Maitra∗ Abstract Technological advances have led to new and automated data collection methods. Datasets once at a premium are often plentiful nowadays and sometimes indeed massive. A new breed of challenges are thus presented – primary among them is the need for methodology to analyze such masses of data with a view to understanding complex phenomena and relationships. Such capability is provided by data mining which combines core statistical techniques with those from machine intelligence. This article reviews the current state of the discipline from a statistician’s perspective, illustrates issues with real-life examples, discusses the connections with statistics, the differences, the failings and the challenges ahead. 1 Introduction The information age has been matched by an explosion of data. This surfeit has been a result of modern, improved and, in many cases, automated methods for both data collection and storage. For instance, many stores tag their items with a product-specific bar code, which is scanned in when the corresponding item is bought. This automatically creates a gigantic repository of information on products and product combinations sold. Similar databases are also created by automated book-keeping, digital communication tools or by remote sensing satellites, and aided by the availability of affordable and effective storage mechanisms – magnetic tapes, data warehouses and so on. This has created a......

Words: 22784 - Pages: 92

Data Mining

...Assignment 4: Data Mining CIS 500 Professor: Dr. Edwin Otto Strayer University August 30, 2013 “Data mining is a process that uses statistical, mathematical, artificial intelligence, and machine learning techniques to extract and identify useful information and subsequent knowledge from large databases, including data warehouses” (Turban, 2011). Predictive analytics serves as a benefit of data mining because it’s a process that uses machine learning to analyze data and make predictions. This can be beneficial to a business because it can be helpful in understanding the behavior of customers. A good example of this would be a business using predictive analytics to decide what level of pricing should be used in correlation with sales information. A business could look at historical data for products, sales, and customers to determine the price for a given product and customer at the right time. Amazon is a heavy user of predictive pricing (Mehra, 2013). This technique is also used in Supply Chain Management because it helps you to understand consumer demand to manage the overall process. This includes delivery, returns, forecasting, sourcing, planning, and order fulfillment. The advantage is if a retailer can predict revenue from a specific product in a reasonable amount of time it will result in better inventory management, use of space, cash flow, and the elimination of out of stock items. Association discovery in products sold to customers is used......

Words: 1499 - Pages: 6

Cis 500 Assignment 4 Data Mining

...CIS 500 Assignment 4 Data Mining homeworktimes.com/downloads/cis-500-assignment-4-data-mining/ For More Tutorial Visit: http://homeworktimes.com/ For any Information Email Us: Uopguides@gmail.com The development of complex algorithms that can mine mounds of data that have been collected from people and digital devices have led to the adoption of data mining by most businesses as a means of understanding their customers better than before. Data mining takes place in retailing and sales, banking, education, manufacturing and production, health care, insurance, broadcasting, marketing, customer services, and a number of other areas. The analytical information gathered by datamining applications has given some businesses a competitive advantage, an ability to make informed decisions, and better ways to predict the behavior of customers. Write a four to five (45) page paper in which you 1. Determine the benefits of data mining to the businesses when employing 1. Predictive analytics to understand the behavior of customers 2. Associations discovery in products sold to customers 3. Web mining to discover business intelligence from Web customers 4. Clustering to find related customer information 2. Assess the reliability of the data mining algorithms. Decide if they can be trusted and predict the errors they are likely to produce. 3. Analyze privacy concerns raised by the collection of personal data for mining purposes. 1. Choose and describe three (3)......

Words: 457 - Pages: 2

Cis 500 Assignment 4 Data Mining

...CIS 500 Assignment 4 Data Mining homeworktimes.com/downloads/cis-500-assignment-4-data-mining/ For More Tutorial Visit: http://homeworktimes.com/ For any Information Email Us: Uopguides@gmail.com The development of complex algorithms that can mine mounds of data that have been collected from people and digital devices have led to the adoption of data mining by most businesses as a means of understanding their customers better than before. Data mining takes place in retailing and sales, banking, education, manufacturing and production, health care, insurance, broadcasting, marketing, customer services, and a number of other areas. The analytical information gathered by datamining applications has given some businesses a competitive advantage, an ability to make informed decisions, and better ways to predict the behavior of customers. Write a four to five (45) page paper in which you 1. Determine the benefits of data mining to the businesses when employing 1. Predictive analytics to understand the behavior of customers 2. Associations discovery in products sold to customers 3. Web mining to discover business intelligence from Web customers 4. Clustering to find related customer information 2. Assess the reliability of the data mining algorithms. Decide if they can be trusted and predict the errors they are likely to produce. 3. Analyze privacy concerns raised by the collection of personal data for mining purposes. 1. Choose and describe three (3) concerns......

Words: 457 - Pages: 2

Cis 500 Assignment 4 Data Mining

...CIS 500 Assignment 4 Data Mining homeworktimes.com/downloads/cis-500-assignment-4-data-mining/ For More Tutorial Visit: http://homeworktimes.com/ The development of complex algorithms that can mine mounds of data that have been collected from people and digital devices have led to the adoption of data mining by most businesses as a means of understanding their customers better than before. Data mining takes place in retailing and sales, banking, education, manufacturing and production, health care, insurance, broadcasting, marketing, customer services, and a number of other areas. The analytical information gathered by datamining applications has given some businesses a competitive advantage, an ability to make informed decisions, and better ways to predict the behavior of customers. Write a four to five (45) page paper in which you 1. Determine the benefits of data mining to the businesses when employing 1. Predictive analytics to understand the behavior of customers 2. Associations discovery in products sold to customers 3. Web mining to discover business intelligence from Web customers 4. Clustering to find related customer information 2. Assess the reliability of the data mining algorithms. Decide if they can be trusted and predict the errors they are likely to produce. 3. Analyze privacy concerns raised by the collection of personal data for mining purposes. 1. Choose and describe three (3) concerns raised by consumers. 2. Decide if each of these......

Words: 449 - Pages: 2

Cis 500 Assignment 4 Data Mining

...CIS 500 Assignment 4 Data Mining homeworktimes.com/downloads/cis-500-assignment-4-data-mining/ For More Tutorial Visit: http://homeworktimes.com/ The development of complex algorithms that can mine mounds of data that have been collected from people and digital devices have led to the adoption of data mining by most businesses as a means of understanding their customers better than before. Data mining takes place in retailing and sales, banking, education, manufacturing and production, health care, insurance, broadcasting, marketing, customer services, and a number of other areas. The analytical information gathered by datamining applications has given some businesses a competitive advantage, an ability to make informed decisions, and better ways to predict the behavior of customers. Write a four to five (45) page paper in which you 1. Determine the benefits of data mining to the businesses when employing 1. Predictive analytics to understand the behavior of customers 2. Associations discovery in products sold to customers 3. Web mining to discover business intelligence from Web customers 4. Clustering to find related customer information 2. Assess the reliability of the data mining algorithms. Decide if they can be trusted and predict the errors they are likely to produce. 3. Analyze privacy concerns raised by the collection of personal data for mining purposes. 1. Choose and describe three (3) concerns raised by consumers. 2. Decide if each of these......

Words: 449 - Pages: 2

Cis 500 Assignment 4 Data Mining

...CIS 500 Assignment 4 Data Mining homeworktimes.com/downloads/cis-500-assignment-4-data-mining/ For More Tutorial Visit: http://homeworktimes.com/ The development of complex algorithms that can mine mounds of data that have been collected from people and digital devices have led to the adoption of data mining by most businesses as a means of understanding their customers better than before. Data mining takes place in retailing and sales, banking, education, manufacturing and production, health care, insurance, broadcasting, marketing, customer services, and a number of other areas. The analytical information gathered by datamining applications has given some businesses a competitive advantage, an ability to make informed decisions, and better ways to predict the behavior of customers. Write a four to five (45) page paper in which you 1. Determine the benefits of data mining to the businesses when employing 1. Predictive analytics to understand the behavior of customers 2. Associations discovery in products sold to customers 3. Web mining to discover business intelligence from Web customers 4. Clustering to find related customer information 2. Assess the reliability of the data mining algorithms. Decide if they can be trusted and predict the errors they are likely to produce. 3. Analyze privacy concerns raised by the collection of personal data for mining purposes. 1. Choose and describe three (3) concerns raised by consumers. 2. Decide if each of these......

Words: 449 - Pages: 2

Cis 500 Assignment 4 Data Mining

...CIS 500 Assignment 4 Data Mining Click link Below To Download: http://strtutorials.com/CIS-500-Assignment-4-Data-Mining-CIS5004.htm The development of complex algorithms that can mine mounds of data that have been collected from people and digital devices have led to the adoption of data mining by most businesses as a means of understanding their customers better than before. Data mining takes place in retailing and sales, banking, education, manufacturing and production, health care, insurance, broadcasting, marketing, customer services, and a number of other areas. The analytical information gathered by data mining applications has given some businesses a competitive advantage, an ability to make informed decisions, and better ways to predict the behavior of customers. Write a four to five (4-5) page paper in which you Determine the benefits of data mining to the businesses when employing Predictive analytics to understand the behavior of customers Associations discovery in products sold to customers Web mining to discover business intelligence from Web customers Clustering to find related customer information Assess the reliability of the data mining algorithms. Decide if they can be trusted and predict the errors they are likely to produce. Analyze privacy concerns raised by the collection of personal data for mining purposes. Choose and describe three (3) concerns raised by consumers. Decide if each of these concerns is valid and explain your......

Words: 366 - Pages: 2

Cis 500 Assignment 4 Data Mining

...CIS 500 Assignment 4 Data Mining Click link Below To Download: http://strtutorials.com/CIS-500-Assignment-4-Data-Mining-CIS5004.htm The development of complex algorithms that can mine mounds of data that have been collected from people and digital devices have led to the adoption of data mining by most businesses as a means of understanding their customers better than before. Data mining takes place in retailing and sales, banking, education, manufacturing and production, health care, insurance, broadcasting, marketing, customer services, and a number of other areas. The analytical information gathered by data mining applications has given some businesses a competitive advantage, an ability to make informed decisions, and better ways to predict the behavior of customers. Write a four to five (4-5) page paper in which you Determine the benefits of data mining to the businesses when employing Predictive analytics to understand the behavior of customers Associations discovery in products sold to customers Web mining to discover business intelligence from Web customers Clustering to find related customer information Assess the reliability of the data mining algorithms. Decide if they can be trusted and predict the errors they are likely to produce. Analyze privacy concerns raised by the collection of personal data for mining purposes. Choose and describe three (3) concerns raised by consumers. Decide if each of these concerns is valid and explain your......

Words: 366 - Pages: 2

Cis 500 Assignment 4 Data Mining

...CIS 500 Assignment 4 Data Mining To Buy This material Click below link http://www.uoptutors.com/cis-500-stayer/cis-500-assignment-4-data-mining The development of complex algorithms that can mine mounds of data that have been collected from people and digital devices have led to the adoption of data mining by most businesses as a means of understanding their customers better than before. Data mining takes place in retailing and sales, banking, education, manufacturing and production, health care, insurance, broadcasting, marketing, customer services, and a number of other areas. The analytical information gathered by datamining applications has given some businesses a competitive advantage, an ability to make informed decisions, and better ways to predict the behavior of customers. Write a four to five (45) page paper in which you Determine the benefits of data mining to the businesses when employing Predictive analytics to understand the behavior of customers Associations discovery in products sold to customers Web mining to discover business intelligence from Web customers Clustering to find related customer information Assess the reliability of the data mining algorithms. Decide if they can be trusted and predict the errors they are likely to produce. Analyze privacy concerns raised by the collection of personal data for mining purposes. Choose and describe three (3) concerns raised by consumers. Decide if each of these concerns is valid and explain your......

Words: 447 - Pages: 2

DC’s Legends of Tomorrow | One Piece 320 The Ultimate Attack Force | Facebook