Job Recruitment Website - Job seeking and recruitment - What is the difference between data mining and data analysis?

What is the difference between data mining and data analysis?

1. Data mining

Data mining refers to mining unknown and valuable information from a large amount of data through methods such as statistics, artificial intelligence, and machine learning. and knowledge process. Data mining mainly focuses on solving four types of problems: classification, clustering, association and prediction, which are quantitative and qualitative. The focus of data mining is to find unknown patterns and laws. Output models or rules, and get model scores or labels accordingly. Model scores include churn probability value, sum score, similarity, prediction value, etc. Labels include high, medium and low value users, churn and non-churn, good, medium and poor credit, etc. Decision trees, neural networks, association rules, cluster analysis and other statistics, artificial intelligence, machine learning and other methods are mainly used for mining. Taken together, the essence of data analysis (narrow sense) and data mining are the same, which is to discover business knowledge (valuable information) from data, thereby helping business operations, improving products, and helping companies make better decisions. , so data analysis (narrow sense) and data mining constitute broad data analysis. These contents and data analysis are all different.

2. Data analysis

In fact, we can say that data analysis is a means of operating data, or an algorithm. The goal is to organize, filter, and process data based on a priori constraints to obtain information. Data mining is the value analysis of information obtained through data analysis methods. Data analysis and data mining are even recursive. That is, the result of data analysis is information. This information is used as data and is mined by data. Data mining uses data analysis methods again and again. It can be seen that the difference between data analysis and data mining is still very obvious.

The specific difference between the two is:

(In fact, the scope of data analysis is wide, including data mining. The difference here mainly refers to statistical analysis)

Amount of data: The amount of data for data analysis may not be large, but the amount of data for data mining is huge.

Constraints: Data analysis starts from a hypothesis, and you need to build an equation or model by yourself to fit the hypothesis. However, data mining does not require assumptions and can automatically establish equations.

Object: Data analysis is often aimed at digital data, while data mining can use different types of data, such as sound, text, etc.

On the results: Data analysis explains the results and presents effective information. The results of data mining are not easy to interpret. It evaluates the value of the information, focuses on predicting the future, and makes decision-making suggestions.

Data analysis is a tool that turns data into information, and data mining is a tool that turns information into cognition. If we want to extract certain patterns (ie cognition) from the data, data analysis is often required Used in conjunction with data mining.

Give me an example: You go to the wet market to buy vegetables with 50 yuan in your pocket. For the dazzling array of chicken, duck, fish, pork and various vegetables, you want to mix meat and vegetables. You ask the prices one by one and continue to conduct statistical analysis. How much meat, how many vegetables can each buy, and how long can it last for? A set of information is drawn in the mind. This is data analysis. When it comes to making a choice, you need to evaluate the value of this information and conduct value analysis on this information based on your preferences, nutritional value, scientific combinations, meal time plans, the most cost-effective combination, etc. Finalizing a purchase plan is data mining.

The combination of data analysis and data mining can finally be implemented to maximize the usefulness of data.