what is data mining

1 year ago 58
Nature

Data mining is the process of extracting and discovering patterns in large data sets using methods at the intersection of machine learning, statistics, and database systems. It is an interdisciplinary subfield of computer science and statistics with the overall goal of extracting information from a data set and transforming it into a comprehensible structure for further use. Data mining is the analysis step of the "knowledge discovery in databases" process, or KDD, which involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating.

Data mining is used by companies to learn more about their customers, develop more effective marketing strategies, increase sales, and decrease costs. It is a crucial component of successful analytics initiatives in organizations, and the information it generates can be used in business intelligence (BI) and advanced analytics. Data mining can be used for everything from learning about what customers are interested in or want to buy to fraud detection and spam filtering. Social media companies use data mining techniques to commodify their users in order to generate profit.

Data mining is often perceived as a challenging process to grasp, but it is not as difficult as it sounds. It is a key part of data analytics overall and one of the core disciplines in data science, which uses advanced analytics techniques to find useful information in data sets.