Trends of Data Science and Applications | 2021

Applicational Statistics in Data Science and Machine Learning

 
 

Abstract


In the domain of data science and machine learning, statistics plays a huge role. When it comes to gaining insights and building quality features out of the data to train any model, statistical tools and techniques along with the concepts of exploratory data analysis assist in doing the same. A data scientist or data analyst is incomplete without the knowledge of statistics because this is the building block of a machine learning or deep learning model which has learned or needs to learn trends and patterns from the features which were built by analysing the data end-to-end, be it in any tabular form or in picture format or video format. Also, as it covers a lot many concepts under statistics like variables, sampling, correlation, outlier treatment and much more, this chapter solely aims to take the reader to a tour of applicational statistics and how it can be combined with exploratory data analysis to easily work on data science and machine learning. Also, data analysis and machine learning are domains that are experiment heavy and need correct statistical methods for correct inferencing. Hence, for these experiments, the different statistical methods in place are discussed here in detail. There are different languages like Python, MATLAB, R and much more which have libraries for statistical mathematics and make simple API calls to do the required experiments within any dataset.

Volume 954
Pages 49 - 90
DOI 10.1007/978-981-33-6815-6_4
Language English
Journal Trends of Data Science and Applications

Full Text