Data Science

List of 10 Best Data Science Books and Descriptions for the Generalist

Data Science is the field of study that handles vast amounts of data using scientific methods, processes, algorithms and systems to find the unseen patterns, derive meaningful information, make business decisions in companies, and also use in non-business institutions. The non-business institutions include industries for Healthcare, Gaming, Image Recognition, Recommendation Systems, Logistics, Fraud Detection (banking and financial institutions), Internet Search, Speech recognition, Targeted Advertising, Airline Route Planning, and Augmented Reality. Data Science is a sub-set of Artificial Intelligence. The data that is used for analysis can come from many different sources and is presented in various formats. Some of the source data may be standardized; others may not be standardized.

To put it in another way, different methodologies are used to gather the data (plural of datum). Then, knowledge (valuable conclusions) is extracted from the assembled data. In the process, after the data is gathered, the research is done on them (data) to obtain new data (results) from which the problems are solved.

Data Science as a (major) discipline exists at the Bachelor’s and Master’s Degree level at the university. However, only few universities in the world offer the Data Science at the Bachelor’s or Master’s Degree. At the Bachelor’s Degree level, the student graduates with a degree in Data Science. This is like a general purpose degree. At the Master’s Degree level, the student leaves with a Post Graduate Degree in Data Science, specializing in Data Analytics, Data Engineering, or as a Data Scientist.

It might surprise the reader and possibly unfortunately, that Machine Learning, Modeling, Statistics, Programming, and Databases are prerequisite knowledge to study the Data Science at the Bachelor’s Degree level despite the fact that they are respected university courses in their own rights, studied in other disciplines at the Bachelor’s Degree level or Master’s level. Notwithstanding, when a student goes to a university to study Data Science at the degree level, all these courses will still be studied, alongside or before the proper courses, for Data Science.

Data Science for Bachelor’s Degree or its specializations like Data Analytics, Data Engineering, or as a Data Scientist are still being developed; though they reached a stage that they are applied in industries after having been studied (in the university). Data Science is a relatively very new discipline, overall.

Remember that you should first be a generalist before becoming a specialist. The distinctions between specialists programs are not yet clear. The distinctions between the generalist and the specialist programs are not clear yet.

Since Data Science is a relatively new discipline, the books prescribed in this document are based on content coverage and not pedagogy (how well the book teaches). And they are for the Bachelor’s Degree (generalist) program. There are different generalist courses.

The List

For more details and possible purchase with credit card, a hyperlink for each of the books is given. Not one of the books cover all the generalist courses.

Essential Math for Data Science: Calculus, Statistics, Probability Theory, and Linear Algebra

Written by: Hadrien Jean

  • Publisher: Hadrien Jean
  • Published Date: After 30 September 2020
  • Language: ‎English
  • No. of Pages: ‎more than 400

The content of this book can be seen as the math course for Data Science. Though it is not recommended to learn Data Science by oneself, a high school graduate who wants to learn Data Science by himself or herself should start with this book.

Content: Calculus; Statistics and Probability; Linear Algebra; Scalars and Vectors; Matrices and Tensors; Span, Linear Dependency, and Space Transformation; Systems of Linear Equations; Eigenvectors and Eigenvalues; Singular Value Decomposition.

A Common-Sense Guide to Data Structures and Algorithms: Level Up Your Core Programming Skills / 2nd Edition

Written by: Jay Wengrow

  • Publisher: Pragmatic Bookshelf
  • Published Date: September 15, 2020
  • Language: ‎English
  • Dimensions: 7.5 x 1.25 x 9.25 inches
  • No. of Pages: ‎508

This book deals with algorithms and data structures which are used in Data Science. Assuming that someone is learning Data Science by himself after graduating from high-school, then this is the next book to read after reading the previous math book. The example programs are given in JavaScript, Python, and Ruby.

Content: Why Data Structures Matter; Why Algorithms Matter; O Yes! Big O Notation; Speeding Up Your Code with Big O; Optimizing Code with and Without Big O; Optimizing for Optimistic Scenarios; Big O in Everyday Code; Blazing Fast Lookup with Hash Tables; Crafting Elegant Code with Stacks and Queues; Recursively Recurse with Recursion; Learning to Write in Recursive; Dynamic Programming; Recursive Algorithms for Speed; Node-Based Data Structures; Speeding Up All the Things with Binary Search Trees; Keeping Your Priorities Straight with Heaps; It Doesn’t Hurt to Trie; Connecting Everything with Graphs; Dealing with Space Constraints; Techniques for Code Optimization

Smarter Data Science: Succeeding with Enterprise-Grade Data and AI Projects / 1st Edition

Written by: Neal Fishman, Cole Stryker, and Grady Booch

  • Publisher: Wiley
  • Published Date: April 14, 2020
  • Language: ‎English
  • No. of Pages: ‎286

Content: Climbing the AI Ladder; Framing Part I: Considerations for Organizations Using AI; Framing Part II: Considerations for Working with Data and AI; A Look Back on Analytics: More Than One Hammer; A Look Forward on Analytics: Not Everything Can Be a Nail; Addressing Operational Disciplines on the AI Ladder; Maximizing the Use of Your Data: Being Value Driven; Valuing Data with Statistical Analysis and Enabling Meaningful Access; Constructing for the Long-Term; A Journey’s End: An IA for AI.

Machine Learning: A Probabilistic Perspective (Adaptive Computation and Machine Learning series) Illustrated Edition

Written by: Kevin P. Murphy

  • Publisher: The MIT Press
  • Published Date: August 24, 2012
  • Language: ‎English
  • Dimensions: 8.25 x 1.79 x 9.27 inches
  • No. of Pages: ‎1104

This book is good for beginners. Again, like all the rest of the books prescribed in this document, this book does not cover everything necessary for the generalist program which, unfortunately, is still not finalized (the specialist programs are also still not finalized). The typical beginner here is a high-school graduate with a pass in mathematics and computer science.

Content: Introduction (Machine learning: what and why?, Unsupervised learning, Some basic concepts in machine learning); Probability; Generative models for discrete data; Gaussian models; Bayesian statistics; Frequentist statistics; Linear regression; Logistic regression; Generalized linear models and the exponential family; Directed graphical models (Bayes nets); Mixture models and the EM algorithm; Latent linear models; Sparse linear models; Kernels; Gaussian processes; Adaptive basis function models; Markov and hidden Markov models; State space models; Undirected graphical models (Markov random fields); Exact inference for graphical models; Variational inference; More variational inference; Monte Carlo inference; Markov chain Monte Carlo (MCMC) inference; Clustering; Graphical model structure learning; Latent variable models for discrete data; Deep learning.

Data Science for Business: What You Need to Know About Data Mining and Data-Analytic Thinking / 1st Edition

Written by: Tom Fawcett and Foster Provost

  • Publisher: O’Reilly Media
  • Published Date: September 17, 2013
  • Language: ‎English
  • Dimensions: 7 x 0.9 x 9.19 inches
  • No. of Pages: ‎413

Content: Data-Analytic Thinking; Business Problems and Data Science Solutions; Introduction to Predictive Modeling: From Correlation to Supervised Segmentation; Fitting a Model to Data; Overfitting and Its Avoidance; Similarity, Neighbors, and Clusters; Decision Analytic Thinking I: What Is a Good Model?; Visualizing Model Performance; Evidence and Probabilities; Representing and Mining Text; Decision Analytic Thinking II: Toward Analytical Engineering; Other Data Science Tasks and Techniques; Data Science and Business Strategy; Conclusion.

Practical Statistics for Data Scientists: 50+ Essential Concepts Using R and Python / 2nd Edition

Written by: Peter Bruce, Andrew Bruce, and Peter Gedeck

  • Publisher: O’Reilly Media
  • Published Date: June 2, 2020
  • Language: ‎English
  • Dimensions: 7 x 0.9 x 9.1 inches
  • No. of Pages: ‎368

Content: Exploratory Data Analysis, Data and Sampling Distributions, Statistical Experiments and Significance Testing, Regression and Prediction, Classification, Statistical Machine Learning, Unsupervised Learning.

The Book of Why: The New Science of Cause and Effect

Written by: Judea Pearl, Dana Mackenzie

  • Publisher: Basic Book
  • Published Date: May 15, 2018
  • Language: ‎English
  • Dimensions: 6.3 x 1.4 x 9.4 inches
  • No. of Pages: ‎432

While many Data Science books use the pure business industry for illustration, this book uses the medical industry and other disciplines for illustration.

Content: Introduction: Mind over Data; The Ladder of Causation; From Buccaneers to Guinea Pigs: The Genesis of Causal Inference; From Evidence to Causes: Reverend Bayes Meets Mr. Holmes; Confounding and Deconfounding: Or, Slaying the Lurking Variable; The Smoke-Filled Debate: Clearing the Air; Paradoxes Galore!; Beyond Adjustment: The Conquest of Mount Intervention; Counterfactuals: Mining Worlds That Could Have Been; Mediation: The Search for a Mechanism; Big Data, Artificial Intelligence, and the Big Questions.

Build a Career in Data Science

Written by: Emily Robinson and Jacqueline Nolis

  • Publisher: Manning
  • Published Date: March 24, 2020
  • Language: ‎English
  • Dimensions: 7.38 x 0.8 x 9.25 inches
  • No. of Pages: ‎354

Content: Getting Started with Data Science; Finding your Data Science Job; Settling into Data Science; Growing in your Data Science role.

Data Science for Dummies / 2nd Edition

Written by: Lillian Pierson

  • Publisher: For Dummies
  • Published Date: March 6, 2017
  • Language: English
  • Dimensions: 7.3 x 1 x 9 inches
  • No. of Pages: ‎384

This book assumes that the reader already has the math and programming pre-required knowledge.

Content:  Wrapping Your Head around Data Science; Exploring Data Engineering Pipelines and Infrastructure; Applying Data-Driven Insights to Business and Industry; Machine Learning: Learning from Data with Your Machine; Math, Probability, and Statistical Modeling; Using Clustering to Subdivide Data; Modeling with Instances; Building Models That Operate Internet-of-Things Devices; Following the Principles of Data Visualization Design; Using D3.js for Data Visualization; Web-Based Applications for Visualization Design; Exploring Best Practices in Dashboard Design; Making Maps from Spatial Data; Using Python for Data Science; Using Open Source R for Data Science; Using SQL in Data Science; Doing Data Science with Excel and Knime; Data Science in Journalism: Nailing Down the Five Ws (and an H); Delving into Environmental Data Science; Data Science for Driving Growth in E-Commerce; Using Data Science to Describe and Predict Criminal Activity; Ten Phenomenal Resources for Open Data; Ten Free Data Science Tools and Applications.

Mining of Massive Datasets / 3rd Edition

Written by: Jure Leskovec, Anand Rajaraman, Jeffrey David Ullman

  • Publisher: Cambridge University Press
  • Published Date: February 13, 2020
  • Language: English
  • Dimensions: 7 x 1 x 9.75 inches
  • No. of Pages: ‎565

This book also assumes that the reader already has the math and programming pre-required knowledge.

Content: Data Mining; MapReduce and the New Software Stack; Algorithms Using MapReduce; Finding Similar Items; Mining Data Streams; Link Analysis; Frequent Itemsets; Clustering; Advertising on the Web; Recommendation Systems; Mining Social-Network Graphs; Dimensionality Reduction; Large-Scale Machine Learning.


The distinctions between specialists programs are not yet clear. The distinctions between the generalist and specialist programs are also not yet clear. However, after reading the given list of books, the reader will be in the position to better appreciate the special roles of data analyst, data engineering and data scientist, and then move forward.

About the author

Chrysanthus Forcha

Discoverer of mathematics Integration from First Principles and related series. Master’s Degree in Technical Education, specializing in Electronics and Computer Software. BSc Electronics. I also have knowledge and experience at the Master’s level in Computing and Telecommunications. Out of 20,000 writers, I was the 37th best writer at I have been working in these fields for more than 10 years.