Skip to main content

Handbook of Statistical Analysis

AI and ML Applications

  • 3rd Edition - September 16, 2024
  • Latest edition
  • Authors: Robert Nisbet, Gary D. Miner, Keith McCormick
  • Language: English

Handbook of Statistical Analysis: AI and ML Applications, third edition, is a comprehensive introduction to all stages of data analysis, data preparation, model building, and model… Read more

Data Mining & ML

Unlock the cutting edge

Up to 20% on trusted resources. Build expertise with data mining, ML methods.

Description

Handbook of Statistical Analysis: AI and ML Applications, third edition, is a comprehensive introduction to all stages of data analysis, data preparation, model building, and model evaluation. This valuable resource is useful to students and professionals across a variety of fields and settings: business analysts, scientists, engineers, and researchers in academia and industry. General descriptions of algorithms together with case studies help readers understand technical and business problems, weigh the strengths and weaknesses of modern data analysis algorithms, and employ the right analytical methods for practical application.

This resource is an ideal guide for users who want to address massive and complex datasets with many standard analytical approaches and be able to evaluate analyses and solutions objectively. It includes clear, intuitive explanations of the principles and tools for solving problems using modern analytic techniques; offers accessible tutorials; and discusses their application to real-world problems.

Key features

  • Brings together, in a single resource, all the information a beginner needs to understand the tools and issues in data analytics to build successful predictive analytic solutions
  • Provides in-depth descriptions and directions for performing many data preparation operations necessary to generate data sets in the proper form and format for submission to modeling algorithms
  • Features clear, intuitive explanations of standard analytical tools and techniques and their practical applications
  • Provides a number of case studies to guide practitioners in the design of analytical applications to solve real-world problems in their data domain
  • Offers valuable tutorials on the book webpage with step-by-step instructions on how to use suggested tools to build models
  • Provides predictive insights into the rapidly expanding “Intelligence Age” as it takes over from the “Information Age,” enabling readers to easily transition the book’s content into the tools of the future

Readership

Advanced students in undergraduate & graduate programs taking courses on Statistical Analysis & Data Mining, Researchers and academics across math, engineering, physical and life sciences, who require advanced coverage or refresher to the subject

Table of contents

Part I – Introduction

1. Historical Background to Analytics

2. Theory

3. Data Mining and Predictive Analytic Process

4. Data Science Tool Types: Which one is Best?

Part II - Data Preparation

5. Data Access

6. Data Understanding

7. Data Visualization

8. Data Cleaning

9. Data Conditioning

10. Feature Engineering

11. Feature Selection

12. Data Preparation Cookbook

Part III – Modeling


13. Algorithms

14. Modeling

15. Model Evaluation and Enhancement

16. Ensembles & Complexity

17. Deep Learning vs. Traditional ML

18. Explainable AI (XAI) put after Deep Learning

19. Human in the Loop

Part IV - Applications

20. GENERAL OVERVIEW of an Application - Healthcare Delivery and Medical Informatics

21. Specific Application: Business: Customer Response

22. Specific Application: Education: Learning Analytics

23. Specific Application: Medical Informatics: Colon Cancer Screening

24. Specific Application: Financial: Credit Risk

25. Specific FUTURE Application: The ‘INTELLIGENCE AGE (Revolution)’: LLMs like ChatGPT - Tiny ML - H.U.M.A.N.E. - Etc.

Part V – Right Models – Luck - & Ethics of Analytics

26. Right Model for the Right Use

27. Ethics in Data Science

28. Significance of Luck

Part VI - Tutorials and Case Studies
Tutorial A Example of Data Mining Recipes Using Statistica Data Miner 13
Tutorial B Analysis of Hurricane Data (Hurrdata.sta) Using the Statistica Data Miner 13
Tutorial C Predicting Student Success at High-Stakes Nursing Examinations (NCLEX) Using SPSS Modeler and Statistica Data Miner 13
Tutorial D Constructing a Histogram Using MidWest Company Personality Data Using KNIME
Tutorial E Feature Selection Using KNIME
Tutorial F Medical/Business Tutorial Using Statistica Data Miner 13
Tutorial G A KNIME Exercise, Using Alzheimer’s Training Data of Tutorial F (RAN note: This tutorial refers to the data used in Tutorial I, and it should be changed to refer to Tutorial F. I propose a new title: Tutorial G Medical/Business Tutorial with Tutorial F Data Using KNIME.
Tutorial H Data Prep 1-1: Merging Data Sources Using KNIME
Tutorial I Data Prep 1–2: Data Description Using KNIME
Tutorial J Data Prep 2-1: Data Cleaning and Recoding Using KNIME
Tutorial K Data Prep 2-2: Dummy Coding Category Variables Using KNIME
Tutorial L Data Prep 2-3: Outlier Handling Using KNIME
Tutorial M Data Prep 3-1: Filling Missing Values With Constants Using KNIME
Tutorial N Data Prep 3-2: Filling Missing Values With Formulas Using KNIME
Tutorial O Data Prep 3-3: Filling Missing Values With a Model Using KNIME

Back Matter:
Appendix-A – Listing of TUTORIALS and other RESOUCES on this book’s COMPANION WEB PAGE
Appendix B – Instructions on HOW TO USE this book’s COMPANION WEB PAGE

Product details

  • Edition: 3
  • Latest edition
  • Published: September 16, 2024
  • Language: English

About the authors

RN

Robert Nisbet

Bob Nisbet, PhD, is a Data Scientist, currently modeling precancerous colon polyp presence with clinical data at the UC-Irvine Medical Center. He has experience in predictive modeling in Telecommunications, Insurance, Credit, Banking. His academic experience includes teaching in Ecology and in Data Science. His industrial experience includes predictive modeling at AT&T, NCR, and FICO. He has worked also in Insurance, Credit, membership organizations (e.g. AAA), Education, and Health Care industries. He retired as an Assistant Vice President of Santa Barbara Bank & Trust in charge of business intelligence reporting and customer relationship management (CRM) modeling.
Affiliations and expertise
Researcher-Medical Informatics, H.H. Chao Comprehensive Digestive Disease Center, University of California Irvine Medical Center, Private Consulting, Santa Barbara, CA, USA

GM

Gary D. Miner

Dr. Gary Miner PhD received a B.S. from Hamline University, St. Paul, MN, with biology, chemistry, and education majors; an M.S. in zoology and population genetics from the University of Wyoming; and a Ph.D. in biochemical genetics from the University of Kansas as the recipient of a NASA pre-doctoral fellowship. He pursued additional National Institutes of Health postdoctoral studies at the U of Minnesota and U of Iowa eventually becoming immersed in the study of affective disorders and Alzheimer's disease. In 1985, he and his wife, Dr. Linda Winters-Miner, founded the Familial Alzheimer's Disease Research Foundation, which became a leading force in organizing both local and international scientific meetings, bringing together all the leaders in the field of genetics of Alzheimer's from several countries, resulting in the first major book on the genetics of Alzheimer’s disease. In the mid-1990s, Dr. Miner turned his data analysis interests to the business world, joining the team at StatSoft and deciding to specialize in data mining. He started developing what eventually became the Handbook of Statistical Analysis and Data Mining Applications (co-authored with Drs. Robert A. Nisbet and John Elder), which received the 2009 American Publishers Award for Professional and Scholarly Excellence (PROSE). Their follow-up collaboration, Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications, also received a PROSE award in February of 2013. Gary was also co-author of “Practical Predictive Analytics and Decisioning Systems for Medicine (Academic Press, 2015). Overall, Dr. Miner’s career has focused on medicine and health issues, and the use of data analytics (statistics and predictive analytics) in analyzing medical data to decipher fact from fiction. Gary has also served as Merit Reviewer for PCORI (Patient Centered Outcomes Research Institute) that awards grants for predictive analytics research into the comparative effectiveness and heterogeneous treatment effects of medical interventions including drugs among different genetic groups of patients; additionally he teaches on-line classes in ‘Introduction to Predictive Analytics’, ‘Text Analytics’, ‘Risk Analytics’, and ‘Healthcare Predictive Analytics’ for the University of California-Irvine. Recently, until ‘official retirement’ 18 months ago, he spent most of his time in his primary role as Senior Analyst-Healthcare Applications Specialist for Dell | Information Management Group, Dell Software (through Dell’s acquisition of StatSoft (www.StatSoft.com) in April 2014). Currently Gary is working on two new short popular books on ‘Healthcare Solutions for the USA’ and ‘Patient-Doctor Genomics Stories’.
Affiliations and expertise
CEO, M&M Predictive Analytics LLC; UCI Adjunct Professor for Continuing Education, Predictive Analytics Program; Associate Editor, The Journal of Geriatric Psychiatry and Neurology; Private Consulting, Tulsa, OK, USA

KM

Keith McCormick

Keith McCormick is a highly accomplished professional consultant, mentor, and trainer, having served as keynote and moderator at international conferences focused on analytic practitioners and leadership alike. Keith has leveraged statistical software since 1990 along with deep expertise utilizing popular industry advanced analytics solutions such as IBM SPSS Statistics, IBM SPSS Modeler, KNIME, popular open-source and other tools involving text and big data analytics. Keith is currently Data Science Principal with Further.
Affiliations and expertise
Data Science Principal, Further

View book on ScienceDirect

Read Handbook of Statistical Analysis on ScienceDirect