Mathematical problems in data science : theoretical and practical methods / Li M. Chen, Zhixun Su, Bo Jiang.

This book describes current problems in data science and Big Data. Key topics are data classification, Graph Cut, the Laplacian Matrix, Google Page Rank, efficient algorithms, hardness of problems, different types of big data, geometric data structures, topological data processing, and various learn...

Full description

Saved in:
Bibliographic Details
Online Access: Full Text (via Skillsoft)
Main Authors: Chen, Li M. (Author), Su, Zhixun (Author), Jiang, Bo (Author)
Format: eBook
Language:English
Published: Cham : Springer, 2015.
Subjects:
Table of Contents:
  • Introduction: Data Science and BigData Computing
  • Overview of Basic Methods for Data Science
  • Relationship and Connectivity of Incomplete Data Collection
  • Machine Learning for Data Science: Mathematical or Computational
  • Images, Videos, and BigData
  • Topological Data Analysis
  • Monte Carlo Methods and their Applications in Big Data Analysis
  • Feature Extraction via Vector Bundle Learning
  • Curve Interpolation and Financial Curve Construction
  • Advanced Methods in Variational Learning: Segmentation with Intensity Inhomogeneity
  • An On-line Strategy of Groups Evacuation From a Convex Region in the Plane
  • A New Computational Model of Bigdata.
  • Preface
  • Contents
  • Acronyms
  • Part I Basic Data Science
  • 1 Introduction: Data Science and BigData Computing
  • 1.1 Data Mining and Cloud Computing: The Prelude of BigData and Data Science
  • 1.2 BigData Era
  • 1.3 The Meaning of Data Sciences
  • 1.4 Problems Related to Data Science
  • 1.5 Mathematical Problems in Data Science
  • 1.6 Mathematics, Data Science, and Data Scientists in Industry
  • 1.7 Remark: Discussion on the Future Problems in Data Science
  • References
  • 2 Overview of Basic Methods for Data Science
  • 2.1 ``Hardware'' and ``Software'' of Data Science
  • 2.1.1 Searching and Optimization
  • 2.1.2 Decision Making
  • 2.1.3 Classification
  • 2.1.4 Learning
  • 2.2 Graph-Theoretic Methods
  • 2.2.1 Review of Graphs
  • 2.2.2 Breadth First Search and Depth First Search
  • 2.2.3 Dijkstra's Algorithm for the Shortest Path
  • 2.2.4 Minimum Spanning Tree
  • 2.3 Statistical Methods
  • 2.4 Classification, Clustering, and Pattern Recognition
  • 2.4.1 k-Nearest Neighbor Method
  • 2.4.2 k-Means Method
  • 2.5 Numerical Methods and Data Reconstruction in Science and Engineering
  • 2.6 Algorithm Design and Computational Complexity
  • 2.6.1 Concepts of Data Structures and Databases
  • 2.6.2 Queues, Stacks, and Linked Lists
  • 2.6.3 Quadtrees, Octrees, and R-trees
  • 2.6.4 NP-Hard Problems and Approximation of Solutions
  • 2.7 Online Searching and Matching
  • 2.7.1 Google Search
  • 2.7.2 Matching
  • 2.7.3 High Dimensional Search
  • 2.8 Remarks: Relationship Among Data Science, Database, Networking, and Artificial Intelligence
  • References
  • 3 Relationship and Connectivity of Incomplete Data Collection
  • 3.1 Current Challenges of Problems in Data Science
  • 3.1.1 Relations and Connectedness
  • 3.1.2 Multiple Scaling and Multiple Level Sampling
  • 3.1.3 Relationship and Connectivity Among Data
  • 3.2 Generalized λ-Connectedness.
  • 3.2.1 λ-Connectedness on Undirected Graphs
  • 3.2.2 λ-Connectedness on Directed Graphs
  • 3.2.3 Potential Function ρ and the Measure μ
  • 3.3 λ-Connected Decomposition and Image Segmentation
  • 3.3.1 λ-Connected Region Growing Segmentation
  • 3.3.2 λ-Connected Split-and-Merge Segmentation
  • 3.4 λ-Connectedness for Data Reconstruction
  • 3.4.1 λ-Connected Fitting
  • 3.4.2 Intelligent Data Fitting
  • 3.4.3 λ-Connected Fitting with Smoothness
  • 3.5 Maximum Connectivity Spanning Tree and λ-Value
  • 3.6 λ-Connectedness and Topological Data Analysis
  • 3.7 Remark: Future Concerns
  • References
  • Part II Data Science Problems and Machine Learning
  • 4 Machine Learning for Data Science: Mathematical or Computational
  • 4.1 Decision Trees and Boosting Process
  • 4.2 Neural Networks
  • 4.3 Genetic Algorithms
  • 4.4 Functional and Variational Learning
  • 4.5 Support Vector Machine Algorithms
  • 4.6 Computational Learning Theory
  • 4.7 Remarks: Statistical Learning Algorithms and BigData
  • References
  • 5 Images, Videos, and BigData
  • 5.1 Images and Videos in BigData Times
  • 5.2 Concepts of Image Processing: Filtering, Segmentation, and Recognition
  • 5.2.1 Image Filtering
  • 5.2.2 Segmentation
  • 5.2.3 Image Recognition
  • 5.3 The Five Philosophies of Image Segmentation
  • 5.4 Image Segmentation Methods
  • 5.4.1 The Maximum Entropy Method and the Minimum Variance Method
  • 5.4.2 Learning in λ-Connectedness with Maximum Entropy
  • 5.4.3 λ-Connected Segmentation and the Minimum Variance Method
  • 5.4.4 λ-Connectedness and the Mumford-Shah Method
  • 5.5 Graph-Cut Based Image Segmentation and Laplacian Matrices
  • 5.6 Segmentation, BigData, and Subspace Clustering
  • 5.6.1 Quadtree and Octree Based Image Segmentation
  • 5.6.2 General Subspace Clustering
  • 5.7 Object Tracking in Video Images
  • 5.7.1 Videos, Compression, and Video Storing Formats.
  • 5.7.2 Object Tracking
  • 5.7.3 Tracking Algorithms
  • 5.7.3.1 Machining Learning Algorithms
  • 5.7.3.2 Online Object Tracking
  • 5.8 Future Concerns: BigData Related Image Segmentation
  • References
  • 6 Topological Data Analysis
  • 6.1 Why Topology for Data Sets-- 6.2 Concepts: Cloud Data, Decomposition, Simplex, Complex, and Topology
  • 6.3 Algorithmic Geometry and Topology
  • 6.3.1 Algorithms for Delaunay Triangulations and Voronoi Diagrams
  • 6.3.2 Manifold Learning on Cloud Data
  • 6.3.2.1 Real Problems Related to Manifold Learning
  • 6.4 Persistent Homology and Data Analysis
  • 6.4.1 Euler Characteristics and Homology Groups
  • 6.4.2 Data Analysis Using Topology
  • 6.5 Digital Topology Methods and Fast Implementation
  • 6.5.1 2D Digital Holes and Betti Numbers
  • 6.5.2 3D Genus Computation
  • 6.6 New Developments in Persistent Homologyand Analysis
  • 6.6.1 Coverage of Sensor Networks Using Persistent Homology
  • 6.6.2 Statistics and Machine Learning Combined with Topological Analysis
  • 6.7 Remarks: Topological Computing and Applications in the Future
  • 6.7.1 λ-Connectedness and Topological Data Analysis
  • 6.7.2 Hierarchy of λ-Connectedness and Topological Data Analysis
  • 6.7.3 Topological Computing in Cloud Computers
  • References
  • 7 Monte Carlo Methods and Their Applications in Big Data Analysis
  • 7.1 Introduction
  • 7.2 The Basic of Monte Carlo
  • 7.3 Variance Reduction
  • 7.3.1 Stratified Sampling
  • 7.3.2 Control Variates
  • 7.3.3 Antithetic Variates
  • 7.3.4 Importance Sampling
  • 7.4 Examples of Monte Carlo Methods in Data Science
  • 7.4.1 Case Study 1: Estimation of Sum
  • 7.4.2 Case Study 2: Monte Carlo Linear Solver
  • 7.4.3 Case Study 3: Image Recovery
  • 7.4.4 Case Study 4: Matrix Multiplication
  • 7.4.5 Case Study 5: Low-Rank Approximation
  • 7.5 Summary
  • References
  • Part III Selected Topics in Data Science.