超值优惠券
¥50
100可用 有效期2天

全场图书通用(淘书团除外)

不再提示
关闭
图书盲袋,以书为“药”
欢迎光临中图网 请 | 注册

数据科学原理

出版社:东南大学出版社出版时间:2017-10-01
开本: 24cm 页数: 12,369页
中 图 价:¥61.6(6.7折) 定价  ¥92.0 登录后可看到会员价
加入购物车 收藏
运费6元,满39元免运费
?新疆、西藏除外
本类五星书更多>

数据科学原理 版权信息

数据科学原理 本书特色

本书旨在帮助你将数学、编程和商业分析这三者融会贯通。有了这本书,在面对复杂的问题时,无论是抽象和原始的数据统计,还是可实施的理念,你都会充满自信。我们采用了一种独特的方法来建立起数学和计算机科学之间的桥梁,你会在这次令人兴奋的学习之旅中成长为一名数据科学家。从清洗和准备数据开始,然后到给出有效的数据挖掘策略和技术,你会经历数据科学的整个流程,建立起数据科学的各个组成部分是如何相互协作的宏观概念,学习基本的数学和统计学知识以及一些目前由数据科学家和分析师用到的伪代码。除此之外,你还将掌握机器学习,了解一些有用的统计模型,这些模型能够帮助你控制和处理很密集的数据集,学会如何创建出能股表达数据意图的可视化方法。

数据科学原理 内容简介

本书旨在帮助你将数学、编程和商业分析这三者融会贯通。有了这本书,在面对复杂的问题时,无论是抽象和原始的数据统计,还是可实施的理念,你都会充满自信。我们采用了一种独特的方法来建立起数学和计算机科学之间的桥梁,你会在这次令人兴奋的学习之旅中成长为一名数据科学家。从清洗和准备数据开始,然后到给出有效的数据挖掘策略和技术,你会经历数据科学的整个流程,建立起数据科学的各个组成部分是如何相互协作的宏观概念,学习基本的数学和统计学知识以及一些目前由数据科学家和分析师用到的伪代码。除此之外,你还将掌握机器学习,了解一些有用的统计模型,这些模型能够帮助你控制和处理很密集的数据集,学会如何创建出能股表达数据意图的可视化方法。

数据科学原理 目录

PrefaceChapter 1: How to Sound Like a Data Scientist What is data science? Basic terminology Why data science? Example - Sigma Technologies The data science Venn diagram The math Example - spawner-recruit models Computer programming Why Python? Python practices Example of basic Python Domain knowledge Some more terminology Data science case studies Case study - automating government paper pushing Fire all humans, right? Case study - marketing dollars Case study - what's in a job description? SummaryChapter 2: Types of Data Flavors of data Why look at these distinctions? Structured versus unstructured data Example of data preprocessing Word/phrase counts Presence of certain special characters Relative length of text Picking out topics Quantitative versus qualitative data Example - coffee shop data Example - world alcohol consumption data Digging deeper The road thus far The four levels of data The nominal level Mathematical operations allowed Measures of center What data is like at the nominal level The ordinal level Examples Mathematical operations allowed Measures of center Quick recap and check The interval level Example Mathematical operations allowed Measures of center Measures of variation The ratio level Examples Measures of center Problems with the ratio level Data is in the eye of the beholder SummaryChapter 3: The Five Steps of Data Science Introduction to Data Science Overview of the five steps Ask an interesting question Obtain the data Explore the data Model the data Communicate and visualize the results Explore the data Basic questions for data exploration Dataset 1 - Yelp Dataframes Series Exploration tips for qualitative data Dataset 2 - titanic SummaryChapter 4: Basic Mathematics Mathematics as a discipline Basic symbols and terminology Vectors and matrices Quick exercises Answers Arithmetic symbols Summation Proportional Dot product Graphs Logarithms/exponents Set theory Linear algebra Matrix multiplication How to multiply matrices SummaryChapter 5: Impossible or Improbable - A Gentle Introduction to Probability Basic definitions Probability Bayesian versus Frequentist Frequentist approach The law of large numbers Compound events Conditional probability The rules of probability The addition rule Mutual exclusivity The multiplication rule Independence Complementary events A bit deeper SummaryChapter 6: Advanced Probability Collectively exhaustive events Bayesian ideas revisited Bayes theorem More applications of Bayes theorem Example - Titanic Example - medical studies Random variables Discrete random variables Types of discrete random variables SummaryChapter 7: Basic Statistics What are statistics? How do we obtain and sample data? Obtaining data Observational Experimental Sampling data Probability sampling Random sampling Unequal probability sampling How do we measure statistics? Measures of center Measures of variation Definition Example - employee salaries Measures of relative standing The insightful part - correlations in data The Empirical rule SummaryChapter 8: Advanced Statistics Point estimates Sampling distributions Confidence intervals Hypothesis tests Conducting a hypothesis test One sample t-tests Example of a one sample t-tests Assumptions of the one sample t-tests Type I and type II errors Hypothesis test for categorical variables Chi-square goodness of fit test Chi-square test for association/independence SummaryChapter 9: Communicating Data Why does communication matter? Identifying effective and ineffective visualizations Scatter plots Line graphs Bar charts Histograms Box plots When graphs and statistics lie Correlation versus causation Simpson's paradox If correlation doesn't imply causation, then what does? Verbal communication It's about telling a story On the more formal side of things The whylhowlwhat strategy of presenting SummaryChapter 10: How to Tell If Your Toaster Is Learning - Machine Learning Essentials What is machine learning? Machine learning isn't perfect How does machine learning work? Types of machine learning Supervised learning It's not only about predictions Types of supervised learning Data is in the eyes of the beholder Unsupervised learning Reinforcement learning Overview of the types of machine learning How does statistical modeling fit into all of this? Linear regression Adding more predictors Regression metrics Logistic regression Probability, odds, and log odds The math of logistic regression Dummy variables SummaryChapter 11: Predictions Don't Grow on Trees - or Do They? Na'fve Bayes classification Decision trees How does a computer build a regression tree? How does a computer fit a classification tree? Unsupervised learning When to use unsupervised learning K-means clustering Illustrative example - data points Illustrative example - beer! Choosing an optimal number for K and cluster validation The Silhouette Coefficient Feature extraction and principal component analysis SummaryChapter 12: Beyond the Essentials The bias variance tradeoff Error due to bias Error due to variance Two extreme cases of bias/variance tradeoff Underfitting Overfitting How bias/variance play into error functions K folds cross-validation Grid searching Visualizing training error versus cross-validation error Ensembling techniques Random forests Comparing Random forests with decision trees Neural networks Basic structure SummaryChapter 13: Case Studies Case study 1 - predicting stock prices based on social media Text sentiment analysis Exploratory data analysis Regression route Classification route Going beyond with this example Case study 2 - why do some people cheat on their spouses? Case study 3 - using tensorflow Tensorflow and neural networks SummaryIndex
展开全部
商品评论(0条)
暂无评论……
书友推荐
本类畅销
编辑推荐
返回顶部
中图网
在线客服