Applied Data Science

简介:
可用数据的爆炸式增长与统计和计算方法的持续发展相吻合,产生了一批新的专家。这些数据科学家使用严格的统计方法来寻找数据的意义。将损失函数最小化是不够的: 商业和社会决策取决于对这些见解的解释。科学计算的世界正在迅速发展。快速和肮脏的脚本是不够的: 可维护的代码库和协作开发环境允许项目进行生产和扩展。数据科学家必须戴很多帽子,我们在这里介绍其中两个。
将使用测试驱动的开发,版本控制和协作来教授可维护的编码技术。代码将是在scikit-learn和statsmodels包中找到的类型。学生在GitHub上创建了一个库,并了解了几个核心的统计/机器学习算法。
案例研究让学生有机会在现实世界的数据集上使用这些自己的软件。在这里,他们发展了从数据中提取意义的直觉。学生通过网站/博客/作品集完成课程,并体验翻译:
现实世界 --> 数据 --> 科学家 --> 合作者/同事 --> 政策决策/数据产品
英文简介:
The explosion of available data coinciding with the continued evolution of statistical and computational methods has resulted in a new breed of specialist. These data scientists use rigorous statistical methods to find meaning in data. Minimizing a loss function is not enough: Business and societal decisions hinge on the interpretation of these insights. The world of scientific computation is rapidly evolving. Quick-and-dirty scripts are not enough: A maintainable code base and collaborative development environment allows projects to productionalize and scale. A data scientist must wear many caps, we present two of them here.
Maintainable coding techniques will be taught using test-driven-development, version control, and collaboration. Code will be of the type found in the scikit-learn and statsmodels packages. Students finish the class having created a library on GitHub, and an understanding of several core statistical/machine-learning algorithms.
Case studies give students the opportunity to use these their own software on real world data sets. Here they develop intuition for extracting meaning from data. Students finish the class with a website/blog/portfolio, and experience with the translation:
Real world --> data --> scientist --> collaborators/coworkers --> policy-decision/data-product
- 书名
- Applied Data Science
- 译名
- 应用数据科学
- 语言
- 英语
- 页数
- 141页
- 大小
- 3.45 MB
- 标签
- 数据科学
- 下载
Applied Data Science.pdf
- 密码
- 65536
最后更新:2025-04-12 23:58:09