Data Science at the Command Line, 2nd Edition: Obtain, Scrub, Explore, and Model Data with Unix Power Tools

简介:
本动手指南演示了命令行的灵活性如何帮助您成为更高效,更高效的数据科学家。您将学习如何结合小型但功能强大的命令行工具来快速获取,清理,探索和建模数据。为了让您入门,作者Jeroen Janssens提供了一个Docker映像,其中包含100多个Unix电动工具-无论您使用Windows,macOS还是Linux,都很有用。您将很快发现为什么命令行是一种敏捷、可扩展和可扩展的技术。即使您对使用Python或R处理数据感到满意,您也将学习如何通过利用命令行的功能来极大地改善数据科学工作流程。本书非常适合数据科学家,分析师,工程师,系统管理员和研究人员。
从网站、api、数据库和电子表格获取数据对纯文本、CSV、HTML/XML和JSON执行擦除操作探索数据、计算描述性统计和创建可视化使用Drake管理您的数据科学工作流程从一行代码和现有的Python或R代码创建可重用的工具使用GNU Parallel并行化和分发数据密集型管道使用降维、聚类、回归和分类算法对数据进行建模
英文简介:
This hands-on guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You'll learn how to combine small, yet powerful, command-line tools to quickly obtain, scrub, explore, and model your data.
To get you started, author Jeroen Janssens provides a Docker image packed with over 100 Unix power tools—useful whether you work with Windows, macOS, or Linux.
You'll quickly discover why the command line is an agile, scalable, and extensible technology. Even if you're comfortable processing data with Python or R, you'll learn how to greatly improve your data science workflow by leveraging the command line's power.
This book is ideal for data scientists, analysts, engineers, system administrators, and researchers.
Obtain data from websites, APIs, databases, and spreadsheetsPerform scrub operations on plain text, CSV, HTML/XML, and JSONExplore data, compute descriptive statistics, and create visualizationsManage your data science workflow using DrakeCreate reusable tools from one-liners and existing Python or R codeParallelize and distribute data-intensive pipelines using GNU ParallelModel data with dimensionality reduction, clustering, regression, and classification algorithms
- 书名
- Data Science at the Command Line, 2nd Edition: Obtain, Scrub, Explore, and Model Data with Unix Power Tools
- 译名
- 命令行数据科学,第二版:使用 Unix Power Tools 获取、清理、探索和建模数据
- 语言
- 英语
- 年份
- 2014
- 页数
- 212页
- 大小
- 7.60 MB
- 标签
- 数据科学
- Unix
- 下载
Data Science at the Command Line, 2nd Edition: Obtain, Scrub, Explore, and Model Data with Unix Power Tools.pdf
- 密码
- 65536
最后更新:2025-04-12 23:58:02