4
Unstructured Data
A Python-R Dictionary for Data Science
Welcome
1
Introduction
Data Collection
2
Structured Data
3
Semi-structured Data
4
Unstructured Data
Data Manipulation
5
Subset Data
6
Modify, Group, Summarize
7
Pivoting and Joining
8
Natural Language Processing
Data Visualization
9
Trends
10
Distributions
Machine Learning
11
Splitting Data
12
Preprocessing
13
Fitting a Model
References
Table of contents
4.1
HTML pages
4.2
PDFs
4
Unstructured Data
R
Python
library
(jsonlite)
library
(xml2)
import
pandas
as
pd
4.1
HTML pages
4.2
PDFs
3
Semi-structured Data
Data Manipulation