-
Notifications
You must be signed in to change notification settings - Fork 50
/
Copy path2021-08-04-pca-visualization.Rmd
120 lines (76 loc) · 2.42 KB
/
2021-08-04-pca-visualization.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
---
description: How to do PCA Visualization in ggplot2 with Plotly.
name: PCA Visualization
permalink: ggplot2/pca-visualization/
thumnail_github: pca-visualization.png
layout: base
language: ggplot2
display_as: ai_ml
page_type: u-guide
order: 4
output:
html_document:
keep_md: true
---
```{r, echo = FALSE, message=FALSE}
knitr::opts_chunk$set(message = FALSE, warning=FALSE)
```
`ggfortify` lets `ggplot2` know how to interpret PCA objects. After loading `ggfortify`, you can use `ggplot2::autoplot` function for `stats::prcomp` and `stats::princomp` objects.
## Default plot
```{r}
library(plotly)
library(ggfortify)
df <- iris[1:4]
pca_res <- prcomp(df, scale. = TRUE)
p <- autoplot(pca_res)
ggplotly(p)
```
PCA result should only contains numeric values. If you want to colorize by non-numeric values which original data has, pass original `data` using data keyword and then specify column name by `colour` keyword. Use `help(autoplot.prcomp)` (or `help(autoplot.*)` for any other objects) to check available options.
```{r}
library(plotly)
library(ggfortify)
df <- iris[1:4]
pca_res <- prcomp(df, scale. = TRUE)
p <- autoplot(pca_res, data = iris, colour = 'Species')
ggplotly(p)
```
# Adding data labels
Passing `label = TRUE `draws each data label using `rownames`
```{r}
library(plotly)
library(ggfortify)
df <- iris[1:4]
pca_res <- prcomp(df, scale. = TRUE)
p <- autoplot(pca_res, data = iris, colour = 'Species', label = TRUE, label.size = 3)
ggplotly(p)
```
Passing `shape = FALSE` makes plot without points. In this case, `label` is turned on unless otherwise specified.
```{r}
library(plotly)
library(ggfortify)
df <- iris[1:4]
pca_res <- prcomp(df, scale. = TRUE)
p <- autoplot(pca_res, data = iris, colour = 'Species', shape = FALSE, label.size = 3)
ggplotly(p)
```
# Displaying eigenvectors.
Passing `loadings = TRUE` draws eigenvectors.
```{r}
library(plotly)
library(ggfortify)
df <- iris[1:4]
pca_res <- prcomp(df, scale. = TRUE)
p <- autoplot(pca_res, data = iris, colour = 'Species', loadings = TRUE)
ggplotly(p)
```
You can attach eigenvector labels and change some options.
```{r}
library(plotly)
df <- iris[1:4]
pca_res <- prcomp(df, scale. = TRUE)
p <- autoplot(pca_res, data = iris, colour = 'Species',
loadings = TRUE, loadings.colour = 'blue',
loadings.label = TRUE, loadings.label.size = 3)
ggplotly(p)
```
<!--------------------- EXAMPLE BREAK ------------------------->