17-Nov-2019

For PCA/SVD, the scale/units of the data matters

For PCA/SVD, the scale/units of the data matters

PC’s/SV’s may mix real patterns

SVD can be computationally intensive for very large matrices

To quote from Hadley Wickham’s book on ggplot2, we want to “shorten the distance from mind to page”.

”…the grammar tells us that a statistical graphic is a mapping from data to aesthetic attributes (colour, shape, size) of geometric objects (points, lines, bars). The plot may also contain statistical transformations of the data and is drawn on a specific coordinate system” – from ggplot2 book

”…the grammar tells us that a statistical graphic is a mapping from data to aesthetic attributes (colour, shape, size) of geometric objects (points, lines, bars). The plot may also contain statistical transformations of the data and is drawn on a specific coordinate system” – from ggplot2 book

”…the grammar tells us that a statistical graphic is a mapping from data to aesthetic attributes (colour, shape, size) of geometric objects (points, lines, bars). The plot may also contain statistical transformations of the data and is drawn on a specific coordinate system” – from ggplot2 book

”…the grammar tells us that a statistical graphic is a mapping from data to aesthetic attributes (colour, shape, size) of geometric objects (points, lines, bars). The plot may also contain statistical transformations of the data and is drawn on a specific coordinate system” – from ggplot2 book

”…the grammar tells us that a statistical graphic is a mapping from data to aesthetic attributes (colour, shape, size) of geometric objects (points, lines, bars). The plot may also contain statistical transformations of the data and is drawn on a specific coordinate system” – from ggplot2 book

The grammar allows for a more compact summary of the base components of a language

in ggplot2 in R, qplot() plots are made up of aesthetics (size, shape, color) and geoms (points, lines).

in ggplot2 in R, qplot() plots are made up of aesthetics (size, shape, color) and geoms (points, lines).

in ggplot2 in R, qplot() plots are made up of aesthetics (size, shape, color) and geoms (points, lines).

ggplot() is very flexible for doing things qplot() cannot do.

your data should have appropriate [...] so that you can quickly look at a dataset and know

• what the variables are

• what the values of each variable mean

your data should have appropriate metadata so that you can quickly look at a dataset and know

• what the variables are

• what the values of each variable mean

your data should have appropriate metadata so that you can quickly look at a dataset and know

your data should have appropriate metadata so that you can quickly look at a dataset and know

• what the variables are

• what the values of each variable mean

in ggplot2 for R, non-numeric or categorical variables should be coded as factor variables and have meaningful labels for each level of the factor.

in ggplot2 for R, non-numeric or categorical variables should be coded as factor variables and have meaningful labels for each level of the factor.

in ggplot2 for R, non-numeric or categorical variables should be coded as factor variables and have meaningful labels for each level of the factor.

for ggplot2 in R, non-numeric or categorical variables should be coded as factor variables and have meaningful labels for each level of the factor. If a variable represents temperature categories, it might be better to use “cold”, “mild”, and “hot” rather than “1”, “2”, and “3”

including the proper metadata can make your exploratory plots essentially self-documenting

for ggplot2 in R, in the call to qplot() you must specify the 'data' argument so that qplot() knows where to look up the variables.

in ggplot2 for R, color is an aesthetic and the color of each point can be mapped to a variable.

in ggplot2 for R, a smooth is a “geom” that you can add along with your data points.

qplot(displ, hwy, data = mpg, geom = c("point", "smooth"))

in ggplot2 for R, specify geom = "point" if you want the smoother overlayed with the points.

a way to create multiple panels of plots based on the levels of categorical variable in ggplot2 for R is facets.

An alternative to histograms is a density smoother, which sometimes can be easier to visualize when there are multiple groups.

Here is a density smooth of an entire study population.

qplot(log(eno), data = maacs, geom = "density")

An alternative to histograms is a density smoother, which sometimes can be easier to visualize when there are multiple groups.

Here is a density smooth of an entire study population.

qplot(log(eno), data = maacs, geom = "density")

aesthetic mappings: describe how data are mapped to color, size, shape, location

these mappings describe how data are mapped to color, size, shape, location : aesthetic mappings

in ggplot2 for R, geoms are geometric objects like points, lines, shapes

in ggplot2 for R, geoms are geometric objects like points, lines, shapes

in ggplot2 for R, facets describe how conditional/panel plots should be constructed

in ggplot2 for R, facets describe how conditional/panel plots should be constructed

in ggplot2 for R, stats are statistical transformations like binning, quantiles, smoothing.

in ggplot2 for R, stats are statistical transformations like binning, quantiles, smoothing.

in ggplot2 for R, stats are statistical transformations like binning, quantiles, smoothing.

in ggplot2 for R, scales are what scale an aesthetic map uses (example: male = red, female = blue).

in ggplot2 for R, scales are what scale an aesthetic map uses (example: male = red, female = blue).

in ggplot2 for R, scales are what scale an aesthetic map uses (example: male = red, female = blue).

in ggplot2 for R coordinate system describes the system in which the locations of the geoms will be drawn

in ggplot2 for R coordinate system describes the system in which the locations of the geoms will be drawn

in ggplot2 for R coordinate system describes the system in which the locations of the geoms will be drawn