Data frame

(1)

Data frame

(2)

What's a data frame?

A data frame has the variables of a data set as columns and the observations as rows.

data frame matrix

(3)

mtcars

head(mtcars)

tail(mtcars)

str(mtcars) structure of your data set

(4)

Creating a data frame

# Definition of vectors

name <- c("Mercury", "Venus", "Earth", "Mars", "Jupiter", "Saturn", "Uranus", "Neptune") type <- c("Terrestrial planet", "Terrestrial planet", "Terrestrial planet",

"Terrestrial planet", "Gas giant", "Gas giant", "Gas giant", "Gas giant") diameter <- c(0.382, 0.949, 1, 0.532, 11.209, 9.449, 4.007, 3.883)

rotation <- c(58.64, -243.02, 1, 1.03, 0.41, 0.43, -0.72, 0.67) rings <- c(FALSE, FALSE, FALSE, FALSE, TRUE, TRUE, TRUE, TRUE)

# Create a data frame from the vectors

planets_df <- data.frame( name, type, diameter, rotation, rings) planets_df

str(planets_df)

(5)

Selection of data frame elements

# Print out diameter of Mercury (row 1, column 3)

planets_df[1,3]

planets_df[4, ]

planets_df[6:8, "diameter"]

planets_df[ , "type"]

[row, column]

(6)

Selection of data frame elements

# Print out diameter of Mercury (row 1, column 3)

planets_df[1,3]

planets_df[4, ]

planets_df[6:8, "diameter"]

planets_df[ , "type"]

planets_df$type

[row, column]

(7)

條件篩選

subset(planets_df, subset = diameter < 2) subset(planets_df, subset = rings )

subset(planets_df, subset = name )  只能是 logical 型態

(8)

Sorting

a <- c(100, 10, 1000)

order(a)

a[order(a)]

(9)

Sorting data frame

positions <- order(planets_df$diameter)

◦ 根據 diameter 排序

planets_df[positions, ]

(10)

Summary

Vectors (one dimensional array): can hold numeric, character or logical va lues. The elements in a vector all have the same data type.

Matrices (two dimensional array): can hold numeric, character or logical v alues. The elements in a matrix all have the same data type.

Data frames (two-dimensional objects): can hold numeric, character or lo gical values. Within a column all elements have the same data type, but di fferent columns can be of different data type.

(11)

List

(12)

list()

# Vector with numerics from 1 up to 10 my_vector <- 1:10

# Matrix with numerics from 1 up to 9 my_matrix <- matrix(1:9, ncol = 3)

# First 10 elements of the built-in data frame mtcars my_df <- mtcars[1:10,]

# Construct list with these different elements:

my_list <- list(my_vector, my_matrix, my_df) my_list

(13)

list() 加上別名

# Vector with numerics from 1 up to 10 my_vector <- 1:10

# Matrix with numerics from 1 up to 9 my_matrix <- matrix(1:9, ncol = 3)

# First 10 elements of the built-in data frame mtcars my_df <- mtcars[1:10,]

# Adapt list() call to give the components names

my_list <- list(vec = my_vector, mat = my_matrix, df = my_df) my_list

(14)

用別名取出元素

my_vector <- 1:10

my_matrix <- matrix(1:9, ncol = 3) my_df <- mtcars[1:10,]

my_list <- list(vec = my_vector, mat = my_matrix, df = my_df)

my_list["mat"]

my_list[["mat"]]

my_list$mat

(15)

Add value(s) to the list

my_vector <- 1:10

my_matrix <- matrix(1:9, ncol = 3) my_df <- mtcars[1:10,]

my_list <- list(vec = my_vector, mat = my_matrix, df = my_df)

my_list <- c(my_list, year = 1980, month = 11) my_list