• 沒有找到結果。

Data frame

N/A
N/A
Protected

Academic year: 2021

Share "Data frame"

Copied!
15
0
0

加載中.... (立即查看全文)

全文

(1)

Data frame

(2)

What's a data frame?

A data frame has the variables of a data set as columns and the observations as rows.

data frame matrix

(3)

mtcars

head(mtcars)

tail(mtcars)

str(mtcars) structure of your data set

(4)

Creating a data frame

# Definition of vectors

name <- c("Mercury", "Venus", "Earth", "Mars", "Jupiter", "Saturn", "Uranus", "Neptune") type <- c("Terrestrial planet", "Terrestrial planet", "Terrestrial planet",

"Terrestrial planet", "Gas giant", "Gas giant", "Gas giant", "Gas giant") diameter <- c(0.382, 0.949, 1, 0.532, 11.209, 9.449, 4.007, 3.883)

rotation <- c(58.64, -243.02, 1, 1.03, 0.41, 0.43, -0.72, 0.67) rings <- c(FALSE, FALSE, FALSE, FALSE, TRUE, TRUE, TRUE, TRUE)

# Create a data frame from the vectors

planets_df <- data.frame( name, type, diameter, rotation, rings) planets_df

str(planets_df)

(5)

Selection of data frame elements

# Print out diameter of Mercury (row 1, column 3)

planets_df[1,3]

planets_df[4, ]

planets_df[6:8, "diameter"]

planets_df[ , "type"]

[row, column]

(6)

Selection of data frame elements

# Print out diameter of Mercury (row 1, column 3)

planets_df[1,3]

planets_df[4, ]

planets_df[6:8, "diameter"]

planets_df[ , "type"]

planets_df$type

[row, column]

(7)

條件篩選

subset(planets_df, subset = diameter < 2) subset(planets_df, subset = rings )

subset(planets_df, subset = name )  只能是 logical 型態

(8)

Sorting

a <- c(100, 10, 1000)

order(a)

a[order(a)]

(9)

Sorting data frame

positions <- order(planets_df$diameter)

根據 diameter 排序

planets_df[positions, ]

(10)

Summary

Vectors (one dimensional array): can hold numeric, character or logical va lues. The elements in a vector all have the same data type.

Matrices (two dimensional array): can hold numeric, character or logical v alues. The elements in a matrix all have the same data type.

Data frames (two-dimensional objects): can hold numeric, character or lo gical values. Within a column all elements have the same data type, but di fferent columns can be of different data type.

(11)

List

(12)

list()

# Vector with numerics from 1 up to 10 my_vector <- 1:10

# Matrix with numerics from 1 up to 9 my_matrix <- matrix(1:9, ncol = 3)

# First 10 elements of the built-in data frame mtcars my_df <- mtcars[1:10,]

# Construct list with these different elements:

my_list <- list(my_vector, my_matrix, my_df) my_list

(13)

list() 加上別名

# Vector with numerics from 1 up to 10 my_vector <- 1:10

# Matrix with numerics from 1 up to 9 my_matrix <- matrix(1:9, ncol = 3)

# First 10 elements of the built-in data frame mtcars my_df <- mtcars[1:10,]

# Adapt list() call to give the components names

my_list <- list(vec = my_vector, mat = my_matrix, df = my_df) my_list

(14)

用別名取出元素

my_vector <- 1:10

my_matrix <- matrix(1:9, ncol = 3) my_df <- mtcars[1:10,]

my_list <- list(vec = my_vector, mat = my_matrix, df = my_df)

my_list["mat"]

my_list[["mat"]]

my_list$mat

(15)

Add value(s) to the list

my_vector <- 1:10

my_matrix <- matrix(1:9, ncol = 3) my_df <- mtcars[1:10,]

my_list <- list(vec = my_vector, mat = my_matrix, df = my_df)

my_list <- c(my_list, year = 1980, month = 11) my_list

參考文獻

相關文件

◦ Lack of fit of the data regarding the posterior predictive distribution can be measured by the tail-area probability, or p-value of the test quantity. ◦ It is commonly computed

Official Statistics --- Reproduction of these data is allowed provided the source is quoted.. Further information can be obtained from the Documentation and Information Centre

In the past researches, all kinds of the clustering algorithms are proposed for dealing with high dimensional data in large data sets.. Nevertheless, almost all of

• But, If the representation of the data type is changed, the program needs to be verified, revised, or completely re- written... Abstract

• A cell array is a data type with indexed data containers called cells, and each cell can contain any type of data. • Cell arrays commonly contain either lists of text

• Use table to create a table for column-oriented or tabular data that is often stored as columns in a spreadsheet.. • Use detectImportOptions to create import options based on

We showed that the BCDM is a unifying model in that conceptual instances could be mapped into instances of five existing bitemporal representational data models: a first normal

○ exploits unlabeled data to learn latent factors as representations. ○ learned representations can be transfer to