R intro part 2

Hi there again,

part 2 of R intro summary :)

other handy functions:

lapply - returns a list of the same length as X, each element of which is the result of applying FUN to the corresponding element of X.

sapply – returns vector, matrix or array of output from applying FUN to elements of X, more user-friendly then lapply.

vapply – tries to generate named array, has a pre-specified type of return value, so it can be safer (and sometimes faster) to use

seq, rep, sort, rev – reverse the elements, append – merge vectors, is.* – check the class of R object, as.* – convert an R object from one class to another, unlist – flatten list to a vectors

REGULAR EXPRESSIONS! – is a sequence of characters that define a search pattern.

grepl - true if a pattern is found, grep - vector of indices of the character strings that contain the pattern, in sub and gsub you can specify a replacement argument, sub only first match, gsub all the matches.

Date and POSIXct objects –

dyplr: arrange – in descending or ascending order, filter , mutate – adding new column out of the rest

ggplot2: ggplot, facet_wrap – wraps a 1d sequence of panels into 2d, expand_limits

different plots with ggplot2: + geom_col -bar plot, geom_point - points, geom_line, geom_histogram, geom_boxplot

LOADING DATA INTO R

Utils package:

read.csv, read.delim – from txt files, read.table, which.min, which.max – returns an index of min or max value in the column, in read.delim for example you can add column names and column classes

readr library:

read_csv, read_tsv – tsv files, read_delim

collectors – are used to pass information about how to interpret values in a column; col_integer, col_factors

data.table library:

fread – same as read.table, but extremely fast and easy

readxl library:

excel_sheets – prints out names of the sheets in excel, read_excel – imports excels as tbl_df, tbl, data.frame,

gdata library:

read.xls – converting excel files to csv and then reading csv files using read.csv,

XLConnect library:

loadWorkbook – a bridge between Excel file and R session, getSheets – list of sheets, readWorksheet – importing the sheet into a data frame, createSheet – create a new sheet, writeWorksheet – adding data frames to a sheet, saveWorkbook – store adapter excel file, renameSheet, removeSheet

DBI library:

dbConnect – MySQLConnection, dbListTables – list of the tables in db, dbReadTable, dbGetQuery – using query to get data from a database table, dbDisconnect

dbSendQuery, then dbFetch – fetching results of executing a query, gives the ability to fetch the query’s result in chunks rather than all at once, dbClearResult – frees the memory

Importing flat files from the web: using readr library, read_csv or read_tsv with url address.

Downloading files from the web: readxl and gdata libraries, read.xls(gdata) and then download.file or read_excel (readr).

Downloading .RData files, download.file first and then load it into the workspace with load.

httr library:

GET – to get request from the web, in result we get the response object, that provides easy access to the status code, content-type, and actual content, using content we can extract the content, and we can define what object we want to retrieve: raw object, R object (list) or a character vector.

JSON files: first GET, then content as text but we can also use jsonlite library and use fromJSON to convert character data into a list, we can pass an object as an argument or URL, we can convert data to JSON file using toJSON, prettify makes JSON files pretty and minify makes in as concise as possible.

haven library:

read_sas for SAS

read_dta for STATA, columns are imported as a labeled vector and in order to change it for R format we need to use as_factor function and then we can convert it for the wanted data type.

read_sav or read_por for SPSS, also labeled class and we need to change it to other standard R class.

foreign library:

Simple functions to import STATA data and SPSS - read.dta for STATA, read.spss for SPSS.


thanks, wait for more!

szarki9