R Programming Cheat Sheet
Master R programming with this comprehensive reference guide. Find functions, data manipulation techniques, statistical methods, and examples for effective data analysis.
📚 Practice what you've learned with our R Flashcards
c()
Data StructuresCreate a vector
Syntax:
c(values)Examples:
c(1, 2, 3, 4, 5) Create numeric vectorc('a', 'b', 'c') Create character vectorc(TRUE, FALSE, TRUE) Create logical vectorNotes:
Fundamental data structure in R
list()
Data StructuresCreate a list
Syntax:
list(elements)Examples:
list(a = 1, b = 2, c = 3) Named listlist(c(1,2), c('a','b')) List of vectorsNotes:
Can contain different data types
data.frame()
Data StructuresCreate a data frame
Syntax:
data.frame(col1, col2, ...)Examples:
data.frame(x = 1:5, y = letters[1:5]) Create data frame with vectorsdata.frame(name = c('Alice', 'Bob'), age = c(25, 30)) Create data frame with named columnsNotes:
Most common data structure for datasets
matrix()
Data StructuresCreate a matrix
Syntax:
matrix(data, nrow, ncol)Examples:
matrix(1:12, nrow = 3, ncol = 4) 3x4 matrixmatrix(0, nrow = 2, ncol = 3) 2x3 zero matrixNotes:
Two-dimensional array of same data type
factor()
Data StructuresCreate categorical variable
Syntax:
factor(x, levels, labels)Examples:
factor(c('low', 'high', 'medium')) Create factorfactor(c(1,2,1,3), labels = c('A','B','C')) Factor with custom labelsNotes:
Used for categorical data
subset()
Data ManipulationFilter data based on conditions
Syntax:
subset(data, condition)Examples:
subset(mtcars, mpg > 20) Filter rows where mpg > 20subset(mtcars, cyl == 4, select = c(mpg, hp)) Filter and select columnsNotes:
Convenient for data filtering
merge()
Data ManipulationJoin data frames
Syntax:
merge(x, y, by)Examples:
merge(df1, df2, by = 'id') Inner join by id columnmerge(df1, df2, by = 'id', all = TRUE) Full outer joinNotes:
SQL-like joins for data frames
aggregate()
Data ManipulationGroup and summarize data
Syntax:
aggregate(x, by, FUN)Examples:
aggregate(mpg ~ cyl, data = mtcars, mean) Mean mpg by cylinderaggregate(. ~ Species, data = iris, mean) Mean of all variables by SpeciesNotes:
Group by operations
apply()
Data ManipulationApply function over margins
Syntax:
apply(X, MARGIN, FUN)Examples:
apply(mtcars, 2, mean) Column meansapply(mtcars, 1, sum) Row sumsNotes:
MARGIN: 1=rows, 2=columns
lapply()
Data ManipulationApply function to list elements
Syntax:
lapply(X, FUN)Examples:
lapply(mtcars, mean) Mean of each columnlapply(1:3, function(x) x^2) Square each elementNotes:
Returns a list
sapply()
Data ManipulationSimplified lapply
Syntax:
sapply(X, FUN)Examples:
sapply(mtcars, mean) Mean of each column as vectorsapply(iris[1:4], function(x) c(min(x), max(x))) Min and max for each columnNotes:
Returns vector/matrix when possible
mean()
StatisticsCalculate arithmetic mean
Syntax:
mean(x, na.rm = FALSE)Examples:
mean(c(1, 2, 3, 4, 5)) Mean of vectormean(mtcars$mpg) Mean of mpg columnmean(c(1, 2, NA, 4), na.rm = TRUE) Mean excluding NA valuesNotes:
Use na.rm = TRUE to handle missing values
median()
StatisticsCalculate median
Syntax:
median(x, na.rm = FALSE)Examples:
median(c(1, 2, 3, 4, 5)) Median of vectormedian(mtcars$mpg) Median of mpg columnNotes:
Middle value when data is sorted
sd()
StatisticsCalculate standard deviation
Syntax:
sd(x, na.rm = FALSE)Examples:
sd(mtcars$mpg) Standard deviation of mpgsd(rnorm(100)) SD of random normal dataNotes:
Sample standard deviation
var()
StatisticsCalculate variance
Syntax:
var(x, na.rm = FALSE)Examples:
var(mtcars$mpg) Variance of mpgvar(iris$Sepal.Length) Variance of sepal lengthNotes:
Sample variance
cor()
StatisticsCalculate correlation
Syntax:
cor(x, y = NULL, method)Examples:
cor(mtcars$mpg, mtcars$hp) Correlation between mpg and hpcor(mtcars) Correlation matrixcor(x, y, method = 'spearman') Spearman correlationNotes:
Default is Pearson correlation
t.test()
StatisticsPerform t-test
Syntax:
t.test(x, y = NULL, alternative)Examples:
t.test(rnorm(30), mu = 0) One-sample t-testt.test(group1, group2) Two-sample t-testt.test(x ~ group, data = df) t-test with formulaNotes:
Tests for differences in means
plot()
PlottingCreate scatter plot
Syntax:
plot(x, y, type, main, xlab, ylab)Examples:
plot(mtcars$hp, mtcars$mpg) Scatter plotplot(1:10, type = 'l') Line plotplot(x, y, main = 'Title', xlab = 'X', ylab = 'Y') Plot with labelsNotes:
Base plotting function
hist()
PlottingCreate histogram
Syntax:
hist(x, breaks, main, xlab)Examples:
hist(mtcars$mpg) Histogram of mpghist(rnorm(1000), breaks = 30) Histogram with 30 binsNotes:
Shows distribution of continuous data
boxplot()
PlottingCreate box plot
Syntax:
boxplot(x, main, xlab, ylab)Examples:
boxplot(mtcars$mpg) Box plot of mpgboxplot(mpg ~ cyl, data = mtcars) Box plot by groupNotes:
Shows distribution and outliers
barplot()
PlottingCreate bar plot
Syntax:
barplot(height, names.arg, main)Examples:
barplot(table(mtcars$cyl)) Bar plot of countsbarplot(c(3, 7, 1), names.arg = c('A', 'B', 'C')) Bar plot with namesNotes:
Good for categorical data
read.csv()
Data Import/ExportRead CSV file
Syntax:
read.csv(file, header = TRUE, sep = ',')Examples:
read.csv('data.csv') Read CSV fileread.csv('data.csv', header = FALSE) Read CSV without headerread.csv('data.txt', sep = '\t') Read tab-separated fileNotes:
Most common way to read data
write.csv()
Data Import/ExportWrite CSV file
Syntax:
write.csv(x, file, row.names = TRUE)Examples:
write.csv(mtcars, 'cars.csv') Write data frame to CSVwrite.csv(df, 'data.csv', row.names = FALSE) Write without row namesNotes:
Export data to CSV format
read.table()
Data Import/ExportRead delimited file
Syntax:
read.table(file, header, sep)Examples:
read.table('data.txt', header = TRUE) Read table with headerread.table('data.dat', sep = ' ') Read space-separated fileNotes:
More flexible than read.csv
if/else
ProgrammingConditional statements
Syntax:
if (condition) { ... } else { ... }Examples:
if (x > 0) print('positive') Simple if statementif (x > 0) { print('pos') } else { print('neg') } If-else statementifelse(x > 0, 'pos', 'neg') Vectorized conditionalNotes:
Use ifelse() for vectorized operations
for
ProgrammingFor loop
Syntax:
for (var in sequence) { ... }Examples:
for (i in 1:10) print(i) Loop through numbersfor (name in names(mtcars)) print(name) Loop through column namesNotes:
Iterate over sequences
while
ProgrammingWhile loop
Syntax:
while (condition) { ... }Examples:
i <- 1; while (i <= 5) { print(i); i <- i + 1 } While loop exampleNotes:
Loop while condition is true
function()
ProgrammingDefine function
Syntax:
function(args) { body }Examples:
square <- function(x) x^2 Simple functiongreet <- function(name = 'World') paste('Hello', name) Function with default argumentNotes:
Functions are first-class objects
paste()
String OperationsConcatenate strings
Syntax:
paste(..., sep = ' ', collapse = NULL)Examples:
paste('Hello', 'World') Concatenate with spacepaste0('Hello', 'World') Concatenate without separatorpaste(c('a', 'b'), 1:2, sep = '-') Vectorized concatenationNotes:
paste0() is equivalent to paste(..., sep = '')
substr()
String OperationsExtract substring
Syntax:
substr(x, start, stop)Examples:
substr('Hello World', 1, 5) Extract 'Hello'substr(c('abc', 'def'), 1, 2) Extract first 2 charactersNotes:
Positions are 1-indexed
nchar()
String OperationsNumber of characters
Syntax:
nchar(x)Examples:
nchar('Hello') Returns 5nchar(c('a', 'abc', 'hello')) Vector of lengthsNotes:
Counts characters in each element
grep()
String OperationsPattern matching
Syntax:
grep(pattern, x, value = FALSE)Examples:
grep('mt', rownames(mtcars)) Find rows containing 'mt'grep('mt', rownames(mtcars), value = TRUE) Return matching valuesgrepl('mt', rownames(mtcars)) Return logical vectorNotes:
Use grepl() for logical output
is.na()
Missing ValuesTest for missing values
Syntax:
is.na(x)Examples:
is.na(c(1, 2, NA, 4)) Returns logical vectorsum(is.na(df)) Count total missing valueswhich(is.na(x)) Positions of missing valuesNotes:
Returns TRUE for NA values
na.omit()
Missing ValuesRemove missing values
Syntax:
na.omit(object)Examples:
na.omit(c(1, 2, NA, 4)) Remove NAs from vectorna.omit(df) Remove rows with any NANotes:
Removes complete cases only
complete.cases()
Missing ValuesFind complete cases
Syntax:
complete.cases(...)Examples:
complete.cases(df) Logical vector of complete rowsdf[complete.cases(df), ] Keep only complete casesNotes:
Returns TRUE for rows without NA
length()
UtilitiesLength of object
Syntax:
length(x)Examples:
length(c(1, 2, 3, 4, 5)) Returns 5length(mtcars) Number of columnsNotes:
For data frames, returns number of columns
dim()
UtilitiesDimensions of object
Syntax:
dim(x)Examples:
dim(mtcars) Returns c(32, 11)dim(matrix(1:12, 3, 4)) Returns c(3, 4)Notes:
Returns NULL for vectors
str()
UtilitiesStructure of object
Syntax:
str(object)Examples:
str(mtcars) Display structure of data framestr(list(a = 1, b = 'hello')) Structure of listNotes:
Compact display of object structure
summary()
UtilitiesSummary statistics
Syntax:
summary(object)Examples:
summary(mtcars) Summary of all columnssummary(mtcars$mpg) Summary of single variableNotes:
Provides different summaries for different object types
head()
UtilitiesFirst few elements
Syntax:
head(x, n = 6)Examples:
head(mtcars) First 6 rowshead(mtcars, 10) First 10 rowsNotes:
Quick peek at data
tail()
UtilitiesLast few elements
Syntax:
tail(x, n = 6)Examples:
tail(mtcars) Last 6 rowstail(mtcars, 3) Last 3 rowsNotes:
View end of data
names()
UtilitiesNames of object elements
Syntax:
names(x)Examples:
names(mtcars) Column namesnames(list(a = 1, b = 2)) List element namesNotes:
Get or set names
rnorm()
Random NumbersRandom normal numbers
Syntax:
rnorm(n, mean = 0, sd = 1)Examples:
rnorm(10) 10 standard normal numbersrnorm(100, mean = 5, sd = 2) 100 numbers with mean=5, sd=2Notes:
Generates from normal distribution
runif()
Random NumbersRandom uniform numbers
Syntax:
runif(n, min = 0, max = 1)Examples:
runif(10) 10 numbers between 0 and 1runif(5, min = 1, max = 100) 5 numbers between 1 and 100Notes:
Generates from uniform distribution
sample()
Random NumbersRandom sampling
Syntax:
sample(x, size, replace = FALSE)Examples:
sample(1:10, 5) Sample 5 numbers from 1 to 10sample(c('A', 'B', 'C'), 10, replace = TRUE) Sample with replacementNotes:
Sample from existing vector
set.seed()
Random NumbersSet random seed
Syntax:
set.seed(seed)Examples:
set.seed(123); rnorm(5) Reproducible random numbersNotes:
Ensures reproducible results
R Programming Mastery Guide
🌱 Beginner
- • Learn basic data structures: vectors, lists, data.frames
- • Master data import/export with read.csv() and write.csv()
- • Practice basic statistics: mean(), median(), sd()
- • Create simple plots with plot(), hist(), boxplot()
- • Understand indexing with [] and $
📈 Intermediate
- • Data manipulation with apply(), lapply(), sapply()
- • Data filtering and subsetting with subset()
- • Merge and join data frames with merge()
- • Handle missing values with is.na(), na.omit()
- • Write custom functions with function()
🚀 Advanced
- • Advanced statistics: t.test(), cor(), regression
- • Data aggregation with aggregate()
- • String operations with grep(), paste(), substr()
- • Control structures: if/else, for, while loops
- • Random sampling and simulation with rnorm(), sample()
💡 Quick Tips
Data Analysis Workflow:
- Import data with read.csv()
- Explore with str(), summary(), head()
- Clean and manipulate data
- Analyze with statistical functions
- Visualize with plotting functions
Common Gotchas:
- R is case-sensitive: Data ≠data
- Use na.rm = TRUE for functions with missing values
- Vectors are 1-indexed, not 0-indexed