R Programming Cheat Sheet
Master R programming with this comprehensive reference guide. Find functions, data manipulation techniques, statistical methods, and examples for effective data analysis.
c()
Data StructuresCreate a vector
Syntax:
c(values)
Examples:
c(1, 2, 3, 4, 5)
Create numeric vectorc('a', 'b', 'c')
Create character vectorc(TRUE, FALSE, TRUE)
Create logical vectorNotes:
Fundamental data structure in R
list()
Data StructuresCreate a list
Syntax:
list(elements)
Examples:
list(a = 1, b = 2, c = 3)
Named listlist(c(1,2), c('a','b'))
List of vectorsNotes:
Can contain different data types
data.frame()
Data StructuresCreate a data frame
Syntax:
data.frame(col1, col2, ...)
Examples:
data.frame(x = 1:5, y = letters[1:5])
Create data frame with vectorsdata.frame(name = c('Alice', 'Bob'), age = c(25, 30))
Create data frame with named columnsNotes:
Most common data structure for datasets
matrix()
Data StructuresCreate a matrix
Syntax:
matrix(data, nrow, ncol)
Examples:
matrix(1:12, nrow = 3, ncol = 4)
3x4 matrixmatrix(0, nrow = 2, ncol = 3)
2x3 zero matrixNotes:
Two-dimensional array of same data type
factor()
Data StructuresCreate categorical variable
Syntax:
factor(x, levels, labels)
Examples:
factor(c('low', 'high', 'medium'))
Create factorfactor(c(1,2,1,3), labels = c('A','B','C'))
Factor with custom labelsNotes:
Used for categorical data
subset()
Data ManipulationFilter data based on conditions
Syntax:
subset(data, condition)
Examples:
subset(mtcars, mpg > 20)
Filter rows where mpg > 20subset(mtcars, cyl == 4, select = c(mpg, hp))
Filter and select columnsNotes:
Convenient for data filtering
merge()
Data ManipulationJoin data frames
Syntax:
merge(x, y, by)
Examples:
merge(df1, df2, by = 'id')
Inner join by id columnmerge(df1, df2, by = 'id', all = TRUE)
Full outer joinNotes:
SQL-like joins for data frames
aggregate()
Data ManipulationGroup and summarize data
Syntax:
aggregate(x, by, FUN)
Examples:
aggregate(mpg ~ cyl, data = mtcars, mean)
Mean mpg by cylinderaggregate(. ~ Species, data = iris, mean)
Mean of all variables by SpeciesNotes:
Group by operations
apply()
Data ManipulationApply function over margins
Syntax:
apply(X, MARGIN, FUN)
Examples:
apply(mtcars, 2, mean)
Column meansapply(mtcars, 1, sum)
Row sumsNotes:
MARGIN: 1=rows, 2=columns
lapply()
Data ManipulationApply function to list elements
Syntax:
lapply(X, FUN)
Examples:
lapply(mtcars, mean)
Mean of each columnlapply(1:3, function(x) x^2)
Square each elementNotes:
Returns a list
sapply()
Data ManipulationSimplified lapply
Syntax:
sapply(X, FUN)
Examples:
sapply(mtcars, mean)
Mean of each column as vectorsapply(iris[1:4], function(x) c(min(x), max(x)))
Min and max for each columnNotes:
Returns vector/matrix when possible
mean()
StatisticsCalculate arithmetic mean
Syntax:
mean(x, na.rm = FALSE)
Examples:
mean(c(1, 2, 3, 4, 5))
Mean of vectormean(mtcars$mpg)
Mean of mpg columnmean(c(1, 2, NA, 4), na.rm = TRUE)
Mean excluding NA valuesNotes:
Use na.rm = TRUE to handle missing values
median()
StatisticsCalculate median
Syntax:
median(x, na.rm = FALSE)
Examples:
median(c(1, 2, 3, 4, 5))
Median of vectormedian(mtcars$mpg)
Median of mpg columnNotes:
Middle value when data is sorted
sd()
StatisticsCalculate standard deviation
Syntax:
sd(x, na.rm = FALSE)
Examples:
sd(mtcars$mpg)
Standard deviation of mpgsd(rnorm(100))
SD of random normal dataNotes:
Sample standard deviation
var()
StatisticsCalculate variance
Syntax:
var(x, na.rm = FALSE)
Examples:
var(mtcars$mpg)
Variance of mpgvar(iris$Sepal.Length)
Variance of sepal lengthNotes:
Sample variance
cor()
StatisticsCalculate correlation
Syntax:
cor(x, y = NULL, method)
Examples:
cor(mtcars$mpg, mtcars$hp)
Correlation between mpg and hpcor(mtcars)
Correlation matrixcor(x, y, method = 'spearman')
Spearman correlationNotes:
Default is Pearson correlation
t.test()
StatisticsPerform t-test
Syntax:
t.test(x, y = NULL, alternative)
Examples:
t.test(rnorm(30), mu = 0)
One-sample t-testt.test(group1, group2)
Two-sample t-testt.test(x ~ group, data = df)
t-test with formulaNotes:
Tests for differences in means
plot()
PlottingCreate scatter plot
Syntax:
plot(x, y, type, main, xlab, ylab)
Examples:
plot(mtcars$hp, mtcars$mpg)
Scatter plotplot(1:10, type = 'l')
Line plotplot(x, y, main = 'Title', xlab = 'X', ylab = 'Y')
Plot with labelsNotes:
Base plotting function
hist()
PlottingCreate histogram
Syntax:
hist(x, breaks, main, xlab)
Examples:
hist(mtcars$mpg)
Histogram of mpghist(rnorm(1000), breaks = 30)
Histogram with 30 binsNotes:
Shows distribution of continuous data
boxplot()
PlottingCreate box plot
Syntax:
boxplot(x, main, xlab, ylab)
Examples:
boxplot(mtcars$mpg)
Box plot of mpgboxplot(mpg ~ cyl, data = mtcars)
Box plot by groupNotes:
Shows distribution and outliers
barplot()
PlottingCreate bar plot
Syntax:
barplot(height, names.arg, main)
Examples:
barplot(table(mtcars$cyl))
Bar plot of countsbarplot(c(3, 7, 1), names.arg = c('A', 'B', 'C'))
Bar plot with namesNotes:
Good for categorical data
read.csv()
Data Import/ExportRead CSV file
Syntax:
read.csv(file, header = TRUE, sep = ',')
Examples:
read.csv('data.csv')
Read CSV fileread.csv('data.csv', header = FALSE)
Read CSV without headerread.csv('data.txt', sep = '\t')
Read tab-separated fileNotes:
Most common way to read data
write.csv()
Data Import/ExportWrite CSV file
Syntax:
write.csv(x, file, row.names = TRUE)
Examples:
write.csv(mtcars, 'cars.csv')
Write data frame to CSVwrite.csv(df, 'data.csv', row.names = FALSE)
Write without row namesNotes:
Export data to CSV format
read.table()
Data Import/ExportRead delimited file
Syntax:
read.table(file, header, sep)
Examples:
read.table('data.txt', header = TRUE)
Read table with headerread.table('data.dat', sep = ' ')
Read space-separated fileNotes:
More flexible than read.csv
if/else
ProgrammingConditional statements
Syntax:
if (condition) { ... } else { ... }
Examples:
if (x > 0) print('positive')
Simple if statementif (x > 0) { print('pos') } else { print('neg') }
If-else statementifelse(x > 0, 'pos', 'neg')
Vectorized conditionalNotes:
Use ifelse() for vectorized operations
for
ProgrammingFor loop
Syntax:
for (var in sequence) { ... }
Examples:
for (i in 1:10) print(i)
Loop through numbersfor (name in names(mtcars)) print(name)
Loop through column namesNotes:
Iterate over sequences
while
ProgrammingWhile loop
Syntax:
while (condition) { ... }
Examples:
i <- 1; while (i <= 5) { print(i); i <- i + 1 }
While loop exampleNotes:
Loop while condition is true
function()
ProgrammingDefine function
Syntax:
function(args) { body }
Examples:
square <- function(x) x^2
Simple functiongreet <- function(name = 'World') paste('Hello', name)
Function with default argumentNotes:
Functions are first-class objects
paste()
String OperationsConcatenate strings
Syntax:
paste(..., sep = ' ', collapse = NULL)
Examples:
paste('Hello', 'World')
Concatenate with spacepaste0('Hello', 'World')
Concatenate without separatorpaste(c('a', 'b'), 1:2, sep = '-')
Vectorized concatenationNotes:
paste0() is equivalent to paste(..., sep = '')
substr()
String OperationsExtract substring
Syntax:
substr(x, start, stop)
Examples:
substr('Hello World', 1, 5)
Extract 'Hello'substr(c('abc', 'def'), 1, 2)
Extract first 2 charactersNotes:
Positions are 1-indexed
nchar()
String OperationsNumber of characters
Syntax:
nchar(x)
Examples:
nchar('Hello')
Returns 5nchar(c('a', 'abc', 'hello'))
Vector of lengthsNotes:
Counts characters in each element
grep()
String OperationsPattern matching
Syntax:
grep(pattern, x, value = FALSE)
Examples:
grep('mt', rownames(mtcars))
Find rows containing 'mt'grep('mt', rownames(mtcars), value = TRUE)
Return matching valuesgrepl('mt', rownames(mtcars))
Return logical vectorNotes:
Use grepl() for logical output
is.na()
Missing ValuesTest for missing values
Syntax:
is.na(x)
Examples:
is.na(c(1, 2, NA, 4))
Returns logical vectorsum(is.na(df))
Count total missing valueswhich(is.na(x))
Positions of missing valuesNotes:
Returns TRUE for NA values
na.omit()
Missing ValuesRemove missing values
Syntax:
na.omit(object)
Examples:
na.omit(c(1, 2, NA, 4))
Remove NAs from vectorna.omit(df)
Remove rows with any NANotes:
Removes complete cases only
complete.cases()
Missing ValuesFind complete cases
Syntax:
complete.cases(...)
Examples:
complete.cases(df)
Logical vector of complete rowsdf[complete.cases(df), ]
Keep only complete casesNotes:
Returns TRUE for rows without NA
length()
UtilitiesLength of object
Syntax:
length(x)
Examples:
length(c(1, 2, 3, 4, 5))
Returns 5length(mtcars)
Number of columnsNotes:
For data frames, returns number of columns
dim()
UtilitiesDimensions of object
Syntax:
dim(x)
Examples:
dim(mtcars)
Returns c(32, 11)dim(matrix(1:12, 3, 4))
Returns c(3, 4)Notes:
Returns NULL for vectors
str()
UtilitiesStructure of object
Syntax:
str(object)
Examples:
str(mtcars)
Display structure of data framestr(list(a = 1, b = 'hello'))
Structure of listNotes:
Compact display of object structure
summary()
UtilitiesSummary statistics
Syntax:
summary(object)
Examples:
summary(mtcars)
Summary of all columnssummary(mtcars$mpg)
Summary of single variableNotes:
Provides different summaries for different object types
head()
UtilitiesFirst few elements
Syntax:
head(x, n = 6)
Examples:
head(mtcars)
First 6 rowshead(mtcars, 10)
First 10 rowsNotes:
Quick peek at data
tail()
UtilitiesLast few elements
Syntax:
tail(x, n = 6)
Examples:
tail(mtcars)
Last 6 rowstail(mtcars, 3)
Last 3 rowsNotes:
View end of data
names()
UtilitiesNames of object elements
Syntax:
names(x)
Examples:
names(mtcars)
Column namesnames(list(a = 1, b = 2))
List element namesNotes:
Get or set names
rnorm()
Random NumbersRandom normal numbers
Syntax:
rnorm(n, mean = 0, sd = 1)
Examples:
rnorm(10)
10 standard normal numbersrnorm(100, mean = 5, sd = 2)
100 numbers with mean=5, sd=2Notes:
Generates from normal distribution
runif()
Random NumbersRandom uniform numbers
Syntax:
runif(n, min = 0, max = 1)
Examples:
runif(10)
10 numbers between 0 and 1runif(5, min = 1, max = 100)
5 numbers between 1 and 100Notes:
Generates from uniform distribution
sample()
Random NumbersRandom sampling
Syntax:
sample(x, size, replace = FALSE)
Examples:
sample(1:10, 5)
Sample 5 numbers from 1 to 10sample(c('A', 'B', 'C'), 10, replace = TRUE)
Sample with replacementNotes:
Sample from existing vector
set.seed()
Random NumbersSet random seed
Syntax:
set.seed(seed)
Examples:
set.seed(123); rnorm(5)
Reproducible random numbersNotes:
Ensures reproducible results
R Programming Mastery Guide
🌱 Beginner
- • Learn basic data structures: vectors, lists, data.frames
- • Master data import/export with read.csv() and write.csv()
- • Practice basic statistics: mean(), median(), sd()
- • Create simple plots with plot(), hist(), boxplot()
- • Understand indexing with [] and $
📈 Intermediate
- • Data manipulation with apply(), lapply(), sapply()
- • Data filtering and subsetting with subset()
- • Merge and join data frames with merge()
- • Handle missing values with is.na(), na.omit()
- • Write custom functions with function()
🚀 Advanced
- • Advanced statistics: t.test(), cor(), regression
- • Data aggregation with aggregate()
- • String operations with grep(), paste(), substr()
- • Control structures: if/else, for, while loops
- • Random sampling and simulation with rnorm(), sample()
💡 Quick Tips
Data Analysis Workflow:
- Import data with read.csv()
- Explore with str(), summary(), head()
- Clean and manipulate data
- Analyze with statistical functions
- Visualize with plotting functions
Common Gotchas:
- R is case-sensitive: Data ≠ data
- Use na.rm = TRUE for functions with missing values
- Vectors are 1-indexed, not 0-indexed