This section provides a comprehensive overview and review of the similarities and differences between SAS and R so that organizations can make more informed decisions in their SAS to R migration plans and vision.
For sponsors who want to keep SAS for legacy studies, they may be interested to use R for custom graphs for data visualization or Shiny apps for user and clinical data interactions. For sponsors who want to be submission ready, it makes sense to apply caution until all R packages have been installed, tested and can produce SDTMs, ADaMs and TLGs.
While most everything from SAS can be replicated in R, there is a steep learning curve since R concepts and process flow are more object oriented. R has meanings for special characters such as [], {} and () for example. In addition, most of R syntax consists of functions which are similar to SAS functions and macro programs. So, knowing how to call SAS functions and macros will help to understand, write and execute R functions. Like SAS macro programs, R functions can have positional, keyword and default parameters. See list of R packages install in SAS LSAF.
Few SAS and similar R terms are listed below.
- data set data frame
- observations # slice(), row_number(), rownames()
- data set options () data frame options []
- label Hmisc::label()
- variable vector
- types: numeric, character numeric, character
- dates (# of days since Jan 1, 1960) dates (# of days since Jan 1, 1970)
- N/A list variable type
- rename rename()
- modules, procs & functions R packages and functions (ex. tidyverse)
- data steps: retain, if then, vr=, by any(), ifelse(), mutate(), group() to replicate
- output not easily to replicate
- first., last. slice(1), slice(n())
- do loops, arrays for loops with data frame index references
proc sql dplyr: select(), mutate(), filter(), case_when(), arrange(), group_by(), %>%
-
left join, right join, inner join, full outer join left_join(), right_join(), inner_join(), full_join()
- subqueries mutate(), summarize(), left_join() to replicate
- proc compare diffdf()
- proc contents Hmisc: contents()
- proc freq tables()
- proc means summarize()
- proc print print()
- proc sort, nodup arrange(), group_by_all()
- proc transpose pivot_longer(), pivot_wider()
numeric functions R numeric functions
-
min, max, mean, sum, median, std min, max, mean, sum, median, sd
-
-
character functions R character functions
-
catx() paste0(), paste(), unite()
- compress() str_extract()
- find() str_detect()
- index() grep()
- lag(), lead() lag(), lead()
- lowcase() tolower()
- scan() word(), strsplit(), separate()
- strip() str_trim()
- substr() str_sub()
- tranwrd() str_replace_all()
- upcase() toupper()
- variable type conversion functions R functions
- input() as.numeric()
- put() as.character()
- length() width option in format(), nchar() returns the # of characters in variable
- length() returns # of variables for data frames & # of records for vars
- count() count
macro programs R functions and user defined functions
- macro variables Vectors with one or more values
- global macro variables Vectors with one or more values, ex. x <- 'Y', x <<- 'Y'
- local macro variables Variables defined within R functions
- defaults and keyword parameters defaults and keyword parameters
- ODS R Markdown
- Logs logrx