Skip to Main Content

Data Science

R Programming

R is a powerful and popular programming language widely used for statistical computing and data analysis. It is a command-line interface, meaning one has to write a set of commands (statements) to interact with the system.

R allows users to easily load, explore, clean, analyze, and visualize datasets. Whether you're working with structured tabular data, time series, or unstructured text data, R provides a wide range of tools and techniques to tackle various data analysis challenges. Data manipulation, transformation, and summarization can be achieved using the vast array of functions and packages R provides. Such packages include ggplot2, dplyr, tidyr, caret.

Overall, R is a versatile and powerful programming language that has become a go-to tool for data analysts, statisticians, and researchers worldwide due to its robustness, flexibility, and extensive community support. Whether you're just starting your journey into data analysis or looking to enhance your skills, R offers a rich set of tools and resources to help you unlock the insights hidden within your data.

R Variable Names (Identifiers)

A variable can have a short name (like x and y) or a more descriptive name (age, carname, total_volume). Rules for R variables are:

  1. The name can be a combination of letters, digits, period (.), and underscore (_).
  2. It must start with a letter or a period.
  3. If it starts with a period, the second character cannot be a number.
  4. It cannot start with a number or an underscore.
  5. Variable names are case-sensitive (age, Age and AGE are three different variables)
  6. Reserved words cannot be used as variables (TRUE, FALSE, NULL, if...)
Footer for USD LibGuide v2.0