Data Transformation in R Programming: Definition & Purpose

Instructor: Giorgos-Nektarios Panayotidis

George-Nektarios has worked as a tutor and student consultant for five years and has a 4-year university degree in Applied Informatics.

In this lesson, we'll explore the purpose of data transformation in the context of R Programming. You'll find out how it is used and carried out through exploring the related R functions/operations.

Managing Unwieldy Data

Imagine that you're a supplier for a company that produces furniture. You need to load the furniture products into a truck for delivery to other cities. You've carefully calculated the space in the truck and need to homogenize the way that each piece of furniture is placed into the truck to ensure everything fits. In other words, you need to eliminate skewness or asymmetry in the products' arrangement.

Handling data in a previously arranged model in R isn't all that different. Each data variable needs to follow a uniform, homogeneous distribution, and skewness has to be eliminated. In R programming, this can be achieved with data transformation.

Purpose of Data Transformation

Let's assume that we have a statistical model that we want to fit. In order to get meaningful results, we need statistical significance for the chosen explanatory/independent variables parameters/coefficients. Furthermore, in order to run this test, we would usually assume that the residuals need to follow the normal distribution, something formally called as homoscedasticity.

The point is that we need several assumptions to be fulfilled with regards to this model. Maybe the simplest of these assumptions would be the ''additivity'' attribute of the model, which implies a linear model, also known by the more complex technical term the ''Generalized Additive Model'' (GAM). Simply put, additivity means a variable's relationship such as this: Y = B0 + B1*X1 + B2*X2 + e.

The typical purpose of data transformation is the adherence to a (statistical) model's assumption in R. Therefore, in order to attempt to fulfill those rules, we would need to make all variables behave in a uniform way, much like the furniture in the earlier shipping example. This uniformity is managed via the so-called data transformation process, which is really nothing more than a specific mathematical operation, such as logarithms and powers. Keep in mind that different forms of data transformation exist, according to what data irregularity is faced, and consequently, what particular change needs to be accomplished in the data.

Data Transformation R Functions

The R programming language comes packed with quite a few data transformation functions. Remember, these functions are designed to perform ordinary mathematical operations. However, they are also used to perform some of the most frequently required transformations in the data. Let's explore some of the most fundamental R functions/operations for data transformation.

Power/Root Transformation Functions/Operations

The first type we'll explore are the transformation functions that produce an output of the data that is raised to a specific power or fraction power (root). These functions usually tackle skewness issues in the data, which in turn imply violation of the above-mentioned very basic additivity assumption. They include the following:

Function ''sqrt''

This is the square root. If we assume the name of a data variable that we want to transform is ''var_1'', then we could apply this function as sqrt(var_1) and then assign the result to a new variable name.

Other Powers Operations

In order to derive another power or root, we would simply use the ''^'' operator. For example, to get the cube root, we would use the following code: var_1^(1/3) and then perform the assignment to the transformed data variable name using the R assignment operator.

To unlock this lesson you must be a Study.com Member.
Create your account

Register to view this lesson

Are you a student or a teacher?

Unlock Your Education

See for yourself why 30 million people use Study.com

Become a Study.com member and start learning now.
Become a Member  Back
What teachers are saying about Study.com
Try it risk-free for 30 days

Earning College Credit

Did you know… We have over 160 college courses that prepare you to earn credit by exam that is accepted by over 1,500 colleges and universities. You can test out of the first two years of college and save thousands off your degree. Anyone can earn credit-by-exam regardless of age or education level.

To learn more, visit our Earning Credit Page

Transferring credit to the school of your choice

Not sure what college you want to attend yet? Study.com has thousands of articles about every imaginable degree, area of study and career path that can help you find the school that's right for you.

Create an account to start this course today
Try it risk-free for 30 days!
Create An Account
Support