+91 89396 94874 info@k2analytics.co.in
Select Page

## Analysis of Two Continuous Variables

In the earlier blogs, we have learned about the Analysis of Single Continuous and Single Categorical variable. In this blog, we will analyze Two Continuous Variables. The table below summarizes the most commonly used Descriptive Statistics to analyze two continuous variables.

 Graphical Methods Scatter plot Numerical Methods Correlation Tabular Methods Crosstab – To analyze two continuous variables using crosstab, we will have to convert the continuous variables into categorical by binning/bucketing

Note: The Latin alphabet r or R is used to denote correlation.

### Example

For the analysis of two continuous variables, we will considering the following pair of variables from the MBA Students Data.

2. 10th Standard Percentages and 12th Standard Percentages.

#### Importing MBA Students Data in R

```#Set directory as per your folder file path
setwd("D:/k2analytics/datafile")
getwd()

 Variable Graduation Percentage MBA Grades Variable Name ‘grad_pct’ ‘mba_grades’ Description Percentage of Marks secured by students in their graduation degree Average Grades of MBA Students in the First three Semesters Variable Type Continuous Variable Continuous Variable

#### Graphical Method | Scatter plot

Scatter Plot visually represents the linear relationship between two continuous variables. It quickly shows the direction of the correlation between the two variables.

The R code to draw Scatterplot between Students Percentage and MBA Grades is given below.

```#Scatter plot for gradution percentage Vs mba grades
)
```

• In the above scatter plot, we observe that a good number of students with a very high percentage of marks in Graduation have also secured good grades in the MBA examination. Whereas students with moderate or below grades have mostly got moderate grades in MBA. Therefore, there is a weak positive correlation between students’ graduation percentage and MBA grades.
• Since the data points are more spread out in the above graph, the strength of the linear relationship between students’ graduation percentage and MBA grades is likely to be very weak. (The strength of linear relationships is measured using correlation)

#### Numerical Method | Correlation

Correlation is a statistical measure used to calculate the strength and direction of the linear relationship between two variables(X, Y). The code to calculate the correlation between students’ graduation percentage and MBA grades is given below:

```#Correlation
method = "pearson")
```
```#Output
```

#### Interpretation

The Correlation between students’ graduation percentage and MBA grades is 0.211 (weak correlation). It means the student with a high percentage of marks in graduation doesn’t necessarily mean the student will secure good grades in MBA.

#### Note

• The Weak correlation between the Students’ Graduation Percentage and MBA Grade maybe because of the following reasons
• Students are from different graduation (Like B.E, B.Com, B.Sc)
• Data is the Mix of Students with different specialization(Like Business Analytics, Marketing, Finance, HR)
• The above statements are just hypotheses. A Data Scientist should able to Explore and Investigate the data thrown at them. We leave this to Aspiring Data Scientists. Play with the data and Do more Detailed Exploratory Data.

#### 2. 10th Standard Percentages Vs 12th Standard percentages

 Variable 10th Standard Percentages 12th Standard Percentages Variable Name ‘tenth_std_pct’ ‘ten_plus_2_pct’ Description Percentage of Marks secured by students in 10th Standard Percentage of Marks secured by students in 12th Standard Variable Type Continuous Variable Continuous Variable

#### Graphical Methods | Scatter plots

```# PRACTICE EXERCISE

# THIS BLOCK IS INTENTIONALLY KEPT BLANK

# WRITE CODE TO MAKE A SCATTER PLOT BETWEEN
# tenth_std_pct AND ten_plus_2_pct
```

The above scatter plot clearly shows there is a positive linear relationship between students’ 10th and 12th Standard Percentages.

#### Numerical Method | Correlation

```#Correlation
corr_2 = cor(mba_df\$tenth_std_pct,
mba_df\$ten_plus_2_pct,
method="pearson")
cat("Pearson Correlation between 10th and 12th Standard Percentage is",round(corr_2,3))
#Output
Pearson Correlation between 10th and 12th Standard percentage is 0.456
```

#### Interpretation

•  Since r = 0.456, there is a moderate linear relationship between Students’ 10th and 12th Standard percentages.

### Practise Exercise

Analyze the 12th Standard Percentages with Graduation Percentages.

### Upcoming Blog

In the upcoming blog, we will learn “Analysis of Two Categorical Variables”

How can we help?