Decoding Purdue

An Analysis of Purdue's GPA

Andrew Thompson (CS & DS, Junior)   •   Sam Lau (Math & Stats, Senior)
Bryce Bogan (Stats, Sophomore)   •   Weichang Wang (CS & DS, Senior)

    Our goal is to analyze the distribution of Purdue's GPA across courses, subjects, academic terms, and professors to help Purdue faculty and students better understand the difficulty and requirements of the many academic paths of the university.
    This portfolio will show our group's many visualizations, insights, and our methods through the entire data visualization process.
visualization

Our Data Visualization Process

This section will go through the first few steps of the data visualization process that we followed for this project.

Acquire

To acquire our dataset covering GPA distributions for a range of courses, subjects, terms, and professors, we downloaded the data from a website called boilergrades.com, which was made available by a public records request.

Original acquired data

Parse

To parse this dataset, we wrote out the datatype, format, and range of each variable. In excel, the correct data format was selected for each column to make it more readable.

Parsed data

Filter

The data was then filtered to remove Pass/Fail courses that did not have a normal grade distribution, and any classes with a withdrawal rate greater than 80% were removed as well. Later in the process, the datasets for many visualizations also filtered out any courses or subjects that have a small number of sections to get only the larger observations that have more significance.

Mine

Next, we mined the data by getting the sum of the population and used this with the grade distribution to get an average GPA for each course. We then created multiple pivot tables to sum every section across each course and subject, as well as match up each professor with every class he or she teaches. These pivot tables were then used in the following stages of the visualization process.

Mined data

Visualizations & Insights

This section serves to showcase the various visualizations that our group has created to give insight into what the relationship is between GPA and numerous factors at Purdue.


     This scatterplot compares the GPA of courses with the number of sections they have. Additionally, each course datapoint is labelled by its department. Most courses have fewer sections, with a particularly dense cluster around 3.25 GPA and 1.4 log # of sections (which is about 4 sections).
     Both Science and Liberal Arts have clusters of courses with a high number of sections, however, Science's cluster has a lower GPA than the Liberal Arts cluster.
test

Lowest 10 courses by GPA
This bar chart shows how out of all courses at Purdue with more than 100 sections, Math has 7 of the lowest 10. This includes Calculus 1 - 3, as well as linear algebra, differential equations, and Plane Analytic Geometry & Calculus 1 & 2, which had the two lowest average GPAs out of all observations by a relatively large margin.

This line graph shows the average GPA of math courses across all terms and years in our dataset. We can see usually the average math GPA is slightly higher in fall than the other two terms. However, we can see an extreme peak in Spring 2020. The average in fall and summer is also higher than other years. This may be related to the Covid pandemic, which was at its peak that semester.
test

Lowest 10 subjects by GPAHighest 10 subjects by GPA
These bar charts compare the top 10 subjects (with 200 or more sections) with the highest GPA and the bottom 10 subjects with the lowest GPA. Math is the subject with the lowest average GPA of 2.56, with a 0.14 gap between it and second place, Band. The two outliers with the highest GPA are Pharmacy and Nursing, which each have an average GPA of 3.69 and 3.72, respectively. These two subjects are also separated by a relatively large gap of 0.25 from third place, Aviation Technology.

Analysis of the Lowest and Highest GPA Courses by Professor

test
test
test
test
test
test
test
The six bargraphs above display the three highest and three lowest courses at Purdue University based on average GPA. Given the average GPA provided by BoilerGrades, the top three and bottom three instructors (with the exception of Organic Chem Lab) for each course were found and the range was assessed. The red charts represent the lowest courses, while the blue charts represent the highest courses. It is intriguing to note that the bottom three courses are all categorized as Mathematics courses under the Mathematics department, while the top three courses are non-Mathematics courses in various departments. It is also intriguing to note that there is a wider range amongst Organic Chem (Highest Average GPA Course) and Plane Analytic Geometry Calculus II (Lowest Average GPA Course) compared to the distributions of the courses following and preceding the extremas respectively. Based on the graphs, the instructor with the lowest average GPA (1.30) was Egbert, Nicholas R. teaching Plane Analytic Geometry II, while the highest average GPA (4.0) was achieved by 10 instructors teaching Organic Chem Lab.
hero

Continue Reading

If you would like to get an even more in-depth look at our work, feel free to visit the Paper tab at the top of the page. You can also visit the Video tab to see our final presentation via youtube, or the Visualization Gallery tab to see all of our visualizations in one convenient location, along with an interactive dashboard.

Built with React & Tailwind - Hosted by Netlify.