Python Pandas Question Exercise Practice Solutions | Data Science and Machine Learning with Python

asked May 27 in Python Programming by Sharda Chaudhary Goeduhub's Expert (2.2k points)
edited May 27 by Sharda Chaudhary

Assignment/Task 5

Pandas - Data Analysis of IMDB movies data

As we have a basic understanding of the different data structures in Pandas, let’s explore the fun and interesting ‘IMDB-movies-dataset’ and get our hands dirty by performing practical data analysis on real data.

It is an open-source dataset and you can download it from this link.

We will read the data from the .csv file and perform the following basic operations on movies data

Load the IMDb Dataset and read
View the dataset
Understand some basic information about the dataset and Inspect the dataframe Inspect the dataframe's columns, shapes, variable types etc.
Data Selection – Indexing and Slicing data
Data Selection – Based on Conditional filtering
Groupby operations
Sorting operation
Dealing with missing values
Dropping columns and null values
Apply( ) functions

Goeduhub's Top Online Courses @Udemy

For Indian Students- INR 360/- || For International Students- $9.99/-

S.No.	Course Name	Coupon
1.	Tensorflow 2 & Keras:Deep Learning & Artificial Intelligence	Apply Coupon
2.	Natural Language Processing-NLP with Deep Learning in Python	Apply Coupon
3.	Computer Vision OpenCV Python \| YOLO\| Deep Learning in Colab	Apply Coupon

More Courses

3 Answers

answered May 28 by G.Vigneshwaran (342 points)
selected May 31 by Goeduhub

Best answer

Load the IMDb Dataset and read

import numpy as np

import pandas as pd

df = pd.read_csv('IMDB-Movie-Data.csv')

2. View the dataset

df.head(10)

3. Understand some basic information about the dataset and Inspect the dataframe Inspect the dataframe's columns, shapes, variable types etc.

df.columns

type(df)

df.dtypes

df.shape

df.size

df.ndim

df.values

df1 = df.values

type(df1)

df.describe()

4. Data Selection – Indexing and Slicing data

df.iloc[0]

df[1:20]

df[['Rating','Votes']].agg(['min','max','mean'])

5. Data Selection – Based on Conditional filtering

df.filter(items=['Rank', 'Votes'])

df['Rating']>7

top_rank = df[df["Rating"] > 8.0]["Title"].count()

print(top_rank)

6. Groupby operations

df2 = df.groupby('Genre')

df2.mean()

df[df['Rating']>7].groupby('Genre')[['Rating']].count()

top_movie = df[df["Rating"] > 8.0]

top_movie.groupby(["Title"])["Votes"].mean()

7. Sorting operation

x = df.sort_values(by='Rating')

x.head(10)

                          
                              most_votes = df.groupby(["Votes"]).mean()
                            
                              most_votes.sort_values(by = ["Votes"], ascending = False).head()

8. Dealing with missing values

df.isnull().sum()

9. Dropping columns and null values

df.dropna()

x = df.drop(['Metascore'], axis='columns', inplace=True)

10. Apply( ) functions

rank = df.apply(lambda n: n*5)

print(rank.head())

Online Courses	Free Tutorials	Go to Your University	Placement Preparation

Online Training - Youtube Live Class Link

Python Pandas Question Exercise Practice Solutions | Data Science and Machine Learning with Python

Assignment/Task 5

Pandas - Data Analysis of IMDB movies data

Goeduhub's Top Online Courses @Udemy

For Indian Students- INR 360/- || For International Students- $9.99/-

Please log in or register to answer this question.

3 Answers

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Related questions