Free Online Tutorials  Go To Your University  Placement Preparation  Online Live ClassesYoutube Live Link

SUMMER TRAINING AT GOEDUHUB TECHNOLOGIES, JAIPUR (Call:- 7976731765)Project Based Best Summer Training Courses in Jaipur

0 like 0 dislike
81 views
in Python Programming by Goeduhub's Expert (2k points)
Web Scraping Project to analyze products from Flipkart. In this article we will cover, How to Scrape Data From Flipkart Website Using Python? How do you check if I can scrape a website? How can I scrape on Flipkart? Does flipkart allow web scraping?

1 Answer

0 like 0 dislike
by Goeduhub's Expert (2k points)
selected by
 
Best answer

How to Scrap Data form Flipkart

We need to follow certain steps for data extraction

  1. Importing necessary libraries like BeautifulSoup, requests, Pandas, csv etc. 
  2. Find url that we want to extract.
  3. Inspect the page, we need to specify the content variable from html which we want to  extract.
  4. Writing code for scraping.
  5. Store the result in desired format.

Step 1  Importing necessary libraries

from bs4 import BeautifulSoup 

import requests 

import csv

import pandas as pd

Requests is a Python HTTP library.So, basically with the help of this library we make a request to a web page.

Step 2  Find url that we want to extract

In this example we want to extract data from flipkart website and will compare price and ratings of different laptops. URL of the Flipkart website containing laptops information is

https://www.flipkart.com/search?q=laptop&sid=6bo%2Cb5g&as=on&as-show=on&otracker=AS_QueryStore_OrganicAutoSuggest_1_6_na_na_na&otracker1=AS_QueryStore_OrganicAutoSuggest_1_6_na_na_na&as-pos=1&as-type=RECENT&suggestionId=laptop%7CLaptops&requestId=7ec220e8-4f02-4150-9e0b-9e90cf692f4b&as-searchtext=laptop

To get the contents of the specified URL, submit a request using the requests library. 

url="https://www.flipkart.com/search?q=laptop&sid=6bo%2Cb5g&as=on&as-show=on&otracker=AS_QueryStore_OrganicAutoSuggest_1_6_na_na_na&otracker1=AS_QueryStore_OrganicAutoSuggest_1_6_na_na_na&as-pos=1&as-type=RECENT&suggestionId=laptop%7CLaptops&requestId=7ec220e8-4f02-4150-9e0b-9e90cf692f4b&as-searchtext=laptop"

response = requests.get(url)

htmlcontent = response.content

soup = BeautifulSoup(htmlcontent,"html.parser")

print(soup.prettify)

  1.  Here we use Beautifulsoup to parse the HTML content with the help of html parser.
  2. I did not get the soup printed here and got the prettify soup printed. It shows the html page in a better way.
  3. Check your output in form of html code.

 flipkart site scraping with web scraping python

This is snapshot of flipkart page which contains different laptops information. We want to extract product name and product price and product ratings.

Step 3 Inspect the page, we need to specify the content variable from html which we want to  extract

Just right click on flipkart page and inspect elements and then select elements which you want to get.

inspect flipkart page for web scraping

We will see a “Browser Inspector Box” open after clicking on inspect. We observe that the class name of the descriptions is ‘_4rR01T’ so we use the find method to extract the descriptions of the laptops.

get element class by inspect

products=[]

prices=[]

ratings=[]

product=soup.find('div',attrs={'class':'_4rR01T'})

print(product.text)

Output-

HP 14s Core i3 10th Gen - (8 GB/256 GB SSD/Windows 10 Home) 14s-cf3074TU Thin and Light Laptop 

Here we are getting particular laptop description. For all laptop descriptions we need to write a loop.

Step 4 Writing code for scraping.

Get all the classes corresponding to price and rating and for complete product.

for a in soup.findAll('a',href=True, attrs={'class':'_1fQZEK'}):

  name=a.find('div',attrs={'class':'_4rR01T'})

  price=a.find('div',attrs={'class':'_30jeq3 _1_WHN1'})

  rating=a.find('div',attrs={'class':'_3LWZlK'})

  products.append(name.text)

  prices.append(price.text)

  ratings.append(rating.text)

Step 5 Store the result in desired format.

import pandas as pd

df = pd.DataFrame({'Product Name':products,'Prices':prices,'Ratings':ratings})

df.head()

Output-

Product NamePricesRatings
0HP 14s Core i3 10th Gen - (8 GB/256 GB SSD/Win...₹36,9904.2
1HP 15 Ryzen 3 Dual Core 3200U - (4 GB/1 TB HDD...₹29,9904.1
2Lenovo Ideapad S145 Core i3 7th Gen - (4 GB/1 ...₹30,9904.1
3HP 15s Ryzen 5 Quad Core 3450U - (8 GB/1 TB HD...₹40,9904.2
4Asus TUF Gaming A17 Ryzen 5 Hexa Core 4600H - ...₹63,9904.7

Store in a csv file-

df.to_csv('products.csv')

A file name “products.csv” is created and this file contains the extracted data. 

Related questions

0 like 0 dislike
5 answers 481 views
asked Apr 4, 2020 in Python Programming by Nisha Goeduhub's Expert (3k points)
1 like 0 dislike
3 answers 481 views
0 like 0 dislike
7 answers 2.4k views
2 like 0 dislike
4 answers 702 views
0 like 0 dislike
4 answers 549 views

 Goeduhub:

About Us | Contact Us || Terms & Conditions | Privacy Policy || Youtube Channel || Telegram Channel © goeduhub.com Social::   |  | 
...