Image compression using K-means clustering | Color Quantization using K-Means | K-means Clustering Application

Question

Image compression using K-means clustering | Color Quantization using K-Means | K-means Clustering Application

asked Jan 18 in Artificial Intelligence(AI) & Machine Learning by Sharda Chaudhary Goeduhub's Expert (2.1k points)
edited Jan 18 by Sharda Chaudhary

In this article, k-means clustering unsupervised learning algorithm using scikit-learn and Python to build an image compression application.

An image is collection of pixels having intensity values between 0 to 255. Color component of a image is combination of RGB(Red-Green-blue) which requires 3 bytes per pixel

In this application reducing the number of colors required to show the image from multiple unique colors to 64, while preserving the overall appearance quality.

Goeduhub's Online Courses @Udemy

For Indian Students- INR 570/- || For International Students- $12.99/-

S.No.	Course Name	Apply Coupon
1.	Tensorflow 2 & Keras:Deep Learning & Artificial Intelligence	Apply Coupon
2.	Computer Vision with OpenCV \| Deep Learning CNN Projects	Apply Coupon
3.	Complete Machine Learning & Data Science with Python	Apply Coupon
4.	Natural Language Processing-NLP with Deep Learning in Python	Apply Coupon
5.	Computer Vision OpenCV Python \| YOLO\| Deep Learning in Colab	Apply Coupon
6.	Complete Python Programming from scratch with Projects	Apply Coupon

1 Answer

answered Jan 18 by Sharda Chaudhary Goeduhub's Expert (2.1k points)
edited Jan 18 by Sharda Chaudhary

Best answer

Image Compression using K-Means Clustering Unsupervised Machine Learning

Now days, we have huge amount of data in form of images and producing lots of images in our day to day life. People upload lots of images daily on social media platforms like Instagram, Facebook, twitter and other cloud storage.

To deal with huge amount of data , we want to store image data as efficiently as possible to maximize image quality and minimize storage space and processing resources, Idea came into existence is image compression.

Formally, image compression is the type of data compression applied to digital images to reduce their cost of storage or transmission.

What is k-means clustering?

First we should have knowledge of k-means clustering. K-means clustering is technique to group similar objects into one cluster. For example,Companies such as Amazon, Netflix, all group their customers on the basis of their interest and their search history etc, and then invite their customers to buy the product. Another example like we are having a dataset that contains the location of people from all over the world, then we can create different clusters according to different states, such that each cluster contains people of a particular state only.

Implementation of Image Compression using K-means

In this application reducing the number of colors required to show the image from multiple unique colors to 64, while preserving the overall appearance quality.

As we know that an image consists of different colours, so while compressing the image using K-Means Clustering we will create clusters of major colours and group all the similar colors in one cluster, forming different clusters for only major colors.

K-means clustering will group similar colors together into ‘k’ clusters (say k=64 in this case) of different colors (RGB values). Therefore, each cluster centroid is the representative of the three dimensional color vector in RGB color space of its respective cluster. You might have guessed by now how smoothly K-means can be applied on the pixel values to get the resultant compressed image. Now, these ‘k’ cluster centroids will replace all the color vectors in their respective clusters. Thus, we need to only store the label for each pixel which tells the cluster to which this pixel belongs. Additionally, we keep the record of color vectors of each cluster center. Following original and reduced image-

original and compressed images using kmean image compression

Importing necessary libraries-

#data science librires

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

from sklearn.cluster import KMeans

Reading and plotting with the help of matplotlib library-

img = plt.imread("1.jpg")

plt.imshow(img)

plt.axis('off')

plt.show()

Checking data type of an image in system- its always numpy array.

type(img)

Output-numpy.ndarray

Analyzing the properties of image-

print(img.shape)

print(img.size)

Output- (176, 287, 3)

151536

The image shape contains the rows( 176), columns(287) and channels(3) in the image. In our image 3 channels because it is coloured image. If we check for grayscale image then colour channel is only 1.

Image size means total no. of pixels (rows * columns * channels).

Reshape image and Normalize image pixel values-

#3d to 2D image by combining w and h

w,h,d=img.shape

image_array = img.reshape(w*h, d)

image_array.shape

#normalize in the range of (0,1)

img=img/255

Output- (50512, 3)

Convert image from 3D to 2D by combining rows and columns, now image contains only 2 parameters that is number of pixels and number of channels. To normalize image pixel value , divide the image size by 255 because that is the maximum intensity value for RGB individually.

Extracting small subset of image for training model-

from sklearn.utils import shuffle

# fitting model on a small sub sample of the complete image

image_array_sample = shuffle(image_array,random_state=1)[:1000]

image_array_sample.size

Output- 3000

KMeans Model Creation and training-

kmeans=KMeans(n_clusters=6,random_state=1)

kmeans.fit(image_array_sample)

Predicting labels for complete image-

#get labels all centroids on the complete image

labels = kmeans.predict(image_array)

Printing Centroids-

print(kmeans.cluster_centers_)

c=kmeans.cluster_centers_

Recreate Original image according to labels-

#recreate original image according to labels and each pixels

def recreate_image(c,labels,w,h,d):

  image=np.zeros((w,h,d))

  label_idx=0

#now label each pixels according to the limited labels

  for i in range(w):

    for j in range(h):

      image[i][j]=c[labels[label_idx]]

      #print(labels[label_idx])

      label_idx+=1

  return(image)

Visualizing and comparing the original and compressed images-

plt.figure(1)

plt.axis('off')

plt.title("original")

plt.imshow(img)

plt.show()

plt.figure(2)

plt.axis('off')

plt.title("reduced")

plt.imshow(recreate_image(c,labels,w,h,d))

plt.show()

original and reduced image by kmeans

Artificial Intelligence(AI) Training in Jaipur

Machine Learning(ML) Training in Jaipur

Online Courses	Free Tutorials	Go to Your University	Placement Preparation

Online Training - Youtube Live Class Link

Image compression using K-means clustering | Color Quantization using K-Means | K-means Clustering Application

Goeduhub's Online Courses @Udemy

For Indian Students- INR 570/- || For International Students- $12.99/-

Please log in or register to answer this question.

1 Answer

Image Compression using K-Means Clustering Unsupervised Machine Learning

What is k-means clustering?

Implementation of Image Compression using K-means

Please log in or register to add a comment.

Our Mentors(For AI-ML)

Related questions