How to remove duplicate values using pandas

Spread the love

In This tutorial, You will learn How to remove duplicate values using pandas with inbuilt function that is ‘drop_duplicates’

Syntax : DataFrame.drop_duplicates(subset=None, keep=’first’, inplace=False)

Parameters :

subset : column label or sequence of labels, optional

Only consider certain columns for identifying duplicates, by default use all of the columns

keep : {‘first’, ‘last’, False}, default ‘first’

first : Drop duplicates except for the first occurrence.
last : Drop duplicates except for the last occurrence.
False : Drop all duplicates.
inplace : boolean, default False

import pandas as pd df1 = pd.DataFrame(['ravi','david','raju','david','test','check']) df2 = pd.DataFrame(['1','2','3','4','5','2']) print("Printing duplicate values") print(df1[df1.duplicated(keep=False)]) print(df2[df2.duplicated(keep=False)]) df1 = df1.drop_duplicates() df2 = df2.drop_duplicates() print(df1) print(df2)

Output:

Printing duplicate values
0
1 david
3 david
0
1 2
5 2
0
0 ravi
1 david
2 raju
4 test
5 check
0
0 1
1 2
2 3
3 4
4 5

 

admin

admin

Leave a Reply

Your email address will not be published. Required fields are marked *