Answered by : elisabeth-engering
import pandas as pd
# Drop all duplicates in the DataFrame
df = df.drop_duplicates()
# Drop all duplicates in a specific column of the DataFrame
df = df.drop_duplicates(subset = "column")
# Drop all duplicate pairs in DataFrame
df = df.drop_duplicates(subset = ["column", "column2"])
# Display DataFrame
print(df)
Source : https://www.datacamp.com/cheat-sheet/pandas-cheat-sheet-for-data-science-in-python | Last Update : Fri, 06 May 22
Answered by : tough-tarantula-yhbyghmtl7lq
df[~df.index.duplicated()]
Source : https://stackoverflow.com/questions/22918212/fastest-way-to-drop-duplicated-index-in-a-pandas-dataframe | Last Update : Mon, 24 Aug 20
Answered by : average-ant-d6hv63y08ny3
df3 = df3[~df3.index.duplicated(keep='first')]
Source : https://stackoverflow.com/questions/13035764/remove-rows-with-duplicate-indices-pandas-dataframe-and-timeseries | Last Update : Thu, 19 Nov 20
Answered by : jjsseecc
# Remove by index
df = df[df.index.duplicated(keep='first')]
# Other methods to remove duplicates
import pandas as pd
df = df.drop_duplicates()
df = df.drop_duplicates(subset = "column")
df = df.drop_duplicates(subset = ["column", "column2"])
Source : https://stackoverflow.com/questions/13035764/remove-pandas-rows-with-duplicate-indices | Last Update : Sun, 25 Sep 22
Answered by : tough-tarantula-yhbyghmtl7lq
idx = pd.Index(['lama', 'cow', 'lama', 'beetle', 'lama', 'hippo'])
idx.drop_duplicates(keep='first')
Index(['lama', 'cow', 'beetle', 'hippo'], dtype='object')
idx.drop_duplicates(keep='last')
Index(['cow', 'beetle','lamb', 'hippo'], dtype='object')
idx.drop_duplicates(keep='False')
Index(['cow', 'beetle','hippo'], dtype='object')
Source : https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Index.drop_duplicates.html | Last Update : Mon, 24 Aug 20
Answered by : doubtful-dormouse-uimhy2ojhi0j
{"tags":[{"tag":"textarea","content":"df3 = df3[~df3.index.duplicated(keep='first')]","code_language":"python"}]}
Source : https://stackoverflow.com/questions/13035764/remove-pandas-rows-with-duplicate-indices | Last Update : Thu, 06 Apr 23
Answered by : attractive-alpaca-9p5g9fusgcy5
df3 = df3[~df3.index.duplicated(keep='first')]
Source : https://stackoverflow.com/questions/13035764/remove-pandas-rows-with-duplicate-indices | Last Update : Fri, 04 Feb 22
Answered by : perfect-penguin-a1nt3r09ybl6
df = df.loc[:,~df.columns.duplicated()].copy()
# https://stackoverflow.com/questions/14984119/python-pandas-remove-duplicate-columns
Source : | Last Update : Mon, 10 Oct 22
Answered by : brave-bat-gssz4vb0pdep
{"tags":[{"tag":"textarea","content":"data.loc[data['email'].duplicated(keep=False),:]","code_language":"whatever"}]}
Source : https://openclassrooms.com/fr/courses/7410486-nettoyez-et-analysez-votre-jeu-de-donnees/7451506-nettoyez-vos-donnees-avec-python | Last Update : Sat, 18 Feb 23
Answered by : lazy-lark-jenyffc7wa94
df.drop_duplicates()
Source : | Last Update : Mon, 30 May 22