Fix Pandas concat – InvalidIndexError: Reindexing only valid with uniquely valued Index objects

The error “invalidindexerror: reindexing only valid with uniquely valued index objects” occurs when you try to contact two or more pandas dataframes like data = pd.concat([dataframe1,dataframe2], axis=1) if you remove duplicate contents that conflict or additional columns, it will still throw an invalid-index-error.

Example 1: lets try to join below two data-frames and see the result we will get.

import pandas as pd

firstDF = pd.DataFrame({'col1': [1,2,3],
                    'col2': [4,5,6] 
                  })

secondDF = pd.DataFrame({'col1': [7,8,9],
                    'col2': ['10','11','12'],
                    'col3': ['13','14','15'] 
                  })

# Concat and keep only cols from firstDF

thirdDF = pd.concat([firstDF, secondDF], ignore_index=True).reindex(firstDF.columns, axis='columns')

Below is the expected result

thirdDF
col1 col2
1    4
2    5
3    6
7    10
8    11
9    12

If we run the application this is the full stack-trace we get…

    /Applications/Anaconda/anaconda3/lib/python3.9/site-packages/pandas/core/indexes/index.py in get_indexer(self, target, method, limit, tolerance)
   2435    index = index.intersection(other)
   2436    return index
-> 2437    union = _union_indexes(indexes)
   2438    return _ensure_index(union)
   2439 

   2440 
   2441    if not self._index_as_unique:
-> 2442    raise InvalidIndexError(self._requires_unique_msg)
   2443 
   2444    if not self._should_compare(target) and not is_interval_dtype(self.dtype):

InvalidIndexError: Reindexing only valid with uniquely valued Index objects

Solution: If the data that you’re working with is correct, then the reason must be related to duplicate column names. The solution to the raise error is to concat first the columns with the duplicate entry before joining it to other data frames…

firstDF = pd.DataFrame({'col1': [1,2,3],
                    'col2': [4,5,6] 
                  }).rename(columns={'col2':'col1'})
print (firstDF)
   col1  col1 <- col1 is duplicated
0     1     4
1     2     5
2     3     6

secondDF = pd.DataFrame({'col1': [7,8,9],
                    'col2': ['10','11','12'],
                    'col3': ['13','14','15'] 
                  })

# Concat and keep only columns from firstDF

thirdDF = pd.concat([firstDF, secondDF], ignore_index=True).reindex(firstDF.columns, axis='columns')
print (thirdDF)

To see why the InvalidIndexError: Reindexing only valid with uniquely valued Index objects error was thrown, print them the following…

print (df1.columns[firstDF.columns.duplicated(keep=False)])
Index(['col1', 'col1'], dtype='object')

print (secondDF.columns[secondDF.columns.duplicated(keep=False)])
Index([], dtype='object')

Alternative fix is to deduplicate it…

print (pd.io.parsers.ParserBase({'names': firstDF.columns})._maybe_dedup_names(firstDF.columns))
['col1', 'col1.1']

Example 2: Suppose we still have below 100+ dataframes with each having a unique index and some with empty columns:

import pandas as pd

df1 = pd.DataFrame({'a':[3,7,2,0], 'b':[], 'c':[7,3,3,6]}, index=['a1','a2','a3','a4'])

df2 = pd.DataFrame({'a':[9,7,3,1], 'b':[5,67,39,92], 'c':[42,27,56,66]}, index=['b1','b2','b3','b4'])
#I have ignored the rest of the dataframes with empty columns

# let concate/join the two as
df = pd.concat([df1,df2])

If we run the above python application this is the error we get…

pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects

Solution: If the column ‘b’ in the ‘df1’ above isn’t empty, then the above application would have compiled successfully without any error. But because there is some dataframes with empty columns, we get an error and the solution is to reset or drop the empty columns first before calling the concat() method with df.reset_index(inplace=True, drop=True).

0 Shares

Fix Pandas concat – InvalidIndexError: Reindexing only valid with uniquely valued Index objects

About Justice Ankomah

Leave a Reply Cancel reply

Related Posts

Solve NameError: name ‘nltk’ is not defined in Python

Importerror: cannot import name ‘force_text’ from django.utils.encoding

Fix – PermissionError: [Errno 13] Permission denied in Python

About Justice Ankomah

Leave a Reply Cancel reply