'Pandas - No Null values but pd.to_datetime gives “Reindexing only valid with uniquely valued Index values"?
Sample Data
+---------+------------------------+
| | date |
+---------+------------------------+
| 0 | 2020-12-31 00:00:00 |
| 1 | 2020-12-31 00:00:00 |
| 2 | 2020-12-31 00:00:00 |
| 3 | 2020-06-11 00:00:00 |
| 4 | 2020-03-10 00:00:00 |
| 172588 | 2020-03-05 00:00:00 |
| 172589 | 2020-03-05 00:00:00 |
| 172590 | 2020-01-27 00:00:00 |
| 172591 | 2020-01-20 00:00:00 |
| 172592 | 2020-01-07 00:00:00 |
+---------+------------------------+
Error
df["date"] = pd.to_datetime(df["date"], errors="coerce").dt.strftime("%Y-%m-%d")
Produces the following Error:
~\miniconda3\lib\site-packages\pandas\core\tools\datetimes.py in to_datetime(arg, errors, dayfirst, yearfirst, utc, format, exact, unit, infer_datetime_format, origin, cache)
799 cache_array = _maybe_cache(arg, format, cache, convert_listlike)
800 if not cache_array.empty:
--> 801 result = arg.map(cache_array)
802 else:
803 values = convert_listlike(arg._values, format)
~\miniconda3\lib\site-packages\pandas\core\series.py in map(self, arg, na_action)
3968 dtype: object
3969 """
-> 3970 new_values = super()._map_values(arg, na_action=na_action)
3971 return self._constructor(new_values, index=self.index).__finalize__(
3972 self, method="map"
~\miniconda3\lib\site-packages\pandas\core\base.py in _map_values(self, mapper, na_action)
1129 values = self._values
1130
-> 1131 indexer = mapper.index.get_indexer(values)
1132 new_values = algorithms.take_1d(mapper._values, indexer)
1133
~\miniconda3\lib\site-packages\pandas\core\indexes\base.py in get_indexer(self, target, method, limit, tolerance)
2984
2985 if not self.is_unique:
-> 2986 raise InvalidIndexError(
2987 "Reindexing only valid with uniquely valued Index objects"
2988 )
InvalidIndexError: Reindexing only valid with uniquely valued Index objects
What have i tried
pd.to_datetime producing "Reindexing only valid with uniquely valued Index objects"
Resolving Reindexing only valid with uniquely valued Index objects
Why this is different
- There are definitely duplicates in the date column, but its not an index
- There are no null/na values in this column as the first link suggested. So making all NaT/Null/Na values unique does not solve the problem
Solution 1:[1]
I was having the same problem,
df with pandas version 1.3.0, a string column who had to be converted to_datetime and when doing so the number of values decreased. I couldnt explain the null values on the converted to_datetime column, they were not null, None, nor NaT in the string column. The problem solved when I added cache=False
in the to_datetime
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | mara875 |