'How to handle response "None" from GeoPy client using Pandas apply?
I am working on pandas dataframe with quite a few hundreds addresses, trying to add a new column with coordinates received from geopy.
Main question: how to handle unresolved geopy addresses, which gives result "None"?
I am quite new in python and not sure, how to move forward.
My code works, but stops once Latitude/Longitude is not in the database and I get response "None".
Original line:
new_df["coords"] = (
new_df["address"]
.progress_apply(geolocator)
.apply(lambda x: (x.latitude, x.longitude))
)
trying to workout something like the below:
new_df["coords"] = (
new_df["address"]
.progress_apply(geolocator)
.apply(lambda x: np.nan if x == "" else (x.latitude, x.longitude))
)
But I keep receiving error: AttributeError: 'NoneType' object has no attribute 'latitude' and I cannot think of idea how to get around it...
I am testing on 2 addresses at the moment:
- "Angyalföld - Béke-Tatai utcai lakótelep" - gives result None
- "Budapest, Bercsényi utca, Hungary" - works properly
full code for testing below, - it will work nicely without the first address:
from random import randint
import pandas as pd
from geopy.exc import *
from geopy.extra.rate_limiter import RateLimiter
from geopy.geocoders import Nominatim
from tqdm import tqdm
tqdm.pandas() # progress bar
data = ["Budapest, Bercsényi utca, Hungary", "Angyalföld - Béke-Tatai utcai lakótelep"]
df = pd.DataFrame(data, columns=["address"])
user_agent = "geopy_user_{}".format(randint(10000, 99999))
app = Nominatim(user_agent=user_agent)
geolocator = RateLimiter(app.geocode, min_delay_seconds=1)
try:
df["coords"] = (
df["address"]
.progress_apply(geolocator)
.apply(
lambda x: (x.latitude, x.longitude)
if hasattr(x, "latitude") and hasattr(x, "longitude")
else pd.NA
)
)
print(df)
except GeocoderServiceError as e:
print("Failed")
print(e) # not yet sure how to handle errors - please ignore or advise
Solution 1:[1]
One workaround is to prevent apply
from failing by checking if x
has the attributes "latitude" and "longitude" using Python built-in function hasattr, like this:
df["coords"] = (
df["address"]
.progress_apply(geolocator)
.apply(
lambda x: (x.latitude, x.longitude)
if hasattr(x, "latitude") and hasattr(x, "longitude")
else pd.NA
)
)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Laurent |