'How to check if extended entity is present or not in tweepy response
I am able to fetch different tweet parameters from tweet.
keyword = tweepy.Cursor(api.search, val,tweet_mode='extended',lang='en').items(2)
tweetdone = 0
all_tweet = []
for tweet in keyword:
tweet_record = {}
tweet_record['tweet.text'] = tweet.full_text
tweet_record['tweet.user.name'] = tweet.user.name
tweet_record['tweet.user.location'] = tweet.user.location
tweet_record['tweet.user.verified'] = tweet.user.verified
tweet_record['tweet.lang'] = tweet.lang
tweet_record['tweet.created_at'] = tweet.created_at
tweet_record['tweet.user'] = tweet.user
tweet_record['tweet.retweet_count'] = tweet.retweet_count
tweet_record['tweet.favorite_count'] = tweet.favorite_count
I want to parse media
objects from the tweet, but extended_entities
in which media_url
is present is not available in all tweets.
so if I try to fetch it like this:
tweet_record['media_url'] = tweet.extended_entities.media_url
It errors out because extended_entities
may not be present in some tweets.
How to deal this issue and fetch media content correctly?
Solution 1:[1]
You have a couple of options here, you can check whether the key exists, or use some try/excepts.
Check whether key exists:
You can do this because tweepy returns a status object, which acts similarly to a json file, or python dictionary, and thus you essentially have a key:value pair. You should be able to use (going by your above code)
if 'extended_entities' in tweet:
tweet_record['media_url'] = tweet.extended_entities.media_url
of course, the reverse is also possible
if 'extended_entities' not in tweet:
#whatever you want to do
This could lead to problems though, what if the extended_entities exists, but for some reason media_url doesn't? And what if you want to get even more from within that (there isn't for a status object, but hey, I'm just trying to future proof here!) You'll have to do long, or multi nested if statements, which won't look the best
if 'extended_entities' in tweet:
if 'media_url' in tweet['extended_entities']
#etc
so it might be easier to just throw it in a try except...
try:
tweet_record['media_url'] = tweet.extended_entities.media_url
except AttributeError:
#etc
this means the program won't error when particular elements aren't found. AttributeError is for accessing an invalid attribute of an object. You of course may want to re-order this for readability. Keep in mind though, that while doing this is pythonic it can be a bit hard to read if used too often in my opinion.
I referred to this question when looking up things for this answer. Gives some good ideas for this sort of thing if you need further help.
Hope that helps.
Solution 2:[2]
Also, a good option is to use hasattr(Object, name)
within an if-statement:
if hasattr(tweet, "extended_entities"):
\# do whatever
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Generic Snake |
Solution 2 | Paul Brennan |