'StopIteration Error while using scholarly.pprint function

I am trying to extract Google Scholar public profiles of certain professors.

I have a list of professors' names and I am using it with help of a scholarly package for scraping their public profile information. However, I am stuck with an error. I am only able to retrieve information for the first name in the professor_list and not the subsequent ones.

for name in professor_list:
    search_query = scholarly.search_author(name)
    scholarly.pprint(next(search_query))

Output:

{'affiliation': 'Deakin University',
 'citedby': 2528,
 'email_domain': '@deakin.edu.au',
 'filled': False,
 'interests': ['Lynn Batten'],
 'name': 'Lynn Batten',
 'scholar_id': 'Tmg0T9sAAAAJ',
 'source': 'SEARCH_AUTHOR_SNIPPETS',
 'url_picture': 'https://scholar.google.com/citations?view_op=medium_photo&user=Tmg0T9sAAAAJ'}
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-242-5b96571c0972> in <module>
      1 for name in professor_list:
      2     search_query = scholarly.search_author(name)
----> 3     scholarly.pprint(next(search_query))

StopIteration:


Solution 1:[1]

Although, scholarly.pprint(next(search_query)) should be working, you can add default value None for next() method in case nothing is found, e.g. next(search_query, None):

from scholarly import scholarly

professor_list = ["Marty Banks, Berkeley",
                  "Adam Lobel, Blizzard",
                  "Daniel Blizzard, Blizzard",
                  "Shuo Chen, Blizzard",
                  "Ian Livingston, Blizzard",
                  "Minli Xu, Blizzard"]

for professor_name in professor_list:
    search_query = scholarly.search_author(name=professor_name)
    scholarly.pprint(next(search_query, None))

More information about StopIteration by Martijn Pieters.

Full output:

{'affiliation': 'Professor of Vision Science, UC Berkeley',
 'citedby': 22559,
 'email_domain': '@berkeley.edu',
 'filled': False,
 'interests': ['vision science', 'psychology', 'human factors', 'neuroscience'],
 'name': 'Martin Banks',
 'scholar_id': 'Smr99uEAAAAJ',
 'source': 'SEARCH_AUTHOR_SNIPPETS',
 'url_picture': 'https://scholar.google.com/citations?view_op=medium_photo&user=Smr99uEAAAAJ'}
{'affiliation': 'Blizzard Entertainment',
 'citedby': 3050,
 'email_domain': '@AdamLobel.com',
 'filled': False,
 'interests': ['Gaming', 'Emotion regulation'],
 'name': 'Adam Lobel',
 'scholar_id': '_xwYD2sAAAAJ',
 'source': 'SEARCH_AUTHOR_SNIPPETS',
 'url_picture': 'https://scholar.google.com/citations?view_op=medium_photo&user=_xwYD2sAAAAJ'}
{'affiliation': '',
 'citedby': 873,
 'email_domain': '',
 'filled': False,
 'interests': ['Daniel Blizzard'],
 'name': 'Daniel Blizzard',
 'scholar_id': 'dk4LWEgAAAAJ',
 'source': 'SEARCH_AUTHOR_SNIPPETS',
 'url_picture': 'https://scholar.google.com/citations?view_op=medium_photo&user=dk4LWEgAAAAJ'}
{'affiliation': 'Senior Data Scientist, Blizzard Entertainment',
 'citedby': 656,
 'email_domain': '@cs.cornell.edu',
 'filled': False,
 'interests': ['Machine Learning', 'Data Mining', 'Artificial Intelligence'],
 'name': 'Shuo Chen',
 'scholar_id': 'OBf4YnkAAAAJ',
 'source': 'SEARCH_AUTHOR_SNIPPETS',
 'url_picture': 'https://scholar.google.com/citations?view_op=medium_photo&user=OBf4YnkAAAAJ'}
{'affiliation': 'Blizzard Entertainment',
 'citedby': 620,
 'email_domain': '@usask.ca',
 'filled': False,
 'interests': ['Human-computer interaction',
               'User Experience',
               'Player Experience',
               'User Research',
               'Games'],
 'name': 'Ian Livingston',
 'scholar_id': 'xBHVqNIAAAAJ',
 'source': 'SEARCH_AUTHOR_SNIPPETS',
 'url_picture': 'https://scholar.google.com/citations?view_op=medium_photo&user=xBHVqNIAAAAJ'}
{'affiliation': 'Blizzard Entertainment',
 'citedby': 502,
 'email_domain': '@blizzard.com',
 'filled': False,
 'interests': ['Game', 'Machine Learning', 'Data Science', 'Bioinformatics'],
 'name': 'Minli Xu',
 'scholar_id': 'QST5iogAAAAJ',
 'source': 'SEARCH_AUTHOR_SNIPPETS',
 'url_picture': 'https://scholar.google.com/citations?view_op=medium_photo&user=QST5iogAAAAJ'}

Alternatively, you can iterate one more time over scholarly.search_author() results to make it work:

from scholarly import scholarly
import json

professor_list = ["Marty Banks, Berkeley",
                  "Adam Lobel, Blizzard",
                  "Daniel Blizzard, Blizzard",
                  "Shuo Chen, Blizzard",
                  "Ian Livingston, Blizzard",
                  "Minli Xu, Blizzard"]

professor_results = []

for professor_name in professor_list:
    for professor_result in scholarly.search_author(name=professor_name):
        professor_results.append({
            "name": professor_result.get("name"),
            "affiliations": professor_result.get("affiliation"),
            "email_domain": professor_result.get("email_domain"),
            "interests": professor_result.get("interests"),
            "citedby": professor_result.get("citedby")
        })

print(json.dumps(professor_results, indent=2, ensure_ascii=False))

Full output:

[
  {
    "name": "Martin Banks",
    "affiliations": "Professor of Vision Science, UC Berkeley",
    "email_domain": "@berkeley.edu",
    "interests": [
      "vision science",
      "psychology",
      "human factors",
      "neuroscience"
    ],
    "citedby": 22559
  },
  {
    "name": "Adam Lobel",
    "affiliations": "Blizzard Entertainment",
    "email_domain": "@AdamLobel.com",
    "interests": [
      "Gaming",
      "Emotion regulation"
    ],
    "citedby": 3050
  },
  {
    "name": "Daniel Blizzard",
    "affiliations": "",
    "email_domain": "",
    "interests": [
      "Daniel Blizzard"
    ],
    "citedby": 873
  },
  {
    "name": "Shuo Chen",
    "affiliations": "Senior Data Scientist, Blizzard Entertainment",
    "email_domain": "@cs.cornell.edu",
    "interests": [
      "Machine Learning",
      "Data Mining",
      "Artificial Intelligence"
    ],
    "citedby": 656
  },
  {
    "name": "Ian Livingston",
    "affiliations": "Blizzard Entertainment",
    "email_domain": "@usask.ca",
    "interests": [
      "Human-computer interaction",
      "User Experience",
      "Player Experience",
      "User Research",
      "Games"
    ],
    "citedby": 620
  },
  {
    "name": "Minli Xu",
    "affiliations": "Blizzard Entertainment",
    "email_domain": "@blizzard.com",
    "interests": [
      "Game",
      "Machine Learning",
      "Data Science",
      "Bioinformatics"
    ],
    "citedby": 502
  }
]

Another alternative is to use Google Scholar Profiles API from SerpApi. It's a paid API with a free plan that handles scaling, bypasses blocks from search engines via dedicated proxies and CAPTCHA solving services. Check out the playground.

Example code to integrate:

from serpapi import GoogleScholarSearch
import json

professor_list = ["Marty Banks, Berkeley",
                  "Adam Lobel, Blizzard",
                  "Daniel Blizzard, Blizzard",
                  "Shuo Chen, Blizzard",
                  "Ian Livingston, Blizzard",
                  "Minli Xu, Blizzard"]

for professor_name in professor_list:
    params = {
        "api_key": "Your SerpApi API key",
        "engine": "google_scholar_profiles",
        "hl": "en",
        "mauthors": professor_name
    }

    search = GoogleScholarSearch(params)
    results = search.get_dict()

    for result in results["profiles"]:
        print(json.dumps(result, indent=2))

Full output:

{
  "name": "Martin Banks",
  "link": "https://scholar.google.com/citations?hl=en&user=Smr99uEAAAAJ",
  "serpapi_link": "https://serpapi.com/search.json?author_id=Smr99uEAAAAJ&engine=google_scholar_author&hl=en",
  "author_id": "Smr99uEAAAAJ",
  "affiliations": "Professor of Vision Science, UC Berkeley",
  "email": "Verified email at berkeley.edu",
  "cited_by": 22559,
  "interests": [
    {
      "title": "vision science",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Avision_science",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:vision_science"
    },
    {
      "title": "psychology",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Apsychology",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:psychology"
    },
    {
      "title": "human factors",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Ahuman_factors",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:human_factors"
    },
    {
      "title": "neuroscience",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Aneuroscience",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:neuroscience"
    }
  ],
  "thumbnail": "https://scholar.google.com/citations/images/avatar_scholar_56.png"
}
{
  "name": "Adam Lobel",
  "link": "https://scholar.google.com/citations?hl=en&user=_xwYD2sAAAAJ",
  "serpapi_link": "https://serpapi.com/search.json?author_id=_xwYD2sAAAAJ&engine=google_scholar_author&hl=en",
  "author_id": "_xwYD2sAAAAJ",
  "affiliations": "Blizzard Entertainment",
  "email": "Verified email at AdamLobel.com",
  "cited_by": 3050,
  "interests": [
    {
      "title": "Gaming",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Agaming",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:gaming"
    },
    {
      "title": "Emotion regulation",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Aemotion_regulation",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:emotion_regulation"
    }
  ],
  "thumbnail": "https://scholar.googleusercontent.com/citations?view_op=small_photo&user=_xwYD2sAAAAJ&citpid=3"
}
https://serpapi.com/search
{
  "name": "Daniel Blizzard",
  "link": "https://scholar.google.com/citations?hl=en&user=dk4LWEgAAAAJ",
  "serpapi_link": "https://serpapi.com/search.json?author_id=dk4LWEgAAAAJ&engine=google_scholar_author&hl=en",
  "author_id": "dk4LWEgAAAAJ",
  "affiliations": "",
  "cited_by": 873,
  "thumbnail": "https://scholar.google.com/citations/images/avatar_scholar_56.png"
}
{
  "name": "Shuo Chen",
  "link": "https://scholar.google.com/citations?hl=en&user=OBf4YnkAAAAJ",
  "serpapi_link": "https://serpapi.com/search.json?author_id=OBf4YnkAAAAJ&engine=google_scholar_author&hl=en",
  "author_id": "OBf4YnkAAAAJ",
  "affiliations": "Senior Data Scientist, Blizzard Entertainment",
  "email": "Verified email at cs.cornell.edu",
  "cited_by": 656,
  "interests": [
    {
      "title": "Machine Learning",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Amachine_learning",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:machine_learning"
    },
    {
      "title": "Data Mining",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Adata_mining",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:data_mining"
    },
    {
      "title": "Artificial Intelligence",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Aartificial_intelligence",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:artificial_intelligence"
    }
  ],
  "thumbnail": "https://scholar.googleusercontent.com/citations?view_op=small_photo&user=OBf4YnkAAAAJ&citpid=1"
}
{
  "name": "Ian Livingston",
  "link": "https://scholar.google.com/citations?hl=en&user=xBHVqNIAAAAJ",
  "serpapi_link": "https://serpapi.com/search.json?author_id=xBHVqNIAAAAJ&engine=google_scholar_author&hl=en",
  "author_id": "xBHVqNIAAAAJ",
  "affiliations": "Blizzard Entertainment",
  "email": "Verified email at usask.ca",
  "cited_by": 620,
  "interests": [
    {
      "title": "Human-computer interaction",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Ahuman_computer_interaction",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:human_computer_interaction"
    },
    {
      "title": "User Experience",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Auser_experience",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:user_experience"
    },
    {
      "title": "Player Experience",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Aplayer_experience",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:player_experience"
    },
    {
      "title": "User Research",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Auser_research",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:user_research"
    },
    {
      "title": "Games",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Agames",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:games"
    }
  ],
  "thumbnail": "https://scholar.google.com/citations/images/avatar_scholar_56.png"
}
{
  "name": "Minli Xu",
  "link": "https://scholar.google.com/citations?hl=en&user=QST5iogAAAAJ",
  "serpapi_link": "https://serpapi.com/search.json?author_id=QST5iogAAAAJ&engine=google_scholar_author&hl=en",
  "author_id": "QST5iogAAAAJ",
  "affiliations": "Blizzard Entertainment",
  "email": "Verified email at blizzard.com",
  "cited_by": 502,
  "interests": [
    {
      "title": "Game",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Agame",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:game"
    },
    {
      "title": "Machine Learning",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Amachine_learning",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:machine_learning"
    },
    {
      "title": "Data Science",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Adata_science",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:data_science"
    },
    {
      "title": "Bioinformatics",
      "serpapi_link": "https://serpapi.com/search.json?engine=google_scholar_profiles&hl=en&mauthors=label%3Abioinformatics",
      "link": "https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:bioinformatics"
    }
  ],
  "thumbnail": "https://scholar.googleusercontent.com/citations?view_op=small_photo&user=QST5iogAAAAJ&citpid=14"
}

Disclaimer, I work for SerpApi.

Solution 2:[2]

When one uses the following code:

search_query = scholarly.search_pubs('A Bayesian Analysis of the Style Goods Inventory Problem')
scholarly.pprint(next(search_query))

Is there a way of saving the output of the above code as a dataframe?

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Dmitriy Zub
Solution 2 Addy