'Downloading Issues from a Public GitHub Repository
I want to download issues from Jabref, an open-source citation tool.
I looked up how to do this, and it looks like the way everyone does it is through curl like this (source):
curl -i "https://api.github.com/repos/<repo-owner>/<repo-name>/issues" -u "<user-name>"
The problem is that I'm not the owner of the repository, so I'm not sure what to fill in for repo-owner. I also don't know who the owner is, since apparently Github hides that information, but even if I did I probably wouldn't have the required permissions.
Is anyone allowed to download issues from public repositories, or do you have to be a collaborator? If anyone is allowed, what should I fill in for repo-owner?
Solution 1:[1]
In the case of the public repository https://github.com/JabRef/jabref, the repoèowner is simply JabRef
.
curl -i "https://api.github.com/repos/JabRef/jabref/issues" -u "<user-name>"
This uses the GitHub API "List issues for a repository", available for anyone on public repositories.
Solution 2:[2]
Here is some Python code that does the trick for me.
def download_github_issues_as_dict(repo_url, token):
'''
since GitHub doesn't make it super easy to download GitHub issues...
:param repo_url: the full URL of the repo (don't include the trailing "/").
:param token: a GitHub Personal Access Token (create from GitHub itself)
:return: a dictionary that can be easily json-ified with the relevant info from the issues.
'''
assert isinstance(repo_url, str) and not repo_url.endswith("/") and "/" in repo_url, "need nice repo_url"
assert isinstance(token, str), "need nice token"
import github # pip install PyGithub
g = github.Github(token)
user_str, repo_str = repo_url.replace("https://github.com/", "").split("/")
user = g.get_user(user_str)
repo = user.get_repo(repo_str)
issues = repo.get_issues(state="all")
# the number 30 appears to hardcoded into the PyGitHub - maybe there is a better way to do this part?
real_issues = [_ for i in range(round(issues.totalCount / 30) + 1) for _ in issues.get_page(i) if
not _.pull_request]
rtn = {}
for iss in real_issues:
iss_dict = {"title": iss.title,
"body": iss.body,
"state": iss.state,
"comments": [_.body for _ in iss.get_comments()]}
rtn[iss.number] = iss_dict
return rtn
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | VonC |
Solution 2 | Pete Cacioppi |