Document toolboxDocument toolbox

GitHub collector

Service description

GitHub is a version control platform that allows you to track changes to your codebase, flag bugs and issues for follow-up, and manage your product's build process. For engineers, it simplifies the process of working with other people and makes it easy to collaborate on projects. Team members can work on files and easily merge their changes with the master branch of the project. Github API provides data about a hosted code repository, ranging from commits to pull requests and comments.

The Devo Github collector enables customers to retrieve data from Github API into Devo to query, correlate, analyze, and visualize it, enabling Enterprise IT and Cybersecurity teams to take the most impactful decisions at the petabyte scale.

Devo collector features

Feature

Details

Feature

Details

Allow parallel downloading (multipod)

Not allowed

Running environments

Collector Server, On Premise

Populated Devo events

Standard

Data source description

The GitHub API provides all the information about a code repository hosted with them, from commits to pull requests and comments.

Listed below are the resources that can be sent to Devo:

Data source

Description

GitHub API endpoint

Collector service name

Devo table

Available from release

Data source

Description

GitHub API endpoint

Collector service name

Devo table

Available from release

Collaborators

Information about the repository collaborators.

/repos/{owner}/{repo}/collaborators

More info

repository

Resource: collaborators

vcs.github.repository.collaborators

v1.0.0

Commits

Commits made in the repository

/repos/{owner}/{repo}/commits/{branchname}

More info

repository

Resource: commits

vcs.github.repository.commits

v1.0.0

Forks

Forks created in the repository

/repos/{owner}/{repo}/forks

More info

repository

Resource: forks

vcs.github.repository.forks

v1.0.0

Events

Information about the different events such as resource creations or deletions

/repos/{owner}/{repo}/events

More info

repository

Resource: events

vcs.github.repository.events

v1.0.0

Issue comments

Comments made in every issue

/repos/{owner}/{repo}/comments

More info

repository

Resource: issue_comments

vcs.github.repository.issue_comments

v1.0.0

Subscribers

Information about the different users subscribed to one repository

/repos/{owner}/{repo}/subscribers

More info

repository

Resource: subscribers

vcs.github.repository.subscribers

v1.0.0

Pull requests

Pull requests made in the repository

/repos/{owner}/{repo}/pulls

More info

repository

Resource: pull_requests

vcs.github.repository.pull_requests

v1.0.0

Subscriptions

Repositories you are subscribed

/repos/{owner}/{repo}/subscription

More info

repository

Resource: subscriptions

vcs.github.repository.subscriptions

v1.0.0

Releases

Information about releases made in the repository

/repos/{owner}/{repo}/releases

More info

repository

Resource: releases

vcs.github.repository.releases

v1.0.0

Stargazers

Information about the users who “starts” repositories making them favourites

/repos/{owner}/{repo}/stargazers

More info

repository

Resource: stargazers

vcs.github.repository.stargazers

v1.0.0

Audit

Organization auditory events

/orgs/{org}/audit-log

More info

organization

Resource: audit

vcs.github.organization.audit

v1.0.0

SSO Authorizations

Single sign-on authorization

/orgs/{org}/credential-authorizations

More info

organization

Resource: sso_authorizations

vcs.github.organization.sso_authorizations

v1.0.0

Webhooks

Organization created webhooks

/orgs/{org}/hooks

More info

organization

Resource: webhooks

vcs.github.organization.webhooks

v1.0.0

How to enable the collection in the vendor

In order to retrieve the data, we need to create an access token to authenticate the collector on our Github server.

Steps

Screenshots

Steps

Screenshots

1

Login to your Github account. This account must be the main account that controls all Github.

2

Go to Settings → Developer settings → Personal access tokens.

3

Click Generate new token (yellow box in the screenshot).

 

4

Tick the following checkboxes:

  • repo

  • repo:status

  • repo_deployment

  • public_repo

  • repo:invite

  • security_events

  • read:packages

  • read:org

  • read:public_key

  • read:repo_hook

  • notifications

  • user

  • read:user

  • user:email

  • user:follow

  • delete_repo

  • read:discussion

  • read:enterprise

  • read:gpg_key

 

5

Click Generate token.

 

If you're using SAML authentication in your account, you'll need to authorize your token after generating it. Check how to do it in this article.

Authorization with SAML

SAML authorization is a new feature that has been added to the collector since v1.2.0.

Definition of SAML

SAML is a markup language for security confirmations, which provides a standardized way to tell external applications and services that a user is who he or she claims to be. SAML uses single sign-on (SSO) technology and allows you to authenticate a user once and then communicate that authentication to multiple applications.

Authorizing a personal access token for use with SAML single sign-on

To use SAML, you must first authorize the token for personal use. There are two options:

  • Authorize existing token.

  • Create a new token and authorize it.

The steps are detailed in the GitHub documentation.

Minimum configuration required for basic pulling

Although this collector supports advanced configuration, the fields required to download data with basic configuration are defined below.

This minimum configuration refers exclusively to those specific parameters of this integration. There are more required parameters related to the generic behavior of the collector. Check setting sections for details.

Setting

Details

Setting

Details

token_value

Set up here requires your access token created in the GitHub console.

username_value

Set up here requires your username.

organization_value

Use this parameter to define the name of the organization that owns the repository

uses_saml

If saml setting is True , validation with username is skipped. SAML authorization is used.

repository_resources

This parameter is used to define the resources that will be retrieved. The allowed resources for this parameter are:

  • commits

  • forks

  • issue_comments

  • collaborators

  • events

  • pull_requests

  • releases

  • stargazers

  • subscribers

  • subscriptions

How to download all available resources:

Just define all listed resources in an array as follows:

"resources": [ "commits", "forks", "issue_comments", "collaborators", "events", "pull_requests", "releases", "stargazers", "subscribers", "subscriptions" ]

Define an array of resources definitions with 1 or more elements.

How to download only the commits:

Just define the desired resources in an array as follows:

"resources": [ "commits" ]

organization_resources

This parameter is used to define the resources that will be retrieved. The allowed resources for this parameter are:

  • audit

  • sso_authorizations

  • webhooks

How to download all available resources:

Just define all listed resources in an array as follows:

"resources": [ "audit", "sso_authorizations", "webhooks" ]

How to download only the audit:

Just define the desired resources in an array as follows:

Accepted authentication methods

The following are the accepted authentication methods for this collector.

Authentication method

URL

API token

Username

Use SAML

Authentication method

URL

API token

Username

Use SAML

API Token

REQUIRED

REQUIRED

REQUIRED

 

SAML

REQUIRED

REQUIRED

REQUIRED

REQUIRED

Run the collector

Once the data source is configured, you can send us the required information if you want us to host and manage the collector for you (Cloud collector), or deploy and host the collector in your own machine using a Docker image (On-premise collector).