GitHub collector
Service description
GitHub is a version control platform that allows you to track changes to your codebase, flag bugs and issues for follow-up, and manage your product's build process. For engineers, it simplifies the process of working with other people and makes it easy to collaborate on projects. Team members can work on files and easily merge their changes with the master branch of the project. Github API provides data about a hosted code repository, ranging from commits to pull requests and comments.
The Devo Github collector enables customers to retrieve data from Github API into Devo to query, correlate, analyze, and visualize it, enabling Enterprise IT and Cybersecurity teams to take the most impactful decisions at the petabyte scale.
Devo collector features
Feature | Details |
---|---|
Allow parallel downloading ( |
|
Running environments |
|
Populated Devo events |
|
Data source description
The GitHub API provides all the information about a code repository hosted with them, from commits to pull requests and comments.
Listed below are the resources that can be sent to Devo:
Data source | Description | GitHub API endpoint | Collector service name | Devo table | Available from release |
---|---|---|---|---|---|
Collaborators | Information about the repository collaborators. |
|
Resource: |
|
|
Commits | Commits made in the repository |
|
Resource: |
|
|
Forks | Forks created in the repository |
|
Resource: |
|
|
Events | Information about the different events such as resource creations or deletions |
|
Resource: |
|
|
Issue comments | Comments made in every issue |
|
Resource: |
|
|
Subscribers | Information about the different users subscribed to one repository |
|
Resource: |
|
|
Pull requests | Pull requests made in the repository |
|
Resource: |
|
|
Subscriptions | Repositories you are subscribed |
|
Resource: |
|
|
Releases | Information about releases made in the repository |
|
Resource: |
|
|
Stargazers | Information about the users who “starts” repositories making them favourites |
|
Resource: |
|
|
Audit | Organization auditory events |
|
Resource: |
|
|
SSO Authorizations | Single sign-on authorization |
|
Resource: |
|
|
Webhooks | Organization created webhooks |
|
Resource: |
|
|
How to enable the collection in the vendor
In order to retrieve the data, we need to create an access token to authenticate the collector on our Github server.
Steps | Screenshots | |
---|---|---|
1 | Login to your Github account. This account must be the main account that controls all Github. | |
2 | Go to Settings → Developer settings → Personal access tokens. | |
3 | Click Generate new token (yellow box in the screenshot). |
|
4 | Tick the following checkboxes:
|
|
5 | Click Generate token. |
|
If you're using SAML authentication in your account, you'll need to authorize your token after generating it. Check how to do it in this article.
Authorization with SAML
SAML authorization is a new feature that has been added to the collector since v1.2.0
.
Definition of SAML
SAML is a markup language for security confirmations, which provides a standardized way to tell external applications and services that a user is who he or she claims to be. SAML uses single sign-on (SSO) technology and allows you to authenticate a user once and then communicate that authentication to multiple applications.
Authorizing a personal access token for use with SAML single sign-on
To use SAML, you must first authorize the token for personal use. There are two options:
Authorize existing token.
Create a new token and authorize it.
The steps are detailed in the GitHub documentation.
Minimum configuration required for basic pulling
Although this collector supports advanced configuration, the fields required to download data with basic configuration are defined below.
This minimum configuration refers exclusively to those specific parameters of this integration. There are more required parameters related to the generic behavior of the collector. Check setting sections for details.
Setting | Details |
---|---|
| Set up here requires your access token created in the GitHub console. |
| Set up here requires your username. |
| Use this parameter to define the name of the organization that owns the repository |
| If saml setting is |
| This parameter is used to define the resources that will be retrieved. The allowed resources for this parameter are:
How to download all available resources: Just define all listed resources in an array as follows: "resources": [
"commits",
"forks",
"issue_comments",
"collaborators",
"events",
"pull_requests",
"releases",
"stargazers",
"subscribers",
"subscriptions"
] Define an array of How to download only the Just define the desired "resources": [
"commits"
] |
| This parameter is used to define the resources that will be retrieved. The allowed resources for this parameter are:
How to download all available resources: Just define all listed resources in an array as follows: "resources": [
"audit",
"sso_authorizations",
"webhooks"
]
How to download only the Just define the desired |
Accepted authentication methods
The following are the accepted authentication methods for this collector.
Authentication method | URL | API token | Username | Use SAML |
---|---|---|---|---|
API Token | REQUIRED | REQUIRED | REQUIRED |
|
SAML | REQUIRED | REQUIRED | REQUIRED | REQUIRED |
Run the collector
Once the data source is configured, you can send us the required information if you want us to host and manage the collector for you (Cloud collector), or deploy and host the collector in your own machine using a Docker image (On-premise collector).