Bootstrap Github workflows for your python project

For effective collaboration with citizen developers on any Github project, we should set up reliable CI/CD pipelines that help you automate testing, code linting and other release processes. Setting up CI/CD for your Python projects on Github can be overwhelming because of a significant number of steps. In this article, we will go through some of the GitHub workflows that you can add to your Python project to set up the CI/CD pipelines.

What are GitHub workflows?

A GitHub workflow is an automated process that runs one or more jobs. Workflows are defined by a YAML file that's checked into a GitHub repository. They can be triggered by events in the repository such as the creation of pull requests, events outside the repository, a predefined schedule or manually.

Workflows are stored in a directory named .github/workflows at the root of the repository. A repository can have multiple workflows.

What are GitHub Actions?

A given GitHub workflow triggers a build pipeline under the Actions tab in your GitHub repo. The build pipeline under the Actions tab could execute various Python unit tests, integration tests, code linting and code security checks.

Let's look at how you set up the aforementioned actions via GitHub workflow files.

Workflows for Python project

Broadly, I have come up with the following list of 4 workflows that I think are necessary for your Python project.

  • Tests workflow

  • Code Linting workflow

  • Code Security workflow

  • Checking Python package (Optional but required in case you want to publish a Pypi release of your project)

There may be more workflows that you come up with aside from the above list. Having worked with multiple open-source Python projects I have seen that these four workflows are the most commonly used.

Test workflow

The test workflow file will execute all the Python tests using pytest. The unit tests can be configured to run on various operating systems with different Python versions. This sort of test matrix gives a lot of confidence to the developers that their code can run on various platforms and in environments having different Python versions. One such sample for the GitHub workflow file is below (also available here on GitHub):-

name: Run Python Unit Tests

on:
  push:
    branches: [master]
  pull_request:
    branches: [master]

jobs:
  test:
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
        os: [ubuntu-latest, macos-latest, windows-latest]
        python-version: ["3.7", "3.8", "3.9", "3.10", "3.11"]

    steps:
      - name: Checkout code
        uses: actions/checkout@v3

      - name: Set up Python ${{ matrix.python-version }}
        uses: actions/setup-python@v4
        with:
          python-version: ${{ matrix.python-version }}

      - name: Install pip
        run: |
          python -m pip install --upgrade pip

      # Install your python dependencies
      # - name: Install python package or dependencies
      #   run: |
      #     pip install -e .

      - name: Install pytest
        run: |
          pip install pytest

      - name: Dump all installed packages
        run: |
          pip list

      - name: Run tests
        run: |
          pytest --durations=10

The above GitHub Actions workflow is configured to trigger two types of events: push events (when changes are pushed to the repository's master branch) and pull_request events (when pull requests are created or updated against the master branch). This ensures that the unit tests are executed both for direct changes to the branch and for contributions submitted via pull requests.

The workflow defines a single job named "test" that runs on different operating systems (Ubuntu, macOS, and Windows) and multiple Python versions (3.7, 3.8, 3.9, 3.10, and 3.11). This matrix strategy allows for cross-platform and cross-version compatibility testing. The job's steps include checking out the code, setting up the specified Python version, upgrading pip, installing necessary test dependencies such as pytest, and then running the unit tests using pytest.

The commented-out block with the name "Install your python dependencies" is an additional step for installing Python dependencies using pip. If you intend to use additional dependencies or have a specific setup for your Python environment, you can uncomment these lines and customize them according to your requirements. This allows you to install any necessary Python packages or dependencies for your unit tests.

Testing is a very important component of any software development. You could have various levels of similar testing workflows. For instance, there could be one workflow for unit tests and one workflow for end-to-end tests (integration tests). These two workflows will trigger separate workflows for running unit tests and end-to-end tests, thus making it easier to isolate code issues at the module level or at the overall project level.

Code linting workflow

The linting workflow file will lint the Python code in your repository using flake8 and isort. Linting is an automated check that can help improve the quality of your code. It can help you find errors earlier, which can save time and reduce costs. Linting can also help you:

  • Reduce errors in production

  • Improve code readability

  • Unify the style within a team

  • Check potential issues and errors early on

  • Suggest best practices

One such sample for the GitHub workflow file for linting is below (also available here on GitHub):-

name: Python Linting

on:
  push:
    branches: [master]
  pull_request:
    branches: [master]

jobs:
  build:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v3
      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.x'

      - name: Install flake8 and isort for linting
        run: |
          pip install flake8 isort

      - name: flake8
        run: flake8

      - name: Check sorted python imports using isort
        run: |
          isort . -c

The GitHub Actions workflow named "Python Linting" is triggered by two types of events: push events, which occur when changes are pushed to the repository's master branch, and pull_request events, which are generated when pull requests are created or updated against the master branch. This ensures that code linting is consistently applied to both direct changes and contributions made via pull requests.

Within the workflow, there is a single job named "build" that runs on the ubuntu-latest operating system environment. The steps of this job include checking out the code from the repository, setting up a Python environment with version '3.x' which installs the most recent production version of Python, and installing the necessary linting tools, specifically flake8 and isort, using pip. Subsequently, the workflow runs the flake8 tool to perform linting checks for adherence to Python coding style guidelines. It then checks the sorted order of Python imports using isort.

Code security workflow

Vulnerability checks are an essential part of software development to make sure that the software that is shipped to the customers doesn't have any security vulnerabilities that could be exploited by hackers. There are quite a few Python packages that help to identify if your Python package and its dependencies have security vulnerabilities. These are safety and bandit. These Python packages help find security vulnerabilities with your Python code and its dependencies and compare them against a known database of security vulnerabilities maintained in SafetyDB. One such workflow that helps you run safety is below (also available here on GitHub):-

name: Security Scan

on:
  push:
    branches: [master]
  pull_request:
    branches: [master]


jobs:
  security-scan:
    name: Run Safety Check
    runs-on: ubuntu-latest

    steps:
    - name: Checkout code
      uses: actions/checkout@v3

    - name: Set up Python
      uses: actions/setup-python@v4
      with:
        python-version: 3.x

      # Install your python dependencies
      # - name: Install python package or dependencies
      #   run: |
      #     pip install -e .

    - name: Upgrade setuptools
      run: pip install --upgrade setuptools

    - name: Install safety
      run: pip install safety

    - name: Run Safety Check
      run: |
        safety check

The GitHub Actions workflow titled "Security Scan" workflow is configured to be triggered on two types of events: push events (when changes are pushed to the repository's master branch) and pull_request events (when pull requests are created or updated against the master branch). This ensures that security scans are carried out both when direct changes are made and when contributions are submitted via pull requests.

The workflow consists of a single job named "security-scan," which runs on the ubuntu-latest operating system environment. The job's steps include checking out the code from the repository, setting up a Python environment with version 3.x which installs the most recent production version of Python, upgrading the setuptools package, installing the safety tool via pip, and finally, running the "safety check" command. This command is executed using safety, a security-focused tool that checks Python dependencies for known security vulnerabilities.

The commented-out block labeled "Install your python dependencies" within the workflow is a placeholder for installing project-specific Python dependencies. If your project requires specific Python packages or dependencies, you can uncomment these lines and customize them to install the necessary dependencies. This allows you to ensure that the project's dependencies are correctly installed and checked for security vulnerabilities during the workflow execution.

Checking Python package (Optional)

In case you want to publish your cool Python package to Pypi, you might want to run your Python package through the twine. twine is used for checking that your Python distribution has a valid name and version, the Python package contains the required files and the Python package's metadata is valid. twine is also used for publishing the Python package to Pypi through proper authorization. One such workflow that helps you check your Python package via twine is below (also available here on GitHub):-

name: Python Twine Check

on:
  push:
    branches: [master]
  pull_request:
    branches: [master]

jobs:
  build:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v3
      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.x'

      - name: Install wheel, twine and setuptools
        run: |
          pip install twine wheel setuptools

      - name: Check package consistency with twine
        run: |
          python setup.py check sdist bdist_wheel
          twine check dist/*

The GitHub Actions workflow titled "Python Twine Check" is configured to be triggered on two types of events: push events (when changes are pushed to the repository's master branch) and pull_request events (when pull requests are created or updated against the master branch). This ensures that the consistency of Python packages is verified both when direct changes are introduced and when contributions are submitted via pull requests.

The workflow defines a single job named "build," which runs on the ubuntu-latest operating system environment. The job's steps include checking out the code from the repository, setting up a Python environment with version '3.x' which installs the most recent production version of Python, and installing two essential tools for Python package management: twine, setuptools and wheel. The subsequent step runs checks for package consistency using python setup.py check sdist bdist_wheel to ensure that the source distribution and binary distribution are correctly generated. The final step employs twine to verify the integrity of the distribution files in the dist directory. This comprehensive workflow enhances the reliability and quality of Python package distributions by identifying any issues during the build process.

Summary

To summarize the article, we learned about GitHub workflows and GitHub actions. We learned about four Github workflows that we could configure for any Python project. We went through different parts of workflow files for testing, linting, vulnerability assessment and package checks for Python-based projects on GitHub. To explore these workflow files on Github you can visit the repository SamplePythonWorkflows.