Scaffolding repos with cookiecutter
For code development projects, we often end up creating code repositories (repos) that have similar structures or components to what we have previously used.
Cookiecutter is a tool that we can use to scaffold the creation of our repos and this post will guide you through how to do this and more.
Standardise with automation
When creating a code repository (repo), you typically start from scratch or with a target repo structure to aim for. Either approach has its merits. Starting from scratch can be exciting, with so many possibilities, whilst using a tried and tested structure can provide a sense of comfort.
Standardisation plays an important part in either of these choices because it helps ensure consistency, encourages reuse of existing good practices and generally gets teams collaborating much better due to a shared understanding of what set of standards/expectations should be applied. Standardisation is good but what is often better is standardisation along with automation.
There are various options available when we want to automate the creation a project structure based off a template or target structure. Although investigating other options is not in the scope of this post, examples I’ve come across involve a mix of custom shell scripts to transform boilerplate structures, “orchestrated” via tooling such as GitHub or Azure DevOps to create artifacts that are ready to be downloaded for further inspection or general use.
This is where a tool such as cookiecutter comes in. Cookiecutter helps to simplify and automate scaffolding of code repos.
Why cookiecutter
Simply put, cookiecutter is a tool that enables you to create a project structure from an existing template (i.e. cookiecutter). Created by the team behind Django best practices, it can be applied to just about any situation in software development, where a new repo needs to be created e.g. Java, Python, C++ or data products.
Importantly, this can be a huge time saver for situations where one would repeatedly have to create the same basic structure for new projects.
The next sections will cover the key underlying components and usage.
Installation
For steps on how to install cookiecutter
, follow the installation instructions here.
In general, you will need Python installed and it is recommended to run the installation of cookiecutter with pip via:
python3 -m pip install --user cookiecutter
.
Create a cookiecutter template from scratch
Use the following steps to create a cookiecutter project from scratch
- Create a new directory named
mkdir cookiecutter-data-app
- Enter into the newly created directory using
cd cookiecutter-data-app
- Create a
cookiecutter.json
at the root of the directory created in step 2 with the following content
{
"project_name": "Starter data project",
"project_slug": "{{ cookiecutter.project_name.lower().strip().replace(' ', '_').replace(':', '_').replace('!', '_')}}",
"created_by": ""
}
- Create a project slug directory (more on this later) named
{{ cookiecutter.project_slug }}
- Within the project slug directory, created the following 2 files and 1 folder:
README.md
andsrc\sample.md
- The folder setup should look like the following image:
- Run
cookiecutter ../cookiecutter-data-app --output-dir "<insert-output-directory>"
to execute the template and save to a specific directory. You can also change to the parent directory and runcookiecutter cookiecutter-data-app --output-dir "<insert-output-directory>"
- The results of the execution will depend on your inputs and should be similar to the below
Process flow
The process flow follows a set of 5 key steps, as shown in the following diagram.
1.Prompt: The application will present options for the user to respond to
2.Gather: Once all options are presented, the options are put together for further processing
3.Pre-hook: Apply custom code before your project is generated e.g., to add default values
4.Compile: Generate your project
5.Post-hook: Apply custom code after your project is generated e.g., to remove files/directories
Key components
Template language
- The template language is Jinja2
Default file
- A default
cookiecutter.json
is a must and contains a set of default values and prompts
{
"project_name": "Starter Project",
"project_slug": "{{ cookiecutter.project_name.lower().strip().replace(' project', '').replace(' ', '_').replace(':', '_').replace('-', '_').replace('!', '_')}}",
"created_by": "",
"created_on": "{% now 'local' %}",
"_extensions": ["jinja2_time.TimeExtension"],
"adapter": ["snowflake", "databricks"]
}
Project slug
- A default folder that contains the contents that will be templated and takes the form
{{cookiecutter.project_slug}}
Configuration file
- The default file can be overridden to prevent default value selection, by including a
.yaml
file and setting the default config to it via thecookiecutter --config-file
switch. Setting the default config file can also be done via an environment variable e.g.,export COOKIECUTTER_CONFIG
.
{
default_context:
full_name: "Kimani Mbugua"
email: "someemail@domain.com"
github_username: "kimani-m"
project_name: "starter_proj"
cookiecutters_dir: "cookiecutter-pantry/"
replay_dir: "/cookiecutter-replay/"
abbreviations:
azdo: https://example@dev.azure.com/example/project/_git/{0}
gh: https://github.com/{0}.git
}
Pantry
- The cookiecutter pantry is a collection of cookiecutter templates. One of the most popular pantries is GitHub, where you can find hundreds of repositories related to cookiecutter. These are great to get you started or for ideas of what is available.
Usage
There is quite a good bit of documentation on cookiecutter that you can go through here to see basic or advanced CLI commands. Way more than we’d cover in this post, so I’d encourage you to review and try commands that interest you out.
Basic
Cookiecutter has the command, options and arguments that can be passed. Some basic options for prompts include:
- no input: Great for suppressing prompts
cookiecutter --no-input
- force: Handy if you want to overwrite
cookiecutter --overwrite-if-exists
or -f - verbose: Gives more detail when the template is generated
cookiecutter --verbose
or -v - replay: Useful for CICD implementations where previous values are used
cookiecutter --replay
Advanced usage
The more advanced options give flexibility to the template generation process, such as:
- Choices: Give user options the cookiecutter.json file
{
"language": ["SQL", "Java", "Python, "Cobol"]
}
- Copy without render: Avoid rendering content
{
"_copy_without_render": [
"*.txt",
"some_dir/"
]
}
- Template extensions: Extend cookiecutter capability to use Jinja2 extensions e.g. jinja2_time
{
"created_on": "{% now 'local' %}",
"_extensions": ["jinja2_time.TimeExtension"]
}
Hooks
In essence, hooks are brilliant and allow cookiecutter to really shine. They are created from a root level in a hooks directory, are Python or shell script based and come in 2 varieties, pre and post hooks.
- Pre generation hooks are great for basic validation and enforce required values
pre_gen_project.py
- Post generation hooks can give a great user experience by cleaning up the project after generation or providing helpful test on project creation
post_gen_project.py
Issues
Although cookiecutter is a handy application, it is not without its issues and quirks, some of which are listed below.
- Errors with .yaml configuration file: Be careful to always include a blank line in the default config yaml file
- Specifying force may not always work if files already exist: Unable to overwrite existing files
- Inactive open source contribution: Given the open source nature, the releases are best endeavours
Using cookiecutter to generate cookiecutter templates
To bootstrap cookiecutter template creation, you can create a cookiecutter template that has prompts to set up a cookiecutter project.
Slightly strange but quite handy when you’re new to cookiecutter or want to share cookiecutter template styles. Weird huh?!
Summary
To recap, some of the reasons to use cookiecutter for code project templating are:
- Easier to create a predefined project structure for repos
- Easier to distribute project structure to multiple teams/users
- Reduces time to kick off new projects
- Increases quality by sharing standards using predefined project structure
- Extensible options are available such as advanced Jinja2 templating or the ability to layer a GUI over the cookiecutter API.
Go on and give it a try. Happy templating!