Using cookiecutter hooks to enhance code scaffolding

In this article

Using cookiecutter to scaffold code repositories offers useful way kick start projects. To enhance the user experience even more, this post will look at using hooks to perform actions such as input validation and clean up activities.

Hooks

Purpose of hooks

Within software, hooks are a way of augmenting an application’s capability by allowing the injection of additional or customisable content.

Using hooks in cookiecutter allows us to do the following and more:

  • Display user friendly messages
  • Add input validation
  • Remove unwanted files/folders
  • Implement complex logic
  • Initialise a git repository

Setting up cookiecutter hooks

The previous post that introduced cookiecutter, also outlined the process flow over 5 steps as shown in the following diagram.

flowchart LR Prompt-->Gather-->Pre-hook-->Compile-->Post-hook

Once we’ve prompted for and gathered user input, the hooks, if any exist, are then applied.

The process to set up hooks in cookiecutter is straightforward and is described below:

  1. Determine which type of hooks to run
  2. Determine whether to use python or shell scripting.
  3. Create a folder named hooks at the root level of the cookiecutter project.
  4. For python hooks
  • Create a file named pre_gen_project.py for running pre-generation hooks.
  • Create a file named post_gen_project.py for running post-generation hooks.
  1. For shell scripts, the process is the same as step 4, except the file type changes e.g., .sh.

For an example of what this would look like in VSCode, see the following image.

cookiecutter-hooks-directory

We’ll expand on the basic cookiecutter template we used in the previous post to demonstrate how we implement hooks in cookiecutter.

Pre-hooks

Pre generation hooks are great for cases where we need to perform actions before we scaffold our repo. The following steps will walk through basic validation using a pre-generation hook.

Create a basic pre-generation hook

  1. Complete the steps in “create a basic cookiecutter template”
  2. Create a folder named hooks at the root level of the cookiecutter project
  3. In the hooks folder, create a pre_gen_project.py file and add the following code block
  import sys

  print("********** PRE GENERATION HOOK *************")

  def main():
      validate()

  def validate():
      created_by = '{{ cookiecutter.created_by}}'
      
      if not created_by.strip():
          print("ERROR: You failed to specify 'created by'.")
          return 1
      
      return 0

  if __name__ == '__main__':
      sys.exit(main())
  1. In the parent directory of the cookiecutter template, run cookiecutter cookiecutter-data-app --output-dir "<insert-output-directory>"
  2. Give inputs for “project_name” and “project_slug”
  3. Do not add input for “created_by” and hit enter
  4. This should throw a validation error similar to the below

cookiecutter-data-app-validation-result

And with these steps, we have now been able to introduce validation via pre-hook execution! ๐Ÿ˜„

Post-hooks

As previously mentioned, post generation hooks can give a great user experience by providing helpful text on project creation or implement complex logic post_gen_project.py

The following steps will walk through using a post-generation hook, where based on user input, the code scaffold a repo with either the local, aws or azure content.

Create a post-generation hook

  1. Remove the pre_gen_project.py file from the hooks directory, for now. We’ll use it later when do a complete run
  2. In the same hooks folder, create a post_gen_project.py file and add the following code block.
  import sys, os, shutil

  def main():
      environment = "{{ cookiecutter.environment }}".lower()
      print("********** POST GENERATION HOOK *************")
      get_environment(environment)

  def drop_dir(dir_list):
      for dir in dir_list:
          cur_dir = os.path.abspath(os.path.curdir)
          drop_dir = os.path.join(cur_dir, dir)
          if os.path.exists(drop_dir):
              shutil.rmtree(drop_dir)
          else:
              print(f"Error: The directory '{drop_dir}' could not be removed")

  def get_environment(environment):

      env_list = []
      if environment == "local":
          env_list = ["aws", "azure"]
      if environment == "aws":
          env_list = ["local", "azure"]
      if environment == "azure":
          env_list = ["local", "aws"]

      drop_dir(env_list)

  if __name__ == '__main__':
      sys.exit(main())
  1. Create a set of directories named “local”, “aws” and “azure”. Add files and folders similar to the following:

cookiecutter-data-app-post-hook-folder-setup

  1. In the cookiecutter.json file, add an additional line for “environment” so that it looks as follows:
{
    "project_name": "Starter data project",
    "project_slug": "{{ cookiecutter.project_name.lower().strip().replace(' project', '').replace(' ', '_').replace(':', '_').replace('-', '_').replace('!', '_')}}",
    "created_by": "", 
    "environment": ["Local", "AWS", "Azure"]
}
  1. In the parent directory of the cookiecutter template, run cookiecutter cookiecutter-data-app --output-dir "<insert-output-directory>"
  2. Give inputs for “project_name”, “project_slug” and “created_by”
  3. Enter a choice of “local”, “aws” or “azure” as an option for the environment
  4. This should return a new template with only 1 environment, similar to the below (e.g., using “local”)

cookiecutter-data-app-post-hook-result

cookiecutter-data-app-post-hook-folder-result

And with these steps, we now have a post-gen hook implemented!

Putting it all together

  1. Clear the directory where the template was output i.e. <insert-output-directory>"
  2. In the parent directory of the cookiecutter template, run cookiecutter cookiecutter-data-app --output-dir "<insert-output-directory>"
  3. Make sure both the pre and post generation hooks are in the hooks folder
  4. With both pre and post generation hooks in place, you should observe a complete run with validation and a chosen environment

Considerations

Shell or Python

When using hooks, the choice of using shell scripts or Python is an important one to make based on factors such as portability and experience. Running a .bat file or other shell script can provide a head-start but the implementation may fall short if this needs to be run on other platforms.

Typically in such cases where portability is important, Python lends itself to be a better option but shell scripts can also be a quick and easy way to implement hooks.

Security

With the ability to implement additional functionality via hooks comes the ability to inject malicious code. ๐Ÿ˜ If running hooks in particular is a concern, there is the option to run without hooks by setting the parameter accept_hooks to false. This will prevent hooks from running but be warned that malicious code can be placed elsewhere such as in jinja templates.

Running hooks is of course not unique to cookiecutter and as always, follow good security practices when developing any application.

Closing thoughts

Hooks are a great way to extend the functionality cookiecutter can give the code repo scaffolding process. As ever in software or data engineering, if you are to use hooks, follow good practices such as exiting the program appropriately and error handling for a robust implementation and better user experience.

Thanks for reading!