Sunday 28 September 2014

Labelling a Play application with Git version information

Background


When you have an active release pipeline and multiple deployment environments, it's often useful to be able to do both of the following:
  1. easily identify the version of a release artifact, by name prior to deployment. 
  2. at any time post deployment, ascertain which version of a release artifact is deployed on a given server without having to access the server's filesystem.
The first point may mean locating a historical build amongst a build archive or simply being confident that the most recent deployment artifact was built from the correct source revision - without reference to any build metadata contained within e.g. the zip file.

Obviously giving build artifacts unique file names which reflect the revision of the source code from which they were built would solve this problem.

The second point means being able to examine a deployment's build metadata, remotely and without any access to the deployed file name. This might be to verify which features or bug-fixes are expected to be present on a certain server in a situation where remote system access is denied to the development and/or test teams. Whatever the motivation, it's always easier and faster to access this metadata through a simple browser request to the server.

A clean solution to this second problem is to create an externally accessible description of the build information by generating and writing a text file to a specific location as a part of the build process. For this post, this location will be somewhere accessible under the Play web context.

You can access the full code discussed in this post by cloning https://github.com/dextaa/blog.git. This repo contains multiple projects so go to the build/git-labelling directory to access the root of this project. This is referred to as <BASE> from now on.

The application used for this example is a Play application which provides JSON responses to stock quote GET requests. The application itself is irrelevant however, the focus of this post is the application's build process and the application exists solely to host this process.

Creating the build descriptor


For this example, the build descriptor will contain information something like this:
Git branch: master
Git head revision: 6460054f0287c57d7cad31f0e0dd0c779d016f1c
Git head tag: Not tagged
Git branch had uncommitted changes: true
Built by: user1
Built at: 28-09-2014 09:22:33
Built using Java version: 1.7.0_51
Built using Scala version: 2.10.4
Built using Play version: 2.3.4 
and it will be written to Play's standard public directory. There's a standard mapping for that directory in conf/routes, which causes the built-in Assets controller to serve content from public under the assets context path:

GET     /assets/*file               controllers.Assets.at(path="/public", file)

such that the build descriptor is available via http e.g. http://localhost:9000/assets/build-info.txt.

It could quite reasonably be argued that build and environment information should be protected from casual inspection, in which case you may prefer to write the build descriptor to a directory which is not directly served and provide a secured custom route to access it.

If you run 'sbt test' in <BASE>, the existence and content of this file will be verified by the test BuildMetadataSpec. Because a pre-generated build-info.txt was committed along with the source code, this test will pass immediately but it will also continue to pass after you have run 'gen-build-desc' at the sbt prompt.


Here's the build descriptor data generator:






The JGit library is at the core of this class. It's a useful library for interacting with Git from within JVM languages and it's used here to get the Git branch name, its head revision's hash and optionally the name of the tag associated with that revision. It's also used to check if the source tree contained any uncommitted changes when the build took place.

The treatment of tags is worth a closer look.

The allTags val contains a collection of references to all of the tags in the subject Git branch. That collection is then searched (using find) for a tag whose associated commit id matches the id of the head revision. If a match is found, it's shortened name (with the Git path removed) is returned (in an Option) as the head revision's tag name.

Matching a tag to a revision requires a couple of paths because Git has a couple of tag types; lightweight and annotated.

Because a lightweight tag is just a pointer to a commit, simply calling tag.getObjectId returns the id of the tagged commit. This is the first case in the matchesHead method.

In contrast, an annotated tag (where -a is added to the 'git tag' command) is a first class object with its own id within the git database and so the extra level of indirection - rt.getObject.getId - shown in the second case, is required to get the id of the associated commit. Calling rt.getObjectId returns the object id of the tag itself and would therefore never equal the id of the revision's head commit.

Where both a lightweight and an annotated tag are present on the same revision, the one which is selected depends on the ordering of allTags, so if that situation could arise and you always wanted to give priority to annotated tags, you could filter allTags by type RevTag and check the resulting collection for matches first.

When buildData is called, it generates a list of tuples, each containing a descriptor entry's key/value. This list is utilised in the sbt task itself.































This doesn't require much comment. A task is created which creates a path to public, relative to the application's base directory in Compile scope (this is Play's standard public directory served by the default Assets controller), it then takes the output of BuildDescriptor.buildData, as described above, formats it as a string and writes it to a file in that directory.

A task key named 'gen-build-desc' is also declared and is associated with the task in the build.sbt with the statement: buildInfo <<= buildInfoTask.

Because generating this descriptor seems like a build server only action, having it run with every local build is unnecessary. Calling the custom task (gen-build-desc) is therefore left as a manual step which could, for example, be added to a Jenkins instance's list of build actions. Alternatively, running 'gen-build-desc' locally at the Play prompt will cause build-info.txt to be re-generated.

Naming the release artifact


For a Play application built using the 'dist' command, there are in fact two artifacts to be named. The first is the actual jar containing the Play application and the second is the zip containing the application jar and all of its dependencies.

It's possible to use sbt's artifactName and a modified version of Play's dist command to control this naming but as of Play 2.3.x (sbt 0.13.5), simply setting the version attribute of the build in build.sbt achieves a nice result:






Here the artifacts would be renamed 1.0-<tag name> or 1.0-<revision hash> in that order of precedence. For example, if the head revision had been tagged 'release-v1', the resulting file name would be 'git-labelling-1.0-release-v1.zip'.

Note however that the version TaskKey is fixed on project load so even if further commits were made to the branch, the revision used in the build descriptor and artifact names would remain unchanged until either sbt was re-started or the 'reload' command was issued at the sbt prompt.

Is it worth doing?

Incremental improvements of this type greatly improve a project's build and deployment process and help it progress towards a point of maturity at which it becomes a stress free and almost invisible part of the process of getting reliable software in front of your customers. That sounds worth doing!