Skip to content

coding like a boss Posts

ApiCompat has Moved into the .NET SDK

I while ago I wrote about how to detect breaking changes in .NET using Microsoft.DotNet.ApiCompat. Since then ApiCompat has moved into the .NET SDK.

What has Changed?

Since ApiCompat now is part of the .NET SDK, the Arcade package feed doesn’t need to be referenced anymore.

<PropertyGroup>
  <RestoreAdditionalProjectSources>
    https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-eng/nuget/v3/index.json;
  </RestoreAdditionalProjectSources>
</PropertyGroup>

The package reference can also be removed.

<ItemGroup>
  <PackageReference Include="Microsoft.DotNet.ApiCompat" Version="7.0.0-beta.22115.2" PrivateAssets="All" />
</ItemGroup>

In order to continue running assembly validation from MSBuild, install the Microsoft.DotNet.ApiCompat.Task.

<ItemGroup>
  <PackageReference Include="Microsoft.DotNet.ApiCompat.Task" Version="8.0.404" PrivateAssets="all" IsImplicitlyDefined="true" />
</ItemGroup>

Enable Assembly validation.

<PropertyGroup>
  <ApiCompatValidateAssemblies>true</ApiCompatValidateAssemblies>
</PropertyGroup>

The contract assembly reference directive has changed, so the old directive needs to be replaced.

<ItemGroup>
  <ResolvedMatchingContract Include="LastMajorVersionBinary/lib/$(TargetFramework)/$(AssemblyName).dll" />
</ItemGroup>

<PropertyGroup>   
  <ApiCompatContractAssembly> LastMajorVersionBinary/lib/$(TargetFramework)/$(AssemblyName).dll
  </ApiCompatContractAssembly>
</PropertyGroup>

The property controlling suppressing breaking changes, BaselineAllAPICompatError, has changed to ApiCompatGenerateSuppressionFile.

<PropertyGroup>
  <BaselineAllAPICompatError>false</BaselineAllAPICompatError>
  <ApiCompatGenerateSuppressionFile>false </ApiCompatGenerateSuppressionFile>
</PropertyGroup>

That’s it, your good to go!

Compatibility Baseline / Suppression directives

Previously the suppression file, ApiCompatBaseline.txt, contained text directives describing suppressed compatibility issues.

Compat issues with assembly Kafka.Protocol:
TypesMustExist : Type 'Kafka.Protocol.ConsumerGroupHeartbeatRequest.Assignor' does not exist in the implementation but it does exist in the contract.

This format has changed to an XML based format, written by default to a file called CompatibilitySuppressions.xml.

<?xml version="1.0" encoding="utf-8"?>
<Suppressions xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <Suppression>
    <DiagnosticId>CP0001</DiagnosticId>
    <Target> T:Kafka.Protocol.CreateTopicsRequest.CreatableTopic.CreateableTopicConfig
    </Target>
    <Left>
LastMajorVersionBinary/lib/netstandard2.1/Kafka.Protocol.dll
    </Left>
    <Right>obj\Debug\netstandard2.1\Kafka.Protocol.dll</Right>
  </Suppression>
</Suppressions>

This format is more verbose than the old format, a bit more difficult to read from a human perspective if you ask me. The description of the various DiagnosticIds can be found in this list.

Path Separator Mismatches

Looking at the suppression example, you might notice that the suppressions contain references to the compared assembly and the baseline contract. It’s not a coincidence that the path separators differs between the references to the contract assembly and the assembly being compared. The Left reference is a templated copy of the ApiCompatContractAssembly directive using OS agnostic forward slashes, but the Right directive is generated by ApiCompat and it is not OS agnostic, hence the backslash path separators generated when executing under Windows. If ApiCompat is executed under Linux it would generate front slash path separators.

You might also notice that the reference to the assembly being compared contains the build configuration name. This might not match the build configuration name used during a build pipeline for example (Debug vs Release).

Both these differences in path reference will make ApiCompat ignore the suppressions when not matched. There is no documentation on how to consolidate these, but fortunately there are a couple of somewhat hidden transformation directives which can help control how these paths are formatted.

<PropertyGroup>
  <_ApiCompatCaptureGroupPattern>
.+%5C$([System.IO.Path]::DirectorySeparatorChar)(.+)%5C$([System.IO.Path]::DirectorySeparatorChar)(.+)
  </_ApiCompatCaptureGroupPattern>
</PropertyGroup>

<ItemGroup>
  <!-- Make sure the Right suppression directive is OS-agnostic and disregards configuration -->
  <ApiCompatRightAssembliesTransformationPattern Include="$(_ApiCompatCaptureGroupPattern)" ReplacementString="obj/$1/$2" />
</ItemGroup>

The _ApiCompatCaptureGroupPattern regex directive captures path segment groups which can be used in the ApiCompatRightAssembliesTransformationPattern directive to rewrite the assembly reference path to something that is compatible to both Linux and Windows, and removes the build configuration segment.

Using this will cause the Right directive to change accordingly.

<?xml version="1.0" encoding="utf-8"?>
<Suppressions xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <Suppression>
    <DiagnosticId>CP0001</DiagnosticId>
    <Target>
T:Kafka.Protocol.CreateTopicsRequest.CreatableTopic.CreateableTopicConfig
    </Target>
    <Left>
LastMajorVersionBinary/lib/netstandard2.1/Kafka.Protocol.dll
    </Left>
    <Right>obj/netstandard2.1/Kafka.Protocol.dll</Right>
  </Suppression>
</Suppressions>

There is a similar directive for the Left directive named ApiCompatLeftAssembliesTransformationPattern.

Leave a Comment

Parsing OpenAPI Style Parameters

Parameters in the OpenAPI 3.1 specification can be defined in two ways using a JSON schema; either by using a media type object, which is useful when the parameter is complex to describe, or by using styles, which is more common in simple scenarios, which often is the case with HTTP parameters.

Let’s have a look at the styles defined and where they fit into a HTTP request.

Path Parameters

Path parameters are parameters defined in a path, for example id in the path /user/{id}. Path parameters can be described using label, matrix or simple styles, which are all defined by RFC6570, URI Template.

Here are some examples using the JSON primitive value 1 for a parameter named id :

  • Simple: /user/1
  • Matrix: /user/;id=1
  • Label: /user/.1

It’s also possible to describe arrays. Using the same parameter as above with two JSON primitive values, 1 and 2, it get’s serialized as:

  • Simple: /users/1,2
  • Matrix: /user/;id=1,2
  • Label: /user/.1.2

Given a JSON object for a parameter named user with the value, { "id": 1, "name": "foo" }, it becomes:

  • Simple: /user/id,1,name,foo
  • Matrix: /user/;user=id,1,name,foo
  • Label: /user/.id.1.name.foo

The explode modifier can be used to enforce composite values (name/value pairs). For primitive values this has no effect, neither for label and simple arrays. With the above examples and styles where explode has effect, here’s the equivalent:

Arrays

  • Matrix: /user/;id=1;id=2

Objects

  • Simple: /user/id=1,name=foo
  • Matrix: /user/;id=1;name=foo
  • Label: /user/.id=1.name=foo

Query Parameters

Query parameters can be described with form, space delimited, pipe delimited or deep object styles. The form style is defined by RFC6570, the rest are defined by OpenAPI.

Primitives

Using the example from path parameters, a serialized user, ?user={user}, or user id, ?id={id}, defined as a query parameter value would look like:

  • Form: id=1

Note that the examples doesn’t describe any primitive values for pipe and space delimited styles, even though they are quite similar to the simple style.

Arrays

  • Form: id=1,2
  • SpaceDelimited: id=1%202
  • PipeDelimited: id=1|2

Objects

  • Form: user=id,1,name,foo
  • SpaceDelimited: user=id%201%20name%20foo
  • PipeDelimited: user=id|1|name|foo

Note that the examples lack the parameter name for array and objects, this has been corrected in 3.1.1.

Not defining explode for deepObject style is not applicable, and like path styles, explode doesn’t have effect on primitive values.

Exploded pipe and space delimited parameters are not described in the example, even though they are similar to form. Do note though that neither of them would be possible to parse, as the parameter name cannot be inferred.

With all this in mind here are the respective explode examples:

Arrays

  • Form: id=1&id=2
  • SpaceDelimited: id=1%202
  • PipeDelimited: id=1|2

Objects

  • Form: id=1&name=foo
  • SpaceDelimited: id=1%20name=foo
  • PipeDelimited: id=1|name=foo
  • DeepObject: user[id]=1&user[name]=foo

Header Parameters

Header parameter can only be described using the simple style.

Given the header user: {user} and id: {id}, a respective header parameter value with simple style would look like:

Primitives

  • Simple: 1

Arrays

  • Simple: 1,2

Objects

  • Simple: id,1,name,foo

Similar to the other parameters described, explode with primitive values have no effect, neither for arrays. For objects it would look like:

  • Simple: id=1,name=foo

Cookie Parameters

A cookie parameter can only be described with form style, and is represented in a similar way as query parameters. Using the example Cookie: id={id} and Cookie: user={user} a cookie parameter value would look like:

Primitives

  • Form: id=1

Arrays

  • Form: id=1,2

Objects

  • Form: user=id,1,name,foo

Similar to the other parameters described, explode with primitive values have no effect. For arrays and objects it looks like:

Arrays

  • Form: id=1&id=2

Objects

  • Form: id=1&name=foo

Note that exploded objects for cookie parameters have the same problem as query parameters; the parameter name cannot be inferred.

Object Complexity

Theoretically an object can have an endless deep property structure, where each property are objects that also have properties that are objects and so on. RFC6570 nor OpenAPI 3.1 defines how deep a structure can be, but it would be difficult to define array items and object properties as objects in most styles.

As OpenAPI provides media type objects as a complement for complex parameters, it’s advisable to use those instead in such scenarios.

OpenAPI.ParameterStyleParsers

To support parsing and serialization of style defined parameters, I’ve created a .NET library, OpenAPI.ParameterStyleParsers. It parses style serialized parameters into the corresponding JSON instance and vice versa. It supports all examples defined in the OpenAPI 3.1 specification, corrected according to the inconsistencies described earlier. It only supports arrays with primitive item values and objects with primitive property values, for more complex scenarios use media type objects. The JSON instance type the parameter get’s parsed according to is determined by the schema type keyword. If no type information is defined it falls back on a best effort guess based on the parameter style.

Leave a Comment

ConfigureAwait or Not

I often get into the discussion about should you disable continuing on the captured context when awaiting a task or not, so I’m going to write down some of my reasoning around this rather complex functionality. Let’s start with some fundamentals first.

async/await

Tasks wrap operations and schedules them using a task scheduler. The default scheduler in .NET schedules tasks on the ThreadPool. async and await are syntactic sugar telling the compiler to generate a state machine that can keep track of the state of the task. It iterates forward by keeping track of the current awaiter’s completion state and the current location of execution. If the awaiter is completed it continues forward to the next awaiter, otherwise it asks the current awaiter to schedule the continuation.

If you are interested in deep-diving what actually happens with async and await, I recommend this very detailed article by Stephen Toub.

What is the Synchronization Context?

Different components have different models on how scheduling of operations need to be synchronized, this is where the synchronization context comes into play. The default SynchronizationContext synchronizes operations on the thread pool, while others might use other means.

The most common awaiters, like those implemented for Task and ValueTask, considers the current SynchronizationContext for scheduling an operation’s continuation. When the state machine moves forward executing an operation it goes via the current awaiter which when not completed might schedule the state machines current continuation via SynchronizationContext.Post.

ConfigureAwait

The only thing this method actually does is wrap the current awaiter with it’s single argument, continueOnCapturedContext . This argument tells the awaiter to use any configured custom synchronization context or task scheduler when scheduling the continuation. Turned off, i.e. ConfigureAwait(false), it simply bypasses them and schedules on the default scheduler, i.e. the thread pool. If there are no custom synchronization context or scheduler ConfigureAwait becomes a no-op. Same thing apply if the awaiter doesn’t need queueing when the state machines reaches the awaitable, i.e. the task has already completed.

Continue on Captured Context or not?

If you know there is a context that any continuation must run on, for example a UI thread, then yes, the continuation must be configured to capture the current context. As this is the default behavior it’s not technically required to declare this, but by explicitly configure the continuation it sends a signal to the next developer that here a continuation is important. If configured implicitly there won’t be anything hinting that it wasn’t just a mistake to leave it out, or that the continuation was ever considered or understood.

In the most common scenario though the current context is not relevant. We can declare that by explicitly state the continuation doesn’t need to run on any captured context, i.e. ConfigureAwait(false).

Enforcing ConfigureAwait

Since configuring the continuation is not required, it’s easy to miss configuring it. Fortunately there is a Roslyn analyzer that can be enabled to enforce that all awaiters have been configured.

Summary

Always declare ConfigureAwait to show intent that the continuation behavior has explicitly been considered. Only continue on a captured context if there is a good reason for doing so, otherwise reap the benefits of executing on the thread pool.

Leave a Comment

OpenAPI Evaluation

Json Schema has a validation vocabulary which can be used to set constraints on json structures. OpenAPI uses Json Schemas to describe parameters and content, so wouldn’t it be nice to be able to evaluate HTTP request and response messages?

OpenAPI.Evaluation is a .NET library which can evaluate HTTP request and response messages according to an OpenAPI specification. Json schemas are evaluated using JsonSchema.NET together with the JsonSchema.Net.OpenApi vocabulary. It supports the standard HttpRequestMessage and HttpResponseMessage and comes with a DelegatingHandler, OpenApiEvaluationHandler, which can be used by HttpClient to intercept and evaluate requests going out and responses coming in according to an OpenAPI 3.1 specification. It’s also possible to manually evaluate requests and responses by traversing the parsed OpenAPI specification and feed it’s evaluators with the corresponding extracted content.

ASP.NET

OpenAPI.Evaluation.AspNet integrates OpenAPI.Evaluation with the ASP.NET request pipeline to enable server side evaluation. It comes with extension methods to evaluate HttpRequest and HttpResponse abstractions. It also supports integration via the HttpContext enabling easy access for ASP.NET request and response pipelines and controllers. A middleware is provided for simple integration and can be enabled via the application builder and service collection.

var builder = WebApplication.CreateBuilder(args);
builder.Services.AddOpenApiEvaluation(OpenAPI.Evaluation.Specification.OpenAPI.Parse(JsonNode.Parse(File.OpenRead("openapi.json"));
var app = builder.Build();
// Registers the middleware into the request pipeline
app.UseOpenApiEvaluation();

To evaluate directly from a request pipeline, the OpenAPI specification first needs to be loaded and registered as described above. Extension methods for HttpContext can then be used for request and response evaluation:

var requestEvaluationResult = context.EvaluateRequest();
...
var responseEvaluationResult = context.EvaluateResponse(200, responseHeaders, responseContent);

Evaluation Result

JsonSchema.NET implements the Json Schema output format which OpenAPI.Evaluation is influenced by. The OpenAPI specification doesn’t define annotations as Json Schema does, so I decided to adopt something similar. The evaluation result contains information describing each specification object traversed and what path through the specification the evaluation process took. An example produced by the default evaluation result json converter is shown belong, it uses a hierarchical output format.

{
  "valid": false,
  "evaluationPath": "",
  "specificationLocation": "http://localhost/#",
  "details": [
    {
      "valid": false,
      "evaluationPath": "/paths",
      "specificationLocation": "http://localhost/#/paths",
      "details": [
        {
          "valid": false,
          "evaluationPath": "/paths/~1user~1{user-id}",
          "specificationLocation": "http://localhost/#/paths/%7e1user%7e1%7buser-id%7d",
          "details": [
            {
              "valid": false,
              "evaluationPath": "/paths/~1user~1{user-id}/get",
              "specificationLocation": "http://localhost/#/paths/%7e1user%7e1%7buser-id%7d/get",
              "details": [
                {
                  "valid": false,
                  "evaluationPath": "/paths/~1user~1{user-id}/get/parameters",
                  "specificationLocation": "http://localhost/#/paths/%7e1user%7e1%7buser-id%7d/get/parameters",
                  "details": [
                    {
                      "valid": false,
                      "evaluationPath": "/paths/~1user~1{user-id}/get/parameters/0",
                      "specificationLocation": "http://localhost/#/paths/%7e1user%7e1%7buser-id%7d/get/parameters/0",
                      "details": [
                        {
                          "valid": false,
                          "evaluationPath": "/paths/~1user~1{user-id}/get/parameters/0/$ref/components/parameters/user/content",
                          "specificationLocation": "http://localhost/#/components/parameters/user/content",
                          "details": [
                            {
                              "valid": false,
                              "evaluationPath": "/paths/~1user~1{user-id}/get/parameters/0/$ref/components/parameters/user/content/application~1json",
                              "specificationLocation": "http://localhost/#/components/parameters/user/content/application%7e1json",
                              "details": [
                                {
                                  "valid": false,
                                  "evaluationPath": "/paths/~1user~1{user-id}/get/parameters/0/$ref/components/parameters/user/content/application~1json/schema",
                                  "specificationLocation": "http://localhost/#/components/parameters/user/content/application%7e1json/schema",
                                  "schemaEvaluationResults": [
                                    {
                                      "valid": false,
                                      "evaluationPath": "",
                                      "schemaLocation": "http://localhost#",
                                      "instanceLocation": "",
                                      "errors": {
                                        "required": "Required properties [\"first-name\"] are not present"
                                      },
                                      "details": [
                                        {
                                          "valid": true,
                                          "evaluationPath": "/properties/last-name",
                                          "schemaLocation": "http://localhost/#/properties/last-name",
                                          "instanceLocation": "/last-name"
                                        }
                                      ]
                                    }
                                  ]
                                }
                              ]
                            }
                          ]
                        }
                      ]
                    }
                  ]
                }
              ]
            }
          ]
        }
      ]
    },
    {
      "valid": true,
      "evaluationPath": "/servers",
      "specificationLocation": "http://localhost/#/servers",
      "details": [
        {
          "valid": true,
          "evaluationPath": "/servers/0",
          "specificationLocation": "http://localhost/#/servers/0",
          "annotations": {
            "url": "http://localhost/v1",
            "description": "v1"
          }
        }
      ]
    }
  ]
}

Parameter Value Parsers

Headers, path values, query strings and cookies can be described in an OpenAPI specification using a combination of instructive metadata, like styles, and schemas. It’s designed to cater for simple data structures and is complemented by content media types for more complex scenarios.

OpenAPI.Evaluation supports all the styles described in the specification, but it’s not explicitly defined how complex scenarios the specification should support, that is left to implementors to decide. In order to cater for more complex scenarios, it’s possible to define custom parsers per parameter by implementing the IParameterValueParser and register it when parsing the OpenAPI specification.

OpenAPI.Evaluation.Specification.OpenAPI.Parse(jsonDocument, parameterValueParsers: new[] { customParameterValueParser });
Leave a Comment

DynamoDB as an Event Store

I’ve been wondering how well Amazon DynamoDB would fit an event store implementation. There were two different designs I wanted to explore, both are described in this article, of which I implemented one of them. The source code is available on GitHub, including a nuget package feed on nuget.org.

Event Store Basics

An event store is a storage concept that stores events in a chronological order. These events can describe business critical changes within a domain aggregate. Besides storing the events of the aggregate, an aggregate state can be stored as a snapshot to avoid reading all events each time the aggregate needs to be rebuild. This can boost performance both in regards of latency and the amount of data that needs to be transferred.

DynamoDB

DynamoDB is a large scale distributed NoSQL database that can handle millions of requests per second. It’s built to handle structured documents grouped in partitions which can be stored in order within a partition key. This has obvious similarities to how an event stream for an aggregate looks like, seems promising!

Another neat feature with DynamoDB is DynamoDB Streams and Kinesis Data Streams with DynamoDB, which both can stream changes in a table to various other AWS services and clients. No need to implement an outbox and integrate with a separate message broker. Add point-in-time recovery and it is possible to stream the whole event store at any time!

Separated Snapshot and Events

Let’s start with the first design that uses composite keys to store snapshots and events grouped by aggregate.

The events are grouped into commits to create an atomic unit that has consistency guarantees without using transactions. Commits use a monotonically increasing id as sort key, while the snapshot uses zero. Since sort keys determine in what order items are stored the commits become ordered chronologically with the snapshot leading, meaning the commits of events can be fetched with a range query while the snapshot can be fetched separately. It makes little sense fetching them all, even though that would also be possible. The snapshot includes the sort key to the commit it represent up until in order to know where to start querying for any events that have not yet been applied to the snapshot.

Do note that there is an item size limit of 400kb in DynamoDB, which should be more than enough to represent both commits and snapshots, but as both are stored as opaque binary data they could be compressed. Besides lowering the size of the items and round-trip latency, this can also lower read and write cost.

When storing a commit the sort key is monotonically increased, this is predictable and therefor can be used as a condition to introduce both optimistic concurrency and idempotency in order to prevent inconsistency due to multiple competing consumers writing events at the same time or a network error occurs during a write operation.

"ConditionExpression": "attribute_not_exists(PK)"

Snapshots can be stored once it precedes the size of the non-snapshotted commits to save cost and lower latency. As snapshots are detached from the event stream it doesn’t matter if storing it after a commit succeeds or fails. If it fails the write operation could be re-run at any time. Updating a snapshot includes guarantees for optimistic concurrency and idempotency by only writing if the version the snapshot points to is higher than the currently stored snapshot or if the attribute is missing all together, which means no snapshot exists.

"ConditionExpression": "attribute_not_exists(version) OR version < :version"

More about conditional writes can be found here.

This was the solution I chose to implement!

Interleaving Snapshots and Events

This was an alternative I wanted to try out, interleaving snapshots and events in one continuous stream.

The idea was to only require a single request to fetch both snapshots and the trailing, non-snapshotted, events, lowering the amount of roundtrips to DynamoDB increasing possible throughput. Reading commits and the latest snapshot would be done by reading in reversed chronological order until a snapshot is found.

This however presents a problem. If a snapshot is stored after say every 10th commit, 11 items have to be queried to avoid multiple roundtrips even though the first item could be a snapshot making the 10 other items redundant. Further more, there are no guarantees when a snapshot get’s written, hence there are no way to know upfront exactly how many items to read to reach the latest snapshot.

Another problem is that all snapshots have to be read when reading the whole event stream.

Conclusion

DynamoDB turns out to be quite a good candidate to take the job as a persistence engine for an event store, supporting the design of an ordered list of events including snapshots, and has the capabilities to stream the events to other services. Replaying a single aggregate can be done with a simple ranged query and it’s schema less design makes it easy to store both commits and snapshots in the same table. It’s distributed nature enables almost limit less scalability and the fact that it is a managed service makes operating it a breeze.

Very nice indeed!

Leave a Comment

Detecting Breaking Changes

When integrating with other applications and libraries it’s important to detect when APIs are changing in an incompatible way as that might cause downtime for the downstream application or library. SemVer is a popular versioning strategy that can hint about breaking changes by bumping the major version part but as an upstream application developer it can be difficult to detect that a code change is in fact a breaking change.

ApiCompat

Microsoft.DotNet.ApiCompat is a tool built for .NET that can compare two assemblies for API compatibility. It’s built and maintained by the .NET Core team for usage within the Microsoft .NET development teams, but it’s also open source and available to anyone.

Installation

The tool is provided as a NuGet package on the .NET Core team’s NuGet feed, which is not on nuget.org, which most package managers reference by default, but a custom NuGet server hosted in Azure. The URL to the feed needs to be explicitly specified in the project that like to use it.

In the project file add a reference to the NuGet feed under the Project group:

<PropertyGroup>
  <RestoreAdditionalProjectSources>
    https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-eng/nuget/v3/index.json;
  </RestoreAdditionalProjectSources>
</PropertyGroup>

Add a reference to the Microsoft.DotNet.ApiCompat package:

<ItemGroup>
  <PackageReference Include="Microsoft.DotNet.ApiCompat" Version="7.0.0-beta.22115.2" PrivateAssets="All" />
</ItemGroup>

The most recent version of the package can be found by browsing the feed.

Usage

The tool can execute as part of the build process and fail the build if the current source code contains changes that is not compatible with a provided contract, an assembly from the latest major release for example.

While it is possible to commit a contract assembly to Git, another approach is to automatically fetch it from source early in the build process. This example uses a NuGet feed as the release source, but it could be an asset in a GitHub release as well or something else.

<Target Name="DownloadLastMajorVersion" BeforeTargets="PreBuildEvent">
  <DownloadFile SourceUrl="https://www.nuget.org/api/v2/package/MyLibrary/2.0.0" DestinationFolder="LastMajorVersionBinary">
    <Output TaskParameter="DownloadedFile" PropertyName="LastMajorVersionNugetPackage" />
  </DownloadFile>
  <Unzip SourceFiles="$(LastMajorVersionNugetPackage)" DestinationFolder="LastMajorVersionBinary" />
</Target>

This will download a NuGet package to the folder LastMajorVersionBinary in the project directory using the DownloadFile task. If the directory doesn’t exist it will be created. If the file already exist and has not been changed, this becomes a no-op.

The Unzip task will unpack the .nupkg file to the same directory as the next step. Same thing here, if the files already exists this is a no-op.

The last step is to instruct ApiCompat to target the unpacked assembly file as the current contract the source code will be compared with. This is done by setting the ResolvedMatchingContract property, which is the only setting required to run the tool.

<ItemGroup>
  <ResolvedMatchingContract Include="LastMajorVersionBinary/lib/$(TargetFramework)/$(AssemblyName).dll" />
</ItemGroup>

The path points to where the assembly file is located in the unpacked NuGet package directory.

Building the project will now download the contract and execute ApiCompat with the default settings. Remember to use proper access modifiers when writing code. internal is by default not treated as part of a public contract1 as the referenced member or type cannot be reached by other assemblies, compared to public.

1 InternalsVisibleToAttribute can make internal member and types accessible by other assemblies and should be used by care.

Handling a Breaking Change

Breaking changes should be avoided as often as possible as they break integrations downstream. This is true for both libraries and applications. An indication of a breaking change should most of the time be identified and mitigated in a non-breaking fashion, like adding another API and deprecating the current one. But at some point the old API needs to be removed, thus a breaking change is introduced which will fail the build.

The property BaselineAllAPICompatError can be used to accept breaking changes. The specification of the breaking changes will be written to a file called ApiCompatBaseline.txt in the root of the project. ApiCompat uses it to ignore these tracked incompatible changes from now on. This property should only be set when a breaking change should be accepted and should result in a new major release.

<PropertyGroup>
  <BaselineAllAPICompatError>true</BaselineAllAPICompatError>
</PropertyGroup>

Once a new build has be executed and the new contract baseline has been established remember to set the property to false or remove it.

The baseline file should be committed to source control where it can be referenced as documentation for the breaking change by for example including it in a BREAKING CHANGE conventional commit message.

Once a new major release has been published, there are two options.

  1. Reference the assembly in the new major release as the contract and delete ApiCompatBaseline.txt.
  2. Do nothing 🙂

As git can visualize diffs between two references, the content of the ApiCompatBaseline.txt will show all the breaking changes between two version tags, which can be quite useful.

Summary

ApiCompat is a great tool for automating detection of breaking changes during development. It avoids introducing incompatible changes and potential headache to downstream consumers, both for applications and libraries.

A complete example of the mentioned changes are available here.

Leave a Comment

Trunk Based Release Versioning – The Simple Way

Some while ago I wrote an article about release versioning using Trunk-Based development where I used GitVersion for calculating the release version. Since then I have experienced multiple issues with that setup up to the point when I had enough and decided to make my own release versioning script, which I simply call Trunk-Based Release Versioning!

There is no complex configuration, it’s just a bash script that can be executed in a git repository. It follows the Trunk-Based Development branching model and identifies previous release versions by looking for SemVer formatted tags. It outputs at what commits the next release starts and ends and includes the version and the commit messages. Bumping version is done by writing commit messages that follows the Conventional Commits specification.

Trunk-Based Development For Smaller Teams

This is the simplest strategy, there is only the default branch. A release contains everything between two release tags.

Scaled Trunk-Based Development

Development is done on short lived development branches. It is possible, and preferable, to continuously release by merging into the default branch. All commits between two release tags make up a release, same as for the previous strategy, but it is also possible to pre-release from the development branch as pictured by the red boxes. A pre-release contains all commits on the branch since last release and when merging it all becomes the new release. Note that previous merge commits from the development branch onto the default branch also are considered release points in order to be able to continuously work from a single branch. This also works if merging the opposite way from the default branch into the development branch in order to for example enforce gated check-ins to the default branch.

For a full example using Github Actions, see this workflow.

That’s it! Keep releasing simple.

Leave a Comment

In-Memory Service Integration Tests

One concept I always use when writing applications or libraries is to test them using fast, reliable and automated integration tests.

The Test Trophy

You have probably heard about the test pyramid before, but if you haven’t heard about the test trophy, you should read this article.

Coming from the .NET world I have come to rely on real time code analyzers such as Roslyn and ReSharper, they help me write cleaner code faster and with little effort. It’s great having continuous feedback on code being written more or less instantly.

For similar reasons I use NCrunch for continuous test execution. I want instant feedback on the functionality I write. This means I need tests that are isolated and stable, run fast and continuously and test business functionality as realistically as possible.

Integration Tests

While unit tests are valuable when writing small isolated pieces of functionality, they are often too isolated to verify the bigger picture, hence the need for integration tests.

In order for integration tests to give near real time feedback they need to work similar to unit tests yet test real business scenarios end-to-end using stable integration points that do not break when refactoring. This is where in-memory, service integration tests come in.

  • In-memory – because it’s fast and can run anywhere.
  • Service – because it exposes stable APIs that don’t break compatibility.
  • Integrations – because they also use stable API’s and are pluggable.

These tests encapsulate an in-memory environment where the service is deployed to a host that acts similar to a real host through which the test can invoke the service. Third party in-memory representations of integrating services are used to mimic dependencies and observe the service’s behavior. The test shortcuts the service’s I/O integrations preferably as close to the network level as possible and redirects traffic back to itself for instrumentation and assertions.

An example of an integration test can be found here. It tests a web server that provides port forwarding functionality to Kubernetes, similar to kubectl port-forward.

Examples of in-memory versions of different components are Kafka.TestFramework, Test.It.With.AMQP, Entity Framework Effort as well as the AspNetCore TestServer.

So how fast do these integration tests run? Let’s churn.

Fast enough to run continuously!

JIT Lag

It is worth mentioning that most test runners do not run each test in a completely isolated application domain because of the overhead caused by the JIT compiler when loading assemblies. Therefor it is important to design services to be instantiable and not have mutating static properties as these would be shared when running tests in parallel. Reusing the same application domain for test executing when running multiple tests simultaneously is a trade-off that increases performance considerably.

Summary

Running integration tests continuously while writing code enables very fast feedback loops. These tests are however not a replacement to post-deployment end-to-end tests which test application functionality on production similar infrastructure.

Its main purpose is to give fast feedback continuously on business functionality during development by mimicking a production deployed instance of the service with minimal mocking while using well known, loosely coupled integration points for refactor stability.

Leave a Comment

Mitigating Infrastructure Drift by using Software Development Principals – Part 2

If you haven’t read the first part on how to mitigate infrastructure drift using software development principals you should! I will refer to parts mentioned in that post during this second, hands-on part.

Work Process

Let’s start by exploring how a simple work process might look like during software development.

 I have included post deployment testing because it is important but I’m not going to talk about how to orchestrate it within deployments, it is out of scope for this article.

Let’s add source control, automation and feedback loops. Traditionally there might be multiple different automation tools responsible for the different parts.

Say hi to trunk based development!

When the team grows, the capability of building functionality in parallel grows. This is where the process complexity also grows. The simplest way of doing parallel work is to duplicate the process and let it run side by side. However, it makes little sense developing functionality within a domain if we cannot use it all together, we need to integrate.

Remember that automation is key for moving fast and the goal is to deploy to production as fast as possible with as high quality as possible. The build and test processes increases quality while automation enables fast feedback within all of the processes pictured.

Integration introduces a risk of breaking functionality, therefor we’d like to integrate as small changes as possible as fast as possible to not loose momentum moving towards production.

Let’s expand trunk based development with a gated check-in strategy using pull requests.

With automation and repeating this process many times a day we now have continuous integration.

A great source control platform that can enable this workflow is for example GitHub.

GitOps

In the previous article I talked about the power of GitOps, where we manage processes and configuration declaratively with source control. By using GitHub Actions we define this process declaratively using YAML and source control it together with the application code. By introducing a simple branching strategy to isolate changes during the parallel and iterative development cycle we also isolate the definition for building and testing.

But why stop there? Git commits are immutable and signed with a checksum and it’s history is a directed acyclic graph of all changes. That means that any thing stored in Git has strong integrity. What if we can treat each commit as a release. Three in one! Source control, quality checks and release management.

This is called continuous delivery.

Infrastructure as Code

Let’s see if a similar automation process can be used for managing declarative cloud infrastructure.

Almost the same. Deployment has been moved into the process enabling continuous deployment. Deployment becomes tightly coupled to a branch, let’s explore this further.

Environments

Traditionally deployments are decoupled from the development process because there is a limitation on the amount of environments to deploy to, causing environments to become a bottle neck for delivery. Deploying to the local machine might also be a completely different process, further complicating and deviating the development process. It would make sense having an environment for every development cycle to further expand on the simplicity of trunk based development. One environment for every branch no matter what the purpose of the environment is.

Using cloud infrastructure we can do that by simply moving the environments to the cloud!

Since each branch represents an environment, it makes working with environments simple.

  • Need a new environment? Create a branch!
  • Switch environment? Switch branch!
  • Delete an environment? Delete the branch!
  • Upgrade an environment? Merge changes from another branch!
  • Downgrade an environment? git reset!

Cost

A common concern with infrastructure and environments is cost. Continuously observing how cloud infrastructure related costs changes over time becomes even more important when all environments are in the cloud since more of the cost becomes related to resource utilization. Most cloud providers have tools available for tracking and alerting on cost fluctuations and since all environments are built the same the tools can be used the same way for all environments. This also enabled observing how cost changes faster and doing something about it even earlier in the development process.

If development environment costs do become too steep, they usually do not need the same amount of resources that exist in a production environment. For performance related development it might still be relevant, but in all other cases lowering cost is quite easy to achieve by lowering the resource tiers used and using auto-scaling as a built in strategy. The latter also lowers cost and increases efficiency for production environments by maximizing resource utilization.

In comparison, how much does building and maintaining local infrastructure for each employee cost? How much does it cost to set up a new on-prem environment, local or shared?

Example with Terraform

There are different tools that can help build cloud infrastructure. We’re going to use Terraform and the HashiCorp Configuration Language as the declarative language.

Let’s start by defining how to build, test and deploy. Here’s a simple GitHub Action workflow that automatically builds infrastructure using the previously mentioned workflow:

name: Build Environment

on: push

jobs:
  build:
    name: Build Environment
    runs-on: ubuntu-latest

    env:
      branch: ${{ github.ref }}

    steps:
      - name: Checkout
        uses: actions/checkout@v2

      - name: Creating Environment Variable for Terraform
        run: |
          branch=${{ env.branch }}
          branch_name=${branch#refs/heads/}
          env=${branch_name////-}
          env=${env//_/-}

          cat << EOF > env.auto.tfvars
          env = "$env"
          EOF

      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v1

      - name: Terraform Init
        run: terraform init 
        
      - name: Terraform Validate
        run: terraform validate

      - name: Terraform Plan
        id: plan
        run: terraform plan -out=terraform.tfplan -var-file=config.tfvars

      - name: Terraform Plan Status
        if: steps.plan.outcome == 'failure'
        run: exit 1

      - name: Terraform Apply
        run: terraform apply -auto-approve terraform.tfplan

Building could be translated to initiating, validating and planning in Terraform.

When initiating, Terraform initializes the working directory by setting up backend state storage and loading referenced modules. Since many persons might work against the same environment at the same time, it is a good idea to share the Terraform state for the environment module by setting up remote backend state storage that can be locked in order to guarantee consistency. Each environment should have it’s own tracked state which means it needs to be stored and referenced explicitly per environment. This can be done using a partial backend configuration.

To validates that the configuration files are syntactically valid, terraform validate is executed.

Running terraform plan creates an execution plan containing the resources that needs to be created, updated or deleted by comparing the current state with the current configuration files and the objects already created previously. The env.auto.tfvars configuration file created in the second step contains the environment name based on the branch name and can be used to create environment specific resources by naming conventions.

Last step is to apply/deploy the execution plan modifying the targeted resources.

Application Platforms

The infrastructure architecture we have explored so far is quite simple and mostly suited for one-to-one matches between infrastructure and application. This might work well for managed services or serverless but if you need more control you might choose an application platform like Kubernetes.

A service does not only comprise by a binary, it needs a host, security, network, telemetry, certificates, load balancing etc. which quite fast increases overhead and complexity. Even though such platforms would fit single application needs, it does become unnecessary complex operating and orchestrating an application platform per application.

Let’s have a look how Azure Kubernetes Service could be configured to host our applications:

An application platform like Kubernetes works like a second infrastructure layer on top of the cloud infrastructure. It does help simplify operating complex distributed systems and systems architected as microservices, especially when running as a managed service. Kubernetes infrastructure abstractions makes it easier for applications to provision and orchestrate functionality while preserving much of their independency. The applications do also still control provisioning of infrastructure specific for their needs outside Kubernetes.

Provisioning Kubernetes and all it’s cloud specific integrations has become separated from application specific infrastructure provisioning. Application infrastructure on the other hand has taken a dependency on the Kubernetes API alongside cloud resource APIs. The environment has become tiered and partitioned.

I do again want to emphasize the importance of having autonomous, independent and loosely coupled cross-functional teams. Each team should own their own infrastructure for the same reason they own their applications. A Kubernetes cluster should not become a central infrastructure platform that the whole company depends on.

Separated Application Workflow

Since the Kubernetes infrastructure has been decoupled from the application workflow it could make sense moving the applications to separate repositories. As they have become autonomous we need to figure out a simple way to reintegrate the applications downstream. Since branches defines the environment of the application platform infrastructure, we could simply go for a similar branch naming standard, i.e. MyEnvironment/SomethingBeingDeveloped.

Looking at the Kubernetes platform architecture, the GitOps continuous delivery tool ArgoCD is responsible for deploying applications in the cluster. The process for delivering applications becomes similar to the GitOps process described earlier, where deployment becomes a reversed dependency. Instead of deploying to a specific environment after releasing, ArgoCD is instructed to observe for new releases in an application’s repository and if the release matches the strategy, it becomes deployed. This means that many ArgoCD instances can monitor and independently deploy many applications across many k8s clusters without any intervention.

Here is the process again:

We still provision application specific infrastructure, that works the same way as described earlier, except we now have an upstream dependency; the Kubernetes cluster(s) in the targeted environment. To keep the applications separated in Kubernetes, we use separate namespaces. This is also where we share platform defined details, for example where to find the environment’s container registry. We can do this by creating ConfigMaps.

Namespaces can also be used to limit resource access for users and service principals minimizing exposed attack surfaces. Access rights for each application can be defined in the upstream infrastructure repository. Since we use a managed Kubernetes service, which has integration with active directory, we can leverage access to both Kubernetes and cloud infrastructure through managed identities.

Tying it all together with a GitHub Action workflow, it could look something like this:

name: Build and Release Application

on: push

jobs:
  build:
    runs-on: ubuntu-latest

    env:
      branch: ${{ github.ref }}
      version: 1.2.3

    steps:
      - name: Checkout
        uses: actions/checkout@v2

      - name: Extracting Environment From Branch Name
        run: |
          branch=${{ env.branch }}
          branch_name=${branch#refs/heads/}
          env=${branch_name%/*}
          echo "env=$env" >> $GITHUB_ENV

      - name: Login to Azure
        uses: azure/login@v1
        with:
          creds: ${{ secrets.AZURE_CREDENTIALS }}

      - name: Set AKS Context
        uses: azure/aks-set-context@v1
        with:
          creds: '${{ secrets.AZURE_CREDENTIALS }}'
          cluster-name: my-cluster
          resource-group: rg-${{ env.env }}

      - name: Fetch Environment Metadata
        run: |
          ENVIRONMENT_METADATA=$(kubectl get configmap/metadata -o go-template={{index .data "metadata.yaml"}} | docker run -i --rm mikefarah/yq eval -j)
          ACR_NAME=$(echo "$ENVIRONMENT_METADATA" | jq .acr.name | xargs)
          echo "ACR_NAME=$ACR_NAME" >> $GITHUB_ENV
          ACR_RESOURCE_GROUP_NAME=$(echo "$ENVIRONMENT_METADATA" | jq .acr.resourceGroupName | xargs)
          echo "ACR_RESOURCE_GROUP_NAME=$ACR_RESOURCE_GROUP_NAME" >> $GITHUB_ENV

      # Here could any application specific infrastructure be applied with Terraform
      # ...

      - name: Build and Push Application Container Images
        run : |
          az acr build --registry ${{ env.ACR_NAME }} --resource-group ${{ env.ACR_RESOURCE_GROUP_NAME }} --file path/to/Dockerfile --image my-application:latest --image my-application:${{ env.version }} .

      - name: Update Deployment Strategy (optional)
        run: |
          read -r -d '' helmvalues <<- YAML
          image_tag: ${{ env.version }}
          YAML

          cat application.yaml | \
            docker run -e HELMVALUES="$helmvalue" -i --rm mikefarah/yq eval '(.spec.source.helm.values=strenv(HELMVALUES) | (.spec.source.targetRevision="refs/heads/${{ env.branch }}")' - | \
            tee application.yaml
          
          kubectl apply -f application.yaml -n argocd

What about running and debugging an application during development? Use Bridge to Kubernetes straight from your IDE!

Shared Components

An important strategy when building environments (and applications) is that they need to be autonomous. However some components might need to be shared or at least moved upstream, like the backend state storage and permissions to create and manage components on the cloud platform.

These components should live in separate repositories with similar development strategies.

Permissions

Defining different roles is good practice in order to align with the least privilege principal. Depending on the size of the company, persons might have multiple roles, but remember that autonomous, cross-functional teams are important to move fast, so each team should have all the roles needed to deliver their applications.

Lessons Learned

One important thing about mitigating drift between environments is to continuously integrate between them. In an ideal world, each change is directly integrated into all environments. However, in reality, that will not always happen, which might cause incompatibility issues when introducing changes to environments. Upgrading from A to B to C is not the same thing as upgrading directly from A to C. That is usually what happens when a branch is merged into another branch, and with cloud infrastructure this might lead to unexpected problems. An example is that no minor versions can be skipped when upgrading Azure Kubernetes Service.

Can I skip multiple AKS versions during cluster upgrade?

When you upgrade a supported AKS cluster, Kubernetes minor versions cannot be skipped.

This means that Terraform needs to apply each commit in order. This can be a bit cumbersome to orchestrate.

Another important aspect with infrastructure as code is that it is quite easy to render yourself into an inconsistent state if manually changing configurations, which can be tempting in pressing situations and with the cloud platform tools easily available. Don’t fall in that trap.

Conclusions

Working cross-functional is powerful. It enables teams to work autonomous, end-to-end within a loosely coupled piece of domain. It also enables teams to use development workflows that includes both applications and infrastructure from start simplifying how to make them work efficiently together. By using the same infrastructure for all environments, continuously merging changes downstream, it can help mitigate drift, while simplifying managing infrastructure changes.

Leave a Comment

Mitigating Infrastructure Drift by using Software Development Principals

According to this survey from driftctl, 96% of the teams asked reports manual changes being the main cause for infrastructure drift. Other concerns are not moving from development to production fast enough and introducing many changes at once.

Drift is a big problem in system development. Test environments get broken and halts the whole development process, or they are in an unknown state.

As a software developer, you have probably experienced drift many times due to parallel, isolated development processes and not merging into production fast enough. We mitigate some of this drift by introducing continuous integration and continuous deployment processes that reliably can move software from development to production faster while still guaranteeing quality by test automation and gated check in. We use DevOps to shift knowledge left in order to remove blocking phases and speed up the process, and we use GitOps to source control operational changes in order to gain control over configuration unknown syndromes.

Development Goals

Before we further explore what concepts we have adopted to mitigate drift in application development, and how we can use it to also mitigating infrastructure drift, let’s have a look at some objectives for general product development. In particular, let’s study three core objectives:

  • Fast Deliveries
  • Quality
  • Simplicity

Fast deliveries makes the product reach the market and potential consumers fast, preferably faster than the competition. Quality makes the customer stay with the product, and simplicity makes both of the previous objectives easier to achieve and maintain.

Automation

Key part of moving fast is automation of repetitive tasks. It decreases the risk of drift by continuously moving us forward faster with higher accuracy by replicating and reproducing test scenarios, releases and deployments, continuously reporting back feedback helping us steer in the right direction. The more automation the faster we move, the lower risk of drift, the higher the quality.

Least Privilege / Zero Trust / BeyondCorp

Security is something that should be embraced by software developers during the development cycle, not something that are forced upon from the side or as an after construct. This is maybe even more important when building infrastructure. When security becomes a real problem, it’s not uncommon that it is too late to do something about it. Trust is fragile and so is also the weakest link to our customers precious data.

Applying a least Privilege policy does not only minimize risk of granting god mode to perpetrators, it also minimizes the possibility to introduce manual applied drift.

While least privilege can lower the attack surface, Zero Trust simplifies the way we do security. If there are no hurdles in the way of development progress, there is less risk to succumb to the temptation of disabling security mechanisms in order to make life easier.

Infrastructure as Code / GitOps

Source controlling application code has been common practice for many years. By using manifests to define wanted states and declarative code on how to get there, source control follows naturally. The reasons for why infrastructure as code is powerful are the same as for application source code; to track and visualize how functionality changes while being able to apply it in a reproducible way by automation.

Less risk of drift.

GitOps makes it possible to detect risky configuration changes by introducing reviews and running gated checks. It simplifies moving changes forward (and backwards) in a controllable way by enabling small increments while keeping a breadcrumb trail on how we got where we are.

DevSecInfraOps

Cross functional teams help us bridge knowledge gaps, removing handovers and synchronization problems. It brings needed knowledge closer to the development process, shortening time to production and getting security and operational requirements built in. Infrastructure is very much part of this process.

Local Environments

Using a different architecture for local development increases drift as well as the cost to build and maintain it. Concepts like security, network and scalability are built into cloud infrastructure, and often provided by products that are not available for local development. As for distributed systems, these are hard to use locally since they run elastically over multiple hosts possibly across geographic boundaries.

What if we could minimize both application and infrastructure drift by reusing the same cloud native infrastructure architecture and automation to produce the same type of environment, anywhere, any time, for any purpose, while adhering to the above criteria. Test, staging, it all shifts left into the development stage, shortening the path to production, and enabling using all environments as a production like environment.

In the next part we will deep dive into how we can work with cloud infrastructure similar to how we work with application development while adopting all of the above concepts.

Leave a Comment