Automatically starting the Windows Server AppFabric Caching Service

Tags: , , 27 Comments »

My current project makes heavy use of the Windows Server AppFabric Caching Service and whilst I think this is great piece of technology, it does have a pretty big hole in functionality. There’s currently no supported, out-the-box way to get a cache host (or the entire cluster) to start automatically, say after a server is rebooted – deliberately or otherwise. This is a bit of a limitation – after you spend time digesting all the configuration options and understood the importance of high availability and set up your cluster perfectly, you still need to be on permanent stand by in case one of your production servers goes down. At the minute, without logging on to one of the hosts in the cluster and manually executing some Powershell commands, your cache host remains permanently “out of the cluster” if something went wrong. Oh dear.

Some trawling around the web throws up some interesting but conflicting discussions  – on the one hand, this functionality is not supported, but on the other, it does work by changing the Windows service to start up automatically (although with the caveat that this could take up to 15 mins to restart), and it will never work with an XML based configuration store.

Confusing to say the least. Changing the AppFabric Caching Service to have a startup type of Automatic in the Services control panel resulted in some really unpredictable behaviour. We are using an XML based configuration store, so maybe this was never going to work, but all sorts of errors and crashes started appearing in the Event Log, basically the cache became unusable until this was changed back to Manual.

So, a custom approach was needed…

Solution

The custom solution is pretty simple, but there’s a few moving parts so I’ve broken it down piece by piece. It involves a custom Powershell script to start the cache host or cluster that is trigged by a Windows Scheduled Task upon system startup.

Powershell settings

By default, Powershell ships in a restricted mode that denies the execution of scripts. This always catches me out, and did so when I migrated this solution from running fine in my development environment onto one of our production servers. After failing silently a few times, I eventually realised what was going on, so make sure your Powershell is configured correctly before you start to you don’t waste time like I did!

Open the Powershell console (you may need to Run as Administrator) and execute the following commandlet to grant script execution:

set-executionpolicy RemoteSigned

Accept the warning by entering Y and you’re good to go.

The Script

The script itself is pretty simple. It basically imports the relevant modules, tells the caching service to look for the local installation to get the cluster settings and then starts the cache host based on the local machine name using the default cache port.

import-module DistributedCacheAdministration 

$computer = gc env:computername 

use-cachecluster 

start-cachehost -hostname $computer -cacheport 22233

I created this as a new Powershell script called StartCacheHost.ps1

Scheduled Task

In order to call this script at system startup, I created a new scheduled task using the Task Scheduler.

After opening the Task Scheduler, I created a new task as follows:

CreateTask

Make sure that this task is set to run whether the user is logged in or not, and that it’s configured to run for your flavour of OS. I also ticked the “Run with highest privileges” box as I usually have to run the Caching Administration Powershell Tool as an Administrator.

Moving to the Triggers tab, I added a new trigger to execute the script on system startup as follows:

Trigger

I opted to delay execution for 30 seconds after startup. This may not be necessary, but it’ felt like a minor trade-off to ensure everything’s up and running before we try to start the cache host.

Finally, moving to the Actions tab, I set the action to execute as follows:

Action

The program/script to run is the Powershell executable and the argument to pass in is the full path to the saved StartCacheHost.ps1 script.

Testing

In order to test that everything’s hanging together nicely, I wanted to start at the bottom and build upwards. So firstly, I wanted to test that the script itself would work.

On a running cache cluster, I stopped the current host by executing the following commandlet in the Caching Administration Powershell Tool:

stop-cachehost –hostname xxxxxxx –cacheport 22233

where xxxxxxx is the local machine name (the current host).

Running:

get-cachehost

returns information about the hosts running in the cluster and indicates that my current host now has a status of “DOWN”.

I then opened the Windows Powershell console and executed my StartCacheHost.ps1 script at the command prompt. Running get-cachehost again indicates that my host is now back up and running again – i.e. it has rejoined the cluster.

I then moved on to testing the scheduled task by stopping the current cache host again. This time I selected my new task in Task Scheduler and manually executed it by clicking Run.

Again, running get-cachehost indicates that my host is back up and running again.

Finally, to piece everything together, I restart my cache host server. 30 seconds after the server comes back up, my script executes and the host has rejoined the cluster. Perfect.

Considerations

The script above assumes that you’re running in a cluster (i.e. more than one cache host) and that the cluster is in a working state so that the current host can rejoin the cluster. If the cluster was down (e.g. if too many lead hosts had gone down) then the host would not be able to re-join the cluster.

I would treat these cases however as critical failures as the entire cluster has gone down and manual intervention is probably required anyway. Ideally, I’d like to be able to check if the cluster is running before executing the start-cachehost command – if there’s no cluster running then this could be swapped for start-cachecluster, although whether this would work would depend on the configuration of lead hosts etc.

I also wanted this behaviour to function in my development environment to save me having to start the caching service whenever I rebooted my laptop. In this case, where I know that there’s only ever one host running in the cluster, I would change the script to execute the start-cachecluster command instead of the start-cachehost e.g.

import-module DistributedCacheAdministration 

use-cachecluster 

start-cachecluster

Spark view engine and HTML 5

Tags: , , , 15 Comments »

It’s no secret that I love the Spark view engine. I’ve blogged about it before and nearly two years down the line since I first used it (and a number of production projects later), I still think it’s the best view engine out there for ASP.NET MVC.

In a nutshell, there’s two reasons why I think this.

1. The in-built view-specific syntax is so well thought out and makes building view files really easy.

For example, I love the ?{} conditional syntax which means that when placed next to an attribute, it will only be output if the statement evaluates to true:

<div id="errorMessage" style="display:none;?{Model.ErrorMessage == null}">
    Error message...
</div>

If you combine this inside a loop with the auto-defined xIsFirst, xIsLast variables you can do something like this:

<ul>
    <li each="var product in Model.Products" class="first?{productIsFirst} last?{productIsLast}">
        ${product.Name}
    </li>
</ul>

This is really powerful stuff. Adding classes to the last or first element in a list is something I always get asked to do by my friendly interface developers and before Spark (and deep in WebForms territory) this always meant helper methods, or messy logic inside view files. Nasty business. (By the way, in the example above, if the item is neither the first or the last element, then the entire class=”” attribute will just be ingored, i.e. no messy empty HTML attributes are rendered).

2. It fits the way I work in a multi-functional team.

I tend to be lucky enough to have dedicated interface developers on a project who specialise in creating beautifully clean, standards compliant, accessible HTML. I let them do their job and they let me do mine. Spark allows us to work side by side nicely without treading on each other’s toes. Integrating static HTML pages delivered by an iDev is really easy as the syntax is terse and succinct. Equally, an iDev looking at a Spark view file doesn’t run a mile screaming (which they used to do when faced with a WebForms .aspx page full of server controls). As Louis states -

The idea is to allow the HTML to dominate the flow and the code to fit seamlessly.

I think it does this beautifully. If I was more of a one-man-band and was responsible for creating everything myself, or did not have experienced web developers to hand, I’d probably prefer NHaml, which seems to have a much more developer focussed approach. I can definitely see the appeal here, but like I say, I tend to work with guys who know HTML and give me HTML to integrate into my applications.

Which brings me on to the subject of this post…

So things have moved on a bit since I started my last project and this time around I was given a lovely set of static HTML pages from a completely separate digital agency altogether. These people obviously know what they’re doing and have fully embraced HTML 5 and all it’s new syntax and features.

“Great” I thought, this should be easy. Just need to go through the views, binding up my data and adding in Spark logic wherever possible. And then I got this:

image

Turns out that <section> is a new HTML 5 element which was being used to great effect in my HTML. Turns out, it’s also a key word in the world of Spark and the two don’t play together too nicely. A few others have run into this problem and there’s a couple of suggestions:

Use the the namespace feature in Spark. This involves adding a prefix attribute in your Spark configuration like so:

<spark>
  <pages prefix="s">
  </pages>
</spark>

Which then means you need to qualify all your Spark elements:

<s:use content="view" />

I didn’t like this approach and found that it broke a lot of the terseness of the Spark syntax. It meant that I couldn’t use the shorthand method for calling partial views simply by specifying the file name.

The next suggestion was to wrap the <section> elements in the !{} syntax, effectively rendering them as non-encoded HTML literals:

!{"<section class='box error'>"}
    Error message...
!{"</section>"}

This approached worked the best – whilst making the <section> elements themselves a bit ugly, it left everything else Spark related in tact.

So, I got past that issue, thinking I was home free, only to be faced with:

image

Oh dear. Turns out that <header> and <footer> are also new elements in HTML 5. I tend to create partial views for both my header and footer logic and up to now have named them (quite sensibly) _header.spark and _footer.spark. Using the shorthand syntax for rendering views, I was able to call them in my layout file like so:

<body>
    <header />
        <use content="view" />
    <footer />
</body>

Well, not any more. Spark is trying to render my partial view called _header.spark, which contains my HTML 5 markup, including the <header> element. Hence the recursive rendering error.

The only solution I found to this was to break from tradition and rename my partial view files to _headerNav.spark and _footerNav.spark, which avoids the naming conflicts altogether.

Summary

HTML 5 brings some new markup syntax which conflicts with the inner workings of the Spark view engine. The most noticeable impact is the <section> element, which cannot be used as it stands with Spark without applying one of the workarounds detailed above.

Care should also be taken when naming partial views so as not to create naming conflicts with the new HTML elements available in HTML 5.

That said, I would still use Spark on projects as the benefits still massively outweigh these downsides. Hopefully the <section> issue will be resolved in a future release, but for now I’m prepared to live with my views being slightly less sparkly.

And now for something completely different

Tags: 4 Comments »

I’ve been pretty quiet recently on the blogging, twitter, community, Sharp Architecture, Who Can Help? front – for a month or two (or three) at least. Those that know me will know why but for those that don’t, I made the decision to leave EMC Consulting a couple of months ago. It was a hard decision as my time at Conchango/EMC had been the best years of my career and I met many great people who have influenced and helped me tremendously, not to mention some life-long friends. I hope those people know who they are. The majority of the content of this blog has come out of projects that I’ve directly worked on at EMC, or from offshoot community projects that I’ve worked on with former colleagues.

But, enough of the past. As well as leaving EMC, my wife and I decided to leave London and, in fact, the UK. I’m really pleased to say that I have just joined another awesome consultancy – Infusion – and have relocated to their office in Dubai, UAE. This is obviously a big step – new job, new company, new country – hence why everything else has taken a back seat whilst we re-organise our lives!

I’ve been at Infusion for a week now and it’s starting to feel normal (although Dubai is going to take a bit longer to get used to!). As with any consultancy, I’m not sure exactly what I’ll be working on just yet, but there’s talk of exciting WPF and Surface projects, and conversations are being had about Azure so I’m pretty keen to get my teeth stuck into something big.

As time goes on, this should start to fuel more topics of discussion on this blog, but in the meantime hopefully a normal service will resume with everything else. And, I should now finally be able to get around to contributing properly to Sharp Architecture after Alec so kindly asked me to join the team.

My latest project has launched!

Tags: 2 Comments »

I’m really proud to announce that my latest project has now officially been released into the wild at www.seethedifference.org!

See the difference - home page

Some of my recent posts around Windows Azure have referred to this project and it will now be the subject of some upcoming posts too.

This was a great project from a technical perspective:

  • It was the first *full* cloud solution that we have built/delivered at EMC Consulting – fully running on the Windows Azure platform, using Web Roles, Worker Roles, Azure Storage and SQL Azure
  • The total build time was 7 weeks due to properly re-using IP, patterns and practices from previous projects

A publicity perspective:

  • Members of the team have presented at least 5 Microsoft conferences, events and user groups about the project

And the non-tangible feel good perspective:

  • We helped a new UK charity startup launch their business by making it financially and technically possible
  • We contributed to something that will hopefully make a difference to a large number of good causes

Please check out the site, promote it, and if you’re feeling generous make a donation!!

Developing outside the cloud with Azure

Tags: 41 Comments »

My colleague Simon Evans and I recently presented at the UK Azure .Net user group in London about a project that we have just delivered. The project is a 100% cloud solution, running in Windows Azure web roles and making use of Azure Table and Blob Storage and SQL Azure. Whilst development on the project is complete, the site is not yet live to the public, but will be soon, which will enable me to talk more freely on this subject.

A few of the implementation specific details that we talked about in the presentation focussed on how we kept the footprint of Azure as small as possible within our solution. I explained why this is a good idea and how it can be achieved, so for those who don’t want to watch the video, here’s the write-up…

Working with the Dev Fabric

The Windows Azure SDK comes with the Dev Fabric, which is a simulation of the Azure platform that allows developers to run Azure solutions locally on their developer machines. It simulates the web and worker roles and also the various storage sources – table, blob and queue so that the entire solution can be run as-is on a single machine. Of course, this is just a simulation, and doesn’t provide any of the benefits that the real production Azure platform gives but it is a vital tool in being able to develop for the platform.

However, the Dev Fabric gets in the way of rapid development. Running in the Dev Fabric involves hitting F5 in Visual Studio which builds the solution, packages the solution, deploys the packaged solution, starts up the virtual environment, starts the storage services, loads the browser and finally loads the the site. This whole process, which Visual Studio orchestrates, can take a number of minutes, depending on the size of your solution and the spec of your machine.

A build in Visual Studio is not enough to see your changes. My development process of running all unit tests, seeing a green light, refreshing the browser to make sure my application still hangs together and I haven’t broken any UI logic now has a massive extra step which adds friction to the process. Even more so if all you changed was one line of HTML, or CSS – remember all these files form part of the deployed solution and so every time they change they need to get packaged up and copied somewhere else. Saving a file and hitting refresh in a browser is not going to work!

So, right from day one, I was looking to find ways to run our solution without depending on the Dev Fabric. This may seem slightly backwards – that I was building a cloud solution that would run outside the cloud, but reducing this overhead in the development process – a small one – but one that would happen numerous times a day meant we were able to deliver our project on time.

Interestingly, whilst the primary goal was breaking the dependency on the Azure platform to remove friction during the development process, the result is a well architected solution that has a minimal Azure footprint. Not only does this allow for greater flexibility down the line, but also means that a team with minimal Azure development experience can work effectively in an Azure solution as the necessary implementation specific knowledge is encapsulated in just a few places.

Data Access and the Repository Pattern

I tend to follow a similar pattern when building applications of having an Infrastructure or Data layer that deals with data access or any hard dependencies on external services. My friends Howard and Jon and I have talked a lot about this in the Who Can Help Me? project, which provides an architectural showcase for an MVC web application, building on top of the Sharp Architecture project. By adhering to these same patterns, we were able to encapsulate all our data access into repositories which dealt with our domain entities when reading and writing. This meant that our entire application has no concept of where these entities are stored until you get into the repository itself.

As Martin Fowler writes:

“A Repository mediates between the domain and data mapping layers, acting like an in-memory domain object collection. Client objects construct query specifications declaratively and submit them to Repository for satisfaction. Objects can be added to and removed from the Repository, as they can from a simple collection of objects, and the mapping code encapsulated by the Repository will carry out the appropriate operations behind the scenes. Conceptually, a Repository encapsulates the set of objects persisted in a data store and the operations performed over them, providing a more object-oriented view of the persistence layer. Repository also supports the objective of achieving a clean separation and one-way dependency between the domain and data mapping layers.”

So, in our case we were able to encapsulate all the Azure Table Storage specific logic inside the repository; we actually took this one step further and encapsulated specific queries into Commands which we could reuse across repositories if necessary.

Dependency Injection and stub Repositories

However, whilst keeping the Azure footprint small, this didn’t break the dependency on data access. In order to run the application we needed to retrieve data, which made calls into Table Storage, which meant we still needed the Dev Fabric to be running.

So… as our repositories all inherit from interfaces and are injected at runtime using the Castle Windsor IOC container, we were able to create stub implementations of these repositories which we could choose to inject instead, meaning we could run the application without the Dev Fabric. This becomes incredibly useful if you’re not actually concerned with the actual data that is being displayed at that point – e.g. you’re building validation logic, or the UI interactions etc.

We use the Fluent Registration feature of Castle Windsor which allows us to register dependencies based on a convention over configuration approach – finding components based on their namespace and location in this case. Rather than have to manually configure the solution to decide whether to use the ‘real’ table storage repositories, or the stub repositories, I wanted this to ‘just work’.

This was achieved by checking the RoleEnvironment.IsAvailable flag that is part of the Azure ServiceRuntime API. The code below shows how we change our conventions for registration based on this knowledge – so, if we’re running in the Dev Fabric (or the live Azure service) we get our real repositories, otherwise we get our stubs.

For more information on using the fluent registration approach, take a look at the Who Can Help Me? project.

public void Register(IWindsorContainer container)
{
    var repositoryNamespace = ".Repositories";

    if (!RoleEnvironment.IsAvailable)
    {
        // If running outside role/dev fabric
        // then use stub repositories
        repositoryNamespace = ".Stub";
        Trace.WriteLine("RUNNING OUTSIDE OF AZURE ROLE - STARTING UP WITH STUB REPOSITORIES");
    }

    container.Register(
        AllTypes.Pick()
            .FromAssembly(Assembly.GetAssembly(typeof(InfrastructureRegistrarMarker)))
            .If(f => f.Namespace.Contains(repositoryNamespace))
            .WithService.FirstNonGenericCoreInterface("SeeTheDifference.Domain.Contracts.Repositories"));
}

Configuration

The Azure environment changes the game slightly when it comes to application configuration as the web.config no longer provides the ability to change configuration at runtime. The reason behind this is that the web.config is packaged up and deployed just like any other code asset as part of your Azure deployment, which means that in order to change a value, you have to go through this whole process again.

The Azure environment does provide another source of configuration in the form of the .cscfg file, which is the Cloud Service Configuration and governs the details about your web or worker roles e.g. number of instances, endpoints, connection strings for Azure data stores. It can also be used to store any custom configuration settings in the form of key-value pairs, similar to the App Settings section you would find in a normal web.config.

So, the recommended approach is to use the .cscfg for any environment specific values as these can then be changed at runtime, which would cause your Azure roles to recycle without re-deployment, just like changing the web.config would in a normal IIS setup.

However, the downside to using the .cscfg is that Azure configuration is only available when the role environment is running – i.e. your using the Dev Fabric. Once again we wanted to break this dependency and be able to run our application outside of this environment.

The approach we took was twofold:

Firstly we encapsulated all our configuration calls into one place – a configuration manager that was injected into any class that needed information from configuration. By doing this we could implement a global switch to check whether we were running inside an Azure environment (just like when we decided which repositories to register) and act accordingly.

Secondly,  we chose to duplicate all our custom configuration settings in .cscfg into our web.config App Settings and read from the appropriate source, according to the RoleEnvironment.IsAvailable flag:

/// <summary>
/// Gets a named configuration setting from application configuration.
/// </summary>
/// <param name="name">The name of configuration setting.</param>
/// <returns></returns>
public string GetSetting(string name)
{
    if (RoleEnvironment.IsAvailable)
    {
        return RoleEnvironment.GetConfigurationSettingValue(name);
    }

    return ConfigurationManager.AppSettings[name];
}

Diagnostics

The Azure diagnostics are key to monitoring how your application is performing and are really the only way to know where and when you need to scale your application out or in. They’re also the source of information for events and custom tracing.

The documentation for configuring the Azure tracing recommends that you add the following configuration to your application’s web.config (and in fact, adds this automatically if you use the role project templates):

<system.diagnostics>
    <trace>
        <listeners>
            <add type="Microsoft.WindowsAzure.Diagnostics.DiagnosticMonitorTraceListener, Microsoft.WindowsAzure.Diagnostics, Version=1.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35"
                name="AzureDiagnostics">
                <filter type="" />
            </add>
        </listeners>
    </trace>
</system.diagnostics>

The problem here is that we’re tightly coupled again to the role environment, and as this configuration gets loaded and read at application startup, you’re not going to get very far if you’re not running within an Azure environment.

We chose to remove this configuration and add the trace listener programmatically at application startup, after performing the same check on the RoleEnvironment.IsAvailable:

/// <summary>
/// Initialises the Azure diagnostics tracing
/// </summary>
public void Initialise()
{
    if (RoleEnvironment.IsAvailable)
    {
        Trace.Listeners.Add(new DiagnosticMonitorTraceListener());
    }
}

This means that our Azure diagnostics tracing is now only initialised if we need it, meaning we can run without it when we’re in a normal ISS set up.

Summary

Hopefully this has shown that developing an Azure application does not have to be a process that takes over your entire solution and forces you down a particular style of development.

By architecting your solution well you can minimise the footprint that Azure has within your solution, which means that it’s easy to break the dependencies if required.

Being able to run an application in a normal IIS setup, without the Dev Fabric speeds up the development process, especially for those working higher up the stack and dealing solely with UI and presentation logic.

Using RoleEnvironment.IsAvailable to determine whether your running “in the cloud” or not allows you to act accordingly if you want to provide alternative scenarios.

By reducing the footprint of Azure within the solution, inexperienced teams can work effectively and acquire the Azure specific knowledge as and when required.

Design by j david macor.com.Original WP Theme & Icons by N.Design Studio