Sunday, October 22, 2017

Managing Multiple Environments in an On-Premise Service Fabric Cluster

Microsoft has a lot of useful information available on how to configure applications and microservices hosted in Service Fabric. However, as I started utilising Service Fabric (SF) to build real-world applications, I found myself faced with a few questions and uncertainties on how best to manage multiple environments in Service Fabric. That is, assuming a continuous integration (CI) pipeline for building, testing and deploying applications, should


  • Each environment (e.g. Development, QA, UAT, Production) be deployed on different Service Fabric clusters?
  • How do I pass different application parameters for the different environments during application deployment? Are these parameters versioned as part of the application code packages?
  • What about ports and Service endpoints? How will my client applications know how to connect to it? Should each environment use a different port? What if SF decides to relocate the service on a different Node and the http endpoint changes or the Node becomes unavailable?

  • Many of these questions are adressed in the SF documentation provided by Microsoft, and in this post I certainly don’t plan on duplicating this information. Regardless, when I actually got to setup my SF cluster and deploy my applications, I came accross a few unexpected challanges – and this is what I intend to cover here.
    In my particular scenario I’m running Service Fabric on premise. If you didn’t realise this was possible, it certainly is and you can find some excellent information here.

    Managing multiple Application Environments

    This may seem like an obvious question…well maybe! Traditionally, as software developers and IT professionals we’re used to provisioning separate hardware resources for each of our different application environments, like UAT and Production. So, why not just do the same in the Service Fabric world? Well, for starters, the minimum number of Nodes (i.e. machines) supported for a cluster running production workloads is no less than five nodes! That’s quite a few… and in my case, I had 5 environments – so that would mean 25 on-premise machines…that’s just crazy talk!
    Now, for development purposes, you can run all your nodes on one machine – and this is certainly what happens when you install the Service Fabric SDK on your developer machine – but this is not really suitable for your test environments and I would stongly advise against it. Why? Well,  in my experience, I found a lot of developers new to Service Fabric struggle with the notion that services in SF can move from one machine (Node) to another within the cluster – this means that:
    a) You shouldn’t use local folder paths in your microservices configuration
    b) You shouldn’t use normal standard .Net app/web.config files – it’s quite a big topic, but you can find out more here.
    c) You shouldn’t assume or rely on particular 3rd-party software being pre-installed on particular Nodes – if you need this, you’re probably best off using SF Docker containers, hosting it elsewhere…or using setup entry points.
    d) If you require special security policy configuration for individual microservices, use the already provided SF policy configuration.
    Thus, unless your test environments are configured to run on a multi-machine cluster, these type of bugs are likely going to be left as unfortunate exercises for your end-users…and lets face it…there’s already enough confusing software out there…lets not add to the pile!
        
    So, getting back to my original question:
    Q: How should I deploy multiple environments of the same application?
    A: Deploy them as uniquely named applications on the same cluster!
    In concept, this is easy…all you have to do is specify multiple Publish Profiles – and unique Application names within each corresponding parameters file. Within your Service Fabric project, the PublishProfiles and ApplicationParameters folders contain xml files corresponding to each of the environments you configured. Below is an example application parameters file for the Production environment – this is where you can define a unique application name per environment:
    image
    <?xml version="1.0" encoding="utf-8"?>
    <Application Name="fabric:/MyFabricApp-Production" xmlns="">
      <Parameters>
        <Parameter Name="StatelessService_InstanceCount" Value="-1" />
      </Parameters>
    </Application>
    Now I did say, in concept it’s easy…why? Well, I had a few other surprising things to consider and resolve.

    Issue 1: Publishing the same versioned application package multiple times to the same SF Cluster

    When you publish a service fabric application in Visual Studio it uses the following powershell script located in the aptly named Scripts folder.
    image
    I actually ended up using this powershell script in my automated build process too as it makes it easy to manage your application parameters via publish profiles – and the script will automatically read and apply these profile parameters during your deployment.
    Note: There are lower level SF Powershell commands you could use – but the one above was fine for my purposes. You can also read more about application configuration for multiple environments here.
    There is one problem with this Deploy-FabricApplication.ps1 script though and I’ve reported this issue on github. Basically, after a build, the versioned deployment package contains all the publish profiles for all environments. So, assuming your environments are:
    1. Development
    2. Staging
    3. Production
    …and your application package is versioned 1.0
    Upon deployment to your Development environment, Service Fabric will first register your 1.0 application package in the Service Fabric image store. Once registered, it will go through its normal process of either upgrading or creating this application.
    All fine…except, when it comes to deploying your application to the Staging environment, this script will fail - why? Because this script tries to re-register the same application package version again with the image store – and you can’t have duplicate deployment packages with the same version.
    Fortunately, all I had to do is tweak this powershell script with an additional parameter, called $BypassAppRegistration, and add new sections to the script (shown in red) to support a simple Upgrade or Create action without register:
    image

    Issue 2: Unique Gateway API ports for each Application Environment

    One of my microservices hosted an API layer for consumption by client applications. This API was implemented using Asp.Net Core WebApi and secured using Windows Authentication. Public facing web application endpoints are typically referred to as gateway Apis as they are the ones servicing requests from outside the cluster – and as such, you typically want to expose them via well-known and static URIs. Any other “internal-facing” microservices – whether using Remoting, Http or some other protocol, are typically resolved via the Service Fabric Naming Service, so they don’t need static URIs.
    When using Asp.Net Core, you can host your web application using either Kestrel or WebListener. Kestrel is not recommended for handling direct internet traffic (that said, if you put a load balancer or reverse proxy in front of it…that’s fine…as long as these front-facing services use a battle tested implementation…e.g. IIS/Http.Sys or nginx). Regardless, in my scenario I required my API to be secured using Windows Authentication – and only WebListener supports that.
    Now to configure an endpoint in Service Fabric, it needs to be declared in the ServiceManifest.xml file – Resources section:
    <Resources>
            <Endpoints>
                <Endpoint Protocol="http" Name="ServiceEndpoint" Type="Input" Port="8354" />
            </Endpoints>
    </Resources>
      
    The problem I faced is that there was no easy way to configure different ports for each of my different environment gateway API services – and because all my different environments run on the same cluster, this is a showstopper!
    Important: It looks like you can now finally override/configure different ports using parameters – this feature was added sometime after September 2017 so this wasn’t an option at the time for me.
    Aside: for internal-facing services, you should leave out the Port – Service Fabric will dynamically allocate a random port from the application port ranges specified during the creation of your Service Fabric Cluster.
    In order to allow me to use different ports for my gateway API services, this is what I ended up doing in the Resources section:
        <Resources>
            <Endpoints>
                <!-- This endpoint is used by the communication listener to obtain the port on which to 
               listen. Please note that if your service is partitioned, this port is shared with 
               replicas of different partitions that are placed in your code. -->
                <Endpoint Protocol="http" Name="ServiceEndpoint-Development" Type="Input" Port="8354" />
                <Endpoint Protocol="http" Name="ServiceEndpoint-Uat" Type="Input" Port="8355" />
                <Endpoint Protocol="http" Name="ServiceEndpoint-Testing" Type="Input" Port="8356" />
                <Endpoint Protocol="http" Name="ServiceEndpoint-Staging" Type="Input" Port="8357" />
                <Endpoint Protocol="http" Name="ServiceEndpoint-Production" Type="Input" Port="8358" />
            </Endpoints>
        </Resources>
    Then, in the Config\Settings.xml file of my service, I added a parameter that allowed me to specify which Endpoint I wished to use:
    <?xml version="1.0" encoding="utf-8" ?>
    <Settings xmlns:xsd="">
        <Section Name="Environment">
            <Parameter Name="WebServiceEndPoint" IsEncrypted="false" Value="" />
        </Section>
    </Settings>
      
    Finally, modify the WebApi.cs file of the API gateway service to use the relevant endpoint name:
    image
    Note too that, although not required, I chose to make the WebServiceEndPoint an environment variable – primarily so that I can have my environment name synchronized with the ASPNETCORE_ENVIRONMENT variable - so I used this in my ApplicationManifest.xml file:
      ...
      </ConfigOverrides>
        <EnvironmentOverrides CodePackageRef="Code">
          <EnvironmentVariable Name="ASPNETCORE_ENVIRONMENT" Value="[AppEnvironment]" />
        </EnvironmentOverrides>
      </ServiceManifestImport>

    Issue 3: Ok…I’ve got a unique URI for my API Gateway…am I done?? Almost!

    Although you’re now got a unique port for your gateway microservices, there are a few additional things you need to remember:
    • Service Fabric is in charge of balancing services across the SF Nodes…and during the life of the application, services can move from one Node to another – so don’t assume a service to remain on a particular Node!
    • Microsoft best practice recommends that you deploy gateway API services on all SF Nodes. However, even if you do, a Node can still go down - so clients still should not hardcode the URI to a particular Node
    Fortunately, there’s a few things you can do. If your cluster is in Azure, you can easily configure a Load Balancer or Azure Application Gateway. However, when running on-premise, you’d have to provide your own Load Balancer – not a big deal, but there’s an even simpler option!
    Service Fabric, regardless if deployed on-premise or in Azure – has a built in Reverse Proxy. Basically, you can present the cluster with the following URL (I’ve omitted some of the options for simplicity)
    http(s)://<Cluster FQDN | internal IP>:Port/<ServiceInstanceName>/<Suffix path>
    So, as an example, assuming my reverse proxy is on default port of 19081, my application name is MyApp-Production, my API service name is WebApi with route api/values, then clients can access address your API layer as:
    https://uriToMySfCluster.com:19081/MyApp-Production/WebApi/api/values

    Final Word

    Service Fabric is a great platform for developing microservices and I’ve had a lot of fun learning and coding against it. That said, as part of my discovery process, I most certainly ran into a few unexpected hurdles – but they all had solutions! Port configuration per environment was probably the most challenging and confusing aspect to overcome. The great news though is that Microsoft is continually improving this platform and making it easier to use…and port configuration is now much easier!

    Sunday, August 1, 2010

    Self-Tracking framework with Code-Only – now supports Entity Framework CTP4

    Someone recently asked if I could update my framework to support the latest CTP4 release of the Entity Framework Feature library. Lucky I had some spare time last night – you can download the latest version from here.

    The framework now uses the new EF DbContext class instead of ObjectContext. What this means is that you can now benefit from the productivity enhancements afforded by DbContext – such as model discovery. I suggest you have a look at the following blog for a good overview on what’s new in CTP4.

    Now, granted, there’s a lot more I could have done. The CTP4 release now supports the data annotations attributes. By decorating your entity class properties with these attributes, the CTP4 release will now automatically use these attributes as part of its code-first code convention programming model. In essence this means that, for the most part, you can use these attributes instead of having to use the fluent interface to map your entities. You can certainly benefit from these now – however note the following shortcomings which I hope to address in future:

    • Provide support for these attributes in my T4 template. Specifically, it’ll be nice if one could also use the [RelatedTo] data annotation attribute instead of my custom [Persist] attribute.
    • The current framework still only uses the Enterprise Library 4.1 validation block instead of the latest 5.0 version. This latest EntLib validation block release now also supports the data annotations attributes, thus this would be a good fit in promoting the principles of DRY.

    Finally, I’ve made one minor refactoring in order to give you more flexibility in mapping your entities:

    1. public interface IMappingStrategy
    2. {
    3.     void MapEntities(ModelBuilder modelBuilder);
    4. }

    An implementation of this IMappingStrategy interface is now taken as a parameter to the EntityFrameworkRegistrar constructor. The idea is that you can provide your own implementation to automate mapping from your entities to the domain model. For example, I provide a MappingByEntityConfiguration implementation so that you can continue to create mapping classes for each of your entities. Alternatively, you can provide another implementation that will allow you to do all your mapping in one class by simply using the fluent interface already provided by the EF ModelBuilder class as per the below example:

    1. public class MyMappingStrategy : IMappingStrategy
    2. {
    3.     #region IMappingStrategy Members
    4.  
    5.     public void MapEntities(System.Data.Entity.ModelConfiguration.ModelBuilder modelBuilder)
    6.     {
    7.         // Map company
    8.         modelBuilder.RegisterSet<DomainModel.Company.Company>();
    9.         var company = modelBuilder.Entity<DomainModel.Company.Company>();
    10.         company.HasKey(c => c.Id);
    11.         company.Property(u => u.ConcurrencyToken)
    12.             .IsRequired()
    13.             .IsConcurrencyToken()
    14.             .HasStoreType("timestamp")
    15.             .IsComputed();
    16.         company.HasOptional(c => c.Ceo).WithOptionalDependent(x => x.CeoOfCompany);
    17.  
    18.         // Map Employee
    19.         modelBuilder.RegisterSet<Employee>();
    20.         var employee = modelBuilder.Entity<Employee>();
    21.         employee.Property(u => u.FirstName)
    22.                 .IsRequired()
    23.                 .IsVariableLength()
    24.                 .IsNotUnicode()
    25.                 .HasMaxLength(Employee.MaxNameLength);
    26.  
    27.         employee.Property(u => u.MiddleName)
    28.             .IsOptional()
    29.             .IsVariableLength()
    30.             .IsNotUnicode()
    31.             .HasMaxLength(Employee.MaxNameLength);
    32.     }
    33.  
    34.     #endregion
    35. }

    You could take this a step further and use reflection to discover all your entities and call the ModelBuilder RegisterSet<TEntity> automatically for all entities (much like what I already do in the provided MappingByEntityConfiguration class implementation). Then, provided your entities are already decorated with the data annotation attributes, you probably could get away without having to use the fluent interface to further configure your entity mapping.

    It’s also interesting to note that the ModelBuilder class also has a RegisterEdmx(string) method providing you with yet another potential way of mapping your entities (although this edmx approach is probably not the most practical approach when already using the code-only approach). My point is – you’ve got options with IMappingStrategy!!

    More on using the Self-Tracking framework…

    I’ve been using a version of this framework at my current employer. Actually, the version you have here is a drastically stripped down version of what I’ve already developed – unfortunately I won’t be able to provide you with the full version of this framework...but at least let me share some ideas and thoughts on how you could progress this framework further in your own time:

    • Use WCF!!! No really, the self tracking entities are designed to work in a 3-tier architecture. Specifically, most of the “self-tracking” capabilities don’t really kick in (i.e. become enabled) until after the entities are de-serialised. As such, I recommend that you provide a WCF service endpoint to your client applications – this WCF service will allow the client applications to save an object graph of self-tracking entities. Thus,
      • for 2-tier deployment architectures, run your WCF process in-process instead of allowing the client to directly manipulate the DAL,
      • To support unit-testing, debugging and flexibility in deployment, make it easy to run your WCF service either in-process or out-of process (i.e. external as per normal)
    • Use of the lazy loading support provided by the EF is not recommended* – in fact, I’ve turned it off in my framework. Instead:
      • Provide an ability in your framework to specify, per entity object, which navigation properties should be eager loaded by default,
      • Provide support for lazy loading over WCF. Furthermore, lazy loading shouldn’t occur implicitly, but rather should be explicitly triggered by the user. For example, have a method on Company called LoadCompanyDivisions() instead of loading the company divisions automatically the first time the divisions property is accessed. Bear in mind that the UI/client side threading model (e.g. WPF vs Silverlight) could have a bearing on how this lazy loading is performed, so ensure that you cater for this via, for example, an IoC-type solution.
      • Combine the concepts of eager loading with lazy loaded navigation properties. This will greatly improve scalability and performance aspects in your application architecture by minimizing both the number of WCF service calls as well as separate calls to your database backend.

    * To use Lazy loading in EF, your entity objects are “proxied” at run-time so as to support the implicit lazy loading behavior – however these proxied objects won’t serialize and as such you have to do some extra WCF work to get around this. In addition, this EF lazy loading won’t work client side…but client-side lazy loading can be quite useful when used wisely.

    Sunday, May 30, 2010

    Renaming Columns when using Code Only in EF 4.0

    The Code-Only feature in Entity Framework 4.0 is currently in CTP 3. Although I’ve been using it a fair bit, one particular pain point for me at  the moment relates to the limited support provided to fine-tune the generated database table columns mapping.

    When using Code-Only, one uses a fluent interface to define how your entities (typically POCO classes) should map to a database schema.  For example, you could use the following extract to specify that the Description property in your entity should be mapped in the database as a column with name Description and of type VARCHAR(100) NOT NULL

    1. Property(u => u.Description)
    2.     .IsRequired()
    3.     .IsVariableLength()
    4.     .IsNotUnicode()
    5.     .HasMaxLength(100);

    This is cool – I think the above syntax is practical and easy to understand. Even better is that one could generate a complete database schema purely from mapping your domain objects (entity classes). Unfortunately, however, the current CTP 3 release is of very limited (or no) help when it comes to doing the following:

    1. Changing the conventions used by EF when generating the corresponding column names for properties. For example, if you have a Customer entity class with primary key property named Id, then EF will by default always map/generate the property to a column named Id – there’s no way to override this default behaviour to, for example, map to CustomerId instead. Foreign key columns will typically be named xxx_id – though what if you wanted xxxId instead? It would be nice if one could specify a template of column naming conventions – something that would promote the DRY principle.
    2. You can override the column name for your properties, however you must do so for ALL your properties – not merely a subset of them.
    3. There doesn’t seem to be a way to change the ordering of generated columns – worse even, when generating a schema from your domain model, one will often find the primary key column not being the first column in the generated tables. For now, I suspect your best option would be to run a SQL script after schema creation to re-order columns to your liking.

    Now obviously I hope that a future Code-Only release would simply provide enhancement to their current fluent interface to address the abovementioned shortcomings; in the mean time, I thought I’d explore some alternatives.

    Items 1 and 2 above could be overcome by using the following code block to explicitly specify the column name of each property (here I’m obviously assuming Type Per Hierarchy mapping but that’s not really relevant to the source problem):

    1. MapSingleType(t => new
    2.     {
    3.         CompanyDivisionID = t.Id,
    4.         CompanyDivisionName = t.Name,
    5.         t.Description,
    6.         t.IsActive
    7.     });

    Note, by using the anonymous class,

    1. The property Id will be mapped to a column named CompanyDivisionID
    2. If no alternate name is specified, then EF will simply map the name as per default; e.g. the Description and IsActive properties will be mapped to the columns with names Description and IsActive, respectively.

    Obviously, having to explicitly mention all your columns (for all your entities) in the mapping so that you can override the default generated column naming convention is tedious at best. I suspect code generation is probably the best interim solution here. Now before you get all excited – I haven’t nearly come up with a complete solution (personally, I think you should rather wait for the next Code-Only release); but at least I can help you get there…or make you learn something in the process…I know I have.

    Using LINQ Expressions to Generate Code

    For the code generation – you could probably (quite successfully) use T4; however I thought I’d try something that I haven’t had much chance playing around with yet…generating code via LINQ Expressions.

    Basically, LINQ introduces a new approach to generate code that is a bit more high level that the IL-based Emit technique. LINQ introduces a runtime representation for expressions which you could think of this as a language-neutral abstract syntax tree that can easily compiled into IL.

    Before showing how one could address this problem using Expressions, first realise that the EF provides a different syntax than specified above that is a bit more suited to dynamic code generation:

    1. MapSingleType(t => EntityMap.Row(
    2.     EntityMap.Column(t.Id, "CompanyDivisionID"),
    3.     EntityMap.Column(t.Name, "CompanyDivisionName"),
    4.     EntityMap.Column(t.Description),
    5.     EntityMap.Column(t.IsActive)));

    To generate this code dynamically at run time using LINQ Expressions, we need to dynamically build up an expression tree that represents the Lambda function above. The following code snippet shows a simple implementation of how this could be done:

    1. PropertyInfo[] properties = null; //todo: use reflection to get entity's properties...
    2. ParameterExpression tParam = Expression.Parameter(typeof(TEntity), "t");
    3.             
    4. var allEntityMemberExpressions = new List<Expression>();
    5. foreach (var p in properties)
    6. {
    7.     ConstantExpression pName = Expression.Constant(p.Name, typeof(string));
    8.     var member = Expression.Convert(Expression.MakeMemberAccess(tParam, p), typeof (object));
    9.     var entityMemberExpression = Expression.Call(typeof (EntityMap), "Column", null, member, pName);
    10.     allEntityMemberExpressions.Add(entityMemberExpression);
    11. }
    12.  
    13. // convert allEntityMemberExpressions into an array of expressions
    14. // and call EntityMap.Row(EntityMapColumn[])
    15. var entityMapRows = Expression.Call(
    16.     typeof(EntityMap), "Row", null,
    17.     Expression.NewArrayInit(
    18.         typeof(EntityMapColumn),
    19.         allEntityMemberExpressions));
    20.  
    21. MapSingleType(
    22.     // create the lambda: t => EntityMapRow(...)
    23.     Expression.Lambda<Func<TEntity, object>>(
    24.         entityMapRows,
    25.         new ParameterExpression[] { tParam }));
    26.         }
    27.     }

    For the most part, hopefully this is self-explanatory – nonetheless, a few explanations:

    1. Line 7: We create a constant expression to represent the column name we want to give our property
    2. Line 8: Expression.MakeMemberAccess with create the expression that represents t.Property. However, some of our properties are likely primitive types (e.g. int) hence we explicitly cast it to type object to avoid a runtime exception in line 9 stating that the static EntityMap.Column(object, string) method couldn’t be resolved.
    3. Line 17: Although we have a collection of expressions – each representing a property/column mapping, we need to convert this to an array of expressions due to the signature of the static EntityMapColumn.Row(Expression[]) method. This is achieved via the Expession.NewArrayInit method.

    Anyway, unless you’re a computer, this code is certainly more readable than IL Emit code. This might not necessarily be the best way to address my original problem – but I’ve learned something new today that I’m sure I’m going to find useful in the future…

    Thursday, May 20, 2010

    PInvokeStackImbalance in .NET 4.0…I beg your pardon?

    It’s been quite some time since I’ve played with PInvoke…mmm - could it be as far back as my .NET 1.1 days…back when I just started to appreciate this little thing called “Garbage Collector”…the days where I just started to realise there’s more to life than sacrificing cute and cuddly animals on that big bloody table called C++…selling my soul to Rational (Purify) in the process…yet still somehow convinced that my just completed sacrifice was in vain and instead I just prepared a feast for the Memory Leak devil…
    …and just when you get comfortable in the Heaven of managed code…you get smacked right in the face with some old piece of C code…dancing on your now soft delicate body…a body that’s been marinated from the many years of coding in your CLR play pen… and invoking your privates with a stack of imbalance…SAY WHAT???

    PInvokeStackImbalance

    I came across this today after upgrading a class library that uses PInvoke to .NET 4.0. Specifically, after running the code, I got a PInvokeStackImbalance detection message from the VS2010 Managed Debugging Assistant (MDA). Obviously, my first thought was, as indicated by the message, that the PInvoke signature didn’t match the types of the C function I’m trying to call. However, I still couldn’t see a problem – all the parameter types seemed correct. Also, since I’m running Win 7 64-bit, I figured possibly this could be a 64-bit issue – so I changed the Platform target in my build settings from Any CPU to x86. Still no success.
    In the end, the problem seemed to disappear once I’ve changed my build target for the PInvoke class library to target .NET 3.5 (2.0 also works fine) instead. The code worked fine…however this solution just didn’t smell right to me, so I did a bit more digging…and good thing I did!
    Turns out that this issue is mentioned in the .NET 4.0 Migration Issues document under the Interoperability section – I’ve copied the relevant section below for quick reference:
    Platform invoke To improve performance in interoperability with unmanaged code, incorrect calling conventions in a platform invoke now cause the application to fail. In previous versions, the marshaling layer resolved these errors up the stack. Debugging your applications in Microsoft Visual Studio 2010 will alert you to these errors so you can correct them.
    If you have binaries that cannot be updated, you can include the <NetFx40_PInvokeStackResilience> element in your application's configuration file to enable calling errors to be resolved up the stack as in earlier versions. However, this may affect the performance of your application.
    Thus, in earlier versions of .NET, we were able to get away with the incorrect calling convention as the .NET framework silently corrected the issue (alas at the cost of introducing a performance penalty).
    Armed with this knowledge, it became apparent that the real source of my PInvoke problem was that the PInvoke calling convention for my method defaulted to StdCall – and I needed to use Cdecl instead. Once I updated the code, I was able to restore my class library to target the .NET 4.0 framework.
    1. [DllImport("libfftw3f-3.dll", EntryPoint = "fftwf_plan_r2r_1d", ExactSpelling = true, CallingConvention = CallingConvention.Cdecl)]
    2. public static extern IntPtr r2r_1d(int n, IntPtr input, IntPtr output, fftw_kind kind, fftw_flags flags);
    Over the years I might have turned soft from my lack of frequent C++ coding, but at least today I’ll sleep well knowing there’ll be no platform invoking my stack of imbalance…

    Thursday, March 11, 2010

    Entity Framework 4 - Modified Self-Tracking Entity framework that supports Code-Only and Persistence Ignorance

    Source Code (Updated):  Selftracking.zip

    Overview

    I like the Self-Tracking templates that come with the latest release of the Entity Framework in VS2010 RC. If you haven't read much about it before, have a look at the ADO .NET team blog.
    In this post I'll share with you a customised version of the VS2010 RC Self-Tracking framework that I've built in my spare time and, although it is largely based on the Entity Framework Self-Tracking feature in VS2010 RC, it also contains a fair amount of enhancement. Specifically, this framework:
    1. Provides support for Code Only Self-Tracking entities and complex types that does not involve the EF designer. Don't get me wrong; I like the designer; but it doesn't scale well once you have more than a handful of classes. I guess one could possibly modify/tweak the existing templates so you can split your domain model in different edmx files - but I wanted a pure code-only approach...so instead I've created my own T4 template that uses your domain object classes as the source of its generation.
    2. Demonstrates how to implement a persistent ignorant data access layer for EF. Specifically, I expose IUnitOfWork, ISession and IRepository interfaces and register my EF-specific implementations via an IoC container - in this case Unity.
    3. Includes a few minor tweaks and bug fixes to improve common usage scenarios with Self-Tracking entities...more about this later.
    4. Shows how to use the self-tracking framework via the included sample domain model and unit tests. 
    Note: I haven't incorporated WCF usage in the sample provided here - however I already have used it in this context with great success...perhaps I will blog about this in future…
    Before I begin, I just want to re-iterate that the code directly related to implementing self-tracking support is primarily a copy/paste job from the code generated by the Self-Tracking templates feature of the VS2010 RC release. I have however wrapped most of the generated code into methods that can be succinctly invoked from your domain objects. Most of the work I’ve done was involved in the creation of a new T4 code generation template as well as making the data access layer persistent ignorant.  
    I'd also like to mention that I found the following resource quite helpful when I started out with EF4 - especially the bits regarding mapping Poco objects and repository implementation for loading navigation properties - its certainly worth a read if you’re new to EF4.
    Without further ado, lets get started and see how to use this framework.

    Getting Started: The Sample Project

    The image below shows the project structure.
    image
    The following provides more detail on the individual framework projects:
    • SelfTracking.Core. This project provides our IoC implementation (via Unity). It also contains a helper method to make validation with the Enterprise Library Validation block a bit easier when the type of the object being validated is not known at compile time. 
    • SelfTracking.DataAccessLayer contains the interfaces for our persistence ignorant data access layer.
    • SelfTracking.DataAccessLayer.EF is our Entity Framework-specific implementation of the Data Access Layer interfaces. There’s also a repository implementation for non-self-tracking entities.
    • SelfTracking.DomainModel contains the Self-Tracking framework. It also contains the T4 template you should copy to your own project and some code snippets you may find useful when writing your domain objects and/or properties.
    The Sample projects demonstrate usage of this self-tracking framework by introducing a very simple domain model that includes one-to-many & one-to-one associations as well as a complex object.
    • Sample.DataAccessLayer introduces the IStandardTrackableObjectRepository interface for our sample project. I’ve chosen to only use one repository interface for this domain as it more than suffices – however larger projects will most likely require more. 
    • Sample.DataAccessLayer.EF contains the EF implementation of IStandardTrackableObjectRepository, mappings for our domain objects as well as a registrar class to assist with mapping/registration of our domain objects with our IoC (Inversion of Control) container.
    • Sample.DomainModel contains our simple domain model.
    • Sample.TestDataAccessLayer contains a few unit tests to keep me honest…mmm, since I'm talking about being honest here...these are actually integration tests, not unit tests - either way, you'll get some insight on how this all fits together by looking at these.
    The example domain is depicted below - a few important points are worth mentioning:
    • There’s a many to many relationship between Company and CompanyDivision (e.g. “Marketing”, “Accounting”…). Notice how this is mapped as two one-to-many relationships rather than a single many-to-many relationship. This is deliberate and I would strongly encourage you do the same in your projects. Basically, one often needs to store additional data as part of a many-to-many relationship (e.g. “created by user” or “is active” if you do logical deletes) – this can’t be achieved via a single many-to-many mapping. Even if you do not initially think that you’ll need to store additional data with this type of relationship, you could be in for a bit of rework later when you need to change it. But don’t just take my word for it…NHibernate in Action gives a much better and more lengthy discussion on this matter.
    • Each Company can have a CEO (0-1 multiplicity)
    • Each Employee must be employed by a company, and each company can employ many employees (one-to-many)
    • For demonstration purposes, Address is a deliberate (bad) example of a complex object. Typically you’d make this an entity instead so that you can have Foreign keys for things like State and PostCode (i.e. Zip code for folks in the US)
    image
     

    Using the T4 Template

    When creating a new domain model project, first copy the TrackableObjectCodeGen.tt T4 template file to your new project. Also ensure that you specify TextTemplatingFileGenerator for the Custom Tool property in the properties window for this file. You’ll also have to manually trigger the execution of this file whenever you modify your domain projects – however you can also get it to automatically generate code every time you build your project.
    I’ll use the Employee class from the sample domain project as an example as it contains quite a few different types of properties. The following listing shows the code – i.e. this contains the code you have to write – not the code that’s generated.

    Code Snippet
    1. using System;
    2. using System.Collections.Generic;
    3. using System.Linq;
    4. using System.Runtime.Serialization;
    5. using System.Text;
    6. using Microsoft.Practices.EnterpriseLibrary.Validation.Validators;
    7. using Sample.DomainModel.Common;
    8. using SelfTracking.DomainModel;
    9. using SelfTracking.DomainModel.TrackableObject;
    10.  
    11. namespace Sample.DomainModel.Company
    12. {
    13.     [Serializable]
    14.     [DataContract(IsReference = true), KnownType(typeof(Address)), KnownType(typeof(Company))]
    15.     public partial class Employee : ActiveBasicObject, IConcurrencyObject
    16.     {
    17.         #region Fields
    18.         public const int MaxNameLength = 20;
    19.         private Address _HomeAddress;
    20.         private bool _HomeAddressInitialised;
    21.         private string _FirstName;
    22.         private string _MiddleName;
    23.         private string _LastName;
    24.         private Company _CeoOfCompany;
    25.         private Company _EmployedBy;
    26.         private byte[] _concurrencyToken;
    27.         #endregion
    28.  
    29.         #region Primitive Properties
    30.         [DataMember]
    31.         [StringLengthValidator(1, MaxNameLength)]
    32.         public string FirstName
    33.         {
    34.             get { return PrimitiveGet(_FirstName); }
    35.             set { PrimitiveSet("FirstName", ref _FirstName, value); }
    36.         }
    37.         
    38.         [DataMember]
    39.         [IgnoreNulls]
    40.         [StringLengthValidator(1, MaxNameLength)]
    41.         public string MiddleName
    42.         {
    43.             get { return PrimitiveGet(_MiddleName); }
    44.             set { PrimitiveSet("MiddleName", ref _MiddleName, value); }
    45.         }
    46.         
    47.         [DataMember]
    48.         [StringLengthValidator(1, MaxNameLength)]
    49.         public string LastName
    50.         {
    51.             get { return PrimitiveGet(_LastName); }
    52.             set { PrimitiveSet("LastName", ref _LastName, value); }
    53.         }
    54.  
    55.         /// <summary>
    56.         /// Gets or sets the concurrency token that is used in optimistic concurrency checks
    57.         /// (e.g. typically maps to timestamp column in sql server databases)
    58.         /// </summary>
    59.         /// <value>The concurrency token.</value>
    60.         [DataMember]
    61.         public byte[] ConcurrencyToken
    62.         {
    63.             get { return PrimitiveGet(_concurrencyToken); }
    64.             set
    65.             {
    66.                 // no point using a "proper" equality comparer here as this field is auto calculated and typically only be set by the persistance layer
    67.                 PrimitiveSet("ConcurrencyToken", ref _concurrencyToken, value, (a, b) => false);
    68.             }
    69.         }
    70.  
    71.         #endregion
    72.  
    73.         #region Complex Properties
    74.         [DataMember]
    75.         [ObjectValidator]
    76.         public Address HomeAddress
    77.         {
    78.             get { return ComplexGet(ref _HomeAddress, ref _HomeAddressInitialised, HandleHomeAddressChanging); }
    79.             set
    80.             {
    81.                 ComplexSet("HomeAddress", ref _HomeAddress, value,
    82.                            ref _HomeAddressInitialised, HandleHomeAddressChanging);
    83.             }
    84.         }
    85.         #endregion
    86.  
    87.         #region Navigation Properties
    88.         /// <summary>
    89.         /// If this employee is a CEO of a company, this property navigates to the associated company.
    90.         /// </summary>
    91.         /// <value>The ceo of company.</value>
    92.         [DataMember, Persist("Company-Ceo")]
    93.         public Company CeoOfCompany
    94.         {
    95.             get { return OneToOneNavigationGet(_CeoOfCompany); }
    96.             set
    97.             {
    98.                 FromChildOneToOneNavigationSet("CeoOfCompany", ref _CeoOfCompany, value,
    99.                                                root => root.Ceo, (root, newValue) => root.Ceo = newValue, true);
    100.             }
    101.         }
    102.  
    103.         /// <summary>
    104.         /// The company that currently employs this employee - mandatory (otherwise this isn't an employee!)
    105.         /// </summary>
    106.         /// <value>The employed by.</value>
    107.         [DataMember, Persist("Company-Employees")]
    108.         [NotNullValidator]
    109.         public Company EmployedBy
    110.         {
    111.             get { return ManyToOneNavigationGet(ref _EmployedBy); }
    112.             set
    113.             {
    114.                 ManyToOneNavigationSet("EmployedBy", ref _EmployedBy, value, oneside => oneside.Employees);
    115.             }
    116.         }
    117.  
    118.         #endregion
    119.     }
    120. }
       
    This class inherits from ActiveBasicObject – although typically you would only need to derive from the TrackableObject class. I’ve used ActiveBasicObject as an easy way to provide the majority of my entities with an Id and IsActive Boolean property (for logical deletes).
    Undoubtedly, all those Primitive/ManyToOne/Get/Set methods in the property implementations are going to look strange to you. These methods serve 2 important purposes:
    • It instructs the T4 template what type of property it is so that it can generate the right code (the T4 template inspects the setter property body implementations for the presence of these methods).
    • They contain all the code that is necessary for Self-Tracking to function – including raising property changed events and ensuring that navigation properties on both sides of the association are kept in sync. Basically, these methods contain the code as copied from the Self-Tracking VS2010 RC release – however they’ve been cleaned up a bit and neatly wrapped into one liner methods. Aside: in most cases, the getter methods don’t really do anything other than returning the supplied field value. As such, they aren’t really required and the template will continue to function without them. Nonetheless, I think this adds a nice symmetry to the property implementations by having both get/set methods – beside, you could extend the getter methods to implement some of your own functionality (aka similar to what one would do with Aspect Oriented Programming).
    When running the T4 template, apart from generating a new corresponding partial class with the required self-tracking supporting code, the T4 template may also make a few minor changes to your original domain object class – this is certainly not something you’d normally want to do with code generation…and before you get all excited and scream foul play, consider that it will:
    • Make your domain object class partial, if it isn’t already;
    • Add any missing KnownType attributes at the top of your class for all the distinct reference and DateTimeOffset persistent property types contained in your self-tracking class. For example, if you have a Person property in a class X, and you also have a Customer class that derives from Person, both Customer and Person KnownType attributes will be added to class X. That said, due to the limitations of EnvDTE (which I used to drive the code gen), the T4 template will ignore interfaces. If however you consider the task of manually decorating your classes with KnownType attributes as one of the true pleasures of life, then by all means don’t let me stop you from having fun - simply set the AddMissingKnownTypeAttributes global parameter in the T4 template to false.
    • Notify you via warnings in the VS studio error list pane if it made any of the abovementioned changes to your classes.
    In addition, you most likely also noticed the Persist attributes. In short, the T4 template would have a really hard time determining the pair of properties in the 2 different classes that partake in the same one-to-one and one-to-many association without these attributes. The text in the attribute contains the name of the association and the T4 template will notify you via errors if it is unable to match association properties due to incorrectly used/missing Persist attributes. Finally, I’ve also decorated the properties with attributes from the Enterprise library Validation block – use of these are completely optional – feel free to use whatever framework/mechanism you prefer.
    The following code block shows the other end of the “Company-Employees” association that lives in the Company class. Note the use of the OneToManyNavigationGet/Set properties instead of ManyToOneNavigationGet/Set. An important thing to realise here too is that the FixupEmployees method is generated by the T4 template – it MUST be called FixupMyPropertyName although feel free to modify/enhance the T4 template if you’re not happy with this restriction. In addition, obviously you’re going to have a compile error unless you first trigger the t4 template to generate code. I’ll stress it again…don’t forget to regenerate your code before you compile.
    Code Snippet
    1. [DataMember, Persist("Company-Employees")]
    2. public TrackableCollection<Employee> Employees
    3. {
    4.     get
    5.     {
    6.         return OneToManyNavigationGet(ref _Employees, FixupEmployees);
    7.     }
    8.     private set
    9.     {
    10.         OneToManyNavigationSet("Employees", ref _Employees, value, FixupEmployees);
    11.     }
    12. }

    The following table provides examples of the different property types supported by this self-tracking framework.
    Property Type
    Code Snippet shortcut
    Primitive Property
    (code snippet: tpropp)
    1. [DataMember]
    2. public string FirstName
    3. {
    4.     get { return PrimitiveGet(_FirstName); }
    5.     set { PrimitiveSet("FirstName", ref _FirstName, value); }
    6. }
    Complex Object Property
    (code snippet: tpropc)
    1. private Address _HomeAddress;
    2. private bool _HomeAddressInitialised;
    3.  
    4. [DataMember]
    5. public Address HomeAddress
    6. {
    7.     get
    8.     {
    9.         return ComplexGet(ref _HomeAddress,
    10.             ref _HomeAddressInitialised, HandleHomeAddressChanging);
    11.     }
    12.     set
    13.     {
    14.         ComplexSet("HomeAddress", ref _HomeAddress, value,
    15.             ref _HomeAddressInitialised, HandleHomeAddressChanging);
    16.     }
    17. }
    One to Many
    (code snippet: tpropotm)
    1. private TrackableCollection<Employee> _Employees;
    2.  
    3. [DataMember, Persist("Company-Employees")]
    4. public TrackableCollection<Employee> Employees
    5. {
    6.     get
    7.     {
    8.         return OneToManyNavigationGet(ref _Employees, FixupEmployees);
    9.     }
    10.     private set
    11.     {
    12.         OneToManyNavigationSet("Employees", ref _Employees,
    13.             value, FixupEmployees);
    14.     }
    15. }
    Many to One
    (code snippet: tpropmto)
    1. private Company _EmployedBy;
    2.  
    3. [DataMember, Persist("Company-Employees")]
    4. public Company EmployedBy
    5. {
    6.     get { return ManyToOneNavigationGet(ref _EmployedBy); }
    7.     set
    8.     {
    9.         ManyToOneNavigationSet("EmployedBy", ref _EmployedBy,
    10.             value, oneside => oneside.Employees);
    11.     }
    12. }
    Root One to One (use in class considered Primary/Root in the association)
    (code snippet: tproproto)
    1. private Employee _Ceo;
    2.  
    3. [DataMember, Persist("Company-Ceo")]
    4. public Employee Ceo
    5. {
    6.     get { return OneToOneNavigationGet(_Ceo); }
    7.     set
    8.     {
    9.         FromRootOneToOneNavigationSet("Ceo", ref _Ceo, value,
    10.             child => child.CeoOfCompany,
    11.             (child, newValue) => child.CeoOfCompany = newValue);
    12.     }
    13. }
    Child One to One (use in class considered the “Child” in the association)
    (code snippet: tpropcoto)
    1. private Company _CeoOfCompany;
    2.  
    3. [DataMember, Persist("Company-Ceo")]
    4. public Company CeoOfCompany
    5. {
    6.     get { return OneToOneNavigationGet(_CeoOfCompany); }
    7.     set
    8.     {
    9.         FromChildOneToOneNavigationSet(
    10.            "CeoOfCompany", ref _CeoOfCompany, value,
    11.            root => root.Ceo,
    12.            (root, newValue) => root.Ceo = newValue, true);
    13.     }
    14. }

    Unit of Work

    In order to achieve persistence ignorance, the Unit Of Work implementation hides/wraps the details of the EF4 ObjectContext class. In addition, this encapsulation also has the added benefit of making it really easy to enhance/augment the capabilities of the object context so as to fit one’s particular project needs. In my case, I’ve incorporated the following useful aspects within my Unit of Work implementation:
    • When you commit changes, all self-tracking entities will “automagically” be reset to an unmodified state. The alternative would have been for you to manually reset all entities, including child entities referenced via navigation properties in your object graph back to unmodified.
    • The unit of work provides a GetManagedEntities method that provides easy access to all entities that are currently managed/attached to the Unit of Work (i.e. the EF4 ObjectContext). This method also makes it easy to filter managed entities by their current modification status (i.e. added/deleted,unmodified/modified).
    Its worth mentioning that one *could* also easily extend the Unit Of work implementation to validate all domain objects upon commit (and/or when they are being attached to the Unit of Work). This would certainly provide an easy, automatic and centralised mechanism of ensuring that business logic rules are enforced in your domain. Sounds too easy…?? Well it probably is!! Personally I won’t recommend it as I think validation logic is typically best done at a higher tier in your architecture, such as at a service layer. For example, one would typically want to validate business rules for user supplied data BEFORE commencing any expensive operations – if validation is postponed until commit, then you’ve potentially just wasted a lot of cycles and made your service more susceptible to denial of service attacks. Also consider that sometimes you may not have enough operational contextual information available at the DAL layer to to properly (and efficiently) enforce business rules as the DAL is too low in your tiered architecture to make this easy (e.g. are there workflows/notifications that need to be considered in addition to this save operation?). Still, I mention this because for simple scenarios this may be all you need – see the example below of how one could go about this.
    Code Snippet
    1. public void Commit(bool acceptChanges = true)
    2. {
    3.     /* One *could* validate modified objects here as follow... although typically this is better done at
    4.      * a service level or higher business tier in your application. */
    5.  
    6.     List<ValidationResults> validationResults = new List<ValidationResults>();
    7.     foreach (var entity in GetManagedEntities<IObjectWithChangeTracker>(ObjectStateConstants.Changed))
    8.     {
    9.         var result = ValidationHelper.Validate(entity);
    10.         if(!result.IsValid)
    11.         {
    12.             validationResults.Add(result);
    13.         }
    14.     }
    15.     if(validationResults.Count > 0)
    16.     {
    17.         throw new ValidationException(validationResults);
    18.     }
    19.     
    20.  
    21.     EntitySession.Context.SaveChanges(acceptChanges ? SaveOptions.AcceptAllChangesAfterSave : SaveOptions.None);
    22.     if(acceptChanges)
    23.     {
    24.         AcceptChanges();
    25.     }
    26. }
    Notice the use of the ValidationHelper class in line 9. Basically, the Enterprise library validation block only provides a generic Validate<T>(T object) method. The problem with this is that if you don’t know the type of the object you want to validate at run-time, then this validate method will be of little use as it will only consider validation attributes for the type known at compile time. For example, if T is say BasicObject, and you supply an instance of Employee – only properties in the BasicObject class will be validated. The ValidationHelper class overcomes this limitation by simply using reflection to dynamically invoke the enterprise library validation method with the correct type for T.

     

    Running the Tests

    To run the included unit tests, please ensure that you modify app.config in Sample.TestDataAccessLayer project with a valid sql server 2005/8 connection (SQL Express is fine).
    When running the test, it will automatically drop and re-create the schema as per the mappings defined in the Sample.DataAccessLayer.EF project. The creation of the schema is controlled via the following unit test class initializer - “Test” specifies the connection string name in app.config, and true indicates that the database should be dropped/recreated as opposed to simply ensuring that it does exist.
    Code Snippet
    1. [ClassInitialize()]
    2. public static void MyClassInitialize(TestContext testContext)
    3. {
    4.     new DataAccessLayerRegistrar("Test").EnsureDatabase(true);
    5. }
       
    I hope this will help you getting started with this customised version of the EF Self-Tracking framework. There’s certainly a lot more I could have blogged about…but this post is getting much longer than I originally anticipated. Please let me know if there are particular features you wish to discuss in more details in future posts.
    Happy coding!
    Adriaan