Sunday, May 1, 2011

Dealing with versions in REST? Think again, you may not need to.

In the recent IWS 2010, I saw Zachman presenting an interesting note about enterprise architecture. He quoted Brooks by saying that redundancy drives the system to chaos, entropy. Well, actually some “managed” or better said “controlled” redundancy is necessary to allow scalability. In a nice context, that redundancy is a “mirrored” one, a copy, an instance.  That means each part that is redundant is the exact same copy of all other redundant parts. But not all is so nice: there is a meaner redundancy, the one created by specialization or customization. That one is partial redundancy, where one part of the instance is the same for all instance, but there is a part that is unique, different from the others. When that happens, we usually say we have a new version.

So, version is a way to differentiate between two things that are similar but not quite the same. Version is something like an ID. That is partial redundancy. If there were no redundancies, or if all redundancies were absolute, then we would not need versions.

Now, if two versions of something were to be treated the same, no difference in processing them, then there is no need to differentiate.  In other words, nobody cares about the part that is different in the instances. Thus, version can be defined as the ID plus the special processing or treatment that ID requires.

Let’s look it using another perspective. To create a version, we need to change something. That is: data, structure, a process, something that is changed. The original form is kept and the new form is assigned a version ID to differentiate it from the original. In other words, versioning is basically a technique to handle the differences that appear when something changes. The real issue is then modifiability.

Modifiability? When we modify something and the rest of the system has to adapt to that change, we say we have no modifiability: the system is tightly coupled.  Modifiability is a quality property that allows some things (even instances) to change without implying the system will break. It is the capacity of the system to support the change of one of its parts (here, each individual instance is counted as one part). When something changes and we start creating special version of things to handle the change and keep the old parts still working, we are adding partial redundancy.  And that is bad, remember?

So, as a piece of advice, try to stay away from versioning as hard as you can.

But, what if you need to use versions, how do you work with them in REST? Well, REST is a style that welcomes redundancy, but tries to keep the flexibility to change without disrupting the system. That is, it offers modifiability.

Let’s see what can change. In REST we have different servers that can come and go. Topology can change. Since we are using layers, the fact that a server changes in a second level or beyond does not affect us. If the server that goes away is the one that we are talking to, there should be another that takes its place (redundancy). If there is no other, then we are in trouble. So, here we have total redundancy, not partial. 

Now, let see the next level. The resources that are actually operated by a server may change. Say, we have no longer a text file but a database, or the resource is no longer a static image but a video, any change that a resource can suffer. Since we manage resources through representations, that change may not be a problem. It is solved by keeping the same representations as before and even adding new ones taking advantage of the changes in the resource.

Ok, we may say: what if the representation is the that changes!? Well, here we have two types of possible changes: Structural or content. If the representation structure changes, then we may be in front of a totally new structure that needs a new media type. Maybe the structure changes a little, like in WSDL 1.1 and 2.0. Well, in that case, the new 2.0 needs to get a new IANA registry (actually it has an application/wsdl+xml assigned for 2.0) . The client will know what is being sent by reading the media type. Humm, ok, it may not happen, new types for new versions of already existing media types may be difficult, in that case it could be the new type description contains backward compatibility, or simple that the old spec is not supported anymore.

Now, let’s analyze content modification in the representation. What does it mean? I guess that the usual data that was being sent by the application is not the same as usual. Humm. That means the client is statically bounded to a set of values? Well, that is not a problem of REST, right? The content should then have a way to tell the client what has changed. A version you said, well yes, but application driven. REST has no restriction of what values or fields you sent out, one of them could be a version number, but that is the application data, application semantics, totally out of REST concern.
A little note here. We may have a well-known generic data type like XML. And we may have one normal document structure the client is reading. What if we add some new elements with some new data, and remove others? Well, here we are not changing the data type (XML is still XML). What we changed is the XML schema. And of course, the schema is specified in the XML body. So, a client can perfectly parse the XML if it understands the schema. Again, nothing related to REST.

Let’s continue. Now, I want to change the actual process. Say, we need to do some posts to complete a process. Suddenly, a new Post and a verification Get is required. In this case, unless the client is a static thing that is not following Hypermedia as the Engine of Application State constrain, that is solve alone by adding a couple of links to one particular resource representation.

Ok, the resource that is changed with two more links may be the one to be versioned, right? Well no. If you still need the same resource representation, either create a new representation, create a modified special representation based on the client request (may be a query parameter) or create a new resource that will be used upon a client selection. Think of a virtual store that has a normal four step process to check out.  If you have a credit card registered, you may want to skip the credit card form, or even perform a one click check out. Note that in these cases, we are simple adding options that will drive the client to new process flow, old clients (that only understand the full checkout process, although that is not nice implemented) may be able to follow that old process link. So, we keep all old thing add add new ones without breaking the app, the magic of Hypermedia.

Did I forget something? I guess not. All the cases can be resolved without making artificial URLS with versions or creating custom media types. The only cases where a version id may be needed are the ones that are application specific and not related to REST.

Cheers!