Sunday, July 25, 2010

What if you need Transactions and also REST?

Some time ago, in a discussion at REST-Discuss, en example for transactions in REST was given. The solutions are usually complex and tedious. This is part of one of my posts, fixed a little bit to include the sample case.

Case: We have a travel booking service. The traveler usually selects a flight and then a hotel to stay. If the person cannot confirm the flight, the room booking should not be confirmed. Here, we have a transaction. It is all or nothing. Good.

How do you model that in REST?

Usual method is to have two resource, the flight booking and the room booking. And, there is also another resource you create, that is a transaction resource. That transaction resource contains the transaction data, either what to do or what was done. Finally, you commit or confirm the transaction resource, and all is committed.

Well. For this, I assume there are two different data sources, and we can commit to each one separately, but when a transaction involves individual transactions at each source, then you use the famous two phase commit. Each source is a participant, right? The airline and the hotel are two different companies, so your transaction is not simple.

We can have one server dedicated to the management of transactions. We can create the transaction object in it and add the steps and then ask him to perform (it can even perform at each step and verify all worked at the end, or rollback at any step if needed). The problem is, given we need to send to it, manually, all the transaction steps and actions, that idead may suffer some scalability problems.

On the other hand you have the client that needs to do all that processing to commit the transaction. That is, the client needs to be transaction-aware.

My feeling is that exposing the data entities as resources, and leaving to the client all the commit processing, is exposing too much the application detail. May not break REST, but adds unnecessary complexity.

In the example, we model, for the two phase commit, two sources, the airline resource and the hotel room resource. It is them implied that both are like databases, even more, separated database engines. And, the client will have to drive the transaction management to change data in both and then to commit. That is implicitly forcing the concepts of a resource, but still it sounds like REST.

So far, so good. Now, my question would be: should I need to do all that to actually reserve a package using REST? Well, thinking about how would I do it, I'd actually follow an online reservation workflow and see what happens:

a. I enter and search for a flight. System returns a list of flights and I select one. At this time a draft reservation is created with my flight in it. (Think a PUT of the empty reservation followed by a POST of the flight).

b. Then the system offers me to add a hotel reservation, and from the provided list I select one too. That is added to my draft reservation (another POST).

c. Finally, I add my credit card information and post a confirmation (Another POST).

This last action is served by server number 5 of 10 currently serving. That server 5 needs to complete the POST, and if unable, it will return an error to the client. Well, that server uses the draft reservation resource information to call a transaction manager to commit all changes. If it fails, server 5 returns the error.

That is totally opaque to the client, which only confirms and receives a yes or no to that request. Depending on that response, the client retries, updates the selection of flights or hotel and confirms again, or even desists and eliminates the reservation. Simple, ha.

But wait, that draft sounds a lot like the transaction resource mentioned above. Well, yes, but semantically it is not. The user adding reservations to a draft is not expecting each step to be on firm. The actual transaction occurs at the end, when we confirm all the bookings.

The difference in this process is that client is freed from knowing the transaction is happening. Resources are just that, no databases nor tables that need transactions and the client doesn't have to choose the use of single or two phase commits. You can scale since you can change the number of servers or transaction managers without touching the client. AND, each client interaction leaves the system in a stable state. Actually, this can be RESTFull too!.

So, if we can hide the complexity of the transaction, why do we need to expose that complexity to the client? I may do it if that brings some benefit. My question will then be, which benefits will I found from one implementation to the other one, or why one of them is not suitable for some particular business case.

Look that the actual transaction coordination is happening under the cover, at server level. That data hiding allows not only flexibility to adjust the process, but also reliability, since failed clients can recover.

Cheers.

No comments:

Post a Comment