CRUD REST API design part 2: bulk operations

Today we gonna talk and think a bit more about REST API and about bulk operations

If you haven’t read PART 1: simple things – go check it out!

Bulk operations

We gonna reuse the same event model from part 1 for simplicity

In part 1 we already tried to call GET /event and agreed that it will return event list. Now lets develop this idea of getting lists of entities into something that is called BULK operations.

A bulk operation relate to performing one or multiple operations on multiple objects.

Do you think GET /event is a bulk operation? Well you can say so because it performs lookup on all the events, but usually it’s not what people mean when they say “bulk”.

Bulk usually means a list of actions.POST /event seems more “bulk” right? Because we’re creating multiple (say 3) events at the same time. Cool right? We just reduced the number of HTTP requests from 3 to 1. Reducing number of HTTP requests is always a boost to your performance.

Check out this “bulk create” or POST /event REST convention that creates 3 events and returns their ID’s (because remember you need ID’s)

POST /event
[
  {
    "date": "2018-03-30T23:28:26.578Z",
    "type": "HACKATHON",
    "headline": "Alexey's lonely hackathon",
    "text": "Please someone come :("
  },
  {
    "date": "2018-03-30T23:28:26.578Z",
    "type": "SPORT"
  },
  {
    "date": "2018-03-30T23:28:26.578Z",
    "type": "COFFEE_BREAK",
    "headline": "Sunshine coffee",
    "text": "Arabica + Robusta strong"
  }
]
[ 1, 2, 3 ]

We’re returning a list of ids for the events we just POSTed.

One important thing here: order matters! Because you need to know which id belong to which event.

Of course you can return the created events themselves but in this case the response size could be big and affect your performance.

Data stores bulk support

It’s critically important here not only to reduce the number of HTTP calls but also to check whether your data store supports bulk operations. If you just create say 500 events in a loop that would be much slower than creating them using data store’s bulk API.

So first of all before implementing REST bulk API you need to check whether you data store supports it.

For example here is MongoDB supports bulk operations and here is a cool performance analysis on this topic on dzone. You can see on this diagram how drastically GET performance reduces in case of remote MongoDB server

 

One failed during bulk operation

Let’s think of the other pitfall: what would happen if one operation failed? Say somewhere in the middle.

You can choose a strategy here:

  • One fails – all fails
  • One fails – never mind, just carry on

Lets think of both cases:

One fails – all fails

In this case we need to rollback.

Rollback should be done either using your data store or manually.

If your data store is another service, for ex. in micro-service architecture, you need to somehow carry transaction over your service-chaining calls, i.e. support XA transactions or retain the consistency yourself.

Here are some good thoughts on this topic at baeldung: transactions across micro-services

One thing that is good about this approach – you have a pretty straightforward response format – either ALL OK (ids, list of events or whatever) or ALL FAILED 400

POST /event
[
  {
    "date": "2018-03-30T23:28:26.578Z",
    "type": "HACKATHON",
    "headline": "Alexey's lonely hackathon",
    "text": "Please someone come :("
  },
  {
    "date": "2018-03-30T23:28:26.578Z",
    "type": "SPORT"
  },
  {
    "date": "2018-03-30T23:28:26.578Z",
    "type": "COFFEE_BREAK",
    "headline": "Sunshine coffee",
    "text": "Arabica + Robusta strong"
  }
]
{
  "successful": false,
  "errorMsg": "One or more events are not good enough"
}
{
  "successful": true
}

One fails – never mind

Choosing this option will not force you to rollback.

Maybe you would need to do some cleaning on UI, call an audit service, or report a failure and stop some global process. But in general you just need to somehow report back that some item(s) actually failed.

Lets assume we’re POSTing 3 events and 2nd AND 3rd events failed to create. Check out this response format

POST /event
[
  {
    "id": 1,
    "date": "2018-03-30T23:28:26.578Z",
    "type": "HACKATHON",
    "headline": "Alexey's lonely hackathon",
    "text": "Please someone come :("
  },
  {
    "id": 2,
    "date": "2018-03-30T23:28:26.578Z",
    "type": "SPORT"
  },
  {
    "id": 3,
    "date": "2018-03-30T23:28:26.578Z",
    "type": "COFFEE_BREAK",
    "headline": "Sunshine coffee",
    "text": "Arabica + Robusta strong"
  }
]
[
  {
    "id": 2,
    "errorMsg": "Hey I dont like sports"
  },
  {
    "id": 3,
    "errorMsg": "No coffee for you until our hackathon is finished!"
  }
]

Isn’t this exactly all the information we need?

It is. The idea here is to return ids of only failed items + some errorMsg. If everything went perfect without errors just return empty list []

Using id and errorMsg from response you can update UI and mark the failed items somehow, probably make a RETRY button exactly for those items failed.

Bulk update and bulk delete

I think you’re smart and already figured out how to deal with PUT /event or DELETE /event – it’s the same as bulk POST /event

One specific thing about DELETE – it’s not very good for rollback strategy. And it’s not only about REST but it’s rather the idea that sometimes if you expect to have UNDO button in future – you should not delete items but rather mark them for deletion.

Thanks for reading! If you would like to know more about REST API design, bulk operations, if you have a question, or you think i’m absolutely wrong – please write me in comments. I do care about your opinion.

PART 1: http://alexeymatveev.com/crud-api-design

PART 3 COMING SOON

Leave a Reply

Your email address will not be published.