Max Desiatov
6 Jun 2018
âą
11 min read
Server interactions take a significant amount of time and effort to develop and test in most mobile and web apps. In apps with most complex APIs I worked on, the networking layer took up to 40% of the development time to design and maintain, specifically due to some of the edge cases I mention below in this article. After implementing this a few times, itâs easy to see different patterns, tools and frameworks that can help with this. While weâre lucky (well, most of us are, I hope) not to care about SOAP anymore, REST isnât the end of history either.
Recently I had a chance to develop and to run in production a few mobile and web apps with GraphQL APIs, both for my own projects and my clients. This has been a really good experience, not least thanks to wonderful PostGraphile and Apollo libraries. At this point, itâs quite hard for me to come back and enjoy working with REST.
But obviously, this needs a little explanation.
To be fair, REST isnât even a standard. Wikipedia defines it as
an architectural style that defines a set of constraints and properties based on HTTP
While something like JSON API spec does exist, in practice itâs very rare that you see a RESTful backend implementing it. In best case scenario, you might stumble upon something that uses OpenAPI/Swagger. Even then, OpenAPI doesnât specify anything about APIs shape or form, itâs just a machine-readable spec, that allows (but not requires) you to run automatic tests on your API, automatically generate documentation etc.
The main problem is still there. You may say your API is RESTful, but there are no strict rules in general on how endpoints are arranged or whether you should, for example, use PATCH
HTTP method for object updates.
There are also things that look RESTful on a first glance, but not so much if you squint: Dropbox HTTP API.
endpoints accept file content in the request body, so their arguments are instead passed as JSON in the
Dropbox-API-Arg
request header or arg URL parameter.
JSON in a request header? (âŻÂ°âĄÂ°ïŒâŻïž” â»ââ»
Thatâs right, there are Dropbox API endpoints that require you to leave request body empty and to serialise a payload as JSON and chuck it an a custom HTTP header. Itâs fun to write client code for special cases like this. But we canât complain, because there is no widely-used standard after all.
In fact, most of the caveats mentioned below are caused by lack of a standard, but Iâd like to highlight what Iâve seen in practice most frequently.
And yes, you can avoid most of these problems in a disciplined experienced team, but wouldnât you want some of this stuff to be resolved already on a software side?
No matter how much you try to avoid this, sooner or later you stumble upon misspelt JSON properties, wrong data types sent or received, fields missing etc. Youâre probably ok if your client and/or server programming language is statically typed and you just canât construct an object with a wrong field name or type. Youâre probably doing good if your API is versioned and you have an old version on /api/v1
URL and a new version with a renamed field on /api/v2
URL. Even better if you have an OpenAPI spec that generates client/server type declarations for you.
But can you really afford all this in all your projects? Can you afford setting up /api/v1.99
endpoint when during a sprint your team decides to rename or rearrange object fields? Even if itâs done, will the team not forget to update the spec and to ping the client devs about the update?
You sure you have all the validation logic right either on client or on server? Ideally, you want it validated on both sides, right? Maintaining all of this custom code is a lot of fun. Or keeping your API JSON Schema up to date.
Most APIs work with collections of objects. In a todo-list app, the list itself is a collection. Most collections can contain more than 100 items. For most servers returning all items in a collection in same response is a heavy operation. Multiply that by a number of online users and it can add up to a hefty AWS bill. Obvious solution: return only a subset of a collection.
Pagination is comparatively straightforward. Pass something like offset
and limit
values in query parameters: /todos?limit=10&offset=20
to get only 10 objects starting at the 20th. Everyone names these parameters differently, some prefer count
and skip
, I like offset
and limit
because they directly correspond to SQL modifiers.
Some backend databases expose cursors or tokens to be passed for next page query. Check out Elasticsearch API that recommends using scroll
calls when you need to go through a huge list of resulting documents sequentially. There are also APIs that pass relevant information in headers. See GitHub REST API (at least thatâs not JSON passed in headers đ
).
When it comes to filtering, itâs so much more interesting⊠Need filtering by one field? No problem, it could be /todos?filter=key%3Dvalue
or maybe more human-readable /todos?filterKey=key&filterValue=value
. Howâs about filtering by two values? Hm, that should be easy, right? Query would look like /todos?filterKeys=key1%2Ckey2&filterValue=value
with URL encoding. But often there is no way to stop the feature creep, maybe a requirement appears for advanced filtering with AND
/OR
operators. Or maybe complex full-text search queries together with complex filtering. Sooner or later you can see a few APIs that invent their own filtering DSL. URL query components are no longer sufficient, but request body in GET
requests is not great either, which means you end-up sending non-mutating queries in POST
requests (which is what Elasticsearch does). Is the API still RESTful at this point?
Either way, both clients and servers need to take extra care with parsing, formatting and validating all these parameters. So much fun! đ As an example, without proper validation and with uninitialised variables you can easily get something like /todos?offset=undefined
.
Swagger mentioned above is probably the best tool for this at the moment, but it isnât used widely enough. Much more frequently I see APIs with documentation maintained separately. Not a big deal for a stable widely used API, but much worse during development in an agile process. Documentation stored separately means itâs frequently not updated at all, especially if itâs a minor, but client-breaking change.
If you donât use Swagger, it probably means you have specialised test infrastructure to maintain. Thereâs also a much higher chance you need integration tests rather than unit-tests, means testing both client and server-side code.
This becomes a problem with much larger APIs, where you might have a number of related collections. Letâs go further with an example of a todo-list app: suppose every todo item can also belong to a project. Would you always want to fetch all related projects at once? Probably not, but then there are more query parameters to add. Maybe you donât want to fetch all object fields at once. What if the app needs projects to have owners and thereâs a view with all this data aggregated in addition to separate views displaying each collection separately? Itâs either three separate HTTP requests or one complex request with all data fetched at once for aggregation.
Either way, there are complexity and performance tradeoffs, maintaining which in a growing application brings more headaches than one would like.
There is also a ton of libraries that can automatically generate a REST endpoint with some help from ORMs or direct database introspection. Even when those are used, usually they arenât very flexible or extensible. That means reimplementing an endpoint from scratch if there is a need for custom parameters, advanced filtering behaviour or just some smarter handling of request or response payload.
Yet another task is consuming those endpoints in client code. Itâs great to use code-generation if you have it, but again it seems to be not flexible enough. Even with helper libraries like Moya, you stumble upon the same barrier: there is a lot of custom behaviour to handle, which is caused by edge cases mentioned above.
If a dev team isnât full-stack, communication between server and client teams is crucial, even critical when thereâs no machine-readable API spec.
With all issues discussed, Iâm inclined to say that in CRUD apps it would be great to have a standard way to produce and consume APIs. Common tooling and patterns, integrated testing and documentation infrastructure would help with both technical and organisational issues.
GraphQL has a draft RFC spec and a reference implementation. Also, check out GraphQL tutorial, which describes most of the concepts youâd need to know. There are implementations for different platforms, and there is plenty of developer tools available as well, most notably GraphiQL, which bundles a nice API explorer with auto-completion and a browser for documentation automatically generated from a GraphQL schema.
In fact, I find GraphiQL indispensable. It can help in solving communication issues between client and server-side teams Iâve mentioned earlier. As soon as any changes are available in a GraphQL schema, youâll be able to see it in GraphiQL browser, same with embedded API documentation. Now client and server teams can work together on API design in an even better way with shorter iteration time and shared documentation thatâs automatically generated and visible to everyone on every API update. To get a feeling of how these tools work check out a Star Wars API example that is available as a GraphiQL live demo.
Being able to specify object fields requested from a server allows clients to fetch only data they need when they need. No more multiple heavy queries issued to a rigid REST API, which are then stitched on the client just to display it all at once in app UI. You are no longer restricted to a set of endpoints, but have a schema of queries and mutations, being able to cherry-pick fields and objects that a client specifically requires. And a server only needs to implement top-level schema objects this way.
A GraphQL schema defines types that can be used in communication between servers and clients. There are two special types that are also core concepts in GraphQL: Query
and Mutation
. Most of the time every request that is issued to a GraphQL API is either a Query
instance that is free of side-effects or a Mutation
instance that modifies objects stored on the server.
Now, sticking with our todo app example, consider this GraphQL schema:
type Project {
id: ID
name: String!
}
type TodoItem {
id: ID
description: String!
isCompleted: Boolean!
dueDate: Date
project: Project
}
type TodoList {
totalCount: Int!
items: [TodoItem]!
}
type Query {
allTodos(limit: Int, offset: Int): TodoList!
todoByID(id: ID!): TodoItem
}
type Mutation {
createTodo(item: TodoItem!): TodoItem
deleteTodo(id: ID!): TodoItem
updateTodo(id: ID!, newItem: TodoItem!): TodoItem
}
schema {
query: Query
mutation: Mutation
}
This schema
block at the bottom is special and defines root Query
and Mutation
types as described previously. Otherwise, itâs pretty straightforward: type
blocks define new types, each block contains field definitions with their own types. Types can be non-optional, for example String!
field canât ever have null
value, while String
can. Fields can also have named parameters, so allTodos(limit: Int, offset: Int): TodoList!
field of type TodoList!
takes two optional parameters, while its own value is non-optional, meaning it will always return a TodoList
instance that canât be null
.
Then to query all todos with ids and names youâd write a query like this:
query {
allTodos(limit: 5) {
totalCount
items {
id
description
isCompleted
}
}
}
GraphQL client library automatically parses and validates the query against the schema and only then sends it to a GraphQL server. Note that offset
argument to allTodos
field is absent. Being optional, its absence means it has null
value. If the server supplies this sort of schema, itâs probably stated in documentation that null
offset means that first page should be returned by default. The response could look like this:
{
"data": {
"allTodos": {
"totalCount": 42,
"items": [
{
"id": 1,
"description": "write a blogpost",
"isCompleted": true
},
{
"id": 2,
"description": "edit until looks good",
"isCompleted": true
},
{
"id": 2,
"description": "proofread",
"isCompleted": false
},
{
"id": 4,
"description": "publish on the website",
"isCompleted": false
},
{
"id": 5,
"description": "share",
"isCompleted": false
}
]
}
}
}
If you drop isCompleted
field from the query, itâll disappear from the result. Or you can add project
field with its id
and name
to traverse the relation. Add offset parameter to allTodos field to paginate, and so allTodos(count: 5, offset: 5)
will return the second page. Helpfully enough, youâve got totalCount
field in the result, so now you know youâve got 42 / 5 = 9
pages in total. But obviously, you can omit totalCount
if you donât need it. The query is in full control of what actual information will be received, but underlying GraphQL infrastructure also ensures that all required fields and parameters are there. If your GraphQL server is smart enough, it wonât run database queries for fields you donât need, and some libraries are good enough to provide that for free. Same with the rest of mutations and queries in this schema: input is type-checked and validated, and based on the query a GraphQL server knows what result shape is expected.
Under the hood, all communication runs through a predefined URL (usually /graphql
) on a server with a simple POST
request that contains the query serialised as a JSON payload. You almost never have a need to be exposed to an abstraction layer this low though.
Not too bad overall: weâve got type-level validation issues taken care of, pagination is also looking good and entity relations can be easily traversed when needed. If you use some GraphQL -> database query translation libraries that are available, you wouldnât even need to write most of the database queries on the server. Client-side libraries can unpack a GraphQL response automatically as an object instance of a needed type quite easily, as naturally the response shape is known upfront from the schema and queries.
While falcor by Netflix seemed to be solving a similar problem, was published on GitHub a few months earlier than GraphQL and came up on my personal radar earlier, it clearly looks like GraphQL has won. Good tooling and strong industry support make it quite compelling. Aside from a few minor glitches in some client libraries (that since have been resolved), I canât recommend highly enough to have a good look at what GraphQL could offer in your tech stack. It is out of technical preview for almost two years now and the ecosystem is growing even stronger. While Facebook designed GraphQL, we see more and more big companies using it in their products as well: GitHub, Shopify, Khan Academy, Coursera, and the list is growing.
Thereâs plenty of popular open-source projects that use GraphQL: this blog is powered by Gatsby static site generator, which translates results of GraphQL queries into data that are rendered into an HTML file. If youâre on WordPress, a GraphQL API is available for it as well. Reaction Commerce is an open-source alternative to Shopify thatâs also powered by GraphQL.
A few GraphQL libraries worth mentioning again are PostGraphile and Apollo.
If you use PostgreSQL as your database on the backend, PostGraphile is able to scan a SQL schema and automatically generate a GraphQL schema with an implementation. You get all common CRUD operations exposed as queries and mutations for all tables. It may look like itâs an ORM, but it isnât: youâre in full control of how your database schema is designed, and what indices are used. Great thing is that PostGraphile also exposes views and functions as queries and mutations, so if there is particularly complex SQL query that youâd like to map to a GraphQL field, just create that SQL view or function and itâll appear automatically in GraphQL schema. With advanced Postgres features like row-level security, you can get complex access control logic implemented with only a few SQL policies to write. PostGraphile even has awesome things like schema documentation [generated automatically from Postgres comments](https://www.graphile.org/postgraphile/postgresql-schema-design# table-documentation) đ€©.
In turn, Apollo provides both client libraries for multiple platforms and code generators that produce type definitions in most popular programming languages, including TypeScript and Swift. In general, I find Apollo much simpler and manageable to use than, for example, Relay. Thanks to simple architecture of Apollo client library, I was able to slowly transition an app that used React.js with Redux to React Apollo, component by component and only when it made sense to do so. Same with native iOS apps, Apollo iOS is a relatively lightweight library thatâs easy to use.
In a future article, Iâd like to describe some of my experience with this tech stack. In the meantime, shoot me a message on Twitter about your experience with GraphQL or if youâre just interested how it could work in your app đ.
Max Desiatov
See other articles by Max
Ground Floor, Verse Building, 18 Brunswick Place, London, N1 6DZ
108 E 16th Street, New York, NY 10003
Join over 111,000 others and get access to exclusive content, job opportunities and more!