Company & Culture

Scaling AppDirect APIs with GraphQL Federation

By Sébastien Lavoie-Courchesne / August 24, 2020

Scaling AppDirect APIs with GraphQL Federation

Seeing more and more companies going headless with GraphQL, we at AppDirect also feel it’s our future reality. In a world where third-party developers want to integrate their products in our platform, giving them access to a highly flexible API seems like the way to go.

For the past few months, our team has been working intensively to implement new microservices to expose a GraphQL endpoint allowing developers to create and modify new products on our platform.

Problematic

Products in our platform are handled by multiple different teams including the Catalog team defining the basic products and integration configurations, Marketplace team defining branding and Pricing team defining pricing. Working within a monolith made it easier to combine all these sections together and return everything to the client whenever he requested a product through the API, which created a very large payload containing information that might not be useful for the client’s needs. As we moved towards a distributed architecture, the product started being separate between multiple different microservices. Our initial thought was to continue exposing REST APIs and have a microservice patch together the different pieces.

At this point we discovered GraphQL and it made perfect sense for us to use it with federation to expose new APIs for the products. There’s no need for a microservice patching data together anymore. We don’t have to return enormous amounts of data that may not be used for a client’s use case.

Flexibility

GraphQL offers an unprecedented amount of flexibility in terms of APIs while retaining a very simple method to access and modify the data you want.

From a client’s point of view, GraphQL allows querying only the necessary fields for in-house implementations. It also provides robustness since adding a new field will not break a developer’s existing integrations.

On our side, implementing the server, the flexibility comes from the fact that we don’t need to load multiple tables for fields that aren’t queried for. For most queries, loading data becomes much faster.

From a client’s point of view, GraphQL allows querying only the necessary fields for in-house implementations.

Implementation

Even though a large portion of our code is JVM-based, we have found it very easy to transition to using NodeJS to expose the GraphQL APIs. JavaScript’s flexibility works very well with the queries and mutations we are exposing and remains very performant. It’s been a nice change of pace to have a microservice start in less than a second and all tests run within 15 seconds, all while maintaining a higher coverage than ever before. We have originally started by using the Apollo tools to build up the infrastructure quickly, and it’s been paying off nicely so far.

As most of these new developments occur within different microservices, we are using federation to create a single graph that will be exposed to the public. Although schema management is still a concern we have to address, the federation has been working effectively in our internal testing with Apollo Gateway.

Query to AppDirect's federated gateway

Here we have an example of a query to our federated gateway. This allows querying a specific product by its identifier and returns information from different microservices within our architecture. Each microservice adds specific information to the global product type definition.

In this case, the product service will expose the query and the base type:

type Query {

product(id: ID!): Product

}

type Product @key(fields: “id”) {

id: ID!

type: ProductType!

name: String!

...

}

Each service can then extend this definition to add more information to the product type, for example:

type IntegrationConfiguration {

createURL: URL

updateURL: URL

}

@extend type Product @key(fields: “id”) {

id: ID!

integrationConfiguration: IntegrationConfiguration

}

Lessons Learned

Resolver performance

There’s a challenge in knowing what data to fetch and when. In our case, we used MongoDB to store our domain objects. This allows fast reading of a single object and everything it comprises. It’s easy to see that, with SQL databases, it’s hard to define which tables to join in your data fetching to ensure fast reading while not loading data that isn’t queried, especially when loading in collections. Since some of our fields are computed, there’s also the question of saving these in the domain object versus computing them each time they are queried. We opted to compute these each time and monitor the number of queries to determine if saving these computations on the domain objects is necessary.

Mutations in a federated graph

As mentioned, we are using federation to join together multiple microservice schemas and expose a single schema. This works very well for queries, but mutations cannot be federated. Any mutation can only act on a single GraphQL server from the gateway’s point of view. This forces more granular mutations, which helps keep them simpler for developer’s to call. For example, editing a product will require a mutation to change its basic information, and a different mutation to change its editions, although both can be queried at the same time through federation.

As with any distributed system, mutations that act on multiple different microservices can bring some challenges. This is the case for product publications. Whenever a marketplace manager approves a product for publication so it can be visible to buyers on the marketplace. We need to ensure that all parts of the product are published, through the different microservices. We opted for a pattern similar to two-phase commit to ensure a product is correctly published in its entirety.

Team dependencies

On a larger scale, since multiple teams own different parts of the schema, there can be some dependency issues when trying to test the complete queries. The product service needs to expose the type and queries so that the sub-domains are able to add information to it. The gateway has to be deployed before we can think of doing federation. In practice, since each sub-domain redefines the type they are extending, you can test your implementation by mocking results for the parent type even if the parent type isn’t deployed to the global schema. This helps a lot in working with multiple teams.

Going Forward

Our plan going forward with GraphQL is to release a closed beta version of our schema by the end of the year. This first schema will likely include most of the product management along with some of our REST APIs fronted with a GraphQL schema.

Sébastien Lavoie-Courchesne is a Senior Developer at AppDirect. AppDirect is hiring! See open positions.