How can you achieve more with GraphQL and serverless?

Igor Soroka
May 6, 2024
5 min read

We want to build fast and reliable APIs. The most obvious choice is to use REST API. However, GraphQL has been out there for some time. AWS did not invent it; they built the AppSync service to make it accessible to their clients natively.

What sets GraphQL and serverless architecture apart is AWS's decision to offer it as a managed service. This means a pay-as-you-go model and no need to manage servers. No containers or virtual machines are required to host the fully functional GraphQL API server instance. In other words, it's serverless, freeing developers from server management concerns.

One of the key advantages of using AWS with GraphQL is the seamless integration of specific services. For instance, you can connect AWS Lambda Function (FaaS), DynamoDB (NoSQL database), and EventBridge (event orchestrator) with your GraphQL server. These services, known as first-class citizens in the AppSync world, form the foundation for your API-level application, which can be easily integrated into your existing architecture. This approach also simplifies access control, as IAM can be used to restrict actions for specific actors.

There will be a series of articles about AppSync. This one is a starting point, where I introduce the topic, the key features of the service, and its capabilities. I will also show the CRUD application. We are building the Book Library. The storage would be DynamoDB. Examples would be CDK, TypeScript, and GraphQL language.

GraphQL as an alternative

REST API aims to communicate with backend systems or third-party services with reliability and security in mind. However, the protocol was invented in the early 2000s. During these times, the internet was different. There was no such notion as mobile internet or phones. Now, everyone has a phone, tablet, desktop computer, and other connected devices. 

With every API I have been writing, there were common issues:

- Documentation and validation. What do I need to call? What is the payload? What should I expect there? 

- Response sizes are the same for mobile and desktop versions. The developer often gets much more information than is needed for the feature.

Main components

GraphQL solves the problems mentioned above with simplicity and perfection. It has three main components: Schema, Resolver, and Data source. Other usual AWS services, such as IAM, are involved in granting permissions, and CloudWatch has native integration for monitoring and alerting. Let's dive deeper into every component.

Schema

It is a contract written in a specific language. It makes it obligatory for clients to follow the types written there. The code written below shows two main types. Query - it is used for reading data like GET but with parameters built-in into it. Mutation is needed to manipulate data in the data sources. Both are Special objects. There is also a subscription for getting real-time data, where your application can listen to news from the server. The exclamation mark means that the parameter is obligatory.


type Query {
  getBook(id: ID!): Book
}

type Mutation {
  addBook(input: AddBookInput!): Book
}

type Subscription {
  newBookAdded: Book
}

type Book {
  id: ID!
  title: String!
  author: String!
}

The book type is the custom-made one. It is called an object in GraphQL terms. There should be one or more fields for the object. 

In this case, the field could be either a scalar (primitive type) or another type. In this example, we could make the author as another type.


type Author {
  firstName: String!
  lastName: String!
  yearOfBirth: AWSDate
}

type Book {
  id: ID!
  title: String!
  author: Author!
}

GraphQL Types

There are also primitive types called scalars. Now AppSync supports standard GraphQL ones and own types prefixed with 'AWS'. 

Here are the most used ones.

  • ID
  • String
  • Int
  • Float
  • Boolean

However, there are many more. For example, here are the special ones from AWS:

  • AWSDate
  • AWSIPAddress
  • AWSEmail

From our book example, we can get one more type called Input. It needs to be set as a parameter to the fields in the Query, Mutation, or Subscription types. Unfortunately, you cannot use your book type to Input book. Here is an example:


type Mutation {
  updateBook(input: UpdateBookInput!): Book
}

input UpdateBookInput {
  id: ID!
  title: String
  author: String
  yearPublished: Int
  isbn: String
}

The current input gives the ability to get the book by ID and update any field, such as ISBN or author. It could be done all at once or one by one. Here comes one of the GraphQL powers where one operation gives the opportunity to update only needed fields, too. As we can see, the validation starts on the level of Schema, where we define the types of the objects. So, just after hitting your endpoint, there is no possibility of getting data in the wrong format or empty (if it is not an optional field).

Data Source

It is a misleading name because the developer would think this gives the impression that it is some storage. It is partly true. 

I prefer to classify data sources into these categories:

  • Storage: DynamoDB, RDS, OpenSearch
  • Event-based sources: EventBridge, Lambda Function, HTTP Endpoint
  • No data source
All supported Data sources for AWS AppSync

The storage is prominent. We want to get the book, so we need to have resources where it can be taken. In this case, there are options: SQL or NoSQL.

Event-driven services would have those waiting for some message to start their work. One would use HTTP Endpoint to integrate with third-party systems. Most importantly, with this data source, one can call an endless number of AWS services through the AWS SDK. There is also an empty source. The use case is local data transformation in resolver or function inside pipeline one. The input needs to be manipulated for use with the resolver. Let's talk about the most mysterious last component of AppSync.

Resolver

The resolver is the most mysterious component because it is hard to understand why you need them. One resolver would be connected to the Data source and the schema field. The resolver is a meeting point for executing one's query because it would have input (request function) and output (response function).''

What is inside resolver?

The request gets the context and arguments. This gives a powerful way of manipulating this data to go to the data source afterward. One is getting the data to save a new book in the storage. It will have required arguments according to theSchema. However, if we want to save this object to DynamoDB, we will need some transformation to achieve our goal. In this case, I would first like to show the traditional way of using VTL markup language.


#set($input = $ctx.args.input)

#if($input.id)
  #set($bookId = $input.id)
#else
  #set($bookId = $util.autoId())
#end

{
    "version": "2017-02-28",
    "operation": "PutItem",
    "key": {
        "id": $util.dynamodb.toDynamoDBJson($bookId)
    },
    "attributeValues": $util.dynamodb.toMapValuesJson($input)
}

This is an example of request template mapping. Let's walk through it. First, we set the input variable as a shorthand for the context. The context usually has various information. We need to get the parameters sent to the Mutation. Also, VTL supports if statements. It is handy when we have branching logic. In the 'else' section, we generate a UUID for the DynamoDB. This is a hidden power of request-mapping templates. One can manipulate JSON objects, encode data, or parse errors with them. The last one is super-useful when we want to return an error in a specific format. However, now you could use JavaScript resolvers instead of VTL.

There are two types of resolvers:

  • Unit
  • Pipeline

We have been talking about unit ones. Pipeline type gives freedom to use multiple functions with several data sources. In this case, one could combine the Lambda function data source with the logic and use the EventBridge bus to start some asynchronous task in another system. We will do this in the second part of this blog with the JavaScript resolvers.

How do components relate to each other?

We have studied three main components: Schema, resolvers, and data sources. The most important part of studying the AppSync service is understanding the dependencies between each component.

Relations between main components

Schema would have many fields, including Queries, Mutations, and Subscriptions. Let's consider an example: we have a query called getBook. We could attach a resolver to it. This resolver would need a data source to get this data. The resolver would be a unit one.The DynamoDB would be connected to it.

User calls the 'getBook'

Here is the image showing the relationships between all of them.The client would be requesting the information about the book, which goes to schema. Schema will route to the correct resolver attached to the Query.Resolver will use the request-mapping template to query Data Source which has a DynamoDB table.

GraphQL and AppSync Security considerations

What should one consider when moving to AWS AppSync?

  • Data sources should have a minimum set of allowed actions for the target resource. Do not use asterisks, and think carefully about what you need. Usually, developers give too many permissions for the AppSync or Lambda functions for simple read operations.
  • The default authorization is the API key, which is static. Controlling access via IAM or a custom Lambda function authorizer is possible. So, it is advisable to take care of it at the beginning of the project.
  • An important part of GraphQL is called introspection. By default, getting the Schema with all the fields and queries/mutations with types of the whole API is possible. Try to turn it off in production.
  • AWS AppSync has only one endpoint, which makes it a perfect target for cyberattacks.

Conclusion

In this article, I discussed the main components of AWS AppSync, its connections, best practices, and some security considerations. AppSync is a powerful service with built-in authorization, caching, and integration capabilities with other services like AWSLambda, DynamoDB, and RDS. It could be used as an entry point for your legacy API, making it typed and more secure. Also, the event-driven architectures are paired well with AppSync.

Igor Soroka
April 5, 2023
5 min read