Building serverless fuel API: Part I
Software engineering is a lifestyle. It is not just a 9–5 job.
There are several ways on how to learn new technologies and gain skills. First, one can learn new stuff at the current working project. However, business requirements are often subject to reality and state-of-the-art technologies can't be implemented.
I became really interested in the serverless stack and AWS-related new in 2018 and since then, tons of new features and additions were released. Back then, there weren't many serverless projects at the consulting where I worked.
So, I took on a hobby project and followed the advice many are given: solve a small problem in your everyday life. Unfortunately, I haven't yet bought an electric car, so I always monitor gas prices in the surrounding area. I decided to build an API that provides current prices for Finland.
In this series of posts, I will document and explain the process of building this tool.
Getting the data
Is web scraping legal? It depends on the use case. As I checked, if you are not getting a competitive advantage, it will be ok. There is no financial gain involved in the project. I am using the data as a hobbyist. Also, there is no abuse of the requests to the webpage which makes the load minimal. Here is a good explanation with images.
The fuel prices are listed in the tables sorted by location. Now we need to get this data into our application. I studied several different options and concluded that I would need some easy way to scrape the price information.
My choice is Cheerio. It is a lightweight library that works with the HTML page as text. The syntax is very close to jQuery. Even if it is not modern anymore but it has the advantage of being very easy to deep dive. From my research, Cheerio is the fastest way to extract data from the web page. It has a good typing library which allows integrating it into the Typescript project.
In the code block above I am initializing the Cheerio object. Then, I am traversing through the DOM structure by the 'select'. The nice thing here that I have the ability to traverse 'element' as Cheerio ones. I am also checking each location by using a loop.
When it comes to cloud deployments one wants to have it repeatable and automated as much as possible. The first thing which comes to mind is doing a set of instructions for the cloud provider of your choice.
There are now several ways of doing infrastructure as a code in the AWS world. The most obvious one will be CloudFormation. It is the most natural way of writing infrastructure configuration. Here it is important to mention that all IaaC will be built on top of it. In the world of serverless there are also other ways:
SAM will be the closest relative to the Cloudformation. It also has a cleaner syntax but the typical file will still contain a lot of lines. Terraform is a powerful tool that allows making packages and creating a module-based infrastructure. This gives great relief in a big project context.
CDK is an AWS-created tool where one can use a normal programming language like Typescript, Java, or Python. It provides the capabilities to skip switching contexts between technologies and syntaxes. I will definitely try it someday.
In this project, I am using the Serverless framework, since it is the easiest to start with. It works like a charm when one needs to get a fast start in the world of serverless. It has quite a simple YAML syntax. In this screenshot, I am doing two things: creating AWS Lambda and attaching the REST endpoint to it from the API Gateway.
The Serverless team has tons of good examples. For example, they have the component to help with the migration of the existing Express API. (source).
The 'handler.ts' has the code for the lambda functions. It is using the 'fuelScraper.ts' class which gives the implementation for them. There is a folder called 'model' which gives the types needed in parsing. Tests are done with Jest which worked as planned.
At the end of this phase, I created two endpoints for getting the location names and the prices for any chosen one.
- Get all locations for Finland (approximately 100 names):
- Get all prices for the chosen location (e.g. Tampere):
- Serverless framework
My plan is to start saving the prices to the DynamoDB table. After that, I will write the endpoint to get the three closest gas stations with the best prices by sending coordinates. In the end, the React app will be made to show them on the map with the google links for navigation.
The full source code of Fuel-API is available on GitHub. Feel free to ask your questions in the comments. If you would like to connect — add me on LinkedIn. I am an individual software consultant based in Helsinki who is an advocate of the serverless stack.