How I leveraged GitLab CI to keep my application fed with data (with example)

17.07.2018

GitLab’s integrated CI/CD is advertised as the way for quick, automatic build, test and deploy of code. But what if we used it to continuously deliver the product of our code, rather than code itself? It turned out to be perfect - and achievable - case for my Cinema Citizen web app.

Manual simply didn’t work

In short, CCtzn helps you plan movie marathons. The calculations run on JSON data I need to prepare beforehand for each cinema and day of the week. The script responsible for “download & convert” work is PHP piece proudly called timetables-maker. It works reasonably well, given how unstructured the original data is, and how many separate datasets it generates. Currently, about 1000 files / 4 minutes are spit out on a laptop.

Since my app relies on the real world timetables provided by cinema chains, it’s a bit time-sensitive. The same day they update information about screenings, I should run the script for fresh data. My shared hosting supports cron, but also limits max_execution_time so the script could be killed in the middle of running! If you read my previous post, you know that I’m too cheap to upgrade the server, especially for non-profit-yet app’s needs.

I started thinking about changing approach. Instead generating hundreds of JSONs at once every week, let the script run on demand, resulting in single (cached) file every time the user requests for timetable at selected venue and date. This way had its own downsides, though:

So until lately, I was the actual backend for my app, having to manually run the script and upload the generated files via FTP (another trait of simple, oldschool hosting). I was too lazy for it, so CCtzn just laid there for months, starving for JSONs. But this time, being lazy also motivated me to delegate this process to the machine.

Automatic did work

When I started playing with GitLab’s solution, my primary focus was to automate FTP upload; after that came everything else (e.g. building JavaScript bundle). I used to write Gulp tasks for the job, but again it required me to trigger file transfer, and it was failing sometimes. Being able to skip the local step and start directly from repository feels awesome.

The following paragraphs contain some quoted terms (like “stage”) from CI/CD nomenclature, which should be interpreted as in docs; however they read pretty intuitively.

A typical setup consists of a few “stages” (for example, “build”, “test” and “deploy”), which ultimate goal is to deliver updated version of your code to some environment. This is what most of tutorials say. The fact is that you can do pretty arbitrary things, not necessarily aimed at releasing, using the CI/CD engine. This is what I did - trigerred side effects (i.e. generated JSONs with movie timetables) from my code, and pushed the files to FTP. All for free, within monthly allowance of 2000 minutes of “runners” utilization; and without (visible) time limit for my script.

The process step by step:

And voila! Finally I have humanless backend :) As promised (and because I work openly on the CCtzn project), you can see the final configuration here.

What surprised me when working with GitLab CI/CD (a.k.a friction log)