Skip to main content

Containerizing and Deploying a Web App in Kubernetes: A Real World Migration

·8 mins

A quick bit of context for you first, this site is built on Hugo , a static site builder that is stupidly well optimized in the build-phase, and has quite a lot of themes. Does it need to be on Kubernetes? Absolutely not. It doesn’t even need to be on a normal web server since it can be fully hosted in S3 like my photography site . That’s never stopped me though, and labbing with over-engineered technology is my favourite way to learn. Let’s get into it!

Platform of Choice… Google? #

For any of my past, present, and future colleagues, I’m sorry. I’m Terraform & AWS through and through, but AWS EKS pricing is painful, and you can’t even use Spot for Fargate! Using EKS Fargate, it would’ve cost me around $150 a month at the absolute minimum. GKE Autopilot on the other hand seems so much better cost-wise. Looks to be $30 a month so far, and the Load Balancer seems cheaper too.

Combining that with the seemingly great security setup and exemplary default monitoring, and this seems like a great solution comparatively. Despite my complete bias against any software from Google, I’ve been pleasantly surprised by how it’s all setup. It seems like there’s not much in terms of GKE-exclusive commands / annotations either compared to EKS, so I’m much more confident that I’d be able to migrate my deployments elsewhere if needed.

I decided against using Terraform here. I already know enough about Terraform, and outside of this, I won’t be dealing with Google Cloud, so creating it manually and documenting will be fine.

The Easy Bit - Dockerfile & Deploy. #

Docker is so, so easy. Until this project, I didn’t realise how easy it was to containerize your own application. Here’s the dockerfile I use:

FROM httpd:latest

WORKDIR /usr/local/apache2/htdocs/

# Clean the default public folder
RUN rm -fr * .??*

# Finally, the "public" folder generated by Hugo in the previous stage
# is copied into the apache htdocs folder
COPY ./public/ /usr/local/apache2/htdocs/

That’s it. No fancy configs, and since we’re using a well-supported web server like Apache, we have a good amount of documentation, and theoretically we could switch to something like Nginx with just a couple of tweaks.

Now if we’re deploying this locally, here’s what we’d run (assuming you have Docker installed)

cd PATH_TO_DOCKERFILE
docker build -t author/container_name /PATH_TO_PUBLIC_HUGO_DIR/ --no-cache
docker run -p 8096:80 author/container_name

When you’re pushing to a registry, it’s a bit different and depends on how you push and where to. In this example, I’m using Artifact Registry. Here’s the guide I used.

The Tough Bit - Kubernetes Setup #

In this section, I won’t be covering any initial GKE setup since people can explain it much better than me. The GKE official docs are actually really good. You can even edit the parameters on the web page itself before copying!

Here’s an explainer of what the resources I’m building are, and what I’m using them for:

ManagedCertificate #

This, along with the LB service, is one of the two GKE-proprietary implementations within Kubernetes. If you want to serve HTTPS traffic publicly, you’ll need to use an SSL certificate. Going the Google-managed way works great.

As long as my domain’s DNS records point directly to the ingresses static IP, it’ll automatically verify and assign a valid cert. No complicated verification needed! I follow this GCP documentation on getting this set up. The deployment config for this is hilariously short:

apiVersion: networking.gke.io/v1
kind: ManagedCertificate
metadata:
  name: lhc-cert
spec:
  domains:
    - liamhardman.cloud
    - www.liamhardman.cloud

In the future, I might end up using cert-manager and LetsEncrypt, but this solution works great for now.

Nodeport Service #

To put it in more on-prem friendly terms, it’s quite similar to Port Forwarding. There’s a port number you expose to the wider world inbound to your network, and this’ll usually be different to the port that’s open internally.

apiVersion: v1
kind: Service
metadata:
  name: lhc-nodeport
spec:
  selector:
    app: "lhc"
  type: NodePort
  ports:
    - protocol: TCP
      port: 8081
      targetPort: 80
      nodePort: 32081

Again, quite a short config here. You don’t need to set the nodePort here technically, so feel free to miss it out. ‘port’ is what this service is listening on, and the targetPort is the port that it’ll forward to. So in this case, my web app is listening on port 80, and my NodePort is listening on 8081.

Google LB Ingress Service #

Another Google-specific implementation, but it’s quite easy to learn.

kind: Ingress
metadata:
  name: lhc-ingress
  annotations:
    kubernetes.io/ingress.global-static-ip-name: eks-ingress
    networking.gke.io/managed-certificates: lhc-cert
    kubernetes.io/ingress.class: "gce"
    networking.gke.io/v1beta1.FrontendConfig: "lhc-frontend"

spec:
  rules:
  - host: "liamhardman.cloud"
    http:
      paths:
      - pathType: ImplementationSpecific
        path: "/*"
        backend:
          service:
            name: lhc-nodeport
            port:
              number: 8081

I use the frontend config to perform a HTTPS redirect:

apiVersion: networking.gke.io/v1beta1
kind: FrontendConfig
metadata:
  name: lhc-frontend
spec:
  redirectToHttps:
    enabled: true
    responseCodeName: PERMANENT_REDIRECT

To have this working succesfully, you’ll also need to grab a static IP to assign to your ingress with the same name as in the ‘kubernetes.io/ingress.global-static-ip-name:’ variable. I used this guide from Google to get that set up.

HorizontalPodAutoscaler #

Quite a quick one here, I set this up in the GKE menu so no config here. In Workloads -> Deployment Name -> Actions (Top of screen) -> Auto-scale, you can set min and max pods, along with the action needed to scale up pods (e.g CPU and/or RAM usage).

Container Deployment #

This is where a huge cost saving comes in for GKE, you can use spot nodes unlike in EKS Fargate! This brings costs down by over a 2/3rds and makes this genuinely viable for a solo labber.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: "lhc-deploy"
spec:
  selector:
    matchLabels:
      app: "lhc"
  replicas: 3
  template:
    metadata:
      labels:
        app: "lhc"
    spec:
      containers:
        - image: [REDACTED]
          imagePullPolicy: Always
          name: "lhc-apache"
          ports:
            - containerPort: 80
              protocol: TCP
          resources:
            requests:
              memory: "512Mi"
              cpu: "250m"
            limits:
              memory: "512Mi"
              cpu: "250m"
      nodeSelector:
        cloud.google.com/gke-spot: "true"
      terminationGracePeriodSeconds: 25

Another short and sweet config here. In this deployment youc an ignore the ‘replicas’ variable here since I’m using auto-scaling. This also shows how the ingress works.

Each resource can have a label applied. Among other things, this can be used to select where inbound requests go to. Anything with the label app:lhc in the lhc namespace will have traffic to liamhardman.cloud forwarded to it. This is great for high-availability and security for multi-tenant deployments, but can be quite frustrating to troubleshoot if you don’t realize you haven’t specified the right namespace somewhere.

While namespaces do help with security, they’re not close to the be-all and end-all , so check this Kubernetes.io guide if you’re interested in locking down a cluster of your own.

Kubernetes Diagram
This diagram explains things quite well for the traffic flow itself, and should hopefully explain any port/networking related confusion.

If you haven’t seen my last article about using Mingrammer, I’d highly recommend it. I hate dragging around tiny connectors so this is perfect!

Automation Action: Pipeline Steps via GitHub Actions #

When a git push is performed, the following will happen:

  • An Ubuntu VM will be used to checkout the current up-to-date repo
  • Hugo will perform a build and minify
  • Google Cloud will Authenticate with a Service Account made for GitHub Actions and get GKE credentials. This is an absolutely brilliant article to get this setup.
  • A temp token will be used (and deleted after 300s) to sign into Artifact Registry
  • A docker build and push to my Artifact Registry Repository will occur
  • Kubectl will then apply yml deployment manifests in ./manifests in the lhc namespace to my lab.

In my last setup, using a GitHub pipeline was quite pointless, and was basically just to learn / show off that I learnt about piepelines. Here, it’s almost required. Doing all of these steps manually would take quite the while, and is quite prone to human error. This will also ensure that my application will automatically deploy while also making sure there’s no downtime during the container restarts.

The Benefits: #

Other than being able to shout “Kubernetes!” at every meeting now, having HA capabilities and such versatility while being so cheap is really good. Even though I’m an infrastructure guy at heart, the ability to mostly ignore the underlying infrastructure maintenance is so refreshing. Now that I’m transitioning into DevOps more, this is what keeps me inspired to keep going, and get my teeth into even more CI/CD tech.

Additionally, I’ve found much lower resource usage to be of a great benefit. I’ve got the minimum 0.25vCPU, 0.5GB RAM and 1GiB ephemeral storage assigned to each pod, with a HorizontalPodAutoscaler set from 2-5 with a Scale Up event occuring with 65% aggregate CPU usage. Even with that, the resource usage is absolutely tiny. This resource usage is with 500 concurrent connections in a load-testing scenario. Absolutely brilliant!

Better redundancy, instant scaling and using spot instances for cost savings is good enough. Running a GTMetrix report shows that I’m also getting some incredible page load speeds here.

Conclusion: Brilliant, and Not so Pricy #

Indeed, this ended up being a good bit cheaper than I thought, and significantly better in terms of out of the box monitoring and security. The benefits here make my old hosting solution seem pretty naff in comparison. The pipeline setup taught me quite a lot, but I’m happy with how much I learnt from setting up a Kubernetes cluster. Containerization / Orchestration was one of my weak points, but it definitely isn’t as much now.

As always, please fire over a mail to [email protected] for any feedback or questions.