CodeOps Studies

When SSL Lies: Debugging PostgreSQL “server does not support SSL” in Kubernetes

Sanjeev Kumar Bharadwaj — Wed, 22 Apr 2026 18:45:31 GMT

You spin up a Kubernetes workload. You connect to PostgreSQL.

You add sslmode=require because security matters.

And then PostgreSQL replies with:

server does not support SSL, but SSL was required

Wait. What?

This is one of those errors that sounds like a configuration typo but usually reveals something deeper about your infrastructure. Let’s break it down properly and fix it the right way.

The Setup

This was a fairly common environment:

Application running inside Kubernetes
PostgreSQL running on a private IP inside the VPC
Direct psql access used for testing connectivity
Connection string explicitly enforcing SSL

The test command looked like this:

psql "host=10.1.1.17 port=5432 dbname=mydb user=myuser sslmode=require"

And PostgreSQL responded:

psql: error: connection to server at "10.1.1.17", port 5432 failed:
server does not support SSL, but SSL was required

At first glance, it feels contradictory. PostgreSQL supports SSL. So why is it saying it does not?

What `sslmode=require` Actually Does

When you set:

sslmode=require

you are telling the PostgreSQL client:

If SSL cannot be negotiated, fail immediately.

No fallback. No downgrade. No retry without encryption.

The connection flow looks like this:

If the server is not configured with SSL enabled, the handshake never happens and the client exits.

That is exactly what happened here.

Why This Happens in Kubernetes Environments

In many Kubernetes setups, PostgreSQL is:

A container inside the cluster
A VM inside a private subnet
An internal managed database
Or a local development database

In these cases, SSL is often disabled by default.

This creates a mismatch:

The client demands encryption
The server is configured for plain TCP
PostgreSQL refuses to downgrade
The connection fails

From PostgreSQL’s perspective, nothing is wrong. It simply does not have SSL enabled.

Step 1: Confirm Whether the Server Supports SSL

Before changing anything, verify the server configuration.

If you can connect without SSL, try:

psql "host=10.1.1.17 port=5432 dbname=mydb user=myuser sslmode=disable"

If that works, then the database is reachable and the issue is purely SSL related.

Now check whether SSL is enabled on the server:

SHOW ssl;

If it returns:

ssl | off

then the error makes complete sense.

Step 2: Understand Your Architecture

Here is the real question:

Should SSL even be required here?

Look at the traffic path.

If all communication is happening:

Inside the same Kubernetes cluster
Or inside a private VPC
Without exposure to the public internet

then the network layer is already isolated.

In such cases, enforcing SSL is sometimes unnecessary and only adds complexity.

However, if the database is:

Publicly accessible
Cross region
Or accessed over the internet

then SSL should absolutely be enabled.

Step 3: The Two Real Fixes

There are only two correct solutions.

Option One: Disable SSL on the Client

If this is an internal trusted network, change:

sslmode=require

to:

sslmode=disable

or simply remove the parameter.

Test:

psql "host=10.1.1.17 port=5432 dbname=mydb user=myuser"

If it connects successfully, you are done.

This is common in:

Local development
Internal Kubernetes clusters
Private staging environments

Option Two: Enable SSL on PostgreSQL Properly

If encryption is required, then the server must be configured correctly.

In postgresql.conf:

ssl = on
ssl_cert_file = 'server.crt'
ssl_key_file = 'server.key'

Then update pg_hba.conf:

/code

Restart PostgreSQL.

Now test:

psql "host=10.1.1.17 port=5432 dbname=mydb user=myuser sslmode=require"

And confirm:

SELECT ssl_is_used();

If it returns true, the handshake succeeded.

The flow now looks like this:

That is the correct configuration if encryption is part of your compliance or security requirement.

A Common Hidden Cause in Cloud Environments

There is one more subtle scenario.

Sometimes you are used to managed services like Amazon RDS where SSL is enabled automatically.

Then you switch to:

Self managed PostgreSQL
A container image
A VM based deployment

And assume SSL is still enabled.

It is not.

Managed database services often handle:

Certificate provisioning
Key rotation
TLS negotiation
Client CA bundles

Self managed PostgreSQL does none of that unless you configure it explicitly.

That assumption gap is where this error usually comes from.

Practical Debug Checklist

Whenever you see this error, follow this mental model:

Keep it simple.

Test the network first. Then test without SSL.

Then make an intentional security decision.

Lessons Learned

The key takeaway is this:

Just because PostgreSQL supports SSL does not mean your server instance is configured for it. Blindly adding sslmode=require because it feels secure can actually break perfectly valid internal architectures.

Security is about understanding your network boundaries, not just toggling flags in connection strings. If traffic never leaves a private subnet, SSL might be optional.

If traffic crosses public infrastructure, SSL should be enforced and configured correctly.

The important part is intention.

Final Thought

This error is not PostgreSQL lying. It is PostgreSQL being honest.

It is simply telling you:

“I do not have SSL enabled, and you told me not to connect without it.”

Once you understand that, the fix becomes straightforward.

And like most infrastructure issues, the real problem was not PostgreSQL.

It was an assumption.

A Real World Journey Building on Tencent Cloud

Sanjeev Kumar Bharadwaj — Mon, 16 Mar 2026 03:30:00 GMT

When you first approach Tencent Cloud, it feels familiar if you come from AWS or Azure. There are VPCs, Kubernetes clusters, load balancers, object storage, IAM, everything you would expect. But once you actually start building and deploying a real system, the differences start to show up in very real ways.

This is a story of building a production ready setup on Tencent Cloud, running into ICP restrictions, redesigning architecture, and eventually stabilizing with a private Kubernetes cluster, bastion access, and a Hong Kong deployment.

It is also a breakdown of how Tencent Cloud actually works under the hood, and how it compares to AWS and Azure in practical DevOps workflows.

Understanding Tencent Cloud at a Core Level

Tencent Cloud follows a similar mental model to other cloud providers, but the naming and behavior have subtle differences.

At the foundation you have a VPC. Inside the VPC you define subnets which can be public or private. Compute resources like CVMs or Kubernetes nodes live inside these subnets. Networking is controlled through security groups and routing tables.

Kubernetes is offered through TKE which is Tencent Kubernetes Engine. It abstracts control plane management but still gives you flexibility over networking, node pools, and scaling.

Object storage is COS which behaves similarly to S3.

Load balancing is handled through CLB which supports both public and private exposure.

At a glance everything looks standard. The complexity begins when you try to expose services publicly inside mainland China.

The ICP Reality

One of the biggest turning points in this journey was understanding ICP.

If you deploy infrastructure in mainland China regions such as Shanghai or Beijing, you cannot simply expose a public website or API like you would on AWS. You need an ICP license issued by the Chinese government.

Without ICP approval, public endpoints may not work reliably, may be blocked, or may never become accessible.

This creates a very real constraint.

You can build everything correctly from a technical standpoint and still fail at the final step of making your service reachable.

That is exactly what happened.

Everything was deployed in Shanghai. Kubernetes cluster was up. Services were running. Ingress was configured. But external access became the bottleneck due to ICP restrictions.

The Shift to Hong Kong

The solution was not a code change. It was a region change.

Tencent Cloud’s Hong Kong region operates outside mainland China regulations. That means no ICP requirement.

Moving to Hong Kong immediately removed the compliance barrier and allowed public exposure of services without regulatory friction.

This shift also impacted latency and accessibility for global users in a positive way.

The architecture itself remained similar, but the operational experience improved drastically.

Architecture Overview

The final architecture evolved into a secure and production ready setup with strong isolation.

You have a private Kubernetes cluster running inside a VPC. The nodes are not directly exposed to the internet. Instead, access is controlled through a bastion host.

Ingress is handled through NGINX inside the cluster. External traffic flows through a load balancer into the ingress controller, and then to services.

TCP services such as device communication ports are also exposed through the ingress configuration rather than separate load balancers.

Here is a simplified flow of the architecture:

Private Cluster and Bastion Access

One of the key design decisions was keeping the Kubernetes cluster private.

This means nodes do not have public IPs. Direct SSH or API access from the internet is not allowed.

Instead, a bastion host is deployed in a public subnet.

You connect to the bastion, and from there access internal resources.

This adds a strong security boundary.

This approach is very similar to hardened AWS architectures, but in Tencent Cloud it feels more necessary because of networking defaults and access patterns.

Kubernetes Exposure Strategy

Instead of creating multiple load balancers for each service, the setup uses a single ingress controller.

NGINX ingress handles both HTTP and TCP traffic.

This is especially useful when dealing with systems like GPS tracking devices or custom protocols where TCP ports need to be exposed.

Configuration is done through a ConfigMap that maps external ports to internal services.

This reduces cost and complexity since Tencent CLB instances are not as flexible or cheap as AWS alternatives in some cases.

Storage and Data Flow

Object storage using COS works very similarly to S3.

You can upload files, serve static assets, and integrate with applications easily.

A simple flow looks like this:

The APIs are slightly different, but the mental model remains the same.

CI CD and Deployment

Your pipeline builds Docker images, pushes them to a registry, and deploys them to Kubernetes.

Tencent Cloud can integrate with CI systems, but often external tools like GitHub Actions or self managed pipelines provide more flexibility.

The deployment flow looks like this:

This part feels very similar to AWS and Azure workflows.

Key Differences from AWS and Azure

Tencent Cloud is powerful, but the experience differs in important ways.

Documentation is not as consistent or detailed as AWS. You often need to experiment or translate concepts mentally.

Naming conventions are different which creates a small learning curve.

Networking behavior can feel stricter or less intuitive, especially with private clusters and routing.

The ICP requirement is something you never deal with in AWS or Azure global regions.

The console UI is functional but less polished compared to AWS.

On the positive side, Tencent Cloud integrates well with the Chinese ecosystem and provides strong performance in Asia.

Lessons I learned:

The biggest lesson is that infrastructure is not just about technology. It is also about geography and regulation.

Choosing the wrong region can block your entire system even if everything is technically correct.

Private clusters with bastion access provide strong security and should be the default for production setups.

Ingress based exposure is more efficient than multiple load balancers, especially when dealing with mixed traffic types.

Always design for flexibility. Being able to shift regions saved the entire setup.

Conclusion

Tencent Cloud is a capable platform, but it requires a different mindset compared to AWS and Azure.

Once you understand the constraints, especially around ICP and networking, you can build robust and scalable systems.

The journey from Shanghai to Hong Kong was not just a migration. It was a deeper understanding of how cloud infrastructure interacts with real world regulations and architecture decisions.

If you approach Tencent Cloud with the right expectations, it becomes a powerful tool rather than a frustrating one.

Lessons Learned Building a CI Pipeline That Auto-Tags and Deploys Docker Images

Sanjeev Kumar Bharadwaj — Tue, 24 Feb 2026 08:58:06 GMT

When I first automated Docker builds and deployments, I thought the hard part would be writing the YAML. It was not.

The real challenges were versioning, preventing accidental rollbacks, handling environment drift, and making deployments predictable. Over time, I built a CI pipeline that automatically tags Docker images, pushes them to a container registry, and deploys the latest version to a server without manual intervention.

This article walks through what worked, what broke, and what I learned while building a production-ready auto-tag and auto-deploy pipeline.

The Goal

The objective was simple:

Every merge to the main branch should build a Docker image
The image should get a unique incrementing version tag
The image should be pushed to a container registry
The deployment server should pull the new version and restart the service automatically
No manual SSH, no manual tagging, no human version bumps

The reality was more nuanced.

The High-Level Architecture

The system had four moving parts:

Source repository
CI workflow
Container registry
Deployment server

Here is the simplified flow.

At first glance, this looks trivial. The devil was in version control and deployment consistency.

Problem: Manual Versioning Does Not Scale

Initially, I hardcoded the image tag like this:

myapp:latest

That worked until it didn’t.

Using latest creates ambiguity. If something breaks, you cannot easily roll back. You also do not know what code is actually running in production.

So I moved to semantic versioning:

0.0.1
0.0.2
0.0.3

But manually updating the version before each commit quickly became annoying and error prone.

The fix was automatic version incrementing inside the CI pipeline.

Automatic Version Tagging Strategy

The pipeline logic became:

Fetch the latest tag from the registry
Parse the version
Increment the patch number
Tag the new image with the incremented version
Push it

Conceptually:

This solved several problems:

Every image became uniquely identifiable
Rollbacks became trivial
Production state became transparent

One major lesson here was to avoid deriving version numbers from Git commit hashes for user-facing services. While hashes are unique, semantic versions are easier to reason about operationally.

CI Pipeline Flow

The CI pipeline was responsible for:

Checking out the code
Logging into the registry
Building the Docker image
Tagging with the new version
Pushing both version tag and latest
Triggering deployment

The full CI flow looked like this:

One key insight was pushing both version and latest tags.

The version tag gives traceability. The latest tag simplifies pull logic on the server.

Deployment Automation

The deployment server had a simple responsibility:

Pull the newest image
Restart the container

At first, I used a naive approach:

docker pull myapp:latest
docker-compose up -d

This works, but only if you are disciplined.

The issue appears when the image digest does not change or when the server has cached layers in a strange state.

A more robust flow became:

This avoids unnecessary restarts and reduces downtime.

Avoiding Downtime

Restarting a container blindly can cause momentary service disruption.

Two improvements helped:

Health checks inside Docker
Graceful restart strategy

Instead of stopping first and then starting, the improved approach was:

Start new container
Verify health
Stop old container

This pattern mimics blue green deployment at a smaller scale.

This reduced deployment risk significantly.

Security Lessons

Several important security practices emerged:

Never store registry credentials in plain text
Use CI secrets properly
Use short-lived tokens when possible
Restrict server SSH access

Another key lesson was separating build and deploy permissions. The CI pipeline should not have unrestricted server access. Ideally, it triggers deployment via a webhook or controlled SSH user with limited privileges.

Observability Matters

The first time a deployment silently failed, I realized logs were not optional.

You need:

CI logs that clearly show version generated
Registry confirmation logs
Server deployment logs
Application startup logs

Without observability, automation becomes guesswork.

Rollback Strategy

One of the biggest advantages of version tagging is clean rollback.

If production breaks:

docker pull myapp:0.0.7
docker run myapp:0.0.7

No rebuild required.

Rollback becomes a configuration change rather than a panic-driven patch.

What I Would Do Differently

If building from scratch again:

Use immutable image references by digest in production
Introduce deployment locking to prevent concurrent runs
Add structured logging for CI
Add automated smoke tests after deployment

Automation is not about speed alone. It is about predictability.

Conclusion

Building an auto-tagging and auto-deploy CI pipeline sounds simple. It is not.

The complexity lies in:

Version consistency
Deployment safety
Rollback reliability
Security boundaries
Observability

Once implemented correctly, the workflow changes how you ship software. Deployments stop being events and start becoming routine.

If you are still manually tagging Docker images or SSHing into servers to deploy, start automating today, that shift in mindset is the real upgrade.

What I Learned Migrating a Real App from Docker Compose to Kubernetes

Sanjeev Kumar Bharadwaj — Thu, 12 Feb 2026 10:25:58 GMT

For a long time, Docker Compose felt like the perfect solution. Simple YAML, fast local setup, predictable behavior. For a single service or even a small stack, it works beautifully.

But at some point, reality catches up.

As the application grew, traffic became less predictable, deployments needed to be safer, and uptime started to matter more than convenience. That was the point where migrating from Docker Compose to Kubernetes stopped being optional and became inevitable.

This post is not a Kubernetes tutorial. It’s a reflection on what actually changed, what broke, and what I wish I had understood earlier before making the move.

Docker Compose Worked Until It Didn’t

Docker Compose is excellent at answering one question:
“How do I run multiple containers together on one machine?”

The problems started when my needs shifted to different questions:

How do I scale only one service without touching others?
How do I deploy without downtime?
How do I expose HTTP and raw TCP services reliably?
How do I survive a node restart without manual intervention?

Compose can technically handle some of this, but only with scripts, conventions, and a lot of discipline. Over time, the setup became fragile. A single bad deploy could take everything down.

Kubernetes didn’t magically solve these problems, but it gave me primitives that were designed for them.

The Biggest Mental Shift: Stop Thinking in Containers

In Docker Compose, you think in containers.
In Kubernetes, you think in systems.

At first, this was uncomfortable. I kept asking questions like:

Where is my container?
Why did Kubernetes restart it?
Why are there three replicas when I only started one?

Eventually, I realized Kubernetes doesn’t care about my containers. It cares about desired state.

Once I stopped fighting that idea and started defining what I wanted instead of how to do it, things clicked.

I no longer deployed containers. I declared intentions.

Configuration Management Became a First-Class Concern

In Compose, environment variables often live in .env files or directly in YAML. That works until you have multiple environments, secrets, and rotating credentials.

Kubernetes forced me to clean this up.

ConfigMaps and Secrets felt verbose at first, but they created a clean separation:

Application code stopped knowing where configuration came from
Sensitive values were no longer mixed with runtime logic
Environment differences became explicit instead of accidental

This alone reduced production mistakes more than any CI rule I had before.

Networking Was Simpler and More Complicated at the Same Time

This was one of the most surprising parts.

Inside the cluster, service discovery is easier than Docker Compose. Every service gets a stable DNS name. Containers can restart, reschedule, or scale without breaking internal communication.

But ingress is where things got interesting.

Exposing HTTP traffic was straightforward once I adopted an ingress controller. TLS termination, routing, and host-based rules became declarative and repeatable.

Exposing raw TCP ports was harder. This is something Compose hides from you. In Kubernetes, you must understand:

Services
NodePorts
LoadBalancers
Ingress TCP mappings

Once configured correctly, it was more reliable than my old setup. Getting there required patience and a lot of reading logs.

Scaling Is Not Just Replicas

In Compose, scaling usually means --scale and hoping nothing breaks.

In Kubernetes, scaling forced me to confront assumptions I didn’t know I had:

Is my app stateless?
What happens if two replicas process the same request?
Where does session data live?
What happens during rolling updates?

Horizontal Pod Autoscaling was powerful, but it exposed poor application design immediately. Anything relying on local state or filesystem assumptions broke fast.

This pain was useful. It forced architectural improvements that made the system more resilient overall.

Health Checks Are Not Optional Anymore

Docker Compose lets unhealthy services limp along. Kubernetes does not.

Once liveness and readiness probes were in place, I learned two important lessons:

A service can be running and still be unusable
Restarting a container is often better than keeping it alive

Bad health checks caused cascading failures. Good ones made deployments boring. Boring deployments are the goal.

CI/CD Became Cleaner and More Predictable

Before Kubernetes, deployments were procedural.
Run this script. Pull this image. Restart that service.

After Kubernetes, deployments became declarative:

Build image
Push image
Update manifest
Let the cluster reconcile

This reduced the surface area for human error. If something went wrong, Kubernetes told me what failed and why. Logs, events, and pod states became a reliable source of truth instead of guesswork.

Kubernetes Did Not Reduce Complexity, It Reorganized It

This is important to say clearly.

Kubernetes did not make the system simpler. It made the complexity explicit.

Things that were previously hidden inside scripts, assumptions, and tribal knowledge were now written down in YAML. That felt heavy at first, but it also meant:

New environments were reproducible
Failures were diagnosable
Scaling decisions were intentional

The complexity existed before. Kubernetes just stopped pretending it didn’t.

What I Would Do Differently Next Time

I would start smaller.

Instead of migrating everything at once, I would:

Move one stateless service first
Get ingress and TLS right early
Invest in logging and metrics from day one
Treat manifests as code, not configuration

Most importantly, I would spend more time understanding Kubernetes concepts before trying to bend them into old patterns.

Final Thoughts

Docker Compose is not bad. It’s just honest about what it is.

Kubernetes is not overkill when your system starts needing guarantees instead of convenience.

The migration was not smooth, but it was worth it. Not because Kubernetes is trendy, but because it forced better engineering decisions that I had been postponing.

If you are feeling friction with Docker Compose, that friction is a signal. Listen to it.

Running Apache Flink on Kubernetes: From Zero to a Fully Utilized Cluster

Sanjeev Kumar Bharadwaj — Sun, 18 Jan 2026 19:05:39 GMT

This blog walks through Apache Flink end to end, starting from what Flink is, how its architecture works, and how to deploy and properly utilize a Kubernetes cluster using Flink’s standalone Kubernetes mode. The goal is not just to get Flink running, but to make sure it runs correctly, efficiently, and in a way that matches how Flink is designed to work.

This guide is based on a real Kubernetes cluster with one control plane and two worker nodes, each with roughly 8 GB RAM.

What is Apache Flink

Apache Flink is a distributed stream and batch processing engine designed for stateful, low-latency, high-throughput data processing. Unlike traditional batch systems, Flink treats streaming as the primary model, with batch being a special case of bounded streams.

Key properties of Flink:

• True streaming engine, not micro-batching • Stateful processing with exactly-once guarantees • Event-time processing and watermarks • Horizontal scalability • Fault tolerance via checkpoints and state backends

Flink is commonly used for real-time analytics, event-driven applications, fraud detection, metrics aggregation, and complex event processing.

Core Flink Architecture

A Flink cluster is composed of a small number of well-defined components.

JobManager

The JobManager is the brain of the cluster.

It is responsible for:

• Accepting jobs • Creating execution graphs • Scheduling tasks • Coordinating checkpoints • Handling failures and restarts

Only one active JobManager exists at a time in standalone mode.

TaskManager

TaskManagers are the workers of the Flink cluster.

Each TaskManager:

• Runs tasks (operators) • Manages task slots • Executes user code • Maintains local state

A TaskManager exposes a fixed number of task slots. Slots are the unit of parallelism in Flink.

Slots and Parallelism

A slot represents a share of a TaskManager’s resources.

Total available parallelism is:

TaskManagers × Slots per TaskManager

For example:

• 4 TaskManagers • 2 slots each • Total parallelism = 8

Jobs can only run with parallelism up to the available slots.

Why Kubernetes for Flink

Kubernetes provides a natural runtime for Flink because:

• Pods map cleanly to JobManager and TaskManager • Kubernetes scheduler handles placement • Native scaling via replicas • Built-in service discovery • Persistent volumes for state

Flink supports Kubernetes in multiple ways. In this blog we use Standalone Kubernetes mode, where Flink runs continuously as a cluster inside Kubernetes.

Cluster Prerequisites

The Kubernetes cluster used here:

• 1 control plane node • 2 worker nodes • ~8 GB RAM per worker • containerd runtime • local-path storage provisioner

Both worker nodes are labeled to allow Flink scheduling:

kubectl label node nyzex-worker-node1 flink-role=worker
kubectl label node nyzex-worker-node2 flink-role=worker

Namespace Setup

Create a dedicated namespace for Flink.

kubectl create namespace flink

This keeps Flink resources isolated and easier to manage.

Persistent Storage for Flink

Flink requires persistent storage for:

• Checkpoints • Savepoints • High availability metadata (optional)

Using a PersistentVolumeClaim allows Kubernetes to dynamically provision storage.

PVC Definition

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: flink-storage
  namespace: flink
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi
  storageClassName: local-path

Apply it:

kubectl apply -f flink-pvc.yaml

With WaitForFirstConsumer, the volume binds only after a pod requests it. This is expected behavior.

Flink Configuration

Flink is configured using flink-conf.yaml mounted into the pods via ConfigMap.

Key configuration:

jobmanager.rpc.address: flink-jobmanager

state.backend: filesystem
state.checkpoints.dir: file:///opt/flink/state/checkpoints
state.savepoints.dir: file:///opt/flink/state/savepoints

execution.checkpointing.interval: 10s

parallelism.default: 2

kubernetes.taskmanager.node-selector.flink-role: worker
kubernetes.jobmanager.node-selector.flink-role: worker

This ensures:

• State is persisted • Checkpointing is enabled • Pods run only on worker nodes

What I used:

apiVersion: v1
kind: ConfigMap
metadata:
  name: flink-config
  namespace: flink
data:
  flink-conf.yaml: |
    jobmanager.rpc.address: flink-jobmanager
    taskmanager.numberOfTaskSlots: 2
    parallelism.default: 8

    # Memory
    jobmanager.memory.process.size: 1024m
    taskmanager.memory.process.size: 3g

    # State backend
    state.backend: rocksdb
    state.backend.incremental: true
    state.checkpoints.dir: file:///flink-data/checkpoints
    state.savepoints.dir: file:///flink-data/savepoints

    execution.checkpointing.interval: 60s
    execution.checkpointing.min-pause: 30s
    execution.checkpointing.timeout: 10m

JobManager Deployment

The JobManager runs as a Deployment with a single replica.

Key points:

• Uses Flink image • Exposes RPC and Web UI ports • Mounts persistent storage • Uses node selector

apiVersion: apps/v1
kind: Deployment
metadata:
  name: flink-jobmanager
  namespace: flink
spec:
  replicas: 1
  selector:
    matchLabels:
      app: flink
      component: jobmanager
  template:
    metadata:
      labels:
        app: flink
        component: jobmanager
    spec:
      containers:
        - name: jobmanager
          image: flink:1.18
          args: ["jobmanager"]
          ports:
            - containerPort: 6123
            - containerPort: 8081
          volumeMounts:
            - name: flink-config-volume
              mountPath: /opt/flink/conf
            - name: flink-storage
              mountPath: /flink-data
          resources:
            requests:
              memory: "1Gi"
              cpu: "1"
            limits:
              memory: "1Gi"
              cpu: "1"
      volumes:
        - name: flink-config-volume
          configMap:
            name: flink-config
        - name: flink-storage
          persistentVolumeClaim:
            claimName: flink-storage

JobManager Service

A Kubernetes Service exposes the JobManager internally.

apiVersion: v1
kind: Service
metadata:
  name: flink-jobmanager
  namespace: flink
spec:
  ports:
    - name: rpc
      port: 6123
    - name: webui
      port: 8081
  selector:
    app: flink
    component: jobmanager

TaskManager Deployment

TaskManagers scale horizontally using replicas.

Important aspects:

• Resource requests and limits • Slot configuration • Pod anti-affinity for spreading

apiVersion: apps/v1
kind: Deployment
metadata:
  name: flink-taskmanager
  namespace: flink
spec:
  replicas: 4
  selector:
    matchLabels:
      app: flink
      component: taskmanager
  template:
    metadata:
      labels:
        app: flink
        component: taskmanager
    spec:
      nodeSelector:
        flink-role: worker
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                labelSelector:
                  matchLabels:
                    component: taskmanager
                topologyKey: kubernetes.io/hostname
      containers:
        - name: taskmanager
          image: flink:1.18
          args: ["taskmanager"]
          env:
            - name: TASK_MANAGER_NUMBER_OF_TASK_SLOTS
              value: "2"
          resources:
            requests:
              memory: "2Gi"
              cpu: "1"
            limits:
              memory: "3Gi"
              cpu: "2"

What I used:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: flink-taskmanager
  namespace: flink
spec:
  replicas: 4
  selector:
    matchLabels:
      app: flink
      component: taskmanager
  template:
    metadata:
      labels:
        app: flink
        component: taskmanager
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                labelSelector:
                  matchLabels:
                    component: taskmanager
                topologyKey: kubernetes.io/hostname
      containers:
        - name: taskmanager
          image: flink:1.18
          args: ["taskmanager"]
          resources:
            requests:
              memory: "3Gi"
              cpu: "2"
            limits:
              memory: "3Gi"
              cpu: "2"
          volumeMounts:
            - name: flink-config-volume
              mountPath: /opt/flink/conf
            - name: flink-tmp
              mountPath: /tmp
      volumes:
        - name: flink-config-volume
          configMap:
            name: flink-config
        - name: flink-tmp
          emptyDir: {}

This configuration allows Kubernetes to spread TaskManagers across both worker nodes.

Some important queries and information:

Why TaskManagers are getting distributed across nodes:

TaskManagers are distributed because:

Kubernetes schedules pods
You allowed scheduling on both worker nodes
Flink TaskManagers are stateless compute workers
You added anti-affinity, so Kubernetes spreads them

Flink itself does not decide node placement. Kubernetes does.

Who decides where a TaskManager runs?

Kubernetes scheduler, not Flink

When you create this:

replicas: 4
kind: Deployment

You are telling Kubernetes:

“I want 4 identical TaskManager pods.”

Kubernetes then:

Looks at available nodes
Checks nodeSelector
Checks resource requests
Applies affinity rules
Chooses nodes

Flink only sees:

“I now have 4 TaskManagers connected to me.”

Why they don’t all land on one node anymore:

Once this is added:

podAntiAffinity:
  preferredDuringSchedulingIgnoredDuringExecution:
    - weight: 100
      podAffinityTerm:
        labelSelector:
          matchLabels:
            component: taskmanager
        topologyKey: kubernetes.io/hostname

Result:

Pods spread evenly
2 TaskManagers on worker-node1
2 TaskManagers on worker-node2

Why TaskManagers do NOT use the PVC

Key principle

TaskManagers are ephemeral compute.
They should be disposable.

Flink is designed so that:

TaskManagers can die at any time
State is NOT tied to a specific TaskManager pod

So by default:

TaskManagers do NOT mount persistent volumes
TaskManagers use local ephemeral storage
Persistent state lives elsewhere

This is intentional.

So what is the PVC actually used for?

In this setup, the PVC is mounted only on the JobManager:

volumeMounts:
  - name: flink-storage
    mountPath: /opt/flink/state

And configured in flink-conf.yaml:

state.backend: filesystem
state.checkpoints.dir: file:///opt/flink/state/checkpoints
state.savepoints.dir: file:///opt/flink/state/savepoints

This means:

Checkpoints are written to the PVC
Savepoints are written to the PVC
Job metadata survives pod restarts

During a checkpoint:

TaskManagers snapshot their local state
State is sent to the JobManager
JobManager persists it to the shared storage

If a TaskManager dies:

Kubernetes restarts it
Flink restores state from the checkpoint directory
Processing resumes

Verifying Cluster Distribution

After deployment:

kubectl get pods -n flink -o wide

Expected result:

• JobManager on one worker • TaskManagers evenly split across nodes • No Pending pods

This confirms proper scheduling and full cluster utilization.

Accessing Flink Web UI

Port-forward the JobManager service:

kubectl port-forward svc/flink-jobmanager 8081:8081 -n flink

Open in browser:

http://localhost:8081

You should see:

• All TaskManagers registered • Total available slots • Healthy cluster status

Running a Sample Job

Run a built-in Flink example:

kubectl exec -n flink deploy/flink-jobmanager -- \
  flink run /opt/flink/examples/streaming/WordCount.jar

Observe task distribution in the UI.

How Kubernetes and Flink Work Together

Kubernetes handles:

• Pod scheduling • Resource isolation • Restarting failed pods • Networking

Flink handles:

• Task scheduling • State management • Checkpoints • Fault recovery

This separation keeps responsibilities clean and scalable.

What Makes This Production-Ready

This setup already includes:

• Persistent state • Checkpointing • Horizontal scalability • Proper pod distribution

Next improvements can include:

• High availability JobManager • External state backend (S3, MinIO) • Kafka integration • Prometheus metrics • Autoscaling

Final Thoughts

Running Apache Flink on Kubernetes is not just about starting pods. Correct scheduling, slot planning, storage configuration, and understanding Flink’s execution model are critical.With this setup, your cluster resources are fully utilized, workloads scale correctly, and you are ready to run real streaming jobs with confidence..

Apache Flink, Kubernetes, and How It Works

Sanjeev Kumar Bharadwaj — Wed, 14 Jan 2026 18:30:00 GMT

Imagine you have a factory that processes things. Flink is like that factory, and Kubernetes is like the factory floor manager that decides where machines go and how they run.

1. What is Flink?

Apache Flink is a distributed data processing engine.

“Distributed” means it can run across multiple computers (nodes) at the same time.
“Data processing” means it takes data streams or batches and transforms them into results.
It can process data as it arrives (streaming) or process a fixed dataset (batch).
Flink is stateful and fault-tolerant: it remembers important information and can recover if a worker dies.

Example:

Like counting how many people enter a mall every minute:

Flink will keep a running total.
If a computer crashes, it will recover the total from where it left off.

2. Flink Cluster Components

A Flink cluster has two main roles:

JobManager (JM): The Brain

The JobManager is like the manager of the factory.
Responsibilities:
- Accept jobs (the “instructions” for the factory)
- Split the job into smaller tasks
- Decide which worker (TaskManager) will execute each task
- Keep track of the state and checkpoints
- Handle failure recovery
Usually, 1 JobManager pod in Kubernetes
Needs persistent storage (PVC) because it stores checkpoints and savepoints

TaskManager (TM): The Worker

The TaskManager is like a worker machine on the factory floor.
Responsibilities:
- Execute tasks assigned by the JobManager
- Keep temporary state in memory or local disk (ephemeral)
- Report back progress to JobManager
TaskManagers are stateless and ephemeral:
- If a TaskManager dies, Kubernetes will restart it somewhere else
- JobManager uses checkpoints to restore the state
Each TaskManager pod can have one or more slots (think “hands” to do work)

3. Slots: Hands of the TaskManager

Each TaskManager has slots, which are units of parallel work.
Each slot can run one subtask of a Flink job.

Analogy:

Imagine a worker has 2 hands → they can work on 2 small tasks at the same time
If you have 4 workers, each with 2 hands → 8 tasks can be worked on simultaneously
Slots allow Flink to divide work and control resources (memory, CPU) per task

4. Parallelism: How Many Hands Work

Parallelism is how many subtasks a job is divided into.

Each subtask needs one slot.
Maximum parallelism = total slots in cluster

Example:

Component	Pods	Slots per Pod	Total Slots
JobManager	1	0	0
TaskManager	4	2	8

parallelism.default = 2 → job will run 2 subtasks if no -p is specified
Job parallelism = 6 → 6 subtasks run, distributed across the 4 TaskManagers
Job parallelism = 10 → only 8 subtasks can run at once (total slots), 2 wait

5. How TaskManagers store data

TaskManagers are ephemeral; they use local memory or local disk for temporary state.
They do NOT use PVC. If a TaskManager dies, its data is lost.
JobManager + PVC is where durable state lives:
- Checkpoints
- Savepoints

Checkpoint flow:

JobManager asks TaskManagers to snapshot their local state
TaskManagers send snapshots to JobManager
JobManager writes snapshots to PVC (persistent storage)
If a TM dies, it is restarted → state is restored from PVC

6. How Kubernetes schedules TaskManagers

As an example, let us see this scenario:

You requested 4 TaskManager pods
Kubernetes decides which node each pod runs on
You added soft anti-affinity → tries to spread them across both nodes

Example of your cluster after scheduling:

TaskManager	Node
TM1	node1
TM2	node1
TM3	node2
TM4	node2

JobManager pod runs on node1, mounts PVC for persistent storage
TaskManagers use local ephemeral storage

7. Step-by-step of job execution

Submit a job (example: WordCount)
JobManager splits job into subtasks according to parallelism
Assigns subtasks to TaskManager slots
TaskManagers execute tasks, keep temporary state
Periodically, TaskManagers checkpoint state to JobManager → written to PVC
If a TaskManager dies → Kubernetes restarts pod → JobManager restores state
Job continues processing seamlessly

8. Example analogy

JobManager → Manager in a factory
TaskManagers → Workers on the floor
Slots → Worker’s hands
Parallelism → Number of hands used on a job
PVC → Manager’s filing cabinet with important records
TaskManager ephemeral storage → Paper on worker’s desk (temporary, lost if worker leaves)

Key takeaways

Flink separates computation (TaskManagers) from state (JobManager + PVC)
Slots control parallelism at runtime
TaskManagers are disposable, Kubernetes can reschedule them anywhere
JobManager is critical: PVC ensures job can recover if TMs die
Parallelism.default is just a default; maximum parallelism is determined by total slots in cluster

Running Traccar on Kubernetes: Lessons Learned from Ingress, TCP Services, and Scaling

Sanjeev Kumar Bharadwaj — Mon, 29 Dec 2025 12:04:28 GMT

Traccar looks simple on the surface. It is just a GPS tracking server with a web interface. Once you attempt to run it in Kubernetes, especially for real device traffic, you quickly realize that it is not a typical HTTP application. Traccar is a mix of HTTP, long-lived TCP connections, multiple device protocols, and stateful behavior that does not always align cleanly with cloud-native assumptions.

This post documents what worked, what did not, and why certain architectural decisions were made while running Traccar on Kubernetes. The goal is not to present a perfect reference architecture, but to share practical lessons learned from real deployments and iterations.

Understanding Traccar’s Traffic Model

Before touching Kubernetes, it is important to understand how Traccar actually receives traffic.

The Traccar web interface is a standard HTTP application. It runs on port 8082 by default and can be exposed using any normal HTTP reverse proxy.

Device traffic is very different. Each GPS device speaks its own protocol. These protocols are almost always raw TCP. Examples include:

Port 5027 for Teltonika devices like FMB125
Port 5004 for OsmAnd
Many other ports depending on protocol configuration

Devices open long-lived TCP connections and continuously send data. They do not behave like short HTTP requests. This distinction heavily influences how Kubernetes networking must be designed.

Initial Attempt: Treating Traccar Like a Normal Web App

The first deployment followed a standard Kubernetes pattern.

Traccar ran in a Deployment
A ClusterIP Service exposed ports 8082, 5027, and 5004
An NGINX Ingress exposed the web interface on a domain

The web interface worked immediately. Logging in, viewing devices, and maps all functioned correctly.

Device connections did not.

At first, it looked like a firewall or security group issue. Ports were open. Services existed. Pods were running. Logs showed no incoming device traffic.

The core mistake was assuming that Kubernetes Ingress could route arbitrary TCP traffic in the same way it routes HTTP.

Why Standard Ingress Does Not Work for Device Ports

Kubernetes Ingress is an HTTP abstraction. It understands hosts, paths, headers, and HTTP semantics.

Traccar device protocols are raw TCP. There is no HTTP handshake, no headers, and no routing metadata.

Most Ingress controllers, including NGINX Ingress, completely ignore non-HTTP traffic unless explicitly configured to handle TCP streams.

This is why simply adding device ports to a Service and expecting Ingress to route them does not work.

Initial Naive Architecture (What Did Not Work)

This represents the first attempt where Traccar was treated like a normal HTTP application.

Why this failed

Ingress only understood HTTP
TCP packets from devices were silently dropped
No errors were obvious unless Ingress logs were inspected carefully

This is useful to show that nothing looked wrong from a Kubernetes resource perspective, yet device traffic never arrived.

Option 1: Separate LoadBalancer for Device Traffic

The simplest working solution was to create a separate Service of type LoadBalancer for device ports.

One LoadBalancer for ports 5027, 5004, and others
Another Ingress for the web UI

This worked immediately. Devices connected successfully and data started flowing.

However, this approach had clear downsides.

Every LoadBalancer costs money
Managing DNS and certificates becomes fragmented
Operational complexity increases with each additional protocol

For a small setup this might be acceptable. For a production system with many protocols, it quickly becomes messy.

Why this worked

Kubernetes LoadBalancer services handle raw TCP natively
Devices connected immediately

Why this was not ideal

Multiple public IPs
Higher cost
DNS and certificate management became fragmented
Scaling to many protocols would multiply LoadBalancers

This is important because it shows a valid stepping stone, not a mistake.

Option 2: NGINX Ingress with TCP Services

The more scalable approach was to use NGINX Ingress TCP services.

NGINX Ingress supports raw TCP forwarding through a ConfigMap. This feature is not enabled by default and requires explicit configuration.

How TCP Services Work

Instead of using an Ingress resource, TCP ports are mapped directly in a ConfigMap.

Example:

apiVersion: v1
kind: ConfigMap
metadata:
  name: tcp-services
  namespace: ingress-nginx
data:
  "5027": "traccar/traccar:5027"
  "5004": "traccar/traccar:5004"

This tells the NGINX Ingress controller:

Listen on port 5027
Forward raw TCP traffic to the Traccar Service on port 5027

The NGINX Ingress controller must also be started with flags enabling TCP services:

--tcp-services-configmap=ingress-nginx/tcp-services

Once this was configured correctly, device traffic started flowing without any separate LoadBalancer.

Key points this diagram communicates

One public entry point
HTTP and TCP are handled differently but coexist cleanly
No extra LoadBalancers
Devices and users share the same domain, different ports

This is the centerpiece of the blog.

Single LoadBalancer, Multiple Protocols

With TCP services enabled, the architecture became much cleaner.

One NGINX Ingress LoadBalancer
HTTP traffic routed via Ingress rules
TCP traffic routed via ConfigMap
One public IP
One DNS domain

Devices connected to the same IP or domain, simply using different ports.

This was the first setup that felt production-ready.

DNS and Domain-Based Device Connections

Some devices, including Teltonika FMB125, support connecting to a domain name instead of an IP address.

This was important for flexibility.

A CNAME record was created pointing to the Ingress LoadBalancer DNS name.

Example:

traccar.example.com -> ingress-lb.amazonaws.com

Devices were configured to connect to traccar.example.com:5027.

This allowed infrastructure changes without touching device configurations, which is critical once devices are deployed in the field.

Devices never depend on a fixed IP
Infrastructure can change underneath
Field devices remain untouched

This is often overlooked but is critical in real deployments.

Scaling Traccar Pods and the TCP Reality

At this point, the next natural step was scaling.

Horizontal Pod Autoscaler was enabled based on CPU usage. Traccar pods scaled up as load increased.

This is where another subtle issue appeared.

TCP Connections Are Sticky

When a device connects over TCP, the connection is established to a specific pod through NGINX. That connection stays open for a long time.

If the pod restarts, the connection drops. Devices reconnect, but not always immediately.

If traffic is distributed across multiple pods, each pod holds its own set of device connections. This is not inherently bad, but it has consequences:

Pod restarts cause device disconnects
Rolling updates must be carefully controlled
Aggressive autoscaling can harm connection stability

For this reason, scaling Traccar is not as simple as scaling stateless HTTP services.

TCP connections are sticky
Restarting a pod drops active devices
Autoscaling must be conservative
Rolling updates must avoid simultaneous pod restarts

This visually explains why Traccar is not truly stateless.

Lessons on Scaling Strategy

A few practical rules emerged.

Keep a minimum number of replicas to avoid cold starts
Avoid frequent pod restarts
Use rolling updates with maxUnavailable set to zero
Scale based on memory and connection count, not only CPU

In some cases, vertical scaling provided more stability than horizontal scaling.

Database Considerations

Running the database inside the cluster was tested initially.

This was quickly abandoned.

Traccar is stateful. Device positions, events, and history must never be lost. Kubernetes pods are ephemeral by design.

Moving PostgreSQL to a managed service like RDS simplified operations significantly.

Backups became reliable
Pod restarts no longer risked data integrity
Performance was more predictable

This separation of concerns was one of the most important architectural decisions.

Observability and Debugging Device Traffic

Debugging TCP traffic is harder than debugging HTTP.

A few practices helped significantly:

Enable detailed Traccar protocol logs temporarily
Use kubectl logs with timestamps
Test device connections using netcat or protocol simulators
Monitor NGINX Ingress logs for connection errors

Blindly assuming that devices are sending data is a common mistake. Always validate traffic at each layer.

CI/CD and Image Strategy

Building a custom Traccar image simplified deployment.

Web UI built once
Backend and frontend shipped together
Image pushed to a registry
Kubernetes deployment updated with new tag

This avoided runtime builds and reduced startup time.

Automated image tagging and controlled rollouts were critical to avoid accidental mass disconnects during updates.

Final Architecture Summary

The final stable setup looked like this:

Traccar runs as a Deployment in Kubernetes
PostgreSQL runs outside the cluster
NGINX Ingress exposes:
- HTTP via Ingress rules
- TCP device ports via TCP services ConfigMap
One LoadBalancer
Devices connect using a domain name
Scaling is conservative and connection-aware

This architecture balanced Kubernetes flexibility with the realities of long-lived TCP connections.

Closing Thoughts

Traccar can run very well on Kubernetes, but only if it is treated as a mixed-protocol, semi-stateful system rather than a simple web application.

The biggest lesson was that Kubernetes abstractions are powerful, but they do not remove the need to understand how applications actually communicate.

If you respect Traccar’s networking model and design around it, Kubernetes becomes an advantage rather than a source of constant friction.

Thanks to mermaid.live to be able to use it to create these flow diagrams!

Why Your Kubernetes Cluster Works Fine Until Traffic Spikes

Sanjeev Kumar Bharadwaj — Mon, 15 Dec 2025 18:56:38 GMT

A Deep Dive into Resource Requests, Limits, and Real World Failures

A Kubernetes cluster often appears healthy during normal operation. Pods are running, dashboards are green, and alerts stay quiet. Then traffic spikes. Suddenly requests slow down, pods restart, and some workloads disappear entirely. This situation surprises many teams because nothing changed in the cluster configuration. The problem usually lies in how resource requests and limits were defined, or not defined at all.

This article explains why these failures happen, how Kubernetes actually uses CPU and memory settings, and how small configuration choices can prevent large production outages.

The False Comfort of a Quiet Cluster

A cluster that handles low traffic smoothly is not necessarily well configured. During calm periods, most applications consume only a fraction of their potential resources. Kubernetes does not enforce limits aggressively when there is no contention. This creates the illusion that default settings are sufficient.

When traffic increases, applications begin to compete for CPU and memory. At that point, Kubernetes must make decisions quickly. If resource definitions are unrealistic, the scheduler and the kubelet respond in ways that feel unpredictable.

What Resource Requests Really Mean

A resource request is a promise. When a pod declares a CPU or memory request, it is telling Kubernetes how much it needs to function reliably. The scheduler uses this information to decide where the pod can run.

Consider a simple deployment:

resources:
  requests:
    cpu: "250m"
    memory: "256Mi"

This configuration means Kubernetes will place the pod only on a node that has at least 250 millicores of CPU and 256 MiB of memory available. Even if the pod uses much less most of the time, the scheduler reserves this capacity.

If requests are set too low, Kubernetes may pack too many pods onto the same node. Everything works until load increases. At that point, the node becomes overwhelmed.

What Limits Actually Do

Limits define the maximum resources a container can use.

For CPU, exceeding the limit causes throttling. The application does not crash but becomes slower. Latency increases and timeouts appear.

For memory, exceeding the limit results in an immediate termination. The container is killed with an Out Of Memory error and restarted if a restart policy exists.

Example:

resources:
  limits:
    cpu: "500m"
    memory: "512Mi"

If the application suddenly needs more memory during a traffic spike, Kubernetes will not negotiate. The container is terminated.

A Common Real World Failure Scenario

Imagine an API service running three replicas on a small cluster. Each pod has low requests and tight memory limits.

resources:
  requests:
    cpu: "100m"
    memory: "128Mi"
  limits:
    memory: "256Mi"

During normal usage, memory consumption stays below 150 MiB. Everything looks fine.

Now a traffic spike occurs. More requests mean more objects in memory, larger request payloads, and more concurrent goroutines or threads. Memory usage climbs past 256 MiB. Kubernetes kills the container. The pod restarts and immediately receives traffic again. The cycle repeats.

From the outside, this looks like instability or a Kubernetes bug. In reality, the limits were never aligned with real application behavior.

Why Pod Evictions Appear During Traffic Spikes

Even if limits are not reached, nodes can still come under pressure. When total memory usage on a node approaches capacity, Kubernetes starts evicting pods.

Pods with lower priority and lower memory requests are evicted first. If requests are unrealistically small, critical workloads may be removed before less important ones.

This is why teams sometimes see monitoring agents or tracing systems disappear under load, even though the main application survives.

The Hidden Cost of Copy Pasted Values

Many Helm charts ship with conservative defaults. These values are designed to work everywhere, not to reflect your workload.

Copying these defaults into production without measurement leads to two common problems:

Requests are too low, causing over scheduling and node pressure
Limits are too tight, causing restarts under load

The cluster appears cost efficient but becomes fragile.

Observing the Problem Properly

Kubernetes provides enough signals to understand what is happening if they are used correctly.

kubectl top pod shows real CPU and memory usage over time.
kubectl describe pod reveals throttling, OOMKills, and eviction reasons.
Node metrics show whether failures are isolated or systemic.

The key is to observe during peak traffic, not during quiet periods.

Setting Requests Based on Reality

A practical approach is to observe average usage under moderate load and add a safety margin.

If a service consistently uses 300 MiB of memory during busy periods, setting a request around 350 MiB and a limit around 600 MiB gives Kubernetes room to operate.

Example:

resources:
  requests:
    cpu: "300m"
    memory: "350Mi"
  limits:
    memory: "600Mi"

This configuration allows bursts without immediate termination and helps the scheduler place pods intelligently.

Why Autoscaling Alone Does Not Save You

Horizontal Pod Autoscalers react to metrics. They do not prevent individual pods from being killed. If pods are crashing due to memory limits, scaling replicas only increases the number of failing pods.

Autoscaling works best when each pod is stable under load and requests reflect real needs.

A Stable Cluster Is an Honest Cluster

Kubernetes does exactly what it is told. If resource definitions are optimistic, the cluster becomes optimistic too. Traffic spikes expose this optimism brutally.

Clusters that survive real world traffic are not over provisioned. They are well measured, honestly configured, and continuously observed.

The difference between a calm cluster and a resilient one is rarely the number of nodes. It is almost always the quality of resource requests and limits.

Understanding Pod Evictions: Why Kubernetes Removes Your Pods And How To Prevent It

Sanjeev Kumar Bharadwaj — Wed, 10 Dec 2025 19:31:15 GMT

Running applications on Kubernetes usually feels smooth and stable until one day a pod suddenly disappears. Kubernetes calls this process eviction, and it happens when the cluster decides that removing a pod is the safest way to protect the node or the rest of the workload. Pod evictions can feel mysterious at first, especially when they occur during heavy traffic or during important workloads. Fortunately, Kubernetes eviction patterns are predictable once you understand what triggers them and how to prevent them.

This guide explains why pod evictions happen, how to identify the root cause, and the practical steps you can follow to keep your pods running reliably. Everything is supported with real Kubernetes commands, YAML examples, and explainers that make the entire topic easy to apply in your own cluster.

What Exactly Is A Pod Eviction

A pod eviction happens when the Kubernetes control plane removes a pod from a node because the node is under pressure. This is not the same as a pod crashing. Eviction is a deliberate decision taken by Kubernetes to protect node stability.

When an eviction happens, you will usually see events such as:

The node had disk pressure
The node had memory pressure
The node had PID pressure
The node was unreachable
Evicting Pod due to node condition

Pods that are part of a deployment will be recreated on another node, but single node clusters or clusters with insufficient resources often suffer downtime as a result.

The Main Reasons Why Pods Get Evicted

Although Kubernetes can report several types of pressure, these three are the most common causes.

Memory Pressure

This is the most frequent reason for evictions. When a node runs out of memory, Kubernetes starts removing pods that exceed their memory requests or pods with lower priority.

Common signs:

Eviction messages mentioning memory pressure
OOMKilled events
Node metrics showing high memory usage

Disk Pressure

This happens when the node runs low on either free disk space or free inodes. Logging, large ephemeral storage usage, container images, and runaway volumes often cause this.

Signs include:

Events mentioning disk pressure
Eviction messages citing low disk availability

PID Pressure

This occurs when a node runs out of process identifiers. Too many running processes or certain badly designed sidecars can trigger this.

How To Inspect Pod Evictions In Your Cluster

Checking Events

Events provide the first clue. Run:

kubectl get events --sort-by=.lastTimestamp

Evicted pods usually report messages like:

Evicted Pod         The node had memory pressure
Evicted Pod         The node had disk pressure

Describing The Pod

Even after eviction, the history remains:

kubectl describe pod  -n

Look for the “Status” and “Last State” sections. They usually contain a reason such as:

Reason: Evicted
Message: The node had memory pressure

Checking Node Conditions

Nodes show their pressure state clearly:

kubectl describe node

You might see conditions like:

MemoryPressure   True
DiskPressure     True
PIDPressure      False

How Kubernetes Decides Which Pods To Evict

Kubernetes follows a predictable order when choosing which pods to evict.

Pod Priority

Pods with higher priority are protected. Lower priority pods face eviction first.

Example of a priority class:

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: critical-service
value: 100000
globalDefault: false
description: "Critical system components"

Use it in a pod or deployment:

spec:
  priorityClassName: critical-service

Resource Requests

Pods that request low memory but actually use much more are frequent eviction targets. Kubernetes tries to preserve pods that are within their declared requests.

How To Prevent Pod Evictions

Set Proper Resource Requests And Limits

One of the strongest defenses against eviction is correct sizing. A pod that consumes far more memory than it requests will be evicted during memory pressure.

Example:

resources:
  requests:
    cpu: "200m"
    memory: "512Mi"
  limits:
    cpu: "500m"
    memory: "1Gi"

Requests tell Kubernetes how much the pod needs to run reliably. If these numbers are too low, Kubernetes will treat the pod as a low priority candidate for eviction.

Enable Limit Ranges And Resource Quotas

In shared namespaces, this prevents runaway pods from disrupting others:

apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
spec:
  limits:
  - type: Container
    defaultRequest:
      memory: 256Mi
      cpu: 100m

Use Pod Priority Classes

Critical workloads deserve higher protection, especially in clusters with limited nodes.

Avoid Overcommitting The Node

Overcommitting CPU is safe, but overcommitting memory is dangerous. Pods will be removed during memory pressure, even if your deployment expects them to stay alive.

Manage Disk Usage Properly

Disk pressure is often caused by:

Large logging output
Growing emptyDir volumes
Too many container images on the node

Use log rotation or a logging agent with proper configuration.

Use Node Autoscaling When Possible

In managed Kubernetes like AKS or EKS, enabling cluster autoscaling significantly reduces unwanted evictions because new nodes appear when pressure increases.

A Real Example: Diagnosing And Fixing A Memory Eviction

Imagine a cluster using a single Standard D3 class VM. Under heavy traffic, one of your application pods disappears. The event log shows:

Evicted: The node had memory pressure

Next steps:

Step 1: Check Node Memory

kubectl describe node

Output:

MemoryPressure   True

Step 2: Compare Consumption With Requests

You notice the pod requests only 128Mi but uses over 800Mi during traffic spikes. The node cannot allocate enough memory, so Kubernetes removes the pod.

Step 3: Fix The Deployment

Update the deployment with realistic memory requests.

resources:
  requests:
    memory: "512Mi"
  limits:
    memory: "1Gi"

Step 4: Add A PriorityClass If Necessary

Critical services can be protected with a higher priority.

Step 5: Monitor

Use Prometheus, Grafana, or Kubelet metrics to observe memory growth and make adjustments.

if you want to see why monitoring (and a minimal kubernetes loki setup) is very important, read my previous blog:
https://blog.nyzex.in/why-observability-is-the-unsung-hero-in-modern-cloud-applications

After this fix, the pod no longer gets evicted during traffic spikes.

Conclusion

Pod evictions are not random events. They are Kubernetes actively protecting your cluster from resource exhaustion. Once you understand the signals that trigger these evictions and how Kubernetes chooses which pods to remove, you can build workloads that are far more resilient.

By setting correct resource requests, monitoring node pressure, controlling logging and ephemeral storage, tuning pod priorities, and sizing nodes properly, you ensure that your critical applications continue running without interruption.

Why Observability is the Unsung Hero in Modern Cloud Applications

Sanjeev Kumar Bharadwaj — Mon, 08 Dec 2025 19:27:56 GMT

Modern cloud applications are incredibly powerful, but with great power comes great complexity. Applications today often consist of multiple microservices, databases, and third-party integrations running across distributed environments. This makes it challenging for developers and operations teams to understand what is happening inside the system, especially when something goes wrong. Traditional monitoring can only take you so far. This is where observability becomes essential.

Observability is the practice of designing systems in such a way that their internal state can be inferred from external outputs. It provides actionable insights that help engineers understand, diagnose, and optimize system performance in real time.

Understanding Observability

Before diving into tools, it is important to distinguish monitoring, logging, and observability:

Monitoring collects predefined metrics such as CPU usage, memory usage, or request latency. Alerts notify teams when thresholds are crossed.
Logging captures events and errors, providing detailed records of system behavior for debugging.
Observability goes further. It combines metrics, logs, and traces to give engineers a complete understanding of system behavior. It allows you to ask “why” something happened, not just “what” happened.

Observability enables answers to questions like:

Why did a specific request take unusually long to process?
Which microservice caused a failure in a transaction?
How does a change in one component affect others across the system?

Real-World Example: An E-Commerce Platform

Consider an e-commerce platform built using microservices for inventory, payment, and shipping. During a holiday sale, the checkout process slows down dramatically. Without observability, the engineering team might only see high CPU usage in one service but not understand the root cause.

By implementing observability, the team can:

Trace requests from the frontend through the payment, inventory, and shipping services.
Analyze logs to identify error spikes in payment validation.
Inspect metrics to find latency bottlenecks in database queries.

In this example, observability allows the team to pinpoint the root cause: a slow payment gateway integration rather than blindly optimizing unrelated services.

Observability Tools and How to Use Them

Here are some widely used tools for observability, along with installation and basic usage examples.

1. Jaeger (Distributed Tracing)

Jaeger helps track requests as they move through microservices.

Installation (local setup with Docker):

docker run -d --name jaeger \
  -e COLLECTOR_ZIPKIN_HTTP_PORT=9411 \
  -p 5775:5775/udp \
  -p 6831:6831/udp \
  -p 6832:6832/udp \
  -p 5778:5778 \
  -p 16686:16686 \
  -p 14268:14268 \
  -p 14250:14250 \
  -p 9411:9411 \
  jaegertracing/all-in-one:1.41

Usage:

Access the Jaeger UI at http://localhost:16686.
Send traces from your application using OpenTelemetry or Jaeger client libraries.
Explore request paths to identify latency and bottlenecks.

2. Prometheus (Metrics Collection)

Prometheus collects and stores metrics from your application.

Installation (local setup with Docker):

docker run -d --name prometheus \
  -p 9090:9090 \
  -v $PWD/prometheus.yml:/etc/prometheus/prometheus.yml \
  prom/prometheus

Example prometheus.yml:

global:
  scrape_interval: 15s

scrape_configs:
  - job_name: 'my-app'
    static_configs:
      - targets: ['host.docker.internal:8000']

Usage:

Access Prometheus at http://localhost:9090.
Query metrics such as request duration or error rate.
Combine with Grafana to create dashboards.

3. Grafana (Visualization)

Grafana provides rich dashboards for visualizing metrics and logs.

Installation (local setup with Docker):

docker run -d -p 3000:3000 --name=grafana grafana/grafana

Usage:

Access Grafana at http://localhost:3000.
Connect Prometheus as a data source.
Build dashboards to monitor service performance and visualize latency, error rates, and request throughput.

When working with kubernetes, I tend to use loki-stack helm chart as it comes with everything built-in!


loki:
  enabled: true
  isDefault: true
  url: http://{{(include "loki.serviceName" .)}}:{{ .Values.loki.service.port }}
  readinessProbe:
    httpGet:
      path: /ready
      port: http-metrics
    initialDelaySeconds: 45
  livenessProbe:
    httpGet:
      path: /ready
      port: http-metrics
    initialDelaySeconds: 45
  datasource:
    jsonData: "{}"
    uid: ""


promtail:
  enabled: true
  config:
    logLevel: info
    serverPort: 3101
    clients:
      - url: http://{{ .Release.Name }}:3100/loki/api/v1/push


grafana:
  enabled: true
  adminUser: "admin"
  adminPassword: "devops123"
  image:
    #tag: 10.3.3
    tag: 11.4.0
  datasources:
    datasources.yaml:
      apiVersion: 1
      datasources:
        - name: Prometheus
          type: prometheus
          access: proxy
          url: http://{{ include "prometheus.fullname" . }}:9090
        - name: Loki
          type: loki
          access: proxy
          url: http://{{ include "loki.fullname" . }}:3100



prometheus:
  enabled: true
  isDefault: false
  server:
    service:
      servicePort: 9090
  url: http://{{ include "prometheus.fullname" .}}:{{ .Values.prometheus.server.service.servicePort }}{{ .Values.prometheus.server.prefixURL }}
  datasource:
    jsonData: "{}"


filebeat:
  enabled: false


logstash:
  enabled: false


fluent-bit:
  enabled: false

I just simply apply it with:

helm repo add grafana https://grafana.github.io/helm-charts  
helm repo update

helm upgrade --install loki-stack grafana/loki-stack -n loki -f loki_stack_values.yaml

To check:

nyzex@nyzex-systems % kubectl get pods -n loki                                                      
NAME                                                 READY   STATUS    RESTARTS   AGE
loki-stack-0                                         1/1     Running   0          20m
loki-stack-alertmanager-0                            1/1     Running   0          20m
loki-stack-grafana-7d4fdcd58c-cs8fk                  2/2     Running   0          20m
loki-stack-kube-state-metrics-fb7f548d6-jg2cq        1/1     Running   0          20m
loki-stack-prometheus-node-exporter-cg57k            1/1     Running   0          20m
loki-stack-prometheus-pushgateway-5649b6944b-9k9fj   1/1     Running   0          20m
loki-stack-prometheus-server-5c8c8f584d-6chxx        2/2     Running   0          20m
loki-stack-promtail-blwfh                            1/1     Running   0          20m

nyzex@nyzex-systems % kubectl get svc -n loki 
NAME                                  TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
loki-stack                            ClusterIP   10.43.174.77            3100/TCP   21m
loki-stack-alertmanager               ClusterIP   10.43.32.102            9093/TCP   21m
loki-stack-alertmanager-headless      ClusterIP   None                    9093/TCP   21m
loki-stack-grafana                    ClusterIP   10.43.16.157            80/TCP     21m
loki-stack-headless                   ClusterIP   None                    3100/TCP   21m
loki-stack-kube-state-metrics         ClusterIP   10.43.119.248           8080/TCP   21m
loki-stack-memberlist                 ClusterIP   None                    7946/TCP   21m
loki-stack-prometheus-node-exporter   ClusterIP   10.43.20.145            9100/TCP   21m
loki-stack-prometheus-pushgateway     ClusterIP   10.43.113.177           9091/TCP   21m
loki-stack-prometheus-server          ClusterIP   10.43.201.93            9090/TCP   21m

Then we can just add ingress to the grafana and we are good to go!

This has been depreciated as of December 2025

https://artifacthub.io/packages/helm/grafana/loki-stack

Putting It All Together

By combining these tools:

Prometheus provides metrics and system health data.
Jaeger provides request traces across microservices.
Grafana visualizes both metrics and traces.

For example, a slow API call can be traced using Jaeger, the request load can be seen in Prometheus, and a Grafana dashboard can provide a real-time view of system performance. This makes identifying and resolving issues faster and more reliable.

Practical Tips for Observability

Start Small: Focus on critical services and gradually expand coverage.
Instrument Key Components: Ensure metrics, logs, and traces are collected for all important paths.
Automate Alerts: Configure alerts in Prometheus or Grafana to proactively detect anomalies.
Review and Iterate: Observability is an ongoing process. Learn from incidents and refine instrumentation.

Conclusion

Observability is no longer optional for modern cloud applications. While monitoring and logging provide snapshots of system behavior, observability delivers a deep understanding of how systems operate and interact. By implementing tools like Jaeger, Prometheus, and Grafana, teams can diagnose problems faster, optimize performance, and ensure a reliable user experience. Observability is the unsung hero that allows engineers to manage complexity effectively, transforming chaos into clarity.

How I Automated GitHub Repository Deployments Into K3s Using SSH and Helm

Sanjeev Kumar Bharadwaj — Fri, 05 Dec 2025 19:02:42 GMT

Building a smooth deployment pipeline for Kubernetes is usually a challenge for small teams and individual developers. Many solutions require complex CI pipelines, container registries, secret management and multiple integration steps. My goal was to create something simpler. I wanted a system that could take any GitHub repository containing a Dockerfile, build the image on a remote server, and deploy it directly into my K3s cluster.

This article explains the complete architecture and the final workflow that made this possible. It uses SSH for secure communication and Helm for templated Kubernetes deployments. Everything happens automatically once a repository is selected.

I have kept this blog short, as it was a large Proof of Concept (PoC), we will revisit this in detailed manner, some other day :)

The Overall Objective

The intention was to create a simple but powerful “Select a GitHub repository and deploy it to K3s” experience. Achieving this required solving the following problems:

Authenticating users through GitHub OAuth
Cloning the selected repository on the remote server
Building a Docker image using the repository’s Dockerfile
Extracting the application port from the Dockerfile
Pushing the image to a container registry when required
Generating Kubernetes manifests dynamically
Applying them to a K3s cluster using remote kubectl
Keeping the workflow secure without exposing SSH passwords

With these requirements in mind, I built the system around a Streamlit based interface, a backend with SSH utilities, and a Helm driven deployment mechanism.

Authentication Through GitHub OAuth

The first step was allowing users to log in with GitHub and access their repositories. During OAuth callback, I stored the GitHub username and token in a database so that future interactions could occur without requesting credentials again.

Once authenticated, the application fetched all repositories that the user had access to. The user could then pick the repository that needed deployment. This made the process extremely convenient because there was no need for manual cloning or token entry.

Cloning the Repository Over SSH

After a repository was selected, the system connected to the remote host where K3s was installed. The connection used a PEM key, not a password. This removed the need to expose secrets and made the communication secure.

The repository was cloned into a temporary directory on the remote server:

git clone https://github.com//.git /tmp/deploy/

The entire build and deployment workflow was executed inside this directory.

Building the Docker Image Remotely

Instead of building the image locally and pushing it to a registry, I chose to build the image directly on the remote server. This avoided large uploads and made the pipeline significantly faster.

sudo docker build -t :latest .

Since I used K3s with containerd by default, I installed Docker separately and configured the system so that it could load images into K3s. This was done using the Docker to containerd import step whenever necessary.

If AWS ECR integration was enabled, the image was tagged and pushed to the registry. That part was optional and only used in certain deployments.

Reading the Application Port From the Dockerfile

Many applications expose ports through the Dockerfile. To make deployments dynamic, I extracted the port by scanning for the EXPOSE instruction:

EXPOSE 8080

If this instruction was present, it became the container port in the Kubernetes Deployment manifest. If it was missing, the system used a default value that could be configured.

This simple extraction made deployments far more flexible. It removed the need for manual adjustments each time a repository changed its application port.

Generating Kubernetes Manifests Automatically

Each repository received its own namespace. Namespaces were created if they did not already exist:

kubectl create namespace

I used Helm to generate the Deployment, Service and optional Ingress files. A lightweight chart template was created and values were injected programmatically. The values included:

image name
container port
replica count
environment variables
namespace

The resulting Helm command looked like this:

helm upgrade --install  ./chart \
  --namespace  \
  --set image.repository= \
  --set image.tag=latest \
  --set containerPort=

Helm provided a clean way to handle templating and versioning without writing YAML repeatedly.

Applying the Manifests to K3s

Once Helm processed the templates, the K3s cluster applied everything instantly. The deployment became active with one replica. The Service exposed the container port. If Ingress was configured, the application became public with a stable URL.

All of this ran within the remote environment using SSH commands executed from my Python application.

The Final Automated Workflow

After completing the entire pipeline, this became the final experience for the user:

Log in with GitHub.
Select a repository from the list.
Click “Deploy”.

Behind the scenes, the system handled:

cloning
image building
port extraction
manifest creation
Helm deployment

The user did not need to interact with Docker commands, kubectl or YAML files. Everything was automatic.

Challenges and Solutions

1. Large repositories and slow builds
To solve this, I ensured that the remote server had cached layers whenever possible by reusing previous build directories.

2. Managing SSH timeouts
Increasing the SSH keep alive configuration and using a resilient execution wrapper helped prevent failures during long deployments.

3. Ensuring reproducible deployments
Helm values files were logged to the database so that each deployment had a traceable configuration history.

4. Handling errors gracefully
Whenever Docker or kubectl produced errors, they were captured and displayed in the Streamlit UI so that the user could fix the repository or Dockerfile.

Benefits of This Approach

This approach is not designed to replace full CI systems. However, it solves a specific problem extremely well. It provides a fast method to deploy small to medium applications into K3s without the overhead of pipelines, registries or private runners.

Some benefits include:

Very quick setup for new applications
No complex infrastructure required
Secure SSH communication
Automatic image handling
Helm based templating
Namespaced isolation for each deployment

It is ideal for personal projects, prototypes, microservices or self-hosted internal tools.

Conclusion

Automating GitHub repository deployments into a K3s cluster became an enjoyable project. By combining SSH, Docker, Kubernetes and Helm into a single workflow, I created a flexible and dynamic deployment system. It saves time, reduces manual work and makes it possible to deploy new applications with a simple click.

Mastering Port Forwarding as a Service: Running Kubernetes Port Forwards with systemd

Sanjeev Kumar Bharadwaj — Tue, 02 Dec 2025 19:56:08 GMT

Kubernetes port forwarding is extremely useful for quick access to internal services. It allows local tools to reach cluster applications without exposing them through Ingress or LoadBalancer services. The problem is that port forwarding breaks easily. It stops when the terminal closes, when the network changes, or when the forwarding process restarts. Many engineers use it only during development because of these limitations.

The good news is that port forwarding can be turned into a persistent background service that runs automatically on boot and restarts on failure. This can be achieved by running kubectl port-forward under systemd. Once configured, the forwarding behaves like any other system-level service.

Understanding the Port-Forward Setup for Postgres

The goal was to expose a PostgreSQL instance running inside your Kubernetes cluster so that it could be accessed securely from another machine. Instead of exposing the database publicly or creating a LoadBalancer, you used a port-forward running as a persistent systemd service. This behaves very much like an SSH tunnel, but managed automatically by Kubernetes.

Accessing a PostgreSQL database that lives inside a Kubernetes cluster often starts with a quick command:

kubectl port-forward pod/postgres 15432:5432 -n postgresql

It works.
It is simple.
And it also breaks the moment:

your SSH session closes
the terminal dies
the pod restarts
network hiccups occur

For development or for connecting external applications, this becomes frustrating. You want PostgreSQL running inside the cluster to feel like it is running locally. Always reachable. No interruptions.

This is exactly where a persistent port-forwarding setup becomes extremely useful.

In this blog, I explain how I automated PostgreSQL port-forwarding using:

a simple shell script
a systemd service
dynamic pod detection
persistent reconnection

This ensures that even if the pod restarts or the port-forward crashes, the system automatically brings it back up.

Why Manual Port Forwarding Fails Over Time

Port-forwarding is not designed for long-term, production-grade networking. It is a debugging convenience.

When you run:

kubectl port-forward pod/postgres 15432:5432

You are telling Kubernetes:

Open 15432 on your local machine
Forward all traffic from this local port
To port 5432 inside the postgres pod

The moment the terminal stops, or the pod is replaced by a new one during a deployment or restart, the connection is lost.

This leads to:

connection refused
ECONNRESET
your app cannot connect to the database
your scripts or migrations failing mid-run

The ideal fix is using a LoadBalancer, NodePort, an internal service mesh, or VPN into the cluster.
But when that is not possible (for example in a locked-down internal environment), persistent port-forwarding is a surprisingly useful workaround.

Solution Overview

We will set up:

A shell script that:
- Continuously finds the current PostgreSQL pod
- Establishes the port-forward
- Automatically retries if the pod changes or the forward crashes
A systemd service that:
- Runs this script in the background
- Starts automatically on boot
- Restarts on failure

This results in a self-healing port-forward that always stays alive.

Step 1: The Final Working Script

Save it as:

/home/ubuntu/pg-portforward.sh

#!/bin/bash
set -e

export KUBECONFIG="${HOME}/.k0s/kubeconfig"

NAMESPACE="postgresql"
LOCAL_PORT=15432
REMOTE_PORT=5432

while true; do
  POD=$(kubectl get pod -n $NAMESPACE -l app=postgres \
        -o jsonpath='{.items[0].metadata.name}' 2>/dev/null)

  if [ -z "$POD" ]; then
    echo "[$(date)] Postgres pod not found. Retrying in 5s..."
    sleep 5
    continue
  fi

  echo "[$(date)] Forwarding to pod: $POD"
  kubectl -n $NAMESPACE port-forward pod/$POD ${LOCAL_PORT}:${REMOTE_PORT}

  echo "[$(date)] Port-forward crashed. Restarting in 5s..."
  sleep 5
done

Make it executable:

chmod +x pg-portforward.sh

Step 2: The systemd Unit File

Create:

/etc/systemd/system/pg-portforward.service

[Unit]
Description=Persistent port-forward for PostgreSQL pod
After=network.target

[Service]
User=ubuntu
Environment="KUBECONFIG=/home/ubuntu/.k0s/kubeconfig"
Environment=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
ExecStart=/home/ubuntu/pg-portforward.sh
Restart=always
RestartSec=5
KillMode=process

[Install]
WantedBy=multi-user.target

Enable and start it:

sudo systemctl daemon-reload
sudo systemctl enable pg-portforward
sudo systemctl start pg-portforward
sudo systemctl status pg-portforward

How This Works

1. Dynamic Pod Discovery

PostgreSQL pods often restart during:

upgrades
node draining
scaling events

A static pod name does not work.
The script uses a label selector:

-l app=postgres

You can substitute whatever label your Helm chart applies.

2. Automatic Reconnection

If the port-forward dies (common), the script simply loops and starts again.

3. systemd keeps it alive

If the script itself fails, systemd restarts it.

If the machine reboots, the service auto-starts.

This ensures PostgreSQL inside Kubernetes remains reachable on localhost:5432 at all times.

Why a Systemd Service

Port-forwarding dies when:

the pod restarts
the connection breaks
kubectl crashes
the terminal closes

To keep the port-forward alive forever, you wrapped it in:

a small bash script that loops forever, automatically reconnecting
a systemd service that starts on boot and restarts on failure

This means your Postgres database is always reachable through that forward without babysitting the terminal.

Testing the Setup

From your local machine:

psql -h  -p 15432 -U postgres -d yourdb

If SSH-tunneling:

ssh -L 5432:localhost:15432 ubuntu@
psql -h localhost -p 5432

If port-forward is running correctly, the connection will be instant.

In DBBeaver:

Common Issues and Fixes

kubectl not found inside systemd

systemd does not automatically inherit your PATH.
This is why we added:

Environment=PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

KUBECONFIG not respected

Your kubeconfig was located here:

/home/ubuntu/.k0s/kubeconfig

So we exported it in both the script and the service.

Pod name changing

The script automatically handles this.

Security Benefits

You avoided exposing Postgres publicly. Only users who can SSH into the server can reach the database. No load balancers. No ingress. No nodePort. Just a controlled, encrypted tunnel.

Conclusion

Port forwarding is usually thought of as a temporary debugging feature, but with a small wrapper script and a systemd unit, it becomes a powerful mechanism for stable access to internal services. For PostgreSQL in particular, this method delivers reliable availability without relaxing cluster security.

This setup is easy to maintain, self-healing, and ideal for environments where PostgreSQL must be reachable consistently through a trusted machine.

Brainwave Visualization Using ESP32, BioAmpEXG, FastAPI, and Interactive Charts

Sanjeev Kumar Bharadwaj — Mon, 01 Dec 2025 03:49:24 GMT

Monitoring brainwave activity in real time has always fascinated me. I wanted to build something that could collect EEG signals, process them on a lightweight device, and display the results on a clean web dashboard. With an ESP32, a MAX30100 sensor, a small analog EEG input, and a FastAPI backend, I was able to create a portable system that measures Alpha, Beta, and Gamma brainwave bands and visualises them as simple bar graphs.

This article explains how I built the entire pipeline, from data collection to real-time visualisation.

I also wrote a short study based on this and used M5Stack for it:
https://www.researchgate.net/publication/391839761_Short-Term_Neurophysiological_Changes_During_Transcendental_Meditation_A_Pilot_EEG_and_ECG-Based_Study

Why I Built This

I wanted a portable and affordable setup that could:

Collect EEG data through a simple analog pin
Compute frequency bands using Fast Fourier Transform
Add additional biometric data from a MAX30100 sensor
Send all readings to a backend server over Wi-Fi
Show clean and simple charts on a dashboard
Make the system completely wireless

The ESP32 Zero 2 WH (or any small ESP32 board) was perfect because it is inexpensive, efficient, and supports both Wi-Fi and continuous sensor sampling.

Hardware Setup

Components Used

ESP32 (M5Core2 or Zero 2 WH)
MAX30100 sensor for infrared and red readings, later used ECG Module
EEG analog signal (BioAmpEXG Pill) connected to an ADC pin
Wi-Fi to push data to backend
Power source (USB or portable battery)

Wiring Overview

MAX30100 SDA → ESP32 SDA pin
MAX30100 SCL → ESP32 SCL pin
EEG analog output → ESP32 ADC pin
Common ground for all components

The MAX30100 is optional for brainwave detection, but I wanted additional IR/RED data to calculate orderliness and signal health. This was the initial choice, but later I switched to AD8232

ESP32 Firmware Logic

The ESP32 collects data continuously. Each cycle does the following:

Read raw EEG analog values
Read IR and RED values from MAX30100
Apply Fast Fourier Transform to EEG samples
Extract band powers:
- Alpha (8 to 12 Hz)
- Beta (12 to 30 Hz)
- Gamma (30 to 100 Hz)
Package everything into a JSON payload
Send the JSON data to the FastAPI backend via Wi-Fi

Example JSON Payload

{
    "alpha": 42.3,
    "beta": 28.1,
    "gamma": 10.5,
    "orderliness": 0.82,
    "ir": 51200,
    "red": 50390
}

Backend: FastAPI Application

The backend receives data, stores it, and serves it to the dashboard.

Key Features

Endpoint for receiving ESP32 JSON data
In-memory store or Redis for fast access
REST endpoint for the dashboard
CORS enabled
Very low latency

Example FastAPI Endpoint

from fastapi import FastAPI
from pydantic import BaseModel

class BrainwaveData(BaseModel):
    alpha: float
    beta: float
    gamma: float
    orderliness: float
    ir: int
    red: int

app = FastAPI()

latest_data = BrainwaveData(
    alpha=0, beta=0, gamma=0, orderliness=0, ir=0, red=0
)

@app.post("/update")
def update(data: BrainwaveData):
    global latest_data
    latest_data = data
    return {"status": "ok"}

@app.get("/data")
def get_data():
    return latest_data

Building the Dashboard

I wanted a clean visualisation with no curves, only bar graphs.
The dashboard uses:

HTML + Bootstrap
Chart.js for bar charts
Auto-refresh using JavaScript
Smooth transitions

Why Bar Graphs?

Bar graphs work well because brainwave bands are relative.
The magnitude of Alpha versus Beta is the most important insight, and bars make comparison easy.

Dashboard Layout

The dashboard has:

A bar graph for Alpha, Beta, and Gamma
A card showing orderliness
A small panel showing IR and RED values
A refresh interval of 1 second

Example Chart.js Code Snippet

const ctx = document.getElementById("brainChart");

const chart = new Chart(ctx, {
    type: "bar",
    data: {
        labels: ["Alpha", "Beta", "Gamma"],
        datasets: [{
            data: [0, 0, 0]
        }]
    },
    options: {
        animation: false,
        scales: {
            y: { beginAtZero: true }
        }
    }
});

async function refreshData() {
    const r = await fetch("/data");
    const d = await r.json();
    chart.data.datasets[0].data = [d.alpha, d.beta, d.gamma];
    chart.update();
}

setInterval(refreshData, 1000);

Here is a graph that I obtained:

How It Works Together

End-to-End Pipeline

ESP32 reads EEG values and MAX30100 values
ESP32 performs FFT and computes band powers
ESP32 sends JSON to FastAPI backend
Dashboard fetches latest data through /data
Chart.js updates the bars in real time

Challenges I Faced

1. Noise in the EEG Signal

Low-cost EEG is noisy.
I had to apply:

Moving average filters
Calibration
Proper grounding
FFT windowing

2. Sampling Rate Stability

To extract accurate brainwave bands, the sampling rate must be stable.
I locked the ESP32 ADC sampling to a consistent interval.

3. Fast Refresh Rendering

Continuous updates caused stuttering until I disabled animation in Chart.js.

Final Result

The dashboard provides a clean and real-time visualisation of:

Alpha, Beta, Gamma brain activity
Signal orderliness
Infrared and red biometric data

It works smoothly on both desktop and mobile browsers and updates once every second.

Future Improvements

I plan to enhance the system with:

WebSocket streaming instead of polling
A rolling timeline view for long sessions
Support for multiple users
A database for storing and analysing sessions
A machine learning model that detects focus, stress, or calmness

Conclusion

This project showed me how much can be done with simple hardware and a clean backend architecture. By combining an ESP32, BioAmpEXG, FFT analysis, FastAPI, and a lightweight dashboard, it is possible to create a fully portable and real-time brainwave monitoring system.

Visualizing Latency Comparisons Between LLM APIs: OpenRouter vs Bedrock

Sanjeev Kumar Bharadwaj — Wed, 26 Nov 2025 13:50:35 GMT

Large Language Models (LLMs) are now integral to modern software applications, powering tasks such as summarization, code generation, and technical explanations. When evaluating multiple LLM APIs, latency, response quality, and consistency are critical. Today, I share a detailed analysis of latency comparison between OpenRouter and Bedrock, along with methodology, visualization, and insights.

Experiment Overview

The primary objective of this experiment was to measure and compare response latency for OpenRouter and Bedrock across multiple prompts. The experiment was designed to capture not only the speed of each API but also its consistency across repeated queries.

Prompts Used

Three representative prompts were chosen for the comparison:

Explain Kubernetes in simple terms for a beginner.
Write a Python function to reverse a linked list.
Summarize the book Atomic Habits in three sentences.

Each prompt was sent five times to each API to generate multiple latency measurements for statistical analysis.

Data Collection Methodology

The latency comparison was conducted using Python, with the following approach:

OpenRouter API Calls:
- Sent HTTP POST requests to the OpenRouter API with the prompt, specifying the gpt-oss-20b model.
- Measured start and end timestamps to calculate latency.
- Extracted the text response from the API JSON payload.
AWS Bedrock API Calls:
- Used the boto3 client to invoke the Bedrock model openai.gpt-oss-20b-1.
- Sent the prompt in the OpenAI-style chat format.
- Measured latency from request initiation to response.
- Extracted the returned text from the API payload.
Data Storage:
- Each query stored the following fields: prompt, repeat number, OpenRouter response, OpenRouter latency, Bedrock response, and Bedrock latency.
- All results were saved into a CSV file (llm_comparison.csv) for analysis and visualization.

This setup ensured a repeatable and reliable dataset for performance analysis and comparison.

Here is a condensed snippet showing the main idea of the comparison script:

for prompt in prompts:
    for i in range(REPEATS):
        or_text, or_time = call_openrouter(prompt)
        print(f"OpenRouter [{i+1}/{REPEATS}] Latency: {or_time:.2f}s")
        br_text, br_time = call_bedrock(prompt)
        print(f"Bedrock   [{i+1}/{REPEATS}] Latency: {br_time:.2f}s")
        data_rows.append({
            "prompt": prompt,
            "repeat": i+1,
            "openrouter_response": or_text,
            "openrouter_latency": or_time,
            "bedrock_response": br_text,
            "bedrock_latency": br_time
        })

This allowed me to build a structured dataset with both responses and latencies for each prompt and repeat.

Latency Analysis

Using the CSV data, we conducted both statistical and visual analysis to compare the APIs.

OpenRouter Latency

Minimum Latency: 2.32 seconds
Maximum Latency: 7.28 seconds
Average Latency: Approximately 4.60 seconds
Observation: OpenRouter exhibited higher variability, particularly for repeated technical explanation prompts.

Bedrock Latency

Minimum Latency: 2.00 seconds
Maximum Latency: 3.24 seconds
Average Latency: Approximately 3.05 seconds
Observation: Bedrock was consistently faster and more stable across repeats and prompt types.

Prompt-Specific Patterns

Kubernetes Explanation: Bedrock consistently responded under 3 seconds, while OpenRouter spiked to over 7 seconds in one repeat.
Python Code Reversal: Both APIs performed similarly in early repeats, but Bedrock remained slightly faster.
Book Summarization: Bedrock maintained both speed and stability, whereas OpenRouter showed variability in later repeats.

Visualization Approach

To better understand latency differences, the following visualizations were created:

Boxplot: Shows overall latency distribution for each API, highlighting median, quartiles, and outliers.
Lineplot Per Prompt: Displays latency across repeats for each prompt, revealing consistency and spikes.

These visualizations make trends immediately clear, allowing developers to make informed choices between APIs.

Python Script for Plotting

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

df = pd.read_csv("llm_latency_comparison.csv")

print(df[['openrouter_latency', 'bedrock_latency']].describe())

plt.figure(figsize=(12,6))
sns.boxplot(data=df[['openrouter_latency', 'bedrock_latency']])
plt.title("Latency Comparison: OpenRouter vs Bedrock")
plt.ylabel("Latency (seconds)")
plt.show()

plt.figure(figsize=(14,6))
for prompt in df['prompt'].unique():
    prompt_data = df[df['prompt'] == prompt]
    sns.lineplot(x='repeat', y='openrouter_latency', data=prompt_data, label=f'OpenRouter: {prompt}', marker='o')
    sns.lineplot(x='repeat', y='bedrock_latency', data=prompt_data, label=f'Bedrock: {prompt}', marker='o')
plt.title("Latency Trends Per Prompt Repeat")
plt.xlabel("Repeat Number")
plt.ylabel("Latency (seconds)")
plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
plt.show()

We obtained the following from the visualtion:

Insights From the Data

OpenRouter Latency Observations:
- Minimum Latency: 2.32 seconds
- Maximum Latency: 7.28 seconds
- Average Latency: Approximately 4.60 seconds
- Variability: Significant across different prompts and repeats, indicating inconsistent performance under certain queries.
Bedrock Latency Observations:
- Minimum Latency: 2.00 seconds
- Maximum Latency: 3.24 seconds
- Average Latency: Approximately 3.05 seconds
- Variability: Much lower than OpenRouter, indicating more consistent performance.
Prompt-Specific Trends:
- For Kubernetes explanation prompts, OpenRouter latency increased up to 7.28 seconds in the fourth repeat, while Bedrock remained under 3 seconds.
- For code generation prompts, both APIs performed similarly in early repeats, but Bedrock consistently had faster responses.
- For book summarization, Bedrock was faster and more stable, with lower standard deviation.

Takeaways

Consistency Matters: Bedrock is more predictable, making it preferable for real-time applications.
Measure Repeats: Single API calls can be misleading; repeated measurements reveal stability.
Latency vs. Prompt Complexity: Certain prompts can trigger spikes in OpenRouter latency, which developers should consider for production workloads.
Data-Driven Decision Making: Structured data collection enables informed API selection.

Conclusion

This experiment shows that Bedrock provides lower and more consistent latency across prompts and repeated queries compared to OpenRouter. Collecting and visualizing latency not only reveals performance differences but also helps developers make informed choices about which API to integrate for production systems.

By sharing both the data collection and visualization workflow, I hope to provide a practical template for evaluating LLM APIs for real-world projects.

How I Migrated an AKS Cluster Across Regions Using Velero

Sanjeev Kumar Bharadwaj — Tue, 25 Nov 2025 18:21:53 GMT

Migrating an entire Kubernetes cluster is one of those tasks that sounds straightforward until you actually begin. When I recently needed to migrate an Azure Kubernetes Service (AKS) cluster from the Central India region to the East US region, the process involved more considerations than simply exporting YAML files and applying them elsewhere.

The requirement was clear. Migrate every workload, every Persistent Volume Claim, every Secret, every ConfigMap, every Custom Resource Definition, and all namespace-level objects. Everything had to move with accuracy.

After evaluating different approaches, Velero turned out to be the most reliable and practical tool for a complete AKS region migration. Velero supports backup and restoration of cluster state as well as persistent storage, and its Azure plugin works smoothly with both Azure Blob Storage and Azure Disk snapshots.

This guide describes the exact steps I followed, the challenges I encountered, and the final workflow that resulted in a successful migration.

1. Why Velero Is Ideal for AKS Region Migration

Velero has several advantages for full-cluster migration:

Backup and restore of namespaced and cluster-level resources

This includes Deployments, StatefulSets, Services, Secrets, ConfigMaps, RBAC objects, and Custom Resources.

Support for disk snapshots

This is essential when migrating workloads that depend on Persistent Volume Claims.

Version compatibility with AKS

Velero supports most Kubernetes API versions used in AKS.

Easy restoration into a completely new cluster

This is useful when the source and destination are in different regions.

2. Preparing Azure for Velero

Velero requires an Azure Blob Storage account and a resource group for snapshot management.

Step 1: Create a Storage Account

Choose a region for the storage account. You can use either the source or destination region because Velero backups are region independent.

az storage account create \
    --name veleroaccount123 \
    --resource-group velero-rg \
    --location eastus \
    --sku Standard_GRS

Step 2: Create a Blob Container

az storage container create \
    --name velero-backups \
    --account-name veleroaccount123

Step 3: Create a Service Principal or Use a Managed Identity

In my case, I used a managed identity created for Velero with the required permissions:

Contributor role on the Resource Group
Storage Blob Data Contributor on the Storage Account

This avoids the need for storing client secrets.

3. Installing Velero on the Source AKS Cluster

Once the storage and identity were ready, I installed Velero on the source cluster.

Step 1: Install Velero CLI

brew install velero

(Or download from the official GitHub releases page if not using macOS.)

Step 2: Install Velero on the Cluster

velero install \
    --provider azure \
    --plugins velero/velero-plugin-for-microsoft-azure:v1.10.0 \
    --bucket velero-backups \
    --secret-file ./credentials-velero \
    --backup-location-config resourceGroup=velero-rg,storageAccount=veleroaccount123 \
    --use-volume-snapshots=true \
    --snapshot-location-config resourceGroup=velero-rg

Now Velero runs inside the cluster with all necessary permissions.

4. Creating Backups

I wanted a complete backup of every namespace, every CRD, and all persistent volumes.

Step 1: Confirm Velero Is Healthy

velero version
kubectl get pods -n velero

Step 2: Run a Full Cluster Backup

velero backup create aks-full-backup \
    --include-namespaces '*' \
    --wait

The backup took several minutes because the cluster had StatefulSets and large PVCs.

Step 3: Verify the Backup

velero backup describe aks-full-backup
velero backup logs aks-full-backup

At this stage I had a complete backup stored in Azure Blob Storage and snapshots created for all PVCs.

5. Creating the Destination AKS Cluster

I created a new cluster in East US with the same Kubernetes version as the source cluster. Matching the Kubernetes version is important because restoring cluster-level objects may otherwise cause compatibility issues.

az aks create \
  --resource-group uspreprod-rg \
  --name uspreprod-aks \
  --location eastus \
  --node-count 3 \
  --kubernetes-version 1.29

After the cluster was ready, I connected to it:

az aks get-credentials \
    --resource-group uspreprod-rg \
    --name uspreprod-aks \
    --overwrite-existing

6. Installing Velero on the Destination Cluster

The installation process is similar to the source cluster:

velero install \
    --provider azure \
    --plugins velero/velero-plugin-for-microsoft-azure:v1.10.0 \
    --bucket velero-backups \
    --secret-file ./credentials-velero \
    --backup-location-config resourceGroup=velero-rg,storageAccount=veleroaccount123 \
    --use-volume-snapshots=true \
    --snapshot-location-config resourceGroup=velero-rg

Velero now has access to the same storage account and snapshots that were created from the source cluster.

7. Restoring the Full Backup

Step 1: Trigger the Restore

velero restore create aks-full-restore \
    --from-backup aks-full-backup \
    --wait

Velero recreated:

Namespaces
Deployments
StatefulSets
Services
Ingress objects
Secrets and ConfigMaps
CRDs and CRs
Everything linked to snapshots

Step 2: Verify Restore Objects

velero restore describe aks-full-restore
velero restore logs aks-full-restore

Step 3: Validate Cluster State

I validated that workloads came up correctly:

kubectl get pods --all-namespaces
kubectl get pvc --all-namespaces
kubectl get ingress --all-namespaces

Any workload that depended on Persistent Volumes was able to recover because the Azure snapshots were restored successfully.

8. Issues Encountered and Fixes

Missing CRDs Before Restore

Some CRDs must exist before restoring their corresponding objects.
Solution: Install CRDs manually or let Velero restore cluster-level CRDs first.

Snapshot Restore Delay

Azure snapshots sometimes take time to rehydrate into new disks.
Solution: Wait a few minutes and reapply StatefulSets if needed.

Identity Permission Issues

The managed identity must have Contributor access on both resource groups.
Without this, PVC restore will fail.

Ingress Controller Differences

The new cluster may create a different external IP for the ingress controller.
Update DNS records accordingly.

9. Final Thoughts

Migrating an AKS cluster across regions can feel overwhelming due to the number of moving parts involved. Velero simplifies the process significantly by offering a predictable and reliable way to back up and restore clusters at scale.

In my case, Velero successfully migrated every namespace, every workload, and every Persistent Volume from the Central India AKS cluster to a completely new cluster in East US. The process was clean, repeatable, and did not require manual recreation of YAML files.

If you are planning a similar migration, I strongly recommend preparing the destination cluster with the same Kubernetes version, ensuring proper identity permissions, and validating your snapshot restores.

Velero is a powerful tool, and with the right configuration, it can handle migrations across regions with very little manual effort.

Deploying Harbor on Kubernetes: A Step-by-Step Guide

Sanjeev Kumar Bharadwaj — Thu, 20 Nov 2025 19:23:13 GMT

Harbor is a cloud-native registry that allows storing, signing, and scanning container images for vulnerabilities. Deploying Harbor on Kubernetes provides a scalable, highly available registry with integrated security features. In this guide, I will walk you through a full setup on a Kubernetes cluster using Helm, from installation to pushing your first Docker image.

Preparing for Installation

Before starting, ensure that your Kubernetes cluster is ready and Helm is installed. You should also have an ingress controller configured if you plan to expose Harbor externally.

Adding the Harbor Helm Repository

Helm simplifies deploying Harbor on Kubernetes. Begin by adding the Harbor Helm repository and fetching the chart:

helm repo add harbor https://helm.goharbor.io
helm repo update
helm fetch harbor/harbor --untar

Harbor provides detailed documentation for a high-availability deployment:

Harbor HA Helm Installation Guide.

Configuring Values for Your Deployment

Harbor is highly configurable through a values file. I created a values.yaml to suit my environment. Key configurations included:

Exposing Harbor via ingress with TLS enabled using the pangolin certificate.
Enabling persistence for Registry, Jobservice, and Trivy to ensure images are not lost during pod restarts.
Configuring internal components like Redis and PostgreSQL.

Values file excerpts:

expose:
  type: ingress
  tls:
    enabled: true
    certSource: pangolin
  ingress:
    hosts:
      core: harbor.nyzex.in
    className: "nginx"

externalURL: https://harbor.nyzex.in

persistence:
  enabled: true
  resourcePolicy: keep
  persistentVolumeClaim:
    registry:
      size: 10Gi
    jobservice:
      jobLog:
        size: 1Gi
    trivy:
      size: 5Gi

redis:
  enabled: true
  password: "redisStrongPassword"

harborAdminPassword: "Harbor12345"

We have disabled tls here, because it is being handled by pangolin for us. To understand that setup, please check my other blog:

Exposing Kubernetes Services Over the Internet Using MetalLB, NGINX Ingress, and Pangolin

Installing Harbor via Helm

Once your values file is ready, install Harbor in the harbor namespace:

kubectl create namespace harbor
helm install my-release ./harbor -n harbor -f values.yaml

Check the status of the pods and services:

kubectl get pods -n harbor
kubectl get svc -n harbor
kubectl get ingress -n harbor

Retrieving the Admin Password

Harbor generates an admin password automatically if not set. To retrieve it:

kubectl get secret my-release-harbor-core -n harbor -o jsonpath="{.data.HARBOR_ADMIN_PASSWORD}" | base64 --decode

This password allows you to log in to the Harbor UI at https://harbor.nyzex.in.

Fixing the Unauthorized Issue

After logging in, I encountered an unauthorized error while pushing images. The crucial fix was setting:

registry:
  relativeurls: true

This change ensures Harbor correctly handles relative paths for the registry.

Docker Workflow

After logging in, I tested Harbor by building and pushing a sample Docker image.

Sample Dockerfile:

FROM busybox:latest

Build and tag the image:

docker build -t harbor.nyzex.in/myproj/test-image:latest .

Login to Harbor:

docker login harbor.nyzex.in

Push the image:

docker push harbor.nyzex.in/myproj/test-image:latest

The image successfully uploaded, and Harbor started scanning it if Trivy is enabled.

Understanding Harbor Components and Persistence

Harbor consists of several core components, each with its role:

Registry: Stores Docker images. Requires persistent storage to avoid losing images.
Core: Provides the UI, API, and handles authentication.
Jobservice: Executes background jobs, such as image replication or garbage collection.
Redis: Caching and session storage.
PostgreSQL: Stores metadata, configurations, and user information.
Trivy (optional): Performs vulnerability scans on images.

Persistent volumes ensure that each component retains data even if pods restart or are rescheduled.

kubectl get pvc -n harbor

Harbor Registry (`my-release-harbor-registry`)

Role: This is the core Docker registry. All images you push or pull are stored here.
Storage: /storage inside the pod (mapped to PVC my-release-harbor-registry).
Growth: Every docker push increases this PVC.
Key takeaway: This is the main storage to monitor and expand if needed.

Redis (`my-release-harbor-redis`)

Role: Redis is used as cache and message broker for Harbor. It speeds up operations like:
- Session management
- Job queue for replication
- Temporary caching of metadata
Storage: Minimal, mostly in-memory. PVC (data-my-release-harbor-redis-0) is 1Gi because Redis mostly uses RAM.
Growth: Usually does not grow with image pushes; only used for transient data.

Trivy (`my-release-harbor-trivy`)

Role: Trivy is Harbor’s vulnerability scanner. It scans container images for CVEs.
Storage: Stores vulnerability DB and scan cache in its PVC (data-my-release-harbor-trivy-0).
Growth: Increases if you scan many images, because it caches scan results and the CVE database (~5Gi here).
Key takeaway: You don’t need to increase this PVC unless you do tons of scans.

Database (`my-release-harbor-database`)

Role: Harbor’s PostgreSQL database. Stores:
- Users, projects, and roles
- Repository metadata
- Scan results
- Jobs, quotas, and configurations
Storage: Your PVC is small (1Gi). Actual usage is tiny at first.
Growth: Will grow slowly with metadata; pushing images does not increase it significantly.

Jobservice (`my-release-harbor-jobservice`)

Role: Manages asynchronous jobs for Harbor:
- Image replication
- Garbage collection
- Scan jobs (Trivy)
- Retention policies
Storage: Uses its PVC (my-release-harbor-jobservice) minimally for job queues and logs (~1Gi).
Growth: Typically small; no impact on image storage.

Summary:

Component	PVC used	Grows with image push?	Notes
Registry	`my-release-harbor-registry`	Yes	This is your main image storage
Redis	`data-my-release-harbor-redis-0`	No	Only cache, ephemeral data
Trivy	`data-my-release-harbor-trivy-0`	Slightly	Stores scan DB & results
Database	`database-data-my-release-harbor-database-0`	Minor	Stores metadata
Jobservice	`my-release-harbor-jobservice`	Minor	Handles background jobs

Quick check of `/storage`

Now that we know /storage is to be monitored for usage, we can use the following to keep check on it:

kubectl exec -n harbor -it deployment/my-release-harbor-registry -- sh -c "du -sh /storage"

This gives total usage.

Check usage per subdirectory (`blobs` and `repositories`)

kubectl exec -n harbor -it deployment/my-release-harbor-registry -- sh -c "du -sh /storage/*"

Lessons Learned

The registry.relativeurls fix is crucial to avoid push failures.
Persistence ensures images, logs, and scan data are retained safely.
Properly configuring ingress and TLS certificates is essential for secure access.
Redis and PostgreSQL are critical for Harbor functionality and must be monitored.
Docker login, build, and push are straightforward once Harbor is correctly configured.

Deploying Harbor with Helm on Kubernetes is straightforward if you follow these steps carefully. With persistence, security, and a proper workflow, Harbor becomes a reliable, enterprise-ready registry for your container images.

Understanding Kubernetes Networking, Load Balancers, Subnets, and MetalLB

Sanjeev Kumar Bharadwaj — Tue, 18 Nov 2025 19:51:38 GMT

Kubernetes networking often feels abstract when you first begin working with clusters, Services, and Ingress controllers. Many engineers are able to “make things work” without fully understanding how packets travel across machines, how load balancers actually assign IP addresses, or what role MetalLB plays in bare-metal deployments.

This guide removes the mystery. It begins with foundational networking concepts and gradually builds toward a complete, unified understanding of Kubernetes networking, external access, load balancers, ingress controllers, and MetalLB.

If you want a single resource that connects all of these concepts clearly, this article is designed to be your go-to reference.

The Fundamentals: What Is a Network?

A network is a group of connected machines that communicate by sending packets to each other. Every machine on the network must have a unique address so that packets know where to go.

That unique address is an IP address.

What Is an IP Address?

An IP address is a numerical identifier assigned to every device on a network. The most common format is IPv4, which looks like this:

192.168.1.50

Each IP address has two parts:

Network portion
Host portion

The network portion identifies which network the device belongs to.
The host portion identifies which specific device it is.

How do we know where one portion ends and the other begins?

That is decided by the subnet mask.

What Is a Subnet?

A subnet (sub-network) divides a large network into smaller logical networks.

Example:

192.168.1.0/24

Here:

/24 means the first 24 bits are the network portion.
The network contains:
- 192.168.1.0 (network address)
- 192.168.1.1 to 192.168.1.254 (usable host IPs)
- 192.168.1.255 (broadcast address)

This gives 254 usable IPs. So:

256 total addresses (0–255)
254 usable host addresses (1–254)
1 network address (.0)
1 broadcast address (.255)

A subnet tells devices where to send packets.
If an IP is inside the same subnet, communication is local.
If the IP is outside the subnet, traffic goes through a gateway.

Subnets define the boundaries within which Kubernetes nodes, pods, and services receive IP addresses.

Your home router typically gives IPs from this subnet to devices using DHCP.

How Are IPs Allocated in a Network?

There are two ways:

Static allocation

The IP is manually configured.
Example:
Your server is set to always use 192.168.1.200.

Dynamic allocation (DHCP)

Your router assigns IPs automatically.

Most home networks use DHCP for laptops, phones, TV, etc.
Servers and Kubernetes nodes often use static IPs.

How Kubernetes Uses IP Addresses

Kubernetes uses three layers of IP assignment:

1. Node IPs

These are normal IPs assigned by your network (your router or your cloud).
Examples:

192.168.1.10
192.168.1.11

2. Pod IPs

Assigned by Kubernetes CNI (Container Network Interface).
Pods must be reachable from any node.
They belong to the cluster's internal network, such as:

10.244.0.15
10.244.1.9

3. Service IPs

These are stable virtual IPs created by Kubernetes for Services.
Examples:

10.98.50.1
10.109.22.18

Pod IPs change.
Service IPs never change.

Services act as stable front doors that point to Pods.

What Is a Kubernetes Service?

A Service groups Pods and exposes them in predictable ways.

Types of Services

ClusterIP (internal only)
NodePort (opens ports on each node)
LoadBalancer (gets a real external IP)
Headless Service (no virtual IP, direct Pod DNS)

For exposing applications outside the cluster, NodePort and LoadBalancer are relevant.

What Is a Load Balancer?

A load balancer accepts traffic on one IP address and distributes it to backend servers (Pods).

There are two major kinds:

A. Cloud Load Balancers (AWS, Azure, GCP)

When you create:

type: LoadBalancer

The cloud provider:

Allocates a public IP
Creates a load balancer appliance
Forwards traffic to your Service

Example:

52.14.222.8 → Kubernetes Service → Pods

B. Bare-Metal Load Balancers (MetalLB)

On bare-metal or home labs you do not have a cloud provider.
Kubernetes cannot create an external LoadBalancer by itself.

This is where MetalLB comes in.

What Is MetalLB?

MetalLB implements LoadBalancer behavior for bare-metal clusters.

When you create:

type: LoadBalancer

MetalLB:

Picks an IP from a configured pool
Assigns it to the Service
Announces the IP on your local network using ARP or BGP

Your router and devices now believe that your cluster owns that IP.

How MetalLB Assigns IPs?

You create an IPAddressPool, for example:

192.168.29.200 - 192.168.29.220

This is a range of 21 IP addresses.

MetalLB can assign at most 21 LoadBalancer Services at the same time.

This has nothing to do with the number of nodes.

Nodes = compute
Pool = number of external Services

You could have:

1 node
100 nodes
500 nodes

The number of nodes does not change your available LoadBalancer IPs.

What MetalLB Actually Does

MetalLB is a load balancer implementation for bare-metal Kubernetes clusters.

MetalLB does not route traffic like a Layer 7 reverse proxy.
It does not do TLS termination or HTTP routing.
It simply:

assigns external IPs for LoadBalancer services
makes nodes answer ARP/NDP for those IPs

This allows devices on your LAN to send traffic to Kubernetes.

The Layer 2 Mode

Your ConfigMap:

address-pools:
- name: default
  protocol: layer2
  addresses:
  - 192.168.29.200-192.168.29.220

MetalLB will:

take one IP from that range
respond on the network as if the node owns it
direct traffic toward the appropriate service

If you have 21 IPs in that range, you can have 21 LoadBalancer services, regardless of how many Kubernetes nodes exist.

What Happens When a LoadBalancer IP Is Assigned?

Example:

Your ingress-nginx Service gets:

EXTERNAL-IP: 192.168.29.200

MetalLB announces:

"192.168.29.200 is located here"

Either:

the node where nginx is running (ARP mode), or
via BGP to your router (BGP mode)

When a client sends traffic to:

http://192.168.29.200

It reaches the correct node, then:

kube-proxy forwards it to the nginx Pod
nginx reads the host header
nginx forwards to the correct internal Service
Service load balances traffic to Pods
Pod responds back through the chain
The client receives the response

Why the Number of IPs Has No Relation to the Number of Nodes

Nodes have their own IPs from your LAN or DHCP server.
MetalLB uses completely separate IPs from the pool.

For example:

Your LAN might be 192.168.29.0/24
Your nodes may be 192.168.29.101, .102, .103
MetalLB pool might be 192.168.29.200-220

These ranges do not overlap and have different purposes.

Node IPs do not limit the number of LoadBalancer IPs.
LoadBalancer IPs do not limit the number of nodes.

They serve two different layers:

Nodes = physical cluster machines
External IPs = addresses for exposing services

In simpler terms,

Node count = how many machines run Pods
Pool size = how many external IPs are available for Services

You could have:

100 nodes
But only 5 LoadBalancer IPs in MetalLB

Then:

At most 5 Services can have external IPs
But 100 nodes can run thousands of Pods inside the cluster

They are independent resources.

What Is an Ingress Controller?

Ingress defines routing rules:

Example:

app1.example.com → service: app1
app2.example.com → service: app2
grafana.example.com → service: grafana

However, Ingress does not provide an IP.
It needs a Service to expose it.

This is why ingress-nginx uses:

kind: Service
type: LoadBalancer

MetalLB gives this Service an IP.
That single IP can route unlimited HTTP/HTTPS applications using hostnames.

This saves your IP pool.

Nginx Ingress Controller is essentially a Layer 7 HTTP/S reverse proxy inside your cluster.

You expose the Ingress Controller using a LoadBalancer service:

ingress-nginx-controller  → MetalLB assigns 192.168.29.200

Your users hit 192.168.29.200, and Nginx:

receives HTTP/S traffic
routes it to services inside the cluster
handles hostnames, paths, TLS, rate limits, etc.

Full End-to-End Traffic Flow

Step 1: A client requests:

https://app1.example.com

Step 2: DNS resolves it to:

192.168.29.200

(This IP is provided by MetalLB)

Step 3: The packet reaches the node

MetalLB Speaker announced this IP via ARP or BGP.

Step 4: ingress-nginx Pod receives traffic

Its LoadBalancer Service forwards port 80 or 443 to nginx.

Step 5: nginx reads routing rules

Defined in Kubernetes Ingress resources.

Step 6: nginx forwards to the correct Service

Example:

app1-service → Pod(s)

Step 7: Pod responds

Traffic flows back through the same path to the client.

Why Ingress Saves IP Addresses

Without ingress:

10 Services exposed externally = 10 LoadBalancer IPs consumed

With ingress:

10 Services exposed externally = 1 LoadBalancer IP used
nginx handles routing internally

This is the reason most production setups use ingress controllers.

Practical Example IP Planning (Home Lab)

Suppose your subnet is:

192.168.29.0/24

Your router uses:

192.168.29.1

Your DHCP range:

192.168.29.50 - 192.168.29.150

You can choose:

192.168.29.200 - 192.168.29.220

This is:

outside the DHCP range
inside the same subnet
safe to use for MetalLB

This gives 21 external IPs.

Flow and Architecture

A. Network overview

                 +----------------------+
                 |    Your Router       |
                 |   192.168.29.1       |
                 +----------+-----------+
                            |
                            |
                 Local Network (L2)
                            |
           -----------------------------------
           |                 |               |
    +-------------+   +-------------+   +-------------+
    | Kubernetes  |   | Kubernetes  |   | Kubernetes  |
    |   Node 1    |   |   Node 2    |   |   Node 3    |
    |192.168.29.10|   |192.168.29.11|   |192.168.29.12|
    +------+------+   +------+------+   +------ ------+
           |                 |               |
        MetalLB Speaker on all nodes
           |                 |               |
       Announces IPs such as 192.168.29.200

B. Ingress routing

Client → 192.168.29.200 → ingress-nginx → app1-service → app1 Pods
                                              ↳ app2-service → app2 Pods
                                              ↳ grafana-service → grafana Pods

Understanding Load Balancers in General

A load balancer distributes incoming network traffic across multiple targets.
There are two broad categories:

Layer 4 Load Balancers

These operate at the connection level (TCP/UDP).
They see ports and IP addresses only.

Examples:

MetalLB
AWS NLB
HAProxy in TCP mode

Layer 7 Load Balancers

These operate at the application layer (HTTP/S).
They understand paths, headers, cookies, hostnames.

Examples:

Nginx Ingress Controller
Traefik
AWS ALB

In Kubernetes, it is common to use both:

MetalLB for Layer 4 external IP allocation
Nginx for Layer 7 routing

How Everything Connects Together

Here is the conceptual hierarchy:

Level 0 – The Network

You have a subnet such as 192.168.29.0/24
The subnet defines the address space for your LAN
Nodes receive IPs from this range

Level 1 – Kubernetes

Pods get IPs from the Pod CIDR
Services get Cluster IPs
Nodes route traffic internally via CNI

Level 2 – MetalLB

Provides external IPs from a dedicated pool
These IPs map to LoadBalancer services
MetalLB advertises these IPs at Layer 2

Level 3 – Ingress

Receives HTTP/S traffic at the MetalLB IP
Routes requests to internal services and pods
Handles hostnames, TLS, etc.

Level 4 – Your Applications

Finally receive traffic that originated outside the cluster

This layered architecture is what makes Kubernetes networking powerful, scalable, and modular.

Summary Table

Component	Purpose	IP Source	Layer
Node IP	Identify physical machines	LAN/DHCP	Layer 3
Pod IP	Identify individual containers	Pod CIDR	Layer 3
Service IP	Internal virtual service endpoints	Cluster CIDR	Layer 3
MetalLB IP	External access IPs for services	MetalLB pool	Layer 2
Ingress Controller	Routes HTTP/S traffic	Behind MetalLB	Layer 7

Conclusion

Kubernetes networking is far easier to understand when each layer is viewed separately and then combined into a complete model. Nodes receive IPs from your network. Pods receive internal IPs from Kubernetes. Services act as stable access points. Load balancers provide external connectivity. MetalLB brings cloud-style load balancers to bare-metal clusters. It does not limit how many nodes you can have. Ingress controllers consolidate routing so that many applications can share one external IP. Your IP pool only limits the number of LoadBalancer services you can expose, not the number of worker nodes, pods, or applications.

With these concepts understood together, you gain complete control over how your workloads are exposed and how your cluster interacts with the outside world.

Building Your Own Home Kubernetes Cluster with k0s and Remote Access

Sanjeev Kumar Bharadwaj — Mon, 17 Nov 2025 21:45:48 GMT

Kubernetes is the powerhouse of modern container orchestration, but setting it up at home or on minimal infrastructure can feel daunting. In this blog, I will walk you through creating a lightweight, fully functional Kubernetes cluster using k0s, complete with a control plane and a worker node, and make it accessible remotely via a Pangolin tunnel.

By the end, you will have a cluster you can experiment on from anywhere.

Why k0s?

k0s is a lightweight, all-in-one Kubernetes distribution that simplifies the setup process:

Single binary for control plane and worker.
Minimal resource usage: ideal for home servers or VMs.
Easy to manage, yet fully compliant with Kubernetes APIs.
Perfect for learning, experimentation, or small production projects.

This makes it ideal for our goal: a home lab cluster with remote access.

Step 1: Setting Up the Control Plane

The control plane is the “brain” of the Kubernetes cluster: it manages nodes, schedules workloads, and exposes the API server.

Install k0s

sudo apt update && sudo apt install curl -y
curl -sSLf https://get.k0s.sh | sudo bash
k0s version

Next, install the controller and start it:

sudo k0s install controller
sudo k0s start
sudo k0s status

Output example:

Version: v1.34.1+k0s.1
Role: controller
Workloads: false
SingleNode: false

This confirms the control plane is running.

Step 2: Configure kubectl on the Control Plane

To interact with Kubernetes, we need kubectl, the CLI tool.

Generate your kubeconfig:

mkdir -p ~/.kube
sudo k0s kubeconfig admin > ~/.kube/config
chmod 600 ~/.kube/config

Install kubectl:

curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl

Verify the cluster is reachable:

kubectl get nodes

At this point, you have a single-node control plane ready.

Step 3: Add a Worker Node

The worker node is where your workloads (pods, deployments, services) will actually run.

On the control plane, generate a token for the worker:

sudo k0s token create --role=worker

On the worker machine:

sudo apt update && sudo apt install curl -y
curl -sSLf https://get.k0s.sh | sudo bash
nano tokenfile  # paste the token from control plane
sudo k0s install worker --token-file tokenfile
sudo k0s start
sudo systemctl enable --now k0sworker
sudo journalctl -fu k0sworker

Back on the control plane, verify the worker joined:

kubectl get nodes -o wide

Output:

NAME                STATUS   ROLES    AGE   VERSION       INTERNAL-IP     EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION     CONTAINER-RUNTIME
ubuntu-workernode   Ready       20m   v1.34.1+k0s   192.168.29.39           Ubuntu 24.04.3 LTS   6.8.0-87-generic   containerd://1.7.28

Step 4: Accessing Your Cluster Remotely

One of the most exciting parts is accessing your cluster from outside your network. For this, we will use a Pangolin tunnel to expose the control plane.

You can follow my previous blog regarding Pangolin setup:

https://blog.nyzex.in/self-hosting-pangolin-newt-on-your-own-server

Copy your kubeconfig to your remote machine.

#first create the kubeconfig file in the controlplane vm
sudo k0s kubeconfig admin > ~/.kube/config
chmod 600 ~/.kube/config
cat ~/.kube/config

Update the server: field to your Pangolin hostname (after copying this config to our remote machine):

clusters:
- cluster:
    server: https://tunnel.nyzex.in:6443
    insecure-skip-tls-verify: true
  name: local

Note: insecure-skip-tls-verify: true bypasses the TLS hostname check since our certificate is for internal names. This is fine for personal labs, but not recommended for production.

Set your kubeconfig and verify:

export KUBECONFIG=$(pwd)/kubeconfig
kubectl get nodes -o wide

You should see both the control plane and worker node, now accessible remotely.

Step 5: How It Works

Here’s a simple view of the setup:

Control Plane: API server and cluster management.
Worker Node: Runs workloads.
Remote Machine: Access via Pangolin tunnel.

What to do next?

For production-grade security, generate a certificate that includes your external hostname instead of skipping TLS verification.
Add more workers to scale your cluster.
Deploy your first workloads and explore Kubernetes features.

Conclusion

With a few steps, you now have a home Kubernetes lab:

Control plane + worker node cluster.
Remote kubectl access via Pangolin tunnel.
Fully functional, ready to deploy workloads.

This setup is perfect for experimenting with Kubernetes, testing CI/CD pipelines, or just learning cluster management hands-on.

Talos OS: A Hard Earned Understanding Of Storage, Certificates, And Access

Sanjeev Kumar Bharadwaj — Mon, 17 Nov 2025 13:54:56 GMT

Talos OS promises a fully immutable, API driven Kubernetes experience. It removes the idea of logging into nodes, changing files manually, or performing maintenance through traditional means. This design brings a high level of security and predictability. It also brings a set of challenges that many users, including myself, only discover once Talos becomes part of a real cluster.

During my recent effort to run PostgreSQL on a Talos cluster, I experienced several failures, unexpected behaviours, and some difficult recovery situations. This post documents that entire experience. I want this to help anyone who is trying to use Talos in a small cluster or a homelab environment, because the learning curve is very steep.

Understanding Why Storage Becomes Difficult

Talos is an immutable operating system. This sounds ideal until you try to mount storage. Many Kubernetes setups allow you to create directories directly on the node using simple commands. Talos does not allow this approach.

These are the limitations that immediately matter:

You cannot create directories on the host manually
You cannot change permissions manually
You cannot rely on paths that do not already exist
You cannot SSH into the node to fix things
You cannot depend on anything that is not part of the machine configuration

My PostgreSQL deployment required a PersistentVolume backed by local storage. I created a PersistentVolume that pointed to a path like /var/lib/postgres or /mnt/postgres. Each attempt failed with the same error.

MountVolume.NewMounter initialization failed for volume "pv-postgres" : path "/var/lib/postgres" does not exist

Talos refuses to mount a path that does not exist. Since I was unable to create that directory manually, I needed a path that already existed.

The solution was surprisingly simple!

I used a directory that Talos already creates by default.

/var/local

The moment I pointed my PersistentVolume to /var/local, PostgreSQL started successfully. The directory existed, the kubelet accepted it, and the pod finally mounted the data volume.

This taught me the most important lesson about Talos and storage. Any stateful workload that needs a hostPath or a local PersistentVolume must rely on a path that exists at boot through the machine configuration. If the path is not created by Talos, then the pod will fail.

I then discovered a default directory that Talos already creates: /var/local. Pointing my PV to this path worked immediately:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-postgres
spec:
  storageClassName: localstorage
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  local:
    path: /var/local
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
          - talos-1to-jsz

The corresponding PVC was:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: data-postgresql
  namespace: postgresql
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: localstorage
  resources:
    requests:
      storage: 10Gi

Then I applied my deployment manifest, alongside svc:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: postgres
  namespace: postgresql
spec:
  replicas: 1
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
        - name: postgres
          image: postgres:16
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 5432
          env:
            - name: POSTGRES_DB
              value: mydb
            - name: POSTGRES_USER
              value: myuser
            - name: POSTGRES_PASSWORD
              value: mypassword
          volumeMounts:
            - name: postgres-data
              mountPath: /var/lib/postgresql/data
      volumes:
        - name: postgres-data
          persistentVolumeClaim:
            claimName: data-postgresql
---
apiVersion: v1
kind: Service
metadata:
  name: postgres
  namespace: postgresql
spec:
  selector:
    app: postgres
  ports:
    - name: postgres
      protocol: TCP
      port: 5432
      targetPort: 5432
  type: ClusterIP

After this, the PostgreSQL pod successfully started and mounted the volume. I was able to connect to it using a test pod:

kubectl run psql-test \
  --rm -it \
  --image=postgres:16 \
  --namespace postgresql \
  --env="PGPASSWORD=mypassword" \
  -- psql -h postgres -U myuser -d mydb

This experience taught me a critical lesson about Talos and storage: any stateful workload requiring a hostPath or local PV must rely on directories that exist at boot through the machine configuration. Trying to use arbitrary directories will fail. While /var/local worked, this is not a best practice and should be avoided in production.

Why Using Existing Directories Like `/var/local` Works But Is Not A Good Practice

When I struggled to mount storage for PostgreSQL on Talos, I eventually discovered that /var/local already existed on every node. Talos creates this directory during early boot. As soon as I pointed my PersistentVolume to /var/local, the database pod started without any issues. The directory already existed, the kubelet was satisfied, and the pod finally mounted the data volume.

This seems like a convenient solution, but it is not a recommended approach. It introduces several long term problems and operational risks.

Here is why it is not a good practice.

1. This directory is not meant for application data

/var/local is an internal Talos directory and is not designed for stateful workloads. Talos can modify or use this directory for its own purposes in future releases. Talos does not document or guarantee that this directory will always exist, or that it will behave the same way across upgrades.

You are depending on behaviour that is a side effect of the operating system rather than a stable feature.

If you reuse /var/local for every PersistentVolume, then every stateful application on that node will write data into the same directory. This will cause:

A lack of isolation between applications
Possible permission conflicts
A risk of one application filling the entire directory and breaking the others
Difficulty with debugging and storage visibility

It also becomes impossible to safely delete or migrate individual application data.

3. Talos does not enforce size limits

Kubernetes does not enforce the size declared in a PersistentVolume. Since /var/local is just a directory, any application can exceed the declared ten gibibytes. There is no guarantee that the node will not run out of space.

In a worst case scenario, the node can crash due to disk pressure.

4. This breaks the philosophy of Talos

Talos is meant to be fully declarative. Anything that exists should be defined through the MachineConfig, not through accidental filesystem structure that happens to be present.

If the directory is not created explicitly by configuration, then it is not an intentional part of your infrastructure. It is risky to build application level storage on top of something that was not designed for this purpose.

5. Upgrades and reinstallations can remove the directory

During major upgrades or node reinstalls, Talos can reset or restructure internal directories. If /var/local is removed, renamed, or reformatted, then every application that relies on it will lose its data.

Your data becomes fragile and tied to undocumented filesystem details.

The Correct Talos Approved Way

The recommended way to create storage paths in Talos is:

Use a MachineConfig patch to create directories with explicit permissions and ownership.

For example, you can declare this:

machine:
  files:
    - path: /var/data/postgres
      permissions: 0o755
      owner: 0
      group: 0
      directory: true

You can create as many directories as you need. This is the clean, controlled, and safe method.

You then point your PersistentVolumes to these paths with confidence that:

They will exist on every node
They will survive upgrades
They are dedicated to the correct application
They will not conflict with Talos internals

This is the long term maintainable approach, but I faced the certificate error here, which made life difficult!!

Expanding Persistent Volumes

In the example above, I claimed the full 10Gi of available storage on the node. If I want to expand storage later, I must first ensure more disk space is available at the path and then patch the PVC to request the larger size. Kubernetes will handle resizing if the StorageClass allows it.

For multiple services requiring persistent storage, I can technically reuse /var/local, but this is also not recommended. Each workload should ideally have its own dedicated storage path or volume managed through a proper storage provider.

Why Longhorn Does Not Work Easily

Longhorn requires certain kernel modules, directory mounts, and filesystem behaviours that Talos does not support by default. Talos aims for very minimal host configuration. Longhorn expects the opposite. The two conflict in many ways.

As a result, most users who attempt Longhorn on Talos experience failures. This includes random crashes, volume mount issues, replica failures, and inability to start the Longhorn UI.

The safer alternative is to use Talos MachineConfig patches to create custom paths and then rely on local PersistentVolumes. This reduces flexibility but increases stability.

The Strange Behaviour Of PersistentVolumes In Talos

When I created a 10 gigabyte PersistentVolume and a matching PersistentVolumeClaim, it worked immediately on /var/local. This made me curious about what was actually happening.

Here is the explanation.

The size declared in a PersistentVolume does not actually allocate disk space
The directory simply points to the host filesystem
Talos does not perform any reservations
Kubelet does not enforce storage consumption limits
The declared capacity only informs Kubernetes scheduling

This means that even if you declare ten gibibytes, the actual host directory is unbounded. PostgreSQL can consume much more than the advertised size if it needs to. The responsibility of storage growth sits entirely on you as the administrator.

If you want to expand a PersistentVolume in Talos:

You need a larger physical directory available
You resize the PersistentVolumeClaim
The underlying filesystem must support online expansion

This works with filesystems like ext4 and xfs if configured correctly by the node.

Certificates and Node Access

Another issue I faced was related to Talos certificates. After initial node creation, when I tried to apply a new worker configuration (machine files path to create directory), I received errors like:

error applying new configuration: rpc error: code = Unavailable desc = connection error: desc = "transport: authentication handshake failed: tls: failed to verify certificate: x509: certificate signed by unknown authority"

Even though I used the same worker.yaml file, Talos refused the connection. The lesson here is that Talos tightly couples node certificates with the cluster PKI. If a certificate becomes invalid or untrusted, you can lose direct access.

There are several scenarios where this failure can occur:

You regenerate a machine configuration file
You recreate a node with a slightly different configuration
You lose your talosconfig file
The cluster CA is overridden when bootstrap runs again
Node IP addresses change
You accidentally mix configuration files from different clusters

When this happens, you cannot log into the node. You cannot fix files manually. You cannot mount a debug shell. This is both a security feature and a very serious operational risk.

This is the moment when Talos begins to feel unforgiving. A lost certificate means that you lose control of the node unless you have backups of your original configuration, certificates, and bootstrap secrets.

The Difficulty Of Troubleshooting And Recovery

Troubleshooting Talos is not like troubleshooting a normal Linux server. There is no SSH access, no direct shell, and no persistent filesystem to examine. Everything flows through the Talos API. If the Talos API is broken because of mismatched certificates, then you are locked out completely.

The only reliable recovery options are:

Reboot into maintanence mode
Reapply a machine configuration
Restore backed up secrets
Reinstall the node if nothing else works

This can feel very restrictive. It requires a mindset shift. Talos does not want administrators to fix issues manually. Talos wants everything to be declared in configuration files from the start.

This is powerful, but very easy to break if any detail is forgotten, and if you need to reset the node, consider your previous work gone :(

Recovering Node Access

To regain access to a Talos node when the certificate fails:

Use the Talos bootstrap token for insecure communication:

talosctl apply-config --insecure -n  --file worker.yaml

Alternatively, download a fresh Talos configuration using:

talosctl -n  kubeconfig -f kubeconfig.yaml

Use talosctl --insecure carefully, as it bypasses certificate validation.
Always back up your talosconfig files and certificates. Losing them makes access recovery difficult.

Key Lessons

Talos does not allow arbitrary hostPath directories. Only paths present at boot or created via machine configuration are valid for PVs.
Use /var/local as a temporary solution, but do not rely on it for production workloads.
Always backup Talos certificates and configuration to avoid losing access.
Stateful workloads require careful planning of persistent storage in Talos.

Talos is secure and minimal by design, but these features make working with storage and configuration more challenging than standard Linux nodes.

My Conclusion After Working Through These Issues

Talos OS is impressive. It is secure and consistent. At the same time, it brings operational challenges that are not obvious until you experience them directly.

The three biggest issues I faced were:

Storage paths that cannot be created manually
Certificates that break node access
Recovery procedures that depend entirely on correct configuration files

Talos is ideal for large production environments where every configuration is version controlled, stable, and tested. It can be difficult for homelab environments where experimentation is common and nodes often change.

In the end I created working PersistentVolumes, understood the certificate failures, recovered node access, and built functional PostgreSQL storage. This journey helped me understand Talos in a much deeper way. I now appreciate how strict and predictable it is, even though that strictness caused many of the problems.

If you are planning to learn Talos, try to keep configuration backups from the very beginning. It will save you much more time later!

I am still learning and perhaps there are simpler ways to have Talos in homelab experiments, but for now I will stick to more experiment friendly kubernetes.

Running Jenkins on Kubernetes – Complete Setup Experience

Sanjeev Kumar Bharadwaj — Thu, 13 Nov 2025 18:22:13 GMT

Running Jenkins on Kubernetes is one of those tasks that seems simple in theory but teaches a lot once you go through it. I wanted to host Jenkins in my cluster for CI workloads and learn how it behaves with persistent volumes, service accounts, and ingress exposure. This is how I set it up step by step.

Understanding the Goal

My objective was clear:

Deploy Jenkins in a dedicated namespace.
Persist Jenkins data using a local PersistentVolume.
Run Jenkins on a worker node, not on the control plane.
Expose Jenkins externally through an NGINX ingress using my domain.

Since I already had an ingress setup with MetalLB and a domain configured through Pangolin tunnel, Jenkins exposure had to follow the same model as my existing services.

You can check out my previous blog for this setup!
https://blog.nyzex.in/exposing-kubernetes-services-over-the-internet-using-metallb-nginx-ingress-and-pangolin

Setting up the Namespace and Storage

The first step was to prepare the storage layer. Jenkins requires persistent data for plugins, jobs, and configurations, so I decided to use a local PersistentVolume that maps to a directory on one of my worker nodes.

Below is the YAML I used for the storage setup:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-jenkins
  labels:
    type: local
spec:
  storageClassName: local-storage
  capacity:
    storage: 20Gi
  accessModes:
    - ReadWriteOnce
  local:
    path: /mnt
  nodeAffinity:
    required:
      nodeSelectorTerms:
        - matchExpressions:
            - key: kubernetes.io/hostname
              operator: In
              values:
                - talos-1to-jsz
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: pvc-jenkins
  namespace: jenkins
spec:
  storageClassName: local-storage
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

Here, talos-1to-jsz is my worker node where the Jenkins pod must run. The PV uses the local path /mnt, and the PVC binds to it successfully. Using WaitForFirstConsumer ensures that the PVC only binds when the pod is scheduled.

Configuring the Service Account and RBAC

Jenkins often needs to interact with the Kubernetes API for jobs and dynamic agent provisioning. To make sure it had sufficient access, I created a service account with a ClusterRole and a ClusterRoleBinding.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: admin-jenkins
rules:
  - apiGroups: [""]
    resources: ["*"]
    verbs: ["*"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: admin-jenkins
  namespace: jenkins
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: admin-jenkins
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: admin-jenkins
subjects:
  - kind: ServiceAccount
    name: admin-jenkins
    namespace: jenkins

This gave Jenkins full administrative access to the cluster, which is acceptable for a controlled environment. In production, it is recommended to restrict permissions according to actual needs.

Deploying Jenkins

With storage and RBAC in place, I created the deployment for Jenkins. The image used was jenkins/jenkins:lts. I ensured that the pod always runs on the worker node and uses the persistent volume claim for /var/jenkins_home.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: deployment-jenkins
  namespace: jenkins
spec:
  replicas: 1
  selector:
    matchLabels:
      app: server-jenkins
  template:
    metadata:
      labels:
        app: server-jenkins
    spec:
      securityContext:
        fsGroup: 1000
        runAsUser: 1000
      serviceAccountName: admin-jenkins
      nodeSelector:
        kubernetes.io/hostname: talos-1to-jsz
      containers:
        - name: deployment-jenkins
          image: jenkins/jenkins:lts
          resources:
            limits:
              memory: "2Gi"
              cpu: "1000m"
            requests:
              memory: "500Mi"
              cpu: "500m"
          ports:
            - name: httpport
              containerPort: 8080
            - name: jnlpport
              containerPort: 50000
          livenessProbe:
            httpGet:
              path: "/login"
              port: 8080
            initialDelaySeconds: 90
            periodSeconds: 10
          readinessProbe:
            httpGet:
              path: "/login"
              port: 8080
            initialDelaySeconds: 60
            periodSeconds: 10
          volumeMounts:
            - name: data-jenkins
              mountPath: /var/jenkins_home
      volumes:
        - name: data-jenkins
          persistentVolumeClaim:
            claimName: pvc-jenkins

Once deployed, the pod initially went into a Pending state because the PersistentVolume node affinity did not match. After correcting the hostname to talos-1to-jsz, it started running successfully.

Creating the Service

To expose Jenkins internally, I created a simple ClusterIP service. This would later be used by the ingress controller.

apiVersion: v1
kind: Service
metadata:
  name: service-jenkins
  namespace: jenkins
spec:
  selector:
    app: server-jenkins
  type: ClusterIP
  ports:
    - port: 8080
      targetPort: 8080

At this point, I verified that the service correctly routed traffic to the pod. For a quick test, I used port forwarding:

kubectl port-forward -n jenkins deployment/deployment-jenkins 8080:8080

Opening http://localhost:8080 brought up the Jenkins setup page.

Unlocking Jenkins

During the first startup, Jenkins requires an administrator password stored inside the container. The message on the web interface pointed to the file /var/jenkins_home/secrets/initialAdminPassword. I retrieved it using:

kubectl exec -it -n jenkins deployment/deployment-jenkins -- cat /var/jenkins_home/secrets/initialAdminPassword

After entering this password in the web interface, Jenkins allowed me to continue the setup and install the recommended plugins.

Exposing Jenkins through Ingress

Once Jenkins was fully functional, I exposed it externally using my NGINX ingress controller. Since I already had MetalLB and a working ingress for another service (kubenav), I followed the same pattern.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: jenkins
  namespace: jenkins
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  ingressClassName: nginx
  rules:
    - host: jenkins.nyzex.in
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: service-jenkins
                port:
                  number: 8080

After applying this and pointing my DNS entry jenkins.nyzex.in to the MetalLB IP of my ingress controller, I was able to access Jenkins directly at:

http://jenkins.nyzex.in

It loaded perfectly through the ingress, confirming that the setup worked as intended.

Now we have jenkins ready to use!

Conclusion

This exercise helped me understand how Jenkins interacts with Kubernetes components such as PersistentVolumes, ServiceAccounts, and Ingress controllers. It also emphasized the importance of node affinity in local storage setups, especially when working with Talos nodes.

With this configuration, Jenkins runs reliably on my worker node, stores its data persistently, and is accessible through my domain managed via MetalLB and Pangolin. The next step will be to integrate Jenkins with GitHub and container registries to build a complete CI workflow.

CodeOps Studies

When SSL Lies: Debugging PostgreSQL “server does not support SSL” in Kubernetes

The Setup

What sslmode=require Actually Does

Why This Happens in Kubernetes Environments

Step 1: Confirm Whether the Server Supports SSL

Step 2: Understand Your Architecture

Step 3: The Two Real Fixes

Option One: Disable SSL on the Client

A Common Hidden Cause in Cloud Environments

Practical Debug Checklist

Lessons Learned

Final Thought

A Real World Journey Building on Tencent Cloud

Understanding Tencent Cloud at a Core Level

The ICP Reality

The Shift to Hong Kong

Architecture Overview

Private Cluster and Bastion Access

Kubernetes Exposure Strategy

Storage and Data Flow

CI CD and Deployment

Key Differences from AWS and Azure

Lessons I learned:

Conclusion

Lessons Learned Building a CI Pipeline That Auto-Tags and Deploys Docker Images

The Goal

The High-Level Architecture

Problem: Manual Versioning Does Not Scale

Automatic Version Tagging Strategy

CI Pipeline Flow

Deployment Automation

Avoiding Downtime

Security Lessons

Observability Matters

Rollback Strategy

What I Would Do Differently

Conclusion

What I Learned Migrating a Real App from Docker Compose to Kubernetes

Docker Compose Worked Until It Didn’t

The Biggest Mental Shift: Stop Thinking in Containers

Configuration Management Became a First-Class Concern

Networking Was Simpler and More Complicated at the Same Time

Scaling Is Not Just Replicas

Health Checks Are Not Optional Anymore

CI/CD Became Cleaner and More Predictable

Kubernetes Did Not Reduce Complexity, It Reorganized It

What I Would Do Differently Next Time

Final Thoughts

Running Apache Flink on Kubernetes: From Zero to a Fully Utilized Cluster

What is Apache Flink

Core Flink Architecture

JobManager

TaskManager

Slots and Parallelism

Why Kubernetes for Flink

Cluster Prerequisites

Namespace Setup

Persistent Storage for Flink

PVC Definition

Flink Configuration

JobManager Deployment

JobManager Service

TaskManager Deployment

Some important queries and information:

Why TaskManagers are getting distributed across nodes:

Who decides where a TaskManager runs?

Kubernetes scheduler, not Flink

Why they don’t all land on one node anymore:

Why TaskManagers do NOT use the PVC

Key principle

So what is the PVC actually used for?

Verifying Cluster Distribution

Accessing Flink Web UI

Running a Sample Job

How Kubernetes and Flink Work Together

What Makes This Production-Ready

Final Thoughts

Apache Flink, Kubernetes, and How It Works

1. What is Flink?

What `sslmode=require` Actually Does