Understanding Apache Flink on Kubernetes

Imagine you have a factory that processes things. Flink is like that factory, and Kubernetes is like the factory floor manager that decides where machines go and how they run.

1. What is Flink?

Apache Flink is a distributed data processing engine.

“Distributed” means it can run across multiple computers (nodes) at the same time.
“Data processing” means it takes data streams or batches and transforms them into results.
It can process data as it arrives (streaming) or process a fixed dataset (batch).
Flink is stateful and fault-tolerant: it remembers important information and can recover if a worker dies.

Example:

Like counting how many people enter a mall every minute:

Flink will keep a running total.
If a computer crashes, it will recover the total from where it left off.

2. Flink Cluster Components

A Flink cluster has two main roles:

JobManager (JM): The Brain

The JobManager is like the manager of the factory.
Responsibilities:
- Accept jobs (the “instructions” for the factory)
- Split the job into smaller tasks
- Decide which worker (TaskManager) will execute each task
- Keep track of the state and checkpoints
- Handle failure recovery
Usually, 1 JobManager pod in Kubernetes
Needs persistent storage (PVC) because it stores checkpoints and savepoints

TaskManager (TM): The Worker

The TaskManager is like a worker machine on the factory floor.
Responsibilities:
- Execute tasks assigned by the JobManager
- Keep temporary state in memory or local disk (ephemeral)
- Report back progress to JobManager
TaskManagers are stateless and ephemeral:
- If a TaskManager dies, Kubernetes will restart it somewhere else
- JobManager uses checkpoints to restore the state
Each TaskManager pod can have one or more slots (think “hands” to do work)

3. Slots: Hands of the TaskManager

Each TaskManager has slots, which are units of parallel work.
Each slot can run one subtask of a Flink job.

Analogy:

Imagine a worker has 2 hands → they can work on 2 small tasks at the same time
If you have 4 workers, each with 2 hands → 8 tasks can be worked on simultaneously
Slots allow Flink to divide work and control resources (memory, CPU) per task

4. Parallelism: How Many Hands Work

Parallelism is how many subtasks a job is divided into.

Each subtask needs one slot.
Maximum parallelism = total slots in cluster

Example:

Component	Pods	Slots per Pod	Total Slots
JobManager	1	0	0
TaskManager	4	2	8

parallelism.default = 2 → job will run 2 subtasks if no -p is specified
Job parallelism = 6 → 6 subtasks run, distributed across the 4 TaskManagers
Job parallelism = 10 → only 8 subtasks can run at once (total slots), 2 wait

5. How TaskManagers store data

TaskManagers are ephemeral; they use local memory or local disk for temporary state.
They do NOT use PVC. If a TaskManager dies, its data is lost.
JobManager + PVC is where durable state lives:
- Checkpoints
- Savepoints

Checkpoint flow:

JobManager asks TaskManagers to snapshot their local state
TaskManagers send snapshots to JobManager
JobManager writes snapshots to PVC (persistent storage)
If a TM dies, it is restarted → state is restored from PVC

6. How Kubernetes schedules TaskManagers

As an example, let us see this scenario:

You requested 4 TaskManager pods
Kubernetes decides which node each pod runs on
You added soft anti-affinity → tries to spread them across both nodes

Example of your cluster after scheduling:

TaskManager	Node
TM1	node1
TM2	node1
TM3	node2
TM4	node2

JobManager pod runs on node1, mounts PVC for persistent storage
TaskManagers use local ephemeral storage

7. Step-by-step of job execution

Submit a job (example: WordCount)
JobManager splits job into subtasks according to parallelism
Assigns subtasks to TaskManager slots
TaskManagers execute tasks, keep temporary state
Periodically, TaskManagers checkpoint state to JobManager → written to PVC
If a TaskManager dies → Kubernetes restarts pod → JobManager restores state
Job continues processing seamlessly

8. Example analogy

JobManager → Manager in a factory
TaskManagers → Workers on the floor
Slots → Worker’s hands
Parallelism → Number of hands used on a job
PVC → Manager’s filing cabinet with important records
TaskManager ephemeral storage → Paper on worker’s desk (temporary, lost if worker leaves)

Key takeaways

Flink separates computation (TaskManagers) from state (JobManager + PVC)
Slots control parallelism at runtime
TaskManagers are disposable, Kubernetes can reschedule them anywhere
JobManager is critical: PVC ensures job can recover if TMs die
Parallelism.default is just a default; maximum parallelism is determined by total slots in cluster

Apache Flink, Kubernetes, and How It Works

1. What is Flink?

2. Flink Cluster Components

JobManager (JM): The Brain

TaskManager (TM): The Worker

3. Slots: Hands of the TaskManager

4. Parallelism: How Many Hands Work

5. How TaskManagers store data

6. How Kubernetes schedules TaskManagers

7. Step-by-step of job execution

8. Example analogy

Key takeaways

More from this blog

When SSL Lies: Debugging PostgreSQL “server does not support SSL” in Kubernetes

A Real World Journey Building on Tencent Cloud

Lessons Learned Building a CI Pipeline That Auto-Tags and Deploys Docker Images

What I Learned Migrating a Real App from Docker Compose to Kubernetes

Running Apache Flink on Kubernetes: From Zero to a Fully Utilized Cluster

Command Palette

1. What is Flink?

2. Flink Cluster Components

JobManager (JM): The Brain

TaskManager (TM): The Worker

3. Slots: Hands of the TaskManager

4. Parallelism: How Many Hands Work

5. How TaskManagers store data

6. How Kubernetes schedules TaskManagers

7. Step-by-step of job execution

8. Example analogy

Key takeaways

More from this blog