A CPU Story: Why must a function return?
Why is there a return at the end of a void function? Because return is not about values — it restores the CPU’s saved address. It’s the shadow of computation theory on silicon. From FSMs to transistor rows, from the physical ceiling of stack overflow to the compiler’s quiet inlining tricks, we unpack what really hides under that single keyword.
Table of Contents
Intro: What does return actually return? #
Have you ever stopped and asked yourself: “Why am I even writing return?”
On the surface it’s trivial. “Return returns a value, who cares.” You’ve been doing it for years; it’s muscle memory. But think back to the last time you wrote return at the end of a function that doesn’t really return anything. That tiny moment of hesitation — “What value am I returning here, exactly?” — probably flashed by and got suppressed under “the language wants it, whatever”. GeeksforGeeks+1
2 sources
8
Function Call Stack in CGeeksforGeeks
9
JavaScript Call StackJavaScript Tutorial
If we phrase the question a bit more precisely:
Why does every function have to return in some way?
In some languages you don’t even write the keyword; the compiler just injects a ghost return at the end of the block. With void functions things get even weirder: the type literally says “nothing”, yet the function must still return. So what is actually leaving the function?
What leaves is not a value; it’s where the CPU was.
Every program running on your machine has to live in RAM: the OS, your IDE, the browser, your scripts, your test runner. They all float in the same physical address space; virtual memoryvirtual memoryan OS mechanism that makes physical RAM appear as isolated address spaces to each program, enabling memory isolation and efficient use. and isolation are “how we organize the ocean”, not separate universes. In this sea of instructions the CPU only has one compass: the program counterprogram counterProgram Counter — a special CPU register that stores the address of the next instruction to execute. Every clock cycle, the CPU reads the instruction at this address, then updates PC to point to what comes next. . Its entire job is to answer a simple question: “Which address in memory am I executing right now?” Wikipedia+3 4 sources 1 Call stackWikipedia 3 Program Counter and Instruction RegisterBaeldung 4 Program counterWikipedia 10 Inside the CPU: A Complete Guide to the Instruction Execution Cycledev.to
When you call a function, the CPU must remember whichever address that compass was pointing to, because once the function finishes it has to jump back and continue from there. This is the real meaning of return:
A moment ago you were at this address in memory — go back there.
Whether or not you return a value, at the hardware level every call eventually “returns” exactly one thing: the address the CPU needs in order to take its next step. Wikipedia+3 4 sources 1 Call stackWikipedia 2 How does a function call work?StackOverflow 4 Program counterWikipedia 11 Understanding how function call workszhu45.org
That pushes us from “why does a function exist?” into “how do we model computation?”.
From computation’s point of view: FSM and “where you left off” #
This story doesn’t start in hardware. Long before computers, we were using the same idea: computation is just moving from one state to another.
Even on paper, doing basic arithmetic, you’re essentially saying: “I was at 7, I added 3, now I’m at 10.” This view is called an FSMFSMFinite State Machine — a theoretical model that views computation as transitions between states, with an initial state, intermediate states, and a halting state. . There’s an initial state, a sequence of intermediate states, and eventually a halting state where you’re done. Wikipedia+2 3 sources 1 Call stackWikipedia 27 Finite-state machineWikipedia 28 Finite State MachinesBrilliant
You can look at a program the same way: each line, each branch, each function call moves you into a new state. A function is a smaller machine embedded in that chain: it has its own initial state, its own internal steps, and a point where it stops and hands you back to the outer machine — to where you left off.
- In theory, “where you left off” is the caller FSM’s next state.
- In the CPU, “where you left off” is the caller’s next instruction address.
That’s why return is more than “produce a value”: in the FSM world it means “go back to the caller’s next state”, in the hardware world it means “restore the caller’s PC”. It’s not a convention the language designers made up on a whim; it’s what you get when you push computation theory all the way down into CPU registers. Wikipedia+2
3 sources
1
Call stackWikipedia
27
Finite-state machineWikipedia
28
Finite State MachinesBrilliant
From the CPU’s point of view: PC and call #
The CPU’s world is much simpler than ours. It really only cares about one thing: the program counter (PC). The PC stores a single memory address. Every clock it runs the same ritual: Baeldung+1 2 sources 3 Program Counter and Instruction RegisterBaeldung 4 Program counterWikipedia
- Fetch the instruction at the address in PC.
- Decode and execute it.
- Compute the “next” PC.
Those three steps are the common heartbeat of every CPU that’s ever shipped — your modern x86-64, and the old Z80 alike.
Look at a classic block diagram and the names become concrete: program counter, stack pointer, register file, the buses between them. They’re not just boxes in a slide; they occupy literal area on the die.

For ordinary instructions, “next” basically means “current address + instruction size”. Branches and function calls change the game: “next” is no longer the neighbor — it’s somewhere else entirely. A function is really just a pattern that makes this jump reusable and predictable.
The compiler takes your func add(a, b) and turns it into machine code at some address in memory. The CPU neither knows nor cares about the name add; the only thing it knows is something like: “this function starts at 0x400580.” Wikipedia+2
3 sources
1
Call stackWikipedia
2
How does a function call work?StackOverflow
11
Understanding how function call workszhu45.org
When a call add executes, the CPU does three things: Wikipedia+2
3 sources
1
Call stackWikipedia
2
How does a function call work?StackOverflow
12
How is return address specified in Stack?GeeksforGeeks
- It computes the address of the next instruction — the one right after the
call— and treats that as the return addressreturn addressthe memory address of the next instruction after a function call — tells the CPU where to continue after the function finishes. . - It writes that address somewhere (on almost all mainstream ISAs, to the top of the stack; on some, into a dedicated link register).
- It sets PC to the entry address of
add.
You just wrote add(2, 3) in source. Underneath, what actually happened is:
There’s a
callat this address. I saved the address of the next instruction as the return address. I set PC to 0x400580. We’re going in.

Stack pointer: the parking slot for return addresses #
Time to bring stack pointerstack pointerSP — a special CPU register that points to the top of the stack in memory, where return addresses and local variables are stored during function calls.
on stage. SP points at the top of a LIFOLIFOLast In First Out — the core principle of stack data structures: the most recently added item is removed first.
region in RAM. When a call runs on a typical architecture, the sequence looks like this: Wikipedia+3
4 sources
1
Call stackWikipedia
6
x86/x64 CPU architecture: the stack & stack framesyuriygeorgiev
12
How is return address specified in Stack?GeeksforGeeks
13
Memory Stack Organization in Computer ArchitectureGeeksforGeeks
- The CPU reads the address of the next instruction.
- It moves SP “down” one step (on most architectures the stack grows downward).
- It writes that return address at the new top of the stack.
- It sets PC to the function’s entry address.
Each call pushes another return address (and usually some frame data) onto the stack; each ret pops the topmost return address back into the PC and moves SP back up. Stack accesses are fast precisely because of this: everything happens at the top, so the cache pattern is ideal. yuriygeorgiev+2
3 sources
6
x86/x64 CPU architecture: the stack & stack framesyuriygeorgiev
13
Memory Stack Organization in Computer ArchitectureGeeksforGeeks
14
Is there a point in memory on the stack where performance drops significantly?StackOverflow

The heapheapa dynamic memory region — allocated at runtime, used for variable-sized objects that need to persist beyond a single function call.
lives in a different universe. That’s the place for variable-sized, long-lived, messy objects. Pointers can connect the two worlds, but the heap doesn’t have anything like a strict “return address” discipline. The stack does: every call pushes a record on top, every ret pops from the top. There is no “remove from the middle”. Wikipedia+3
4 sources
1
Call stackWikipedia
8
Function Call Stack in CGeeksforGeeks
13
Memory Stack Organization in Computer ArchitectureGeeksforGeeks
15
Will the pointer address be stored in cache or the data that it points to?Reddit

Stack overflow is what happens when that discipline hits its physical ceiling.
In practice it’s mostly one of two things: you forgot a base case in a recursive function, or your call chain goes so deep that the base case is effectively unreachable.
Each call adds another frame and another return address, SP marches further down every time, and at some point the stack pushes against the limit the OS reserved for this process. That’s when the CPU/OS effectively says “that’s enough”, and your program crashes. Wikipedia+4
5 sources
1
Call stackWikipedia
8
Function Call Stack in CGeeksforGeeks
13
Memory Stack Organization in Computer ArchitectureGeeksforGeeks
14
Is there a point in memory on the stack where performance drops significantly?StackOverflow
16
Infinite loops in JavaStackOverflow
Keep that picture in the back of your mind — we’ll come back to it when we talk about the cost of functions.
If you never return: how “nothingness” actually works #
Once you’ve seen the call/ret mechanics, “What happens if I don’t return?” is no longer an abstract question.
When I say “if there’s no return, there’s only nothingness”, I mean something very specific: the CPU keeps ticking, keeps executing instructions, keeps updating PC — but the ret never runs.
The return address at the top of the stack never makes its way back into the PC.
The program is technically still “alive”, but the line after the call is gone for good. Wikipedia+3
4 sources
1
Call stackWikipedia
2
How does a function call work?StackOverflow
12
How is return address specified in Stack?GeeksforGeeks
16
Infinite loops in JavaStackOverflow
Through the FSM lens, what you’ve done is this:
you abandoned the outer state machine (the caller) and decided to spin forever inside the inner one (the callee).
You’ve cut the link that should take you back to the caller’s next state. return is exactly the thing that re-opens that link. Wikipedia+2
3 sources
1
Call stackWikipedia
27
Finite-state machineWikipedia
28
Finite State MachinesBrilliant
The two classic shapes:
An infinite loop inside the function
PC keeps bouncing between a handful of instructions, SP doesn’t move, and control never gets back to the line after the call. In a single-threaded program, everything “after” that call is effectively frozen. StackOverflow+1 2 sources 16 Infinite loops in JavaStackOverflow 21 Stopping an executed function that is an infinite loopRedditRecursion with no base case
Eachcallwrites a new return address onto the stack, SP moves down, and noretever consumes those addresses. Eventually you slam into the physical end of the stack region and get a stack overflow. Wikipedia+3 4 sources 1 Call stackWikipedia 8 Function Call Stack in CGeeksforGeeks 13 Memory Stack Organization in Computer ArchitectureGeeksforGeeks 16 Infinite loops in JavaStackOverflow
The choice of language doesn’t matter here. Go, C, Java, Python… if you call a function whose body is an infinite loop, the code after the call simply never runs. The CPU might be very busy executing instructions, but the PC never visits the “line after the call” again.
Inside the compass: where PC and SP actually live #
So far we’ve talked about PC and SP as “special registers”.
On the theory side we’ve used FSMs; on the CPU side we’ve used call/ret + stack and drawn the whole picture at the ISA level.
Now let’s take one more step down and ask: where does this return necessity physically land on the die? Where are these state transitions actually stored?
This is why we peek at the die:
to see that return isn’t a purely syntactic rule the language enforces, but a concrete hardware behavior — specific rows in the register file and specific bytes in RAM changing in a disciplined pattern.
Keeping that in mind changes how you look at code: instead of “what’s the harm?”, you start asking “exactly which physical thing am I stressing here?” and you get better at predicting which abstractions will bite you later.
On diagrams we draw a box and label it “register file”. In a debugger we see RSP = 0x7ffeefbff5c0. Easy to think of those as just names. In reality these are physical structures living inside each core, constantly flipping bit patterns on every clock: rows of flip-flops etched into silicon — register rows. Wikipedia+2
3 sources
5
Stack registerWikipedia
6
x86/x64 CPU architecture: the stack & stack framesyuriygeorgiev
17
CPU Registers x86-64OSDev Wiki
Zoom in a bit: under the package there’s the silicon diesilicon diethe thin silicon wafer on which transistors and circuits are physically etched to create the CPU chip. ; on the die there are cores. Inside each core: cache, ALUs, branch units, and a block called the register fileregister filethe block of fast internal registers inside a CPU core — where PC, SP, and other CPU registers physically live on the silicon die. . That’s the neighborhood where RAX, RBX, RSP and the PC all live. Wikipedia+1 2 sources 7 Register fileWikipedia 18 The Convoluted Way Intel's 386 Implemented Its RegistersHackaday

The register file is not one opaque box; it’s more like a grid — rows and columns. Each cell is a tiny flip-flop holding a single bit; each row is a 64‑bit register. The labels you see in ISA docs — RSP, RIP, RAX — are really row names on this grid. Wikipedia+2
3 sources
7
Register fileWikipedia
19
FlipFlops and RegistersTechTop
20
Flip-flops and registersCPUville

When your debugger says RSP = 0x7ffeefbff5c0, all 64 flip-flops in that row are holding the bits of that address. When a call runs, the bit pattern in that row changes; when a ret runs, the pattern shifts back. Climbing one frame up the call stack literally means “this row of flip-flops changed back to its previous value”.
This is why I don’t want to leave return at “because the language wants it”: the honest sentence is closer to “when this function finishes, the CPU must restore this particular register row with this address pattern”.
The real cost of a function, and the compiler’s quiet answer #
Let’s go back up one level: functions are not free abstractions.
Every call / ret comes with work attached:
- PC and SP have to change.
- A new stack framestack framea data block added to the stack during a function call — contains the return address, local variables, and the previous stack pointer. has to be built (return address + locals).
- A few cache lines are going to get hot or cold in the process. Wikipedia+3 4 sources 1 Call stackWikipedia 6 x86/x64 CPU architecture: the stack & stack framesyuriygeorgiev 13 Memory Stack Organization in Computer ArchitectureGeeksforGeeks 14 Is there a point in memory on the stack where performance drops significantly?StackOverflow
If you slice your code into lots of tiny functions, what you’re really saying at the physical level is: “We’re going to jump PC and SP around more often, and we’re going to grow and shrink the stack more frequently.” In exchange you get readability, testability, reuse.
Looking at that, you might be tempted to say: “Fine, then I’ll just write everything in one giant function and never call anything.”
That has a cost too: a huge body that’s painful to reason about and debug. If you had to manually choose the sweet spot between “everything inline” and “everything tiny”, you’d be right to worry.
Modern compilers step in exactly here. They aggressively inline small, hot functions: instead of emitting an actual call, they paste the function body straight into the call site. call / ret vanish, SP and PC move less, the binary grows a bit, the hot path gets faster. StackOverflow+3
4 sources
2
How does a function call work?StackOverflow
22
Compiler Optimization TechniquesAalto OpenCS
23
Compiler OptimisationUCL
24
x86 Assembly and Call Stack (CS161 textbook)UC Berkeley
The mental model I’d use:
- Don’t be afraid to factor code into meaningful functions.
- But don’t write a new function every other line either; pointless abstractions still bloat the stack in debug and in cold paths.
- The compiler will try to inline “small and hot” functions for you; when you overdo it, you end up paying the real
callcost in the places where heuristics decide not to inline.
In debug builds inlining is usually disabled. That’s why you get a nice, clean call stack there, and a much messier picture in release builds — the optimizer is eating calls and spitting out straight-line instructions. UC Berkeley+1 2 sources 24 x86 Assembly and Call Stack (CS161 textbook)UC Berkeley 25 Intel 64 and IA-32 Architectures Optimization Reference ManualIntel
Where we ended up, and what comes next #
We started at the bottom: PC and SP as rows inside a register file, the relative positions of stack and heap in memory, and what stack overflow actually hits. We walked the same function call across four layers:
- Theory layer — FSM: computation as state→state transitions, and the function’s obligation to bring you back to “where you left off”.
- CPU layer — callcalla CPU instruction that saves the current address on the stack and jumps to a function's entry point. and retreta CPU instruction that pops the return address from the stack back into the program counter, exiting the function. , how PC and SP move.
- Hardware layer — actual register rows and flip-flops holding PC and SP.
- Compiler/developer layer — inlining, function cost, and how much abstraction to introduce.
After this, it’s hard for me to say “a function just returns” with a straight face. Every return you type nudges a specific row in the register file, touches specific bytes in RAM, and updates the little graph living inside the branch predictor.
I’d like you to leave this piece with one more question in mind:
If every core only has one PC and every thread has its own stack, how does parallelism actually work?
That’s the next story. We’ll talk about per‑thread stacks, per‑core register sets, how PC and SP are saved and restored during context switches, and where branch prediction and the Return Address Stack fit into all of this. In other words:
How can we be “in” so many functions at once?
We’ll answer that from the CPU’s side of the table. Wikipedia+3 4 sources 1 Call stackWikipedia 3 Program Counter and Instruction RegisterBaeldung 6 x86/x64 CPU architecture: the stack & stack framesyuriygeorgiev 26 Branch Target Buffer, Return Address StackUCSD CS141
Thanks #
UC Berkeley’s open course material is the backbone of this series. Computer architecture, CPU internals, stack/register/memory hierarchy — being able to learn this stuff for free, as someone who came to these details later in their career, is a huge privilege.
🎥 The lecture series that shaped this post:
UC Berkeley — Computer Architecture (YouTube playlist)