Cocojunk

🚀 Dive deep with CocoJunk – your destination for detailed, well-researched articles across science, technology, culture, and more. Explore knowledge that matters, explained in plain English.

Navigation: Home

Memory management

Published: Sat May 03 2025 19:14:06 GMT+0000 (Coordinated Universal Time) Last Updated: 5/3/2025, 7:14:06 PM

Read the original article here.


The Lost Art of Building a Computer from Scratch: Understanding Memory Management

Building a computer system from the ground up involves understanding fundamental concepts that bridge hardware and software. One of the most critical resources a computer system manages is its memory. Without effective memory management, programs would clash, data would be lost, and the system would quickly become unstable or unable to run more than one simple task. This resource explores the core principles and techniques behind memory management, providing the foundational knowledge needed to appreciate its role in system design.

1. What is Memory Management?

At its heart, memory management is the process of coordinating and optimizing the use of a computer's memory. When multiple programs (or even different parts of a single complex program) need to store data or instructions, they need specific locations in memory. Memory management is the system component responsible for:

  • Allocating Memory: Granting requests from programs for blocks of memory of a specific size.
  • Tracking Memory Usage: Keeping track of which parts of memory are currently in use and which are available (free).
  • Deallocating Memory: Reclaiming memory that is no longer needed by a program, making it available for future allocations.
  • Protecting Memory: Ensuring that one program or part of a program cannot accidentally (or intentionally) interfere with the memory used by another.

Memory Management: The system component responsible for dynamically allocating portions of computer memory to programs upon request and freeing it for reuse when no longer needed. This is also referred to as Dynamic Memory Management, Dynamic Storage Allocation, or Dynamic Memory Allocation.

In single-tasking, simple systems, memory management might be as basic as dividing memory into fixed regions. However, in any system designed to run multiple processes concurrently or handle data structures of varying and unpredictable sizes, dynamic memory management becomes essential. It allows the system to use its memory resources flexibly and efficiently.

Memory management can occur at different levels within a system:

  • Operating System Level: The OS manages memory for all running processes, ensuring isolation and fair sharing. This is necessary in multi-process environments.
  • Application Level: A program can manage memory within its own allocated space. This is often handled by runtime libraries or language-specific mechanisms.

Fundamentally, memory management can be categorized into two main approaches based on how memory is returned to the free pool:

  • Manual Memory Management: The programmer explicitly requests and releases memory.
  • Automatic Memory Management: The system or runtime environment automatically detects and reclaims memory that is no longer needed.

Let's delve into these categories.

2. Manual Memory Management

In manual memory management, the program explicitly asks the system for memory when it needs it and explicitly tells the system when it's done with that memory. This gives the programmer fine-grained control but also places a significant burden on them to manage memory correctly.

The primary area of memory used for dynamic allocation requests in manual systems is often called the heap or free store.

Heap / Free Store: A large pool of memory from which programs can request blocks of varying sizes during runtime.

When a program requests memory (e.g., "I need 100 bytes"), the memory manager searches the heap for an available block of at least that size. If found, a portion of that block is marked as "in use" and a reference (usually a pointer, which is a memory address) to that block is returned to the program. When the program is finished with the memory, it must inform the memory manager that the block is now "free" and can be reused.

In the C programming language, the standard library functions for manual heap allocation and deallocation are widely known:

// Request 100 bytes from the heap
void* ptr = malloc(100);

// ... use the memory pointed to by ptr ...

// Release the memory back to the heap
free(ptr);
  • malloc(size): Attempts to find and allocate a block of size bytes from the heap. Returns a pointer to the allocated memory or NULL if the request cannot be satisfied.
  • free(ptr): Takes a pointer previously returned by malloc (or similar functions like calloc, realloc) and marks the corresponding memory block as free and available for future allocations.

2.1 Challenges in Manual Memory Management

Implementing an effective manual memory manager is complex due to several issues:

  1. Fragmentation: As memory is allocated and freed over time, the free space in the heap can become broken up into many small, non-contiguous blocks.

    Fragmentation: A phenomenon where the available free memory in a system is divided into many small, separate blocks, making it difficult to satisfy large memory requests even if the total amount of free memory is sufficient.

    • External Fragmentation: This is the main type encountered with varying-size allocations. It occurs when there is enough total free memory, but it's scattered in chunks that are too small to fulfill a new allocation request. Imagine having 1000 bytes free, but it's in ten 100-byte chunks separated by allocated blocks. If a program requests 200 bytes, the request cannot be satisfied, even though 1000 bytes are free in total.
  2. Metadata Overhead: To manage the heap, the memory manager needs to store information about each allocated or free block (e.g., its size, status, pointers to the next block in a free list). This "metadata" consumes memory itself. For very small allocations, the metadata can be a significant percentage of the total memory requested.
  3. Memory Leaks: This is a major problem for programmers using manual management. A memory leak occurs when a program allocates memory but fails to free it when it is no longer needed. The memory manager thinks the memory is still in use, so it cannot be reallocated, effectively reducing the available free memory over time and potentially causing the program or system to run out of memory.
  4. Dangling Pointers: This occurs when memory is freed, but the program still holds a pointer to that now-invalidated memory location. Accessing a dangling pointer can lead to crashes, data corruption, or security vulnerabilities.

2.2 Efficiency Considerations

The specific algorithm used by the memory manager to find, allocate, and free blocks significantly impacts performance. Simple algorithms might be fast but lead to high fragmentation. Complex algorithms might reduce fragmentation but introduce overhead in searching and managing free lists. Studies have shown that the overhead for a single memory allocation request can involve a significant number of CPU instructions. Designing a fast and efficient memory allocator is a non-trivial task.

2.3 Common Manual Allocation Algorithms

Various algorithms are used to organize and manage the heap:

2.3.1 Fixed-Size Blocks Allocation (Memory Pool Allocation)
  • Concept: Memory is divided into blocks of a single, fixed size. When a request for memory of that size arrives, a free block is taken from a list of available blocks. When freed, the block is returned to the free list. If a request is for a size that doesn't match the fixed size, this method might not be suitable or might allocate a block larger than requested (leading to internal fragmentation, though less common in this specific approach if only one block size is used). More commonly, multiple pools for different fixed sizes are used.
  • Mechanism: A simple linked list of free blocks is often maintained. Allocation involves taking the head of the list; deallocation involves adding the freed block back to the list (at the head for speed).
  • Pros: Very fast allocation and deallocation (often constant time, O(1)). Minimal metadata overhead per block. Avoids external fragmentation for blocks of that specific size.
  • Cons: Can suffer from significant internal fragmentation if the requested sizes vary and the block size is large. Limited flexibility. Can still have external fragmentation between blocks if different pool sizes are used or if the allocation strategy needs to handle requests larger than any fixed block size.
  • Use Cases: Embedded systems with predictable memory needs, game development (for frequent allocation/deallocation of objects like particles, game entities), object pools in applications.
2.3.2 Buddy Blocks
  • Concept: Memory is allocated in blocks whose sizes are powers of two (or some other base). Available memory is conceptually split recursively.
  • Mechanism: The total memory area starts as one large block. When a request comes for size S, find the smallest power-of-two block size B that is >= S. If the smallest available block is larger than B, it is split into two "buddy" blocks of half the size. One buddy is used, the other is added to a free list for blocks of that smaller size. This splitting continues until a block of size B is obtained. When a block is freed, the system checks if its "buddy" block (the one it was split from) is also free. If so, they are merged into a larger block, which is then checked against its buddy, and so on, potentially coalescing up to the original large block.
  • Pros: Relatively simple coalescing (merging) of freed blocks helps reduce external fragmentation compared to some other methods.
  • Cons: Can suffer from internal fragmentation because requests are rounded up to the nearest power-of-two block size.
  • Use Cases: Often used in operating system kernels for managing physical memory pages or kernel objects.
2.3.3 Slab Allocation
  • Concept: Pre-allocates chunks of memory (called "slabs" or "caches") specifically designed to hold objects of a particular type or size.
  • Mechanism: For a given type of object (e.g., a process control block, a file descriptor structure), a "cache" is created. The cache consists of one or more "slabs," which are contiguous blocks of memory. Each slab is divided into fixed-size slots, each capable of holding one object of the cache's type. The allocator maintains a list of free slots within these caches. Allocating an object involves taking any free slot from the appropriate cache. Deallocating returns the slot to the free list.
  • Pros: Very fast allocation/deallocation (finding a free slot is quick). Eliminates fragmentation within a slot. Reduces metadata overhead (metadata is per-slab or per-cache, not per-object slot). Constructor/destructor logic can be integrated when a slot is assigned/returned.
  • Cons: Requires setting up specific caches for different object types/sizes. Can still have fragmentation between slabs if caches aren't fully utilized.
  • Use Cases: Primarily used in operating system kernels for managing frequently allocated kernel data structures of known sizes (e.g., task_struct, inode).

2.4 Stack Allocation (alloca)

While the heap is the main area for dynamic manual allocation with arbitrary lifetimes, there's another form of dynamic allocation that behaves differently: stack allocation within a function.

Stack Allocation: Allocating memory for local variables within a function call frame on the call stack. This memory is automatically allocated when the function is entered and automatically released when the function exits.

Normally, the size of stack-allocated variables is fixed at compile time. However, some systems provide functions like alloca.

// Allocate memory on the stack (not standard C, system-specific)
char* buffer = alloca(size);
  • Mechanism: alloca(size) typically translates into instructions that directly manipulate the stack pointer register, reserving size bytes on the current function's stack frame.
  • Pros: Extremely fast allocation (often just pointer arithmetic). No need for explicit deallocation (free) as it's automatically released when the function returns.
  • Cons: Risk of Stack Overflow: If size is too large or alloca is called recursively extensively, the stack can exceed its allocated space, leading to a crash. Undefined Behavior: alloca is not part of standard C/C++ (it's a common extension), and its behavior, especially on overflow, can be system-dependent and undefined. Cannot return pointers: Memory allocated with alloca is invalid once the function returns, so pointers to this memory cannot be returned or stored long-term.
  • Use Cases: Allocating temporary buffers within a function where the required size is only known at runtime and the data is not needed after the function returns. Safer, error-checking variants exist on some systems (like _malloca on Windows, often requiring _freea for cleanup or alloca_account in glibc).

3. Automated Memory Management

Manual memory management, while powerful, is notoriously difficult to get right. Memory leaks and dangling pointers are common sources of bugs. Automated memory management strategies aim to reduce or eliminate this burden on the programmer by handling the deallocation process automatically.

3.1 Automatic Management of Call Stack Variables

As mentioned in the previous section, local variables within a function (often called "automatic variables") are a fundamental example of automated memory management. Their storage is automatically allocated on the call stack upon function entry and automatically deallocated upon function exit. This simplicity is a key reason why the stack is preferred for function-scoped data whenever possible.

3.2 Garbage Collection (GC)

Garbage Collection: An automated memory management technique where the system or runtime environment periodically identifies blocks of memory that are no longer "reachable" or "usable" by the program and automatically reclaims them.

  • Concept: Instead of the programmer explicitly saying "I'm done with this memory," the garbage collector figures it out by determining which objects are still accessible (directly or indirectly) from a set of "root" references (like global variables, variables on the stack). Any memory not reachable from these roots is considered "garbage" and can be freed.
  • Mechanism: GC algorithms vary widely (e.g., mark-and-sweep, copying collectors, generational collectors), but they typically involve tracing relationships between objects to determine reachability.
  • Pros: Prevents memory leaks caused by forgetting to free memory. Reduces programmer burden and bug count related to deallocation. Eliminates dangling pointers caused by freeing memory too early.
  • Cons: Requires system resources (CPU time and memory) to perform the collection process. Can introduce unpredictable pauses (stop-the-world events) as the collector runs. May retain memory longer than strictly necessary if objects remain reachable even if the program is logically finished with them.
  • Use Cases: Widely used in languages like Java, C#, Python, JavaScript, Go, Lisp.

3.3 Reference Counting

Reference Counting: An automated memory management technique where each block of allocated memory maintains a counter of how many pointers currently point to it. When the counter drops to zero, the memory is automatically deallocated.

  • Concept: Every time a new pointer is made to point to a memory block, its reference counter is incremented. Every time a pointer to the block is destroyed or changed to point elsewhere, the counter is decremented.
  • Mechanism: This often requires compiler support or careful library/runtime implementation to ensure counters are correctly updated on every pointer assignment or destruction.
  • Pros: Memory is typically reclaimed immediately after the last reference is removed (unlike some GC methods). Can be simpler to implement than full tracing garbage collection.
  • Cons: Cannot handle Circular References: If two or more objects reference each other but are not referenced by anything else, their reference counts will never drop to zero, leading to a memory leak. Overhead: Incrementing/decrementing counters adds overhead to pointer operations. Can involve recursive freeing when a large data structure's count drops to zero.
  • Mitigation: Often combined with tracing GC to handle cycles, or systems introduce the concept of "weak references" which don't increment the count but allow access to the object if it still exists.
  • Use Cases: Used in languages like Swift, Objective-C (with ARC), Python (partially, combined with GC), C++ (with smart pointers like std::shared_ptr).

3.4 Memory Pools (Lifecycle-based Deallocation)

Memory Pool (Lifecycle-based): A memory management strategy where memory is allocated from specific pools associated with a particular application state or lifecycle stage (e.g., a request, a transaction). All memory allocated from a pool is automatically deallocated together when that lifecycle stage ends.

  • Concept: Instead of freeing individual allocations, you allocate items within the context of a "pool." When the event or process associated with the pool finishes, the entire pool is discarded, freeing all memory within it simultaneously.
  • Mechanism: Typically, a pool manager allocates large blocks from the underlying system (like the heap) and then fulfills smaller allocation requests from within those blocks. Deallocation is trivial: just discard the large blocks comprising the pool.
  • Pros: Very fast allocation within the pool (often just pointer bumps). Extremely fast deallocation (discarding the pool is O(1) relative to the number of items in the pool). Eliminates fragmentation within the pool as it's allocated linearly. Prevents leaks within the pool for that lifecycle.
  • Cons: Can hold onto memory longer than necessary if an object's lifetime is shorter than the pool's lifecycle. Not suitable for objects that need to persist across multiple lifecycle stages.
  • Use Cases: Web servers (allocating memory for a single request within a pool), compilers (allocating memory for an Abstract Syntax Tree during parsing), long-running applications with distinct transaction boundaries. While allocation within the pool might still be requested explicitly, the deallocation event is automated based on the lifecycle.

4. Systems with Virtual Memory

Modern operating systems extensively use virtual memory. This is a technique that creates an abstraction layer between the addresses programs use (virtual addresses) and the actual physical locations in RAM (physical addresses).

Virtual Memory: A memory management technique that provides an idealized abstraction of memory to programs, decoupling the addresses used by a program (virtual addresses) from the physical addresses in RAM.

Virtual Address: An address used by a program within its own address space. These addresses are theoretical and must be translated by hardware (the MMU) and the OS into physical addresses to access memory.

Physical Address: The actual address of a memory location in the computer's RAM chips.

4.1 How Virtual Memory Works (Conceptual)

The system divides both virtual address space (what the program sees) and physical memory into fixed-size blocks, typically called pages (for virtual space) and frames (for physical space). A hardware component, the Memory Management Unit (MMU), works with the operating system to translate virtual addresses to physical addresses using lookup tables (page tables).

Memory Management Unit (MMU): A hardware component responsible for translating virtual memory addresses generated by the CPU into physical memory addresses.

4.2 Benefits of Virtual Memory

Implementing virtual memory provides several crucial benefits for building robust systems:

  1. Memory Protection/Isolation: Each process can have its own separate virtual address space. The MMU, guided by the OS, can be configured so that a process's virtual addresses only map to physical memory frames that have been allocated to that specific process. Attempts to access virtual addresses that don't map to its allowed physical memory or are marked as read-only (if the process tries to write) trigger hardware exceptions, preventing one process from corrupting another's memory or the operating system's memory.

    Memory Protection: A mechanism, typically implemented using virtual memory and hardware support (MMU), that prevents a process from accessing memory that has not been allocated to it or is outside its permitted access rights (e.g., attempting to write to a read-only memory area).

  2. Abstraction and Simplified Linking: Programs can be compiled and linked as if they have a contiguous block of memory starting at a specific address (often address 0), regardless of where they are actually loaded in physical RAM or if their physical memory is contiguous. The virtual memory system handles the mapping.

  3. Efficient Use of Physical RAM (Paging/Swapping): Virtual memory allows the system to run programs whose total memory requirements exceed the amount of physical RAM available. This is achieved by:

    Paging/Swapping: Techniques where the operating system moves less-recently used pages of memory from physical RAM to a designated area on slower secondary storage (like a hard drive or SSD) to free up physical frames for pages that are actively being used. When a program tries to access a page that has been moved out ("paged out" or "swapped out"), the system detects a "page fault," pauses the program, finds the page on secondary storage, loads it back into a physical frame (potentially moving another page out), updates the page table, and then allows the program to resume. This makes the system appear to have more memory than is physically installed, allowing more programs to run simultaneously or single programs to work with larger datasets.

  4. Shared Memory: Although virtual memory isolates processes, it can also facilitate controlled sharing. The OS can map pages from the virtual address spaces of multiple processes to the same physical memory frames. This is a highly efficient way for processes to communicate or share libraries.

    Shared Memory: A technique using virtual memory where multiple processes can access the same region of physical memory simultaneously, providing a fast method for inter-process communication.

Virtual memory systems also interact with the storage hierarchy, deciding which data resides in faster primary storage (RAM) and which can be temporarily moved to slower secondary storage (disk).

Primary Storage (RAM): Fast, volatile memory directly accessible by the CPU.

Secondary Storage (Disk/SSD): Slower, non-volatile storage used for long-term data persistence and as an extension of primary memory in virtual memory systems.

5. Historical Context: Memory Management Approaches

Understanding how memory management evolved in early systems highlights the challenges and motivations behind modern techniques.

5.1 Burroughs/Unisys MCP Systems

The Burroughs B5000 (1961) was groundbreaking in its integrated approach to memory management, supporting virtual memory from the outset without needing an external MMU chip (as it was integrated into the CPU design).

  • Virtual Memory: Implemented early virtual memory concepts, moving code and data between main memory and secondary storage via "overlaying" (a form of paging).
  • Segment Descriptors: Instead of raw pointers, the system used protected control words called Segment Descriptors to refer to blocks of memory (segments).

    Segment Descriptor: A special control word used in systems like Burroughs MCP that contains metadata about a block of memory (a segment), such as its physical address, length, type, and a "presence bit" indicating if it's currently in main memory or on disk. These descriptors provided built-in bounds checking, meaning operations using descriptors could not accidentally access memory outside the defined segment, offering a high degree of memory safety (preventing buffer overflows) enforced by hardware. Descriptors themselves were protected from unauthorized modification.

5.2 OS/360 and Successors (IBM)

The original IBM System/360 (1964) family did not support hardware virtual memory. Memory management was a function of the OS supervisor.

  • No Virtual Memory (Initially): Programs operated directly on physical memory addresses.
  • Memory Isolation via Protection Keys: Hardware-based protection was achieved using protection keys. Physical memory was divided into blocks (e.g., 2KB). Each block and each running process was assigned a key (0-15). A process could only write to memory blocks that matched its key. Key 0 was reserved for the supervisor (OS kernel). This provided isolation but wasn't full address space separation like virtual memory.

    Protection Keys: A hardware mechanism used in systems like IBM System/360 where memory is divided into blocks, and access (especially write access) is controlled by matching a key associated with the CPU's current state (representing the running task) to a key associated with the memory block.

  • Manual Allocation (GETMAIN/FREEMAIN): Programs requested and freed memory using supervisor calls (SVCs) invoking GETMAIN and FREEMAIN macros.
  • Fixed Partitions vs. Dynamic Regions: Earlier versions (MFT - Multiprogramming with a Fixed number of Tasks) divided memory into operator-defined fixed partitions. Later versions (MVT - Multiprogramming with a Variable number of Tasks) allowed jobs to have dynamic "regions" of memory.

    Partition/Region: A contiguous block of memory allocated to a specific job or process in early multi-programming systems like OS/360. Partitions were fixed size in MFT, while Regions were dynamically sized in MVT.

  • Subpools: Within a job's region or shared system areas, memory was further managed using subpools (numbered 0-255). Subpools were aligned to the protection key boundary (e.g., 2KB). Different subpools could be assigned different protection keys or characteristics. The OS managed free lists within these subpools.

    Subpool: A subdivision of a memory Region or Partition in OS/360 and successors, used for granular management of allocated and free blocks within that area.

  • Evolution: Successors like OS/VS1 and OS/VS2 introduced virtual memory. MVS (Multiple Virtual Storage) expanded the address space structure with additional shared (CSA) and private (LSQA, SWA) areas, and reserved more protection keys for privileged code, demonstrating the increasing complexity of managing memory in advanced multi-user, multi-tasking systems.

6. Conclusion

Memory management is a foundational concept in computer systems, balancing the needs of multiple programs and data structures with the finite physical memory available. Whether implemented manually by the programmer or automatically by the system, the core tasks of allocation, tracking, and deallocation are essential.

Understanding the challenges like fragmentation and the overheads of different algorithms provides insight into why various techniques have been developed. Virtual memory, with its ability to decouple logical and physical addresses, is a cornerstone of modern operating systems, enabling protection, efficient resource utilization through paging, and flexible memory sharing.

Studying the historical evolution, from early systems relying on protection keys and manual allocation within partitions to pioneers integrating virtual memory and sophisticated descriptor-based management, reveals the progression of ideas aimed at building more robust, secure, and efficient computing environments. For anyone building a computer system from scratch, mastering these memory management concepts is not just beneficial – it's indispensable.

Related Articles

See Also