dynamic arrays in CDynamic Arrays in C: How to Build Resizable Arrays with malloc and realloc
Learn how to implement dynamic, resizable arrays in C. Master malloc, realloc, and free to create vectors that grow with your data. Avoid common memory pitfalls.
Dynamic arrays are a fundamental data structure in C that overcome the biggest limitation of standard arrays: their fixed size. In a language where memory management is manual, knowing how to create an array that can grow and shrink at runtime is a vital skill for building real-world applications.
Whether you're developing a dynamic list for a CLI tool, a high-performance database buffer, or a custom game engine, mastering the interaction between pointers, malloc, and realloc is the key to flexible data management. Standard C arrays (e.g., int arr[10]) are fast but brittle; they break as soon as your data exceeds their pre-defined bounds.
This comprehensive guide walks you through the entire lifecycle of a dynamic array—from initial allocation to capacity expansion and final deallocation. We'll explore the "vector" pattern, performance trade-offs, and how to avoid the most common memory-related security vulnerabilities.
What Is a Dynamic Array?
In C, a dynamic array is not a built-in type but a pattern of memory management. It typically consists of a pointer to a block of heap-allocated memory, along with metadata tracking the array's current size (how many elements it holds) and its total capacity (how many it can hold before needing more memory).
Unlike static arrays, whose size is determined at compile-time and cannot be changed, dynamic arrays can be resized using the realloc function.
Key Components of a Dynamic Array:
- The Pointer: Holds the address of the first element in the heap block.
- Size: The number of elements currently in use.
- Capacity: The total number of elements that the current memory block can accommodate.
Core Functions: malloc, realloc, and free
Building a dynamic array relies on the C standard library's memory management quartet (including calloc).
malloc (Memory Allocation)
Used for the initial creation of the array. You request a block of capacity * sizeof(element_type) bytes.
realloc (Re-allocation)
The "magic" function that grows the array. It tries to expand the current block in place or, if necessary, finds a new, larger block and copies the data over.
free (Memory Release)
Crucial for preventing memory leaks. It returns the array's heap block to the system when it's no longer needed.
The Resizable Vector Pattern
A professional implementation of a dynamic array in C usually involves a struct. This encapsulates all the necessary data into a single, easy-to-pass object.
The Dynamic Array Struct
#include <stdlib.h>
typedef struct {
int *data;
size_t size;
size_t capacity;
} IntVector;
8-12 Use Cases for Dynamic Arrays
- User Input Streams – Storing an unknown number of lines or characters from a file or user input.
- Buffer Management – Managing network packets or file chunks where the total size isn't known upfront.
- Graph Algorithms – Storing adjacency lists for nodes where the number of connections varies wildly.
- Game Engines – Managing a dynamic list of active entities, projectiles, or particles on screen.
- Search Results – Collecting an unknown number of results from a database query or search function.
- Command Line Utilities – Parsing a variable number of flags and arguments.
- Compilers – Storing the tokens or syntax tree nodes generated during parsing.
- Text Editors – Managing a "gap buffer" or dynamic list of lines in a document.
- History Buffers – Implementing undo/redo functionality where the history can grow indefinitely.
- Data Serialization – Building a byte buffer dynamically as objects are serialized into a binary format.
- Caching Systems – Managing a set of cached objects where the cache size might be tuned at runtime.
- Statistical Analysis – Storing large sets of data points collected during an experiment or simulation.
6-10 Practical Examples
Example 1: Basic Initialization and Cleanup
#include <stdio.h>
#include <stdlib.h>
typedef struct {
int *array;
size_t size;
size_t capacity;
} Vector;
void init_vector(Vector *v, size_t initial_capacity) {
v->array = malloc(initial_capacity * sizeof(int));
v->size = 0;
v->capacity = initial_capacity;
}
void free_vector(Vector *v) {
free(v->array);
v->size = v->capacity = 0;
}
Example 2: Adding an Element (The Push Phenomenon)
void push_back(Vector *v, int value) {
if (v->size == v->capacity) {
// Double the capacity
v->capacity *= 2;
v->array = realloc(v->array, v->capacity * sizeof(int));
}
v->array[v->size++] = value;
}
Example 3: Initializing with calloc (Zero-Initialization)
void init_zero_vector(Vector *v, size_t size) {
v->array = calloc(size, sizeof(int));
v->size = v->capacity = size;
}
Example 4: Creating a Generic Array (void *)
typedef struct {
void *data;
size_t element_size;
size_t size;
size_t capacity;
} GenericVector;
Example 5: Removing Elements (Shrinking)
void pop_back(Vector *v) {
if (v->size > 0) {
v->size--;
// Optional: Shrink if size < 1/4 of capacity
}
}
Example 6: Copying a Dynamic Array
Vector copy_vector(Vector *v) {
Vector new_v;
new_v.array = malloc(v->capacity * sizeof(int));
memcpy(new_v.array, v->array, v->size * sizeof(int));
new_v.size = v->size;
new_v.capacity = v->capacity;
return new_v;
}
Example 7: Multidimensional Dynamic Arrays (2D Matrix)
Creating a 2D array dynamically requires a pointer to a pointer. You first allocate an array of rows (pointers), then allocate each row (data). This allows for "jagged" arrays where rows have different lengths, or a standard rectangular matrix.
int** create_matrix(int rows, int cols) {
int **matrix = malloc(rows * sizeof(int*));
for (int i = 0; i < rows; i++) {
matrix[i] = malloc(cols * sizeof(int));
}
return matrix;
}
void free_matrix(int **matrix, int rows) {
for (int i = 0; i < rows; i++) {
free(matrix[i]);
}
free(matrix);
}
Example 8: Flattened 2D Array for Performance
A better way to implement 2D arrays is by "flattening" them into a single 1D block. This improves cache locality and simplifies memory management (only 1 malloc and 1 free call). You access elements using the formula index = row * cols + col.
int* create_flat_matrix(int rows, int cols) {
return malloc(rows * cols * sizeof(int));
}
void access_flat(int *matrix, int row, int col, int cols) {
int value = matrix[row * cols + col];
}
8-12 Tips for Performance and Safety
- Exponential Growth – Always grow capacity by a factor (usually 1.5x or 2x). Incremental growth (e.g., +1) leads to O(n²) performance.
- Handle
reallocFailure –reallocreturnsNULLon failure. If you assign this directly to your pointer, you'll lose the address to the old block (and get a leak). Always use a temporary pointer. - Use
size_t– This is the standard unsigned type for sizes and indexing, ensuring your array can use all available memory on any architecture. - Initialize to NULL – If you start with a
NULLpointer and0size, the first call toreallocwill act likemalloc. - Shrinking Strategy – Be careful with shrinking; doing it too often can cause performance thrashing. Only shrink when the size is significantly smaller than the capacity (e.g., 25%).
- Check for NULL – Always check the result of
mallocandreallocbefore using the pointer. - Document Ownership – Be clear about who is responsible for
free()-ing the array's memory (e.g., the function that created it or the caller). - Alignment – Most standard allocators provide 8-byte or 16-byte aligned memory, which is sufficient for all standard types.
- Avoid Constant Re-allocation – If you know the final size, pre-allocate it once to save CPU time.
- Use
constfor Readers – Mark pointers asconst void *in functions that only read from your array. - Bounds Checking – Since C doesn't provide it, implement your own bounds checking in access functions during development.
- Audit with Valgrind – Use memory debuggers to ensure no "use-after-free" or "leaks" are hiding in your resize logic.
Troubleshooting Dynamic Array Common Issues
Issue 1: The "Double-Free" or "Use-After-Free"
Problem: A crash occurs when freeing the array or accessing it later.
Cause: Calling free() twice, or using a pointer after it was passed to realloc (the old address might be invalid!).
Solution: After freeing or reallocating, immediately update your pointers. After free(), set the pointer to NULL.
Issue 2: Memory Leak During Resize
Problem: The program consumes more and more RAM.
Cause: realloc failed, and you overwrote the pointer to the original memory with NULL.
Solution: void *temp = realloc(...); if (temp) v.data = temp; else handle_error();
Issue 3: O(n²) Performance (Slowness)
Problem: Adding elements to a large array is extremely slow. Cause: Growing the array by a fixed amount (like +1 or +10 elements) every time it fills up. Solution: Double the capacity when it's full. This makes the average cost of an insert O(1).
Issue 4: Index Out of range
Problem: Data corruption or random crashes.
Cause: Accessing v->array[v->size] instead of v->array[v->size-1].
Solution: Use an accessor function with assert(index < v->size).
Related Concepts
Linked Lists
An alternative to dynamic arrays. They are better for frequent insertions at the beginning or middle, but slower for random access.
Memory Alignment
When you allocate memory for an array of structs, ensure the allocator respects the alignment requirements of the struct members.
Buffer Overflow
Writing past the capacity of your dynamic array. This is the source of many security vulnerabilities in C.
Frequently Asked Questions
Is a dynamic array the same as a vector in C++?
The concept is the same, but C lacks the built-in std::vector. In C, you must implement the growth and management logic yourself.
Does realloc always move the data?
No. If there is enough contiguous free space immediately following the current block, realloc will simply expand the block in place, which is much faster.
can I use [] notation with a dynamic array?
Yes! If ptr is a pointer to heap memory, ptr[5] is exactly the same as *(ptr + 5).
how do I create a 2D dynamic array?
You can either allocate an array of pointers (where each pointer points to a row) or allocate a single 1D block and calculate offsets manually (row * width + col). The 1D approach is better for performance.
what is the maximum size of a dynamic array?
It's limited by your system's size_t and available RAM. On 64-bit systems, this is effectively hundreds of gigabytes (or terabytes if you have the hardware).
should I use malloc or calloc for the initial allocation?
Use calloc if you want all elements to be zeroed out initially. Use malloc if you're about to fill the array with data anyway.
how often should I shrink the array?
Only when memory usage is a desperate concern. For most applications, it's better to keep the capacity to avoid the overhead of future re-allocations.
can I realloc a NULL pointer?
Yes. Calling realloc(NULL, size) is identical to calling malloc(size).
Quick Reference Card
| Operation | Function | Performance | Caveat |
|---|---|---|---|
| Create | malloc | Fast | Check for NULL |
| Grow | realloc | O(n) or O(1) | Use a temp pointer |
| Clean | free | Fast | Set to NULL after |
| Access | ptr[i] | O(1) | No bounds checking |
Try MemC to Visualize Dynamic Arrays
Want to see realloc in action? MemC lets you step through your code and watch as a small block of memory is "magically" moved and expanded to accommodate new data. It's the best way to visualize how your vector grows.
Try Dynamic Array Visualization
Summary
Dynamic arrays are an essential tool for any C programmer, providing a flexible way to handle variable amounts of data. By understanding the interaction between malloc, realloc, and free, you can build data structures that are both efficient and powerful.
- Always use a struct to keep pointer, size, and capacity together.
- Use exponential growth (doubling) for O(1) average insertions.
- Be careful with realloc failure to avoid memory leaks.
- Always free your memory when finished.
With these techniques, your C programs will be able to handle any volume of data with confidence and speed.