Working with Memory

I have been fortunate enough to learn programming languages that allow me to manage my own memory. I think it is shocking that computer science introduction classes have stopped teaching C/C++ and pointers. This choice have robbed new graduates the opportunity of royally screwing up their first programs, and the valuable experience acquired when they finally dig themselves out and understand at a low level how computers work.

Basics

To allocate heap memory in C, then liberate the pointer when you don't need it anymore.

void* p = malloc(32);
if( p ) {
  // Success!  You now have 32-bytes of storage.

  free(p);
}

If you are using C++, you can also use new and delete. You cannot mix malloc with delete or new with free. Newer compilers should generate a warning telling you it is a bad idea.

Placement New

In C++, if you want to place your object in specific memory address and initialize memory via the class constructor:

void* addr = ...;
MyType* p = new(addr) MyType(...);

It is worth noting that you cannot delete this pointer because you don't own its memory. Call destructor manually if you need to.

Memory Pages

When you request memory, depending on the size, operating system will handle the request differently. Typically when you call malloc, the system library will call sbrk for you and extend the size of your heap. This happens in multiples of virtual memory page size, which you can find out via:

#include <unistd.h>
long page_size = sysconf(_SC_PAGESIZE);

In some environment getpagesize() is an equivalent API. Many processor architectures use 4096-byte pages, and each range of pages can have different permissions. For example, constants are stored in read-only regions. Program code has execute bit set and heap has write-bit set.

Some debugging tools, such as libduma use memory pages with different permissions before your variables to detect program code attempting to access beyond allocated storage range.

Fragmentation

Because operating system manages memory pages, fragmentation (partially used page) will reduce available memory to other processes running on the same system. You can try to reclaim some of them by calling:

#include <malloc.h>
malloc_trim(0);

If you know you'll be allocating memory again, you can use a non-zero value with malloc_trim to avoid the extra work later.

Using mmap and tmpfs to Manage memory

You can avoid fragmentation by mapping memory to a virtual address that isn't part of your heap using mmap call. This way the entire range can be liberated without worrying about partially used pages.

On Linux and other operating systems, it is possible to create file systems that are backed by memory instead of persistent storage. An example of this is the /tmp directory which is often used to store files that do not need to survive between reboots.

Another consideration is pointer consistency. You may want the memory to come in and out as required, but don't want to have to refresh all the pointers pointing at the memory, mmap can also help with that.

The general procedure:

int fd = open("/tmp/buffer", ...);
ftruncate(fd, ...);

void* p = mmap(0, ..., ..., MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
p = mmap(p, ..., ..., MAP_FIXED | MAP_PRIVATE, fd, 0);

First, open will create a file in /tmp that is backed by memory. Later we can unlink this file to free memory. Use ftruncate to effectively allocate memory by resizing the file. Operating system will not actually map in memory pages until it is first accessed.

Then, the first anonymous mmap provides a memory range where we can map in the memory. We use the second mmap call to tie that address to the memory mapped file in /tmp.

There are other alternatives to using temporary files and they can work the same way, such as using shared memory segments.