You always hear people talking about how C programmers have to manage their own memory. I think that I've understood what that meant for a while. It means that if you want to manipulate data, you must ask the machine to find a place for you to put it. Furthermore, that space, of whatever size you specified in your request, is going to be reserved by the machine until you specifically tell it that you are done using it (or the function returns, thus emptying the stack frame). Today I came a bit closer to understand what that means in terms of C programming style.

I've been messing around a bit with C lately just for fun. I was exposed to the language for the first when I was in college but I never really got the hang of pointers. Suddenly, I feel like I'm starting to get it, and it's not because I've been spending a lot of time writing or reading C. I think that I'm just a better programmer now and things somehow make more sense to me. Thanks to Hacker School and Perka, I've grown tremendously as a programmer in the last year.

Anyhow, enough gushing. I was playing around with mmap today. mmap is the C function that does virtual memory mapping for files and devices. Crack open the mmap man(2) page and you'll see this function signature:

void *
mmap(void *addr, size_t len, int prot, int flags, int fd,
        off_t offset);

I decided that I was going to try to wrap this function in my own function called char* mapfile(char* filename) which would take a filename and return a pointer to the first character mapped in the file. So I started writing my function:

char* mapfile(char* filename, int* fd_p, size_t* size_p)
{
    int fd;
    char *addr;
    size_t size = *size_p;
    struct stat sb;

    /* Open the input file. */
    fd = open(filename, O_RDONLY);

    if (fd == -1)
        handle_error("open");

    if (fstat(fd, &sb) == -1)
        handle_error("fstat");

    size = sb.st_size;

    /* Memory map the file. */
    addr = mmap(NULL,
            size,
            PROT_READ,
            MAP_PRIVATE,
            fd,
            0);

    return addr;

Okay, so fstat is a bit strange, who knows what that is actually about, but fine. I got the size of the file. So far so good. I return the address to the chunk of memory slash the first character mapped to the file. Life good.

But wait, I'm writing C so if I (in some sense) allocated some memory using mmap, then I need to free it with it's corresponding freeing function. For mmap this is munmap which has the following function signature:

int
munmap(void *addr, size_t len);

Just to be crystal clear: munmap takes two parameters: the pointer and the size of the mapped thing. That's a problem. My mapfile function only returns the pointer. The size_t is scoped within the function and is not available to caller. How is the caller going to munmap?? Maybe I can figure out a hack which has mapfile basically return two values:

return addr, size;

I hear that the Go programming language allows you do things like this, but you certainly can't in C. Can you return an array? Not really because the array has different types in it. An array of pointers? Ehh no, pointers are also typed in C. Oh oh! Maybe you can make the data global in the program. Yeah, that seems like a great idea! Ha!

How can anyone program in C without the ability to write a reasonable mapfile function? A coworker enlightened me with the answer:

In C, the caller has to manage the memory for the functions that it uses.

As the caller of mapfile, you need to allocate a size_t variable. We do it on the stack by simply declaring it. Then you give mapfile a pointer to that variable and allow it to do its magic and set the value:

size_t size;
mapfile(filename, &size)

So you have to change the signature of mapfile to also take a *size_t. And you have a similar implementation except that at the very end where you take size_p, the pointer provided by the caller, and point it to the piece of data you want to expose. I have no idea if this is good style but it seems reasonable enough to me.

char* mapfile(char* filename, size_t* size_p)
{

//
// skipping over some stuff ...
//

size_t size;

/* Memory map the file. */
addr = mmap(NULL,
        size,
        PROT_READ,
        MAP_PRIVATE,
        fd,
        0);

/* Make these available to the caller */
*size_p = size;

return addr;
}

And here's the full function. There's a bit more to the story because you need to provide a file pointer as well so that the caller can close the file, but that's just variations on a theme.

char* mapfile(char* filename, int* fd_p, size_t* size_p)
{
    int fd;
    char *addr;
    size_t size;
    struct stat sb;

    /* Open the input file. */
    fd = open(filename, O_RDONLY);

    if (fd == -1)
        handle_error("open");

    if (fstat(fd, &sb) == -1)
        handle_error("fstat");

    size = sb.st_size;

    /* Memory map the file. */
    addr = mmap(NULL,
            size,
            PROT_READ,
            MAP_PRIVATE,
            fd,
            0);

    /* Make these avaiable to the caller */
    *fd_p = fd;
    *size_p = size;

    return addr;
}

Memory allocation and the pointer manipulation is clearly at the core of how C works but it's more than that. It affects your style. It fundamentally changes the way you think about the design of your programs. It's not an annoyance, it's at the core of how everyone must think about and write C code.