Sunday, July 25, 2010

Forking File Descriptors

When create a new process with a call to fork, one thing the child process will retain is a copy of the parent's open file descriptors:
The child process shall have its own copy of the parent's file descriptors. Each of the child's file descriptors shall refer to the same open file description with the corresponding file descriptor of the parent.
This can be useful, for example, it enables IPC by remapping the standard input, output, and error descriptors to pipes that were created before the fork and then using exec to replace the child image with a new command. However, this behavior can also be an annoyance if one is not careful. An example I hit recently was that the application was occasionally hanging when it shouldn't be. Poking at the system it was blocked on a read to a file descriptor that should have been closed. Debugging further the file descriptor for writing to the pipe was indeed closed in the parent process but, a fork and exec had occurred before it was closed in a different part of the code and the child process still had an open handle to write to that pipe. This failure was of course sporadic and difficult to reproduce because it only occurred if the timing was just right.

Once the problem was diagnosed though, it seemed like it should be a trivial problem to fix. We just need to make sure we close out all of the file descriptors we aren't interested in. In the past I did this using a closefrom system call. This call does not seem to be widely supported though. The naive way to do this is to simply close everything in the range of possible values for file descriptors. On POSIX systems the range can be determined via the _SC_OPEN_MAX setting:

int
closefrom_v1(int fd) {
    int max = sysconf(_SC_OPEN_MAX);
    for (int i = fd; i <= max; ++i) {
        close(i);
    }
    return 0;
}
This method is simple and it basically works, but it bothers me that it is closing lots of file descriptors that aren't open. More importantly though, with this implementation we cannot determine if the close of some descriptors failed. So here is an improved version that checks the error codes and returns -1 on failure and the number of closed descriptors on success:
int
closefrom_v2(int fd) {
    int count = 0;
    int max = sysconf(_SC_OPEN_MAX);
    for (int i = fd; i <= max; ++i) {
        if (close(i) == -1) {
            if (errno != EBADF) {
                return -1;
            }
        } else {
            ++count;
        }
    }
    return count;
}
Notice that we are still closing file descriptors that are not open. I'm not aware of a simple and portable scheme for listing the set of open descriptors for a process so I'll live with the annoyance for now. To make sure it works as expected we can create a simple test program that opens some files, forks a child, has the child use closefrom, and then exec's a process that will list out the file descriptors that are open. Here is the example:
#include <unistd.h>
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/wait.h>

int
closefrom_v1(int fd) {
    int max = sysconf(_SC_OPEN_MAX);
    for (int i = fd; i <= max; ++i) {
        close(i);
    }
    return 0;
}

int
closefrom_v2(int fd) {
    int count = 0;
    int max = sysconf(_SC_OPEN_MAX);
    for (int i = fd; i <= max; ++i) {
        if (close(i) == -1) {
            if (errno != EBADF) {
                return -1;
            }
        } else {
            ++count;
        }
    }
    return count;
}

int
main(int argc, char **argv) {
    if (argc != 2) {
        printf("Usage: %s <cmd>\n", argv[0]);
        exit(1);
    }

    // Open a bunch of files so the child will have something to close
    int numFDs = 42;
    for (int i = 0; i < numFDs; ++i) {
        FILE *f = fopen("/dev/null", "w");
        if (f == NULL) {
            perror("open failed");
            exit(1);
        }
        assert(f != NULL);
    }
    printf("opened %d files\n", numFDs);

    // Fork a child and exec
    pid_t pid = fork();
    if (pid == -1) {
        perror("could not fork");
    } else if (pid == 0) {
        printf("closed %d files\n", closefrom_v2(3));
        if (execlp(argv[1], (char *) 0) == -1) {
            perror("could not exec");
        }
    } else {
        int status;
        waitpid(pid, &status, 0);
        printf("child %d exited with status %d\n", pid, status);
    }
    return 0;
}
For the program to exec, I'll use a simple shell script that lists the contents of the procfs directory /proc/PID/fd:
#!/bin/sh
ls /proc/$$/fd
Note that using procfs to get the listing of filed descriptors is not portable and may not work for you. If you need a more general scheme looking at what lsof is probably a good starting point. Back to the main topic, if the script works you should get output like:
$ ./testclose listfds.sh
opened 42 files
closed 42 files
0  1  2  255
child 31588 exited with status 0
Note file descriptor 255 shown in the listing is added by the shell that is executing the script. It was not open until the after the exec so it was not closed.

No comments:

Post a Comment