eventfd definition

The eventfd() function creates an “eventfd object” that can be used by the user-state program as an event waiting/notification mechanism and the kernel can notify the user-state program of the event.

1
2
#include <sys/eventfd.h>
int eventfd(unsigned int initval, int flags);

Initialization parameters

initval

The eventfd object contains a uint64_t counter, which is saved by the kernel. The value of this counter can be initialized by the initval parameter when initializing eventfd.

The eventfd() call returns a new fd that points to this newly created eventfd object.

flags

The flags parameter can be obtained by bitwise and combination of the following options that determine the behavior of eventfd.

  • EFD_CLOEXEC (linux 2.6.27 and later) Set this fd to close-on-exec (automatically closes the fd when exec is called)
  • EFD_NONBLOCK (linux 2.6.27 and later) set this fd to non-blocking
  • EFD_SEMAPHORE (linux 2.6.30 and later) reads semaphore-like data from eventfd, see the description of read below

read

Read returns the length of the 8-byte shaped buffer (i.e., returns 8) upon success. If the supplied buffer is less than 8 bytes, -1 is returned and errno is set to EINVAL error.

The result of read varies depending on whether the eventfd’s counter is 0 and whether the flag parameter was set to EFD_SEMAPHORE when the eventfd object was created.

  • If EFD_SEMAPHORE is not set and the value of counter is non-zero, read returns an 8-byte shaper with the value of counter and sets the value of counter to 0
  • If EFD_SEMAPHORE is set and the value of counter is non-zero, read returns an 8-byte shape-shift with a value of 1 and sets the value of counter by one
  • If the value of counter is 0, the process goes into a blocking state or returns errno for EAGAIN, depending on whether flag is set to nonblocking

write

The write method adds 8 bytes of shaped data from the buffer to the eventfd’s counter. the maximum value stored on the counter is unint64-1, i.e. 0xfffffffffffffffe. if the sum is exceeded, wirte causes blocking or returns an errno depending on whether flag is set to non-blocking. EAGAIN’s errno.

If the buffer provided to the write call is less than 8 bytes, or if an attempt is made to write 0xffffffffffffffffff, write will return an EINVAL error.

poll, select and other similar operations

eventfd supports poll, select, epoll, and other similar operations.

  • eventfd is readable when the value of counter is greater than 0
  • eventfd is writable when counter is less than 0xffffffffffffff, i.e., at least one 1 can be written without blocking
  • When counter overflows, select considers eventfd both writeable and readable, and poll returns a POLLERR error. As mentioned above, write will never cause a counter overflow. However, if the KAIO subsystem performs 2^64 eventfd “signal posts”, an overflow may occur (theoretically possible, but unlikely in practice). If an overflow occurs, the read will return that maximum uint64_t value (i.e. 0xffffffffffffffffff).

The eventfd file descriptor also supports other file descriptor multiplexing APIs: pselect and ppoll.

close

File descriptors should be closed when they are no longer needed. When all file descriptors associated with the same eventfd object have been closed, the kernel releases the object’s resources.

A copy of the file descriptor created by eventfd() is inherited by the child process generated by fork. Duplicate file descriptors are associated with the same eventfd object. File descriptors created by eventfd() are retained in execve unless the close-on-exec flag is set.

Return Value

On success, eventfd() returns a new eventfd file descriptor. On error, the return value is -1 and errno is set to the corresponding error code.

Error codes

  • EINVAL flags have unsupported values specified in them
  • EMFILE has reached the limit of the number of open file descriptors per process
  • ENFILE The limit on the number of file descriptors opened by the operating system has been reached.
  • ENODEV cannot mount (internal) anonymous inode devices
  • ENOMEM does not have enough memory to create a new eventfd

Version Compatibility

eventfd() is available from linux kernel 2.6.22 and later, with support provided by glibc from version 2.8 onwards. eventfd2() system call is supported from kernel 2.6.27 onwards. From version 2.9 onwards, glibc’s eventfd() is also implemented internally based on eventfd2().

Notes

In all cases where the pipe is used only to signal events, the application can use the eventfd file descriptor instead of the pipe (see pipe(2)). The eventfd file descriptor has a much lower kernel overhead than a pipe and requires only one file descriptor (while a pipe requires two).

When used in the kernel, the eventfd file descriptor can provide a bridge from the kernel to user space, for example, by allowing functions such as KAIO (kernel AIO) to signal to the file descriptor that certain operations have completed.

A key point about the eventfd file descriptor is that it can be monitored using select(2), poll(2) or epoll(7) just like any other file descriptor. This means that applications can monitor both the readiness of “legacy” files and the readiness of other kernel mechanisms that support the eventfd interface. (Without the eventfd() interface, these mechanisms cannot be multiplexed via select(2), poll(2), or epoll(7).

The current value of the eventfd counter can be viewed in the /proc/[pid]/fdinfo directory of the process via an entry in the corresponding file descriptor. For more details, see proc(5).

There are two underlying Linux system calls: eventfd() and, more recently, eventfd2(). The former system call does not implement the flags argument. The latter system call implements the above flag values. The glibc wrapper function will use eventfd2() where available.

Example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
#include <iostream>
#include <sys/eventfd.h>
#include <unistd.h>
#include <inttypes.h> /* Definition of PRIu64 & PRIx64 */
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h> /* Definition of uint64_t */

#define handle_error(msg)   \
    do                      \
    {                       \
        perror(msg);        \
        exit(EXIT_FAILURE); \
    } while (0)

int main(int argc, char *argv[])
{
    int efd;
    uint64_t u;
    ssize_t s;

    if (argc < 2)
    {
        fprintf(stderr, "Usage: %s <num>...\n", argv[0]);
        exit(EXIT_FAILURE);
    }
    //Default blocking mode, read at once to clear the counter to zero, if it is already zero, then let the process block
    //efd = eventfd(0, 0);
    //Non-blocking, semaphore mode
    efd = eventfd(0, EFD_SEMAPHORE|EFD_NONBLOCK);
    if (efd == -1)
        handle_error("eventfd");

    switch (fork())
    {
    case 0:
        for (int j = 1; j < argc; j++)
        {
            printf("Child writing %s to efd\n", argv[j]);
            u = strtoull(argv[j], NULL, 0);
            /* strtoull() allows various bases */
            s = write(efd, &u, sizeof(uint64_t));
            if (s != sizeof(uint64_t))
                handle_error("write");
        }
        printf("Child completed write loop\n");

        exit(EXIT_SUCCESS);

    default:
        while (1)
        {
            sleep(1);
            printf("Parent about to read, u: %d\n", u);
            s = read(efd, &u, sizeof(uint64_t));
            printf("Parent read s:%d, u:%d\n", s, u);
            if (s != sizeof(uint64_t)){
                if(errno == EAGAIN){
                    exit(EXIT_SUCCESS);
                }
                handle_error("read");
            }
            printf("Parent read %" PRIu64 " (%#" PRIx64 ") from efd\n", u, u);
        }
    case -1:
        handle_error("fork");
    }
}