golang io/fs

1. Background on designing io/fs

The Go language’s interfaces are one of Gopher’s favorite syntactic elements. Their implicit contract satisfaction and “only currently available generic mechanism” qualities make them a powerful weapon for combinatorial-oriented programming, and their existence provides the foundation for Go to build abstractions of things, as well as being the primary means of doing so.

One of the most successful interface definitions in the Go language since its inception is the io.Writer and io.Reader.

1
2
3
4
5
6
7
type Writer interface {
    Write(p []byte) (n int, err error)
}

type Reader interface {
    Read(p []byte) (n int, err error)
}

These two interfaces establish a good abstraction of data operations in data sources, through which we can read or write any data source that satisfies both interfaces.

  • String
1
2
r := strings.NewReader("hello, go")
r.Read(...)
  • Byte sequences
1
2
r := bytes.NewReader([]byte("hello, go"))
r.Read(...)
  • Data within the file
1
2
f := os.Open("foo.txt") // f 满足io.Reader
f.Read(...)
  • Network sockets
1
2
r, err :=  net.DialTCP("192.168.0.10", nil, raddr *TCPAddr) (*TCPConn, error)
r.Read(...)
  • Constructing HTTP requests
1
req, err := http.NewRequestWithContext(ctx, "POST", url, bytes.NewReader([]byte("hello, go"))
  • Read the contents of a compressed file
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
func main() {
    f, err := os.Open("hello.txt.gz")
    if err != nil {
        log.Fatal(err)
    }

    zr, err := gzip.NewReader(f)
    if err != nil {
        log.Fatal(err)
    }

    if _, err := io.Copy(os.Stdout, zr); err != nil {
        log.Fatal(err)
    }

    if err := zr.Close(); err != nil {
        log.Fatal(err)
    }
}

io.Reader and io.Writer abstractions are inextricably linked to the deep Unix background of the original Go core team, which was probably influenced by the design philosophy that “in UNIX, everything is a stream of bytes”.

Unix has another design philosophy: Everything is a file, that is, in Unix, any device with I/O, whether it is a file, socket, driver, etc., has a corresponding file descriptor after the device is opened, and Unix simplifies the operation of these devices in the abstract file. The user only needs to open the file and pass the obtained file descriptor to the corresponding operation function, and the OS kernel knows how to get the specific device information based on this file descriptor, which hides the details of reading and writing to various devices internally.

Unix also uses a tree structure to organize various abstract files (data files, sockets, disk drives, external devices, etc.) and access them through file paths, so that a tree structure constitutes a file system.

But for some unknown historical reason, the Go language doesn’t have abstractions for files and file systems built into the standard library! We know today that os.File is a concrete struct type, not an abstract type.

1
2
3
4
5
6
// $GOROOT/src/os/types.go

// File represents an open file descriptor.
type File struct {
        *file // os specific
}

The only field in the structure os.File, the file pointer, is also an OS-related type. Let’s take os/file_unix.go as an example. In unix, file is defined as follows.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
// file is the real representation of *File.
// The extra level of indirection ensures that no clients of os
// can overwrite this data, which could cause the finalizer
// to close the wrong file descriptor.
type file struct {
        pfd         poll.FD
        name        string
        dirinfo     *dirInfo // nil unless directory being read
        nonblock    bool     // whether we set nonblocking mode
        stdoutOrErr bool     // whether this is stdout or stderr
        appendMode  bool     // whether file is opened for appending
}

Rob Pike, the father of the Go language, is upset that os.File was not defined as an interface in the first place.

Rob Pike

But as Russ Cox commented in the above issue: “I guess I would think io.File should be the interface, but that’s all moot now”.

But during the design of the embed file feature in Go 1.16, the Go core team and the Gopher’s involved in the discussion decided that introducing an abstraction of the File System and File would benefit Go code as much as io.Reader and io. So Rob Pike and Russ Cox took it upon themselves to complete the design of io/fs.

2. Exploring the io/fs package

The addition of io/fs was not “spur of the moment”, as the need for an abstract file system interface was raised and implemented many years ago with the godoc implementation.

godoc

This final implementation has been available for a long time in the form of the vfs package of the godoc tool. Although its implementation is somewhat complex and not sufficiently abstract, it is an important reference for the io/fs package design.

Go’s abstraction of file systems and files is grounded in the FS interface type and the File type in io/fs. The design of these two interfaces follows Go’s consistent “small interface principle” and conforms to the open-close design principle (open to extensions and closed to modifications).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// $GOROOT/src/io/fs/fs.go
type FS interface {
        // Open opens the named file.
        //
        // When Open returns an error, it should be of type *PathError
        // with the Op field set to "open", the Path field set to name,
        // and the Err field describing the problem.
        //
        // Open should reject attempts to open names that do not satisfy
        // ValidPath(name), returning a *PathError with Err set to
        // ErrInvalid or ErrNotExist.
        Open(name string) (File, error)
}

// A File provides access to a single file.
// The File interface is the minimum implementation required of the file.
// A file may implement additional interfaces, such as
// ReadDirFile, ReaderAt, or Seeker, to provide additional or optimized functionality.
type File interface {
        Stat() (FileInfo, error)
        Read([]byte) (int, error)
        Close() error
}

The FS interface represents the minimal abstraction of the virtual file system, which contains only one Open method; the File interface is the minimal abstraction of the virtual file, containing only the three common methods needed to abstract a file (and no less). We can extend these two interfaces by way of Go’s common embedded interface types, just as io.ReadWriter is based on an extension of io.Reader. In this design proposal, the authors also name this approach as extension interface, where one or more new methods are added to a basic interface type to form a new interface. For example, the following extension interface type StatFS is based on the FS interface.

1
2
3
4
5
6
7
8
// A StatFS is a file system with a Stat method.
type StatFS interface {
        FS

        // Stat returns a FileInfo describing the file.
        // If there is an error, it should be of type *PathError.
        Stat(name string) (FileInfo, error)
}

For File, the basic interface type, fs package only gives an extension interface: ReadDirFile, that is, on the basis of the File interface to add a ReadDir method formed, this extension method name + the name of the basic interface to name a new interface type is also Go’s usual method.

For FS interfaces, the fs package gives some examples of common “new extension interfaces” that extend FS.

FS

Take the ReadDirFS interface of the fs package as an example.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
// $GOROOT/src/io/fs/readdir.go
type ReadDirFS interface {
    FS

    // ReadDir reads the named directory
    // and returns a list of directory entries sorted by filename.
    ReadDir(name string) ([]DirEntry, error)
}

// ReadDir reads the named directory
// and returns a list of directory entries sorted by filename.
//
// If fs implements ReadDirFS, ReadDir calls fs.ReadDir.
// Otherwise ReadDir calls fs.Open and uses ReadDir and Close
// on the returned file.
func ReadDir(fsys FS, name string) ([]DirEntry, error) {
    if fsys, ok := fsys.(ReadDirFS); ok {
        return fsys.ReadDir(name)
    }

    file, err := fsys.Open(name)
    if err != nil {
        return nil, err
    }
    defer file.Close()

    dir, ok := file.(ReadDirFile)
    if !ok {
        return nil, &PathError{Op: "readdir", Path: name, Err: errors.New("not implemented")}
    }

    list, err := dir.ReadDir(-1)
    sort.Slice(list, func(i, j int) bool { return list[i].Name() < list[j].Name() })
    return list, err
}

We see that along with ReadDirFS, the standard library also provides a helper function: ReadDir . The first parameter of this function is a variable of the FS interface type. In its internal implementation, ReadDir first determines whether the incoming fsys implements ReadDirFS by type assertion, and if it does, it directly calls its ReadDir method; if it does not, it gives the regular implementation. Several other FS extension interfaces also have their own helper function, which is a Go convention. If you are implementing your own FS extensions, don’t forget this convention: give the helper function that accompanies your extension interface.

Some packages in the standard library that involve virtual file systems were adapted to io/fs in Go 1.16, such as: os, net/http, html/template, text/template, archive/zip, etc.

Take http.FileServer as an example. Before Go 1.16, a static file server was generally written like this

1
2
3
4
5
6
7
8
// github.com/bigwhite/experiments/blob/master/iofs/fileserver_classic.go
package main

import "net/http"

func main() {
    http.ListenAndServe(":8080", http.FileServer(http.Dir(".")))
}

After the Go 1.16 http package has adapted the FS and File interfaces to fs, we can write it like this.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
// github.com/bigwhite/experiments/blob/master/iofs/fileserver_iofs.go
package main

import (
    "net/http"
    "os"
)

func main() {
    http.ListenAndServe(":8080", http.FileServer(http.FS(os.DirFS("./"))))
}

The new DirFS function added to the os package returns an implementation of fs.FS: a File System consisting of a file tree rooted by the incoming dir.

We can refer to DirFS to implement a goFilesFS implementation that returns only files with the .go suffix.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
// github.com/bigwhite/experiments/blob/master/iofs/gofilefs/gofilefs.go

package gfs

import (
    "io/fs"
    "os"
    "strings"
)

func GoFilesFS(dir string) fs.FS {
    return goFilesFS(dir)
}

type goFile struct {
    *os.File
}

func Open(name string) (*goFile, error) {
    f, err := os.Open(name)
    if err != nil {
        return nil, err
    }
    return &goFile{f}, nil
}

func (f goFile) ReadDir(count int) ([]fs.DirEntry, error) {
    entries, err := f.File.ReadDir(count)
    if err != nil {
        return nil, err
    }
    var newEntries []fs.DirEntry

    for _, entry := range entries {
        if !entry.IsDir() {
            ss := strings.Split(entry.Name(), ".")
            if ss[len(ss)-1] != "go" {
                continue
            }
        }
        newEntries = append(newEntries, entry)
    }
    return newEntries, nil
}

type goFilesFS string

func (dir goFilesFS) Open(name string) (fs.File, error) {
    f, err := Open(string(dir) + "/" + name)
    if err != nil {
        return nil, err // nil fs.File
    }
    return f, nil
}

In the above GoFilesFS implementation.

  • goFilesFS implements the FS interface of io/fs, and the fs.File instance returned by its Open method is my custom goFile structure.
  • The goFile structure satisfies the File interface of io/fs by embedding *os.File.
  • We override goFile’s ReadDir method (overriding os.File’s method of the same name), in which we filter out files with non-.go suffixes.

With the GoFilesFS implementation in place, we can then pass it to http.FileServer.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
// github.com/bigwhite/experiments/blob/master/iofs/fileserver_gofilefs.go
package main

import (
    "net/http"

    gfs "github.com/bigwhite/testiofs/gofilefs"
)

func main() {
    http.ListenAndServe(":8080", http.FileServer(http.FS(gfs.GoFilesFS("./"))))
}

By opening the localhost:8080 page through a browser, we can see a file tree consisting of only go source files!

3. Improve code testability with io/fs

With the addition of the abstraction of the file system and files in Go 1.16, we can use io/fs to improve the testability of such code when we are dealing with file-related code in the future.

We have a function like this.

1
func FindGoFiles(dir string) ([]string, error)

This function finds the paths of all go source files under dir and puts them in a []string to return. We can easily give the first version of the following implementation.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// github.com/bigwhite/experiments/blob/master/iofs/gowalk/demo1/gowalk.go

package demo

import (
    "os"
    "path/filepath"
    "strings"
)

func FindGoFiles(dir string) ([]string, error) {
    var goFiles []string
    err := filepath.Walk(dir, func(path string, info os.FileInfo, err error) error {
        if info.IsDir() {
            return nil
        }

        ss := strings.Split(path, ".")
        if ss[len(ss)-1] != "go" {
            return nil
        }

        goFiles = append(goFiles, path)
        return nil
    })
    if err != nil {
        return nil, err
    }

    return goFiles, nil
}

This version of the implementation uses filepath’s Walk function directly, which is tightly bound to the os package, i.e. to test this function, we need to construct a real file tree on disk, like the following.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
$tree testdata
testdata
└── foo
    ├── 1
    │   └── 1.txt
    ├── 1.go
    ├── 2
    │   ├── 2.go
    │   └── 2.txt
    └── bar
        ├── 3
        │   └── 3.go
        └── 4.go

Following the go convention, we [place the external data files for the test dependencies under testdata. Here is the test file for the above function.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// github.com/bigwhite/experiments/blob/master/iofs/gowalk/demo1/gowalk_test.go
package demo

import (
    "testing"
)

func TestFindGoFiles(t *testing.T) {
    m := map[string]bool{
        "testdata/foo/1.go":       true,
        "testdata/foo/2/2.go":     true,
        "testdata/foo/bar/3/3.go": true,
        "testdata/foo/bar/4.go":   true,
    }

    files, err := FindGoFiles("testdata/foo")
    if err != nil {
        t.Errorf("want nil, actual %s", err)
    }

    if len(files) != 4 {
        t.Errorf("want 4, actual %d", len(files))
    }

    for _, f := range files {
        _, ok := m[f]
        if !ok {
            t.Errorf("want [%s], actual not found", f)
        }
    }
}

The first version of the FindGoFiles function’s design was clearly less testable, requiring a dependency on a specific layout of files on disk, although testdata was also submitted to the code repository as source.

With the io/fs package, we use the FS interface to improve the measurability of the FindGoFiles function a bit, and we redesign the function.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
// github.com/bigwhite/experiments/blob/master/iofs/gowalk/demo2/gowalk.go

package demo

import (
    "io/fs"
    "strings"
)

func FindGoFiles(dir string, fsys fs.FS) ([]string, error) {
    var newEntries []string
    err := fs.WalkDir(fsys, dir, func(path string, entry fs.DirEntry, err error) error {
        if entry == nil {
            return nil
        }

        if !entry.IsDir() {
            ss := strings.Split(entry.Name(), ".")
            if ss[len(ss)-1] != "go" {
                return nil
            }
            newEntries = append(newEntries, path)
        }
        return nil
    })

    if err != nil {
        return nil, err
    }

    return newEntries, nil
}

This time we add a parameter fsys of type fs.FS to FindGoFiles, which is the key to unlocking the function from the specific FS implementation. Of course the test method of demo1 is also applicable to this version of the FindGoFiles function.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
// github.com/bigwhite/experiments/blob/master/iofs/gowalk/demo2/gowalk_test.go
package demo

import (
    "os"
    "testing"
)

func TestFindGoFiles(t *testing.T) {
    m := map[string]bool{
        "testdata/foo/1.go":       true,
        "testdata/foo/2/2.go":     true,
        "testdata/foo/bar/3/3.go": true,
        "testdata/foo/bar/4.go":   true,
    }

    files, err := FindGoFiles("testdata/foo", os.DirFS("."))
    if err != nil {
        t.Errorf("want nil, actual %s", err)
    }

    if len(files) != 4 {
        t.Errorf("want 4, actual %d", len(files))
    }

    for _, f := range files {
        _, ok := m[f]
        if !ok {
            t.Errorf("want [%s], actual not found", f)
        }
    }
}

Since we use the io/fs.FS interface, any entity that implements the fs.FS interface can be used to construct tests against FindGoFiles. FS interface and fs.File-related interfaces is still a bit tricky. The Go standard library has thought of this and provides us with the testing/fstest package, so we can directly use the memory-based FS implemented in the fstest package to test FindGoFiles.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
// github.com/bigwhite/experiments/blob/master/iofs/gowalk/demo3/gowalk_test.go
package demo

import (
    "testing"
    "testing/fstest"
)

/*
$tree testdata
testdata
└── foo
    ├── 1
    │   └── 1.txt
    ├── 1.go
    ├── 2
    │   ├── 2.go
    │   └── 2.txt
    └── bar
        ├── 3
        │   └── 3.go
        └── 4.go

5 directories, 6 files

*/

func TestFindGoFiles(t *testing.T) {
    m := map[string]bool{
        "testdata/foo/1.go":       true,
        "testdata/foo/2/2.go":     true,
        "testdata/foo/bar/3/3.go": true,
        "testdata/foo/bar/4.go":   true,
    }

    mfs := fstest.MapFS{
        "testdata/foo/1.go":       {Data: []byte("package foo\n")},
        "testdata/foo/1/1.txt":    {Data: []byte("1111\n")},
        "testdata/foo/2/2.txt":    {Data: []byte("2222\n")},
        "testdata/foo/2/2.go":     {Data: []byte("package bar\n")},
        "testdata/foo/bar/3/3.go": {Data: []byte("package zoo\n")},
        "testdata/foo/bar/4.go":   {Data: []byte("package zoo1\n")},
    }

    files, err := FindGoFiles("testdata/foo", mfs)
    if err != nil {
        t.Errorf("want nil, actual %s", err)
    }

    if len(files) != 4 {
        t.Errorf("want 4, actual %d", len(files))
    }

    for _, f := range files {
        _, ok := m[f]
        if !ok {
            t.Errorf("want [%s], actual not found", f)
        }
    }
}

Since FindGoFiles accepts fs.FS type variables as parameters, making it significantly more testable, we can construct test scenarios by code without the need to construct complex and variable test scenarios on real physical disks.

4. Summary

The addition of io/fs makes it easy to program interface-oriented, rather than to os.File as a concrete implementation. The addition of io/fs is not at all inconsistent, as if the package and its abstractions existed when Go 1.0 was released . This is the benefit of the implicit dependency nature of the Go interface, which makes it feel very nice!

The code covered in this article can be downloaded here.