I recently saw an issue: irssi 1.4.1 fails to build on darwin arm64, and it’s phenomenon is that it reports an error when linking.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
Undefined symbols for architecture arm64:
  "_current_theme", referenced from:
      _format_get_text_theme in libfe_common_core.a(formats.c.o)
      _format_get_text in libfe_common_core.a(formats.c.o)
      _strip_codes in libfe_common_core.a(formats.c.o)
      _format_send_as_gui_flags in libfe_common_core.a(formats.c.o)
      _window_print_daychange in libfe_common_core.a(fe-windows.c.o)
      _printformat_module_dest_charargs in libfe_common_core.a(printtext.c.o)
      _printformat_module_gui_args in libfe_common_core.a(printtext.c.o)
      ...
  "_default_formats", referenced from:
      _format_find_tag in libfe_common_core.a(formats.c.o)
      _format_get_text_theme_args in libfe_common_core.a(formats.c.o)
      _printformat_module_dest_args in libfe_common_core.a(printtext.c.o)
      _printformat_module_gui_args in libfe_common_core.a(printtext.c.o)
ld: symbol(s) not found for architecture arm64

The code themes.c defines these two global variables.

1
2
THEME_REC *current_theme;
GHashTable *default_formats;

And the themes.c compiled from themes.c.o is also in the archive file: themes.c.o.

1
2
$ ar t src/fe-common/core/libfe_common_core.a | grep themes.c.o
themes.c.o

And themes.c.o also defines these two symbols.

1
2
3
4
$ objdump -t src/fe-common/core/libfe_common_core.a.p/themes.c.o | grep COM
0000000000000008         01 COM    00 0300 _current_theme
0000000000000008         01 COM    00 0300 _default_formats
0000000000000008         01 COM    00 0300 _themes

So, what is the problem? It looks like the link provides the libfe_common_core.a argument, and the .a also contains themes.c.o, and the symbols we are looking for are defined, so why is there an Undefined symbols problem?

The answer lies in the COMMON symbols.

COMMON symbols

The reasons and principles of the COMMON symbols can be found in detail in MaskRay’s blog All about COMMON symbols, which describes this issue in great detail from the linker’s point of view from the linker’s point of view.

In short, the COMMON symbol was introduced to interoperate with Fortran. It corresponds to a global variable in C without an initialization statement. In fact, at the end, it is still saved in the .bss segment and cleared by default. So.

1
2
int common_symbol;
int not_common_symbol = 0;

The two statements end up with similar results, except that the first one is a COMMON Symbol and the second one is a normal GLOBAL Symbol.

This still doesn’t seem to be related to the Undefined symbols error. What is the problem?

Archive

A static library is usually given as an Archive, with the suffix .a. It is actually a collection of .o packages, plus an index, i.e. a separate table that holds which symbols are defined for each .o. The advantage of this is that when looking for symbols, instead of traversing .o, you can look for the relevant symbols directly in the index.

To create an Archive, the ar command can be used on Linux to.

1
ar cr libxxx.a a.o b.o c.o

where c means create and r means insert (and overwrite).

On macOS, you have to use libtool -static to create Archive.

1
libtool -static libxxx.a a.o b.o c.o

Otherwise an error will be reported when linking.

1
ld: warning: ignoring file libxxx.a, building for macOS-arm64 but attempting to link with file built for unknown-unsupported file format

Then you can use the ar t command to see what is in the Archive.

1
2
3
4
$ ar t libxxx.a
a.o
b.o
c.o

The index of the Archive can be viewed with the nm --print-armap command.

1
2
3
4
5
6
7
8
$ nm --print-armap libxxx.a
Archive index:
symbol1 in a.o
symbol2 in b.o
symbol3 in c.o

a.o:
0000000000000000 T symbol1

So we already know about Archive: it is a collection of multiple .o files, and it implements an index. When linking, the index is used to find .o instead of traversing all .o files.

Linking problem

So, going back to the linking problem at the beginning, since we have confirmed that the symbol is defined in the .o file and that this .o is indeed in the .a file, there is only one last possibility left: the symbol is not in the index.

Trying with the nm --print-armap command, I found that the _default_formats and _current_theme above are only defined in the corresponding .o, but not in the Archive index section.

Netizen @ailin-nemui pointed out the problem and provided a link: OS X linker unable to find symbols from a C file which only contains variables. The important point it makes is that the macOS version of ar/ranlib/libtool does not create indexes for COMMON symbols by default. So, the solution is clear:

  1. The first one is not to create COMMON symbols: add the compile option -fno-common, which is the default in newer compilers.
  2. The second is to create an index for the COMMON symbol: use the libtool -static -c command, where the -c option turns on indexing for the COMMON symbol.
  3. Third, modify the code: set an initialization value for the global variable

This way, the problem is solved properly.

Appendix

The following is the relevant documentation written in the macOS libtool manpage.

1
2
3
-c     Include common symbols as definitions with respect to the table of contents.  This is seldom the intended behavior for linking  from
          a library, as it forces the linking of a library member just because it uses an uninitialized global that is undefined at that point
          in the linking.  This option is included only because this was the original behavior of ranlib.  This option is not the default.