Yu-Chin Juan, the author of FFM, has open-sourced the C++ version of the code libffm on GitHub. Since the daily data processing is in a Python environment, expect to find a Python version of FFM. Related projects on Github There are many on Github, such as this one: A Python wrapper for LibFFM.

## Installation of libffm in Windows+Anaconda environment

### Installation of libffm-python package

The project is installed on Windows as follows.

• Download the project locally and unzip it.
• Install the mingw32 environment. conda install mingw32
• Add mingw32 path to environment variable PATH: C:\RBuildTools\3.5\mingw_32\bin
• Modify the compilation settings in Python. D:\ProgramData\Anaconda3\Lib\distutils\distutils.cfg If you don’t have this file then create it yourself, add the content as.
 1 2  [build] compiler=mingw32 
• Execute: python setup.py install in the project directory

However, when using it, the following error is reported.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30  --------------------------------------------------------------------------- OSError Traceback (most recent call last) in ----> 1 import ffm D:\ProgramData\Anaconda3\lib\site-packages\ffm-7e8621d-py3.6-win-amd64.egg\ffm\__init__.py in ----> 1 from .ffm import FFMData, FFM, read_model D:\ProgramData\Anaconda3\lib\site-packages\ffm-7e8621d-py3.6-win-amd64.egg\ffm\ffm.py in 70 FFM_Problem_ptr = ctypes.POINTER(FFM_Problem) 71 ---> 72 _lib = ctypes.cdll.LoadLibrary(get_lib_path()) 73 74 _lib.ffm_convert_data.restype = FFM_Problem D:\ProgramData\Anaconda3\lib\ctypes\__init__.py in LoadLibrary(self, name) 424 425 def LoadLibrary(self, name): --> 426 return self._dlltype(name) 427 428 cdll = LibraryLoader(CDLL) D:\ProgramData\Anaconda3\lib\ctypes\__init__.py in __init__(self, name, mode, handle, use_errno, use_last_error) 346 347 if handle is None: --> 348 self._handle = _dlopen(self._name, mode) 349 else: 350 self._handle = handle OSError: [WinError 87] 参数错误。 

The main reason is that the libffm.so file was not compiled and generated during the installation on Windows. The installation failed.

### Compilation of Libffm on Windows

Since I had problems with the Python package, I thought I would compile it directly using the C++ version of the code. After reading the project description, only v1.21 of libffm supports Windows environment:

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20  Building Windows Binaries ========================= The Windows part is maintained by different maintainer, so it may not always support the latest version. The latest version it supports is: v1.21 To build them via command-line tools of Visual C++, use the following steps: 1. Open a DOS command box (or Developer Command Prompt for Visual Studio) and go to LIBFFM directory. If environment variables of VC++ have not been set, type "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\amd64\vcvars64.bat" You may have to modify the above command according which version of VC++ or where it is installed. 2. Type nmake -f Makefile.win clean all 

Follow the above procedure to install, the first error encountered: “nmake” cannot be found

 1 2 3 4 5 6 7  nmake : 无法将“nmake”项识别为 cmdlet、函数、脚本文件或可运行程序的名称。请检查名称的拼写，如果包括路径，请确保路径正 确，然后再试一次。 所在位置 行:1 字符: 1 + nmake -f Makefile.win clean all + ~~~~~ + CategoryInfo : ObjectNotFound: (nmake:String) [], CommandNotFoundException + FullyQualifiedErrorId : CommandNotFoundException 

The initial solution was to add the directory where “nmake” is located to the environment variable PATH. However, the error is still reported after execution, and this time the main error is that the referenced file cannot be loaded.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14  PS E:\Download\libffm-121> nmake -f Makefile.win clean all Microsoft (R) Program Maintenance Utility Version 14.00.24210.0 Copyright (C) Microsoft Corporation. All rights reserved. erase /Q *.obj *.exe windows\. rd windows mkdir windows cl.exe /nologo /O2 /EHsc /D "_CRT_SECURE_NO_DEPRECATE" /D "USEOMP" /D "USESSE" /openmp -c ffm.cpp ffm.cpp ffm.cpp(21): warning C4068: unknown pragma ffm.cpp(22): fatal error C1034: algorithm: no include path set NMAKE : fatal error U1077: '"C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\cl.exe"' : return code '0x2' Stop. 

After searching online, I found that the water of setting environment variables in VC++ is still deep, you need to add PATH, LIB and INCLUDE. The main reason is that ucrt is added in VS2015, so it needs to introduce Windows 10 SDK, and uuid.lib has to be found in Windows 8.x SDK, so it is still quite troublesome to configure.

• PATH C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin;C:\Program Files (x86)\Microsoft Visual Studio 14.0\Common7\IDE
• LIB C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\lib;C:\Program Files (x86)\Windows Kits\10\Lib\10.0.10240.0\ucrt\x86;C:\Program Files (x86)\Windows Kits\10\Lib\10.0.10240.0\ucrt\x86 Program Files (x86)\Windows Kits\8.1\Lib\winv6.3\um\x86
• INCLUDE C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\include;C:\Program Files (x86)\Windows Kits\10\Include\10.0.10240.0\ ucrt

Adjust the specific path accordingly according to the location of your installation. After finishing, execute it again to compile successfully. As follows, only a few warning messages appear.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18  PS E:\Download\libffm-121> nmake -f Makefile.win clean all Microsoft (R) Program Maintenance Utility Version 14.00.24210.0 Copyright (C) Microsoft Corporation. All rights reserved. erase /Q *.obj *.exe windows\. rd windows mkdir windows cl.exe /nologo /O2 /EHsc /D "_CRT_SECURE_NO_DEPRECATE" /D "USEOMP" /D "USESSE" /openmp -c ffm.cpp ffm.cpp ffm.cpp(21): warning C4068: unknown pragma cl.exe /nologo /O2 /EHsc /D "_CRT_SECURE_NO_DEPRECATE" /D "USEOMP" /D "USESSE" /openmp -c timer.cpp timer.cpp cl.exe /nologo /O2 /EHsc /D "_CRT_SECURE_NO_DEPRECATE" /D "USEOMP" /D "USESSE" /openmp ffm-train.cpp ffm.obj timer.obj -Fewindows\ffm-train.exe ffm-train.cpp ffm-train.cpp(1): warning C4068: unknown pragma cl.exe /nologo /O2 /EHsc /D "_CRT_SECURE_NO_DEPRECATE" /D "USEOMP" /D "USESSE" /openmp ffm-predict.cpp ffm.obj timer.obj -Fewindows\ffm-predict.exe ffm-predict.cpp 

After compilation, a new windows folder will be created under the source folder and 2 exe files will be generated.

• ffm-predict.exe
• ffm-train.exe

### Use of ffm-train.exe and ffm-predict.exe

The simpler method is to call it directly from the command line, using the method described in the project documentation.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31  Command Line Usage ================== - ffm-train' usage: ffm-train [options] training_set_file [model_file] options: -l : set regularization parameter (default 0.00002) -k : set number of latent factors (default 4) -t : set number of iterations (default 15) -r : set learning rate (default 0.2) -s : set number of threads (default 1) -p : set path to the validation set --quiet: quiet model (no output) --no-norm: disable instance-wise normalization --auto-stop: stop at the iteration that achieves the best validation loss (must be used with -p) By default we do instance-wise normalization. That is, we normalize the 2-norm of each instance to 1. You can use --no-norm' to disable this function. A binary file training_set_file.bin' will be generated to store the data in binary format. Because FFM usually need early stopping for better test performance, we provide an option --auto-stop' to stop at the iteration that achieves the best validation loss. Note that you need to provide a validation set with -p' when you use this option. - ffm-predict' usage: ffm-predict test_file model_file output_file 

Alternatively it can be used by calling the command line via Python at

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39  import os import subprocess os.getcwd() os.chdir(r'E:\Download\libffm-121\windows') os.getcwd() os.system("start ffm-train.exe") os.startfile("ffm-train.exe") os.system("start ffm-predict.exe") os.startfile("ffm-predict.exe") #使用缺省参数训练模型 cmd = 'ffm-train bigdata.tr.txt model' subprocess.call(cmd, shell=True) #使用bigdata.te.txt作为validation数据 cmd = 'ffm-train -p bigdata.te.txt bigdata.tr.txt model' subprocess.call(cmd, shell=True) #使用5折交叉验证 cmd = 'ffm-train -v 5 bigdata.tr.txt' subprocess.call(cmd, shell=True) #用–quiet参数训练时不打印训练信息 cmd = 'ffm-train –quiet bigdata.tr.txt' subprocess.call(cmd, shell=True) #预测 cmd = 'ffm-predict bigdata.te.txt model output.txt' subprocess.call(cmd, shell=True) #基于磁盘的训练 cmd = 'ffm-train –no-rand –on-disk bigdata.tr.txt' subprocess.call(cmd, shell=True) #使用–auto-stop参数，当达到最优的validation损失时停止训练 cmd = 'ffm-train -p bigdata.te.txt -t 100 bigdata.tr.txt' subprocess.call(cmd, shell=True) 

The address of the training file used for the sample code is

https://github.com/keyunluo/python-ffm/tree/master/example/libffm-format

As the above call is very troublesome, I found a separate open source project to further encapsulate it, the encapsulated code is

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75  from __future__ import print_function, absolute_import import os, sys, subprocess, shlex, tempfile, time, sklearn.base, math import numpy as np import pandas as pd from pandas_extensions import * from ExeEstimator import * class LibFFMClassifier(ExeEstimator, sklearn.base.ClassifierMixin): ''' options: -l : set regularization parameter (default 0) -k : set number of latent factors (default 4) -t : set number of iterations (default 15) -r : set learning rate (default 0.1) -s : set number of threads (default 1) -p : set path to the validation set --quiet: quiet model (no output) --norm: do instance-wise normalization --no-rand: disable random update --norm' helps you to do instance-wise normalization. When it is enabled, you can simply assign 1' to value' in the data. ''' def __init__(self, columns, lambda_v=0, factor=4, iteration=15, eta=0.1, nr_threads=1, quiet=False, normalize=None, no_rand=None): ExeEstimator.__init__(self) self.columns = columns.tolist() if hasattr(columns, 'tolist') else columns self.lambda_v = lambda_v self.factor = factor self.iteration = iteration self.eta = eta self.nr_threads = nr_threads self.quiet = quiet self.normalize = normalize self.no_rand = no_rand def fit(self, X, y=None): if type(X) is str: train_file = X else: if not hasattr(X, 'values'): X = pd.DataFrame(X, columns=self.columns) train_file = self.save_reusable('_libffm_train', 'to_libffm', X, y) # self._model_file = self.save_tmp_file(X, '_libffm_model', True) self._model_file = self.tmpfile('_libffm_model') command = 'utils/lib/ffm-train.exe' + ' -l ' + repr(v) + \ ' -k ' + repr(r) + ' -t ' + repr(n) + ' -r ' + repr(a) + \ ' -s ' + repr(s) if self.quiet: command += ' --quiet' if self.normalize: command += ' --norm' if self.no_rand: command += ' --no-rand' command += ' ' + train_file command += ' ' + self._model_file running_process = self.make_subprocess(command) self.close_process(running_process) return self def predict(self, X): if type(X) is str: test_file = X else: if not hasattr(X, 'values'): X = pd.DataFrame(X, columns=self.columns) test_file = self.save_reusable('_libffm_test', 'to_libffm', X) output_file = self.tmpfile('_libffm_predictions') command = 'utils/lib/ffm-predict.exe ' + test_file + ' ' + self._model_file + ' ' + output_file running_process = self.make_subprocess(command) self.close_process(running_process) preds = list(self.read_predictions(output_file)) return preds def predict_proba(self, X): predictions = np.asarray(map(lambda p: 1 / (1 + math.exp(-p)), self.predict(X))) return np.vstack([1 - predictions, predictions]).T 

In summary, it is very difficult to use libffm in a Windows environment, either for compiling or calling, and it is recommended to use it in a Linux environment if the environment permits.

## Installation of libffm in Linux+Anaconda environment

The installation of the libffm-python package in Anaconda on Linux also has problems. The specific error reported is as follows.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43  ➜ libffm-python git:(master) python setup.py install /home/qw/anaconda3/lib/python3.7/site-packages/setuptools/dist.py:481: UserWarning: The version specified ('7e8621d') is an invalid version, this may not work as expected with newer versions of setuptools, pip, and PyPI. Please see PEP 440 for more details. "details." % self.metadata.version running install running bdist_egg running egg_info creating ffm.egg-info writing ffm.egg-info/PKG-INFO writing dependency_links to ffm.egg-info/dependency_links.txt writing requirements to ffm.egg-info/requires.txt writing top-level names to ffm.egg-info/top_level.txt writing manifest file 'ffm.egg-info/SOURCES.txt' reading manifest file 'ffm.egg-info/SOURCES.txt' writing manifest file 'ffm.egg-info/SOURCES.txt' installing library code to build/bdist.linux-x86_64/egg running install_lib running build_py creating build creating build/lib.linux-x86_64-3.7 creating build/lib.linux-x86_64-3.7/ffm copying ffm/__init__.py -> build/lib.linux-x86_64-3.7/ffm copying ffm/ffm.py -> build/lib.linux-x86_64-3.7/ffm running build_ext building 'ffm.libffm' extension creating build/temp.linux-x86_64-3.7 gcc -pthread -B /home/qw/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I. -I/home/qw/anaconda3/include/python3.7m -c ffm.cpp -o build/temp.linux-x86_64-3.7/ffm.o -Wall -O3 -std=c++0x -march=native -DUSESSE -DUSEOMP cc1plus: 警告：command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ ffm.cpp:578: 警告：忽略 #pragma omp parallel [-Wunknown-pragmas] 578 | #pragma omp parallel for schedule(static) reduction(+: loss) | ffm.cpp:726: 警告：忽略 #pragma omp parallel [-Wunknown-pragmas] 726 | #pragma omp parallel for schedule(static) reduction(+: loss) | gcc -pthread -B /home/qw/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I. -I/home/qw/anaconda3/include/python3.7m -c timer.cpp -o build/temp.linux-x86_64-3.7/timer.o -Wall -O3 -std=c++0x -march=native -DUSESSE -DUSEOMP cc1plus: 警告：command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ g++ -pthread -shared -B /home/qw/anaconda3/compiler_compat -L/home/qw/anaconda3/lib -Wl,-rpath=/home/qw/anaconda3/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.7/ffm.o build/temp.linux-x86_64-3.7/timer.o -o build/lib.linux-x86_64-3.7/ffm/libffm.cpython-37m-x86_64-linux-gnu.so -fopenmp /home/qw/anaconda3/compiler_compat/ld: build/temp.linux-x86_64-3.7/ffm.o: unable to initialize decompress status for section .debug_info /home/qw/anaconda3/compiler_compat/ld: build/temp.linux-x86_64-3.7/ffm.o: unable to initialize decompress status for section .debug_info /home/qw/anaconda3/compiler_compat/ld: build/temp.linux-x86_64-3.7/ffm.o: unable to initialize decompress status for section .debug_info /home/qw/anaconda3/compiler_compat/ld: build/temp.linux-x86_64-3.7/ffm.o: unable to initialize decompress status for section .debug_info build/temp.linux-x86_64-3.7/ffm.o: file not recognized: file format not recognized collect2: 错误：ld 返回 1 error: command 'g++' failed with exit status 1 `

At first I thought there was a problem with the libffm code, so I replaced it with the latest version online and found that it still reported errors. So I checked the code again and found that the code was fine and could be compiled normally in a non-Anaconda environment. Anaconda comes with a connector ld which is stored in ~/anaconda3/compiler_compat directory, the solution is very simple, just change the name of the ld in ~/anaconda3/compiler_compat directory and install it again. The solution is very simple.