AIML Introduction

AIML, known as Artificial Intelligence Markup Language, is an XML language for creating natural language software agents, invented and created by Dr. Richard S. Wallace and the Alicebot open source software organization between 1995-2000. AIML is an XML format for rule definition in order to match patterns and determine responses.

  • The design goals of AIML are as follows.
  • AIML should be easy for the general public to learn and understand.
  • AIML should enable minimal concepts to be encoded to support a stimulus-response discipline system component based on I.C.E.
  • AIML should be compatible with XML.
  • Writing AIML processable program files should be easy and convenient.
  • AIML objects should have good readability and clarity for people.
  • The design of AIML should be formal and simple.
  • AIML should contain dependencies on other languages.

For a detailed primer on AIML, you can turn to Alice Bot’s AIML Primer. You can also learn more about AIML and what it can do at the AIML Wikipedia page. With Python’s AIML package, it is easy to implement artificial intelligence chatbots.

Building a chatbot with AIML

Install the Python aiml library

  • Python 2:pip install aiml
  • Python 3:pip install python-aiml

Get alice resources.

After Python aiml is installed, there will be an alice subdirectory under Lib/site-packages/aiml in the Python installation directory, which is a simple corpus that comes with the system.

Loading alice under Python

Once you have obtained the alice resource, you can load the alice brain directly using the Python aiml library.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# -*- coding: utf-8 -*-
import aiml
import sys
import os


def get_module_dir(name):
    path = getattr(sys.modules[name], '__file__', None)
    if not path:
        raise AttributeError('module %s has not attribute __file__' % name)
    return os.path.dirname(os.path.abspath(path))


alice_path = get_module_dir('aiml') + '/alice'
#切换到语料库所在工作目录
os.chdir(alice_path)

alice = aiml.Kernel()
alice.learn("startup.xml")
alice.respond('LOAD ALICE')

while True:
    print alice.respond(raw_input("Enter your message >> "))

The above process is very simple, next we have to create our own bot from zero.

Creating a standard startup file

It is standard practice to create a startup file called std-startup.xml as the main entry point for loading AIML files. In this example, we will create a base file that matches a pattern and returns one accordingly. We want to match the pattern load aiml b and then have it load our aiml brain as a response. We will create the basic_chat.aiml file in one step.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
<aiml version="1.0.1" encoding="UTF-8">
    <!-- std-startup.xml -->

    <!-- Category是一个自动的AIML单元 -->
    <category>

        <!-- Pattern用来匹配用户输入 -->
        <!-- 如果用户输入 "LOAD AIML B" -->
        <pattern>LOAD AIML B</pattern>

        <!-- Template是模式的响应 -->
        <!-- 这里学习一个aiml文件 -->
        <template>
            <learn>basic_chat.aiml</learn>
            <!-- 你可以在这里添加更多的aiml文件 -->
            <!--<learn>more_aiml.aiml</learn>-->
        </template>

    </category>

</aiml>

Creating a standard startup file

It is standard practice to create a startup file called std-startup.xml as the main entry point for loading AIML files. In this example, we will create a base file that matches a pattern and returns one accordingly. We want to match the pattern load aiml b and then have it load our aiml brain as a response. We will create the basic_chat.aiml file in one step.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
<aiml version="1.0.1" encoding="UTF-8">
<!-- basic_chat.aiml -->
<aiml>

    <category>
        <pattern>HELLO</pattern>
        <template>
            Well, hello!
        </template>
    </category>

    <category>
        <pattern>WHAT ARE YOU</pattern>
        <template>
            I'm a bot, silly!
        </template>
    </category>

</aiml>

Random Response

You can also add a random response like the one below. It will respond randomly when receiving a message that starts with “One time I”. * is a wildcard that matches anything.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
<category>
    <pattern>ONE TIME I *</pattern>
    <template>
        <random>
            <li>Go on.</li>
            <li>How old are you?</li>
            <li>Be more specific.</li>
            <li>I did not know that.</li>
            <li>Are you telling the truth?</li>
            <li>I don't know what that means.</li>
            <li>Try to tell me that another way.</li>
            <li>Are you talking about an animal, vegetable or mineral?</li>
            <li>What is it?</li>
        </random>
    </template>
</category>

Use an existing AIML file

Writing your own AIML file is a lot of fun, but it will take a lot of work. I think it takes about 10,000 patterns before it starts to get real. Fortunately, the ALICE Foundation offers a large number of free AIML files. Browse the AIML files on the Alice Bot website.

Testing the newly created robot

So far, all the AIML files in XML format are ready. They are all important as part of the robot’s brain, but for now they are just information (information). The robot needs to come to life. You can customize the AIML with any language, but here you can use Python.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# -*- coding: utf-8 -*-
import aiml
import os


mybot_path = './mybot'
#切换到语料库所在工作目录
os.chdir(mybot_path)

mybot = aiml.Kernel()
mybot.learn("std-startup.xml")
mybot.respond('load aiml b')

while True:
    print(mybot.respond(raw_input("Enter your message >> ")))

This is the simplest program we can start with. It creates an aiml object, learns the startup file, and then loads the rest of the aiml file. Then it is ready to chat, and we enter an infinite loop of constantly prompting the user for messages. You will need to enter a pattern that the bot recognizes. This pattern depends on which AIML files you have loaded. We create the startup file as a separate entity so that we can later add more aiml files to the bot without modifying any of the program source code. We can add more files to the startup xml file that are available for learning.

Accelerated Brain Loading

When you start having a lot of AIML files, it will take a long time to learn. That’s where the BRAIN files come from. After the robot learns all the AIML files, it can save its brain directly to a file that will dynamically speed up the loading time in subsequent runs.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# -*- coding: utf-8 -*-
import aiml
import os


mybot_path = './mybot'
#切换到语料库所在工作目录
os.chdir(mybot_path)

mybot = aiml.Kernel()

if os.path.isfile("mybot_brain.brn"):
    mybot.bootstrap(brainFile="mybot_brain.brn")
else:
    mybot.bootstrap(learnFiles="std-startup.xml", commands="load aiml b")
    mybot.saveBrain("mybot_brain.brn")

while True:
    print(mybot.respond(raw_input("Enter your message >> ")))

Remember, if you use the brain method as written above, loading at runtime does not save the added changes to brain. You will either need to delete the brain file so that it can be rebuilt at the next start, or you will need to modify the code so that it saves the brain at some point after reloading.

Add Python commands

If you want to provide your bot with some special commands for running Python functions, then you should capture the input message for the bot and then process it before sending it to mybot.respond(). In the above example, we get the user’s input from raw_input. However, we can get the input from anywhere. It could be a TCP socket, or a speech recognition source code. Process the message before it goes to AIML. You may want to skip AIML processing on some specific messages.

1
2
3
4
5
6
7
8
9
while True:
    message = raw_input("Enter your message >> ")
    if message == "quit":
        exit()
    elif message == "save":
        mybot.saveBrain("bot_brain.brn")
    else:
        bot_response = mybot.respond(message)
        # Do something with bot_response

Sessions and Assertions

By specifying a session, AIML can tailor different sessions for different people. For example, if a person tells the bot that his name is Alice and another person tells the bot that his name is Bob, the bot can distinguish between the different people. To specify the session you are using, pass it as the second argument to respond()

1
2
sessionId = 12345
mybot.respond(raw_input(">>>"), sessionId)

This is helpful for customizing personalized conversations for each client. You will have to generate your own session ID in some form and keep track of it. Note that saving the brain file will not save all the session values.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
sessionId = 12345

# 会话信息作为字典获取. 包含输入输出历史,
# 以及任何已知断言
sessionData = mybot.getSessionData(sessionId)

# 每一个会话ID需要时一个唯一值。
# 断言名是机器人在与你的会话中了解到的某些/某个名字 
# 机器人可能知道,你是"Billy",而你的狗的名字是"Brandy"
mybot.setPredicate("dog", "Brandy", sessionId)
clients_dogs_name = mybot.getPredicate("dog", sessionId)

mybot.setBotPredicate("hometown", "127.0.0.1")
bot_hometown = mybot.getBotPredicate("hometown")

In AIML, we can use the set response in the template to set the assertion

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
<aiml version="1.0.1" encoding="UTF-8">
   <category>
      <pattern>MY DOGS NAME IS *</pattern>
      <template>
         That is interesting that you have a dog named <set name="dog"><star/></set>
      </template>  
   </category>  
   <category>
      <pattern>WHAT IS MY DOGS NAME</pattern>
      <template>
         Your dog's name is <get name="dog"/>.
      </template>  
   </category>  
</aiml>

Using the AIML above, you can tell the robot.

My dogs name is Max

And the robot will answer you.

That is interesting that you have a dog named Max

Then, if you ask the robot.

What is my dogs name?

The robot will answer.

Your dog’s name is Max.

AIML can be used to implement conversational bots, but for Chinese there are the following problems.

  • Chinese rule base is small. In general, the richer the rule base, the more human-like the response of the robot. Currently, the rule base for English is very rich, covers a wide range of topics, and is publicly available. However, the publicly available Chinese rule base is basically not available.
  • The AIML interpreter does not support Chinese well. In fact, the PyAIML module (parser) under Python can already support Chinese relatively well, but there are also the following problems: English words are generally distinguished by spaces or punctuation, so they have a kind of “natural word separation” feature, and since Chinese input is not separated by spaces, the above will cause some inconvenience in practice. Some inconvenience in practice. For example, to achieve input matching with/without spaces, it is necessary to include both modes in the rule base.

Solutions.

  • Build your own corpus (e.g. get training from subtitle files)
  • Own Chinese word separation tool (e.g. jieba)