Advanced use of argparse in Python

argparse is a standard library for parsing command line arguments. Its usage is roughly like this.

import argparse

parser = argparse.ArgumentParser(description='Greet to some body')
parser.add_argument(
    '-n', '--name', default='John Doe', help='name of the person to greet')

args = parser.parse_args()
print(f'Hello, {args.name}!')

This is very mundane, and the same functionality could be written much faster if click were used.

import click

@click.command()
@click.option("-n", "--name", default="John Doe", help="name of the person to greet")
def cli(name):
    print(f'Hello, {name}!')

cli()

The difference between the two is that argparse uniformly parses the argument value and then processes it itself, while click can pass the argument value directly to the decorated function. The latter approach is more conducive to code decoupling and easier to maintain.

I also initially chose click when I was working on PDM, which has a series of subcommands on the command line, and click’s nested command groups (click.Group) provided powerful support that helped me do this well. However, as I wrote deeper and tried to add some more complex functionality, I discovered the shortcomings of click and was prompted to finally choose argparse. So far it seems that the capabilities provided by argparse do the job very well.

Inheritance and extensions

Suppose we have written a command line interface like the following with click.

# bot.py
import click

@click.group()
def cli():
    pass

@cli.command()
@click.option("-n", "--name", default="John Doe", help="name of the person to greet")
def greet(name):
    print(f'Hello, {args.name}!')

@cli.command()
@click.option("-n", "--name", default="John Doe", help="name of the person to say goodbye")
def goodbye(name):
    print(f'Goodbye, {args.name}!')

This command line contains two subcommands greet and goodbye, and now I’ve released this bot library and want users to add new commands to this with click, which is easy.

# test.py
from bot import cli

@cli.command()
def test():
    print('test')

cli()

This command line now has a new test subcommand. This is how the Flask CLI can be extended. But I’d like to build on this and provide the ability to add command options, like adding a -verbose option to the original greet command, which is a verbose hello if true, otherwise a concise hello. How is this done? It involves adding an argument to the original greet function and changing the behavior of the function to read that argument. Looking at the API documentation I learn that this function is stored on the callback property of the generated Command object, so I can only write a new function that replaces it, then if I don’t want to copy the original function over, but just want to inherit and extend it, then I have to keep the original function in place and call it in the new function.

This whole process, to me, seemed like a Monkey patch, which in a language that supports OOP, it shouldn’t be, so I started looking for alternatives. Of course, I finally found argparse, and here’s how I used argparse to implement the command-line interface of PDM.

argparse Advanced

Subcommands of argparse

argparse also supports subcommands, and subcommands can have their own subcommands.

parser = argparse.ArgumentParser()

subparsers = parser.add_subparsers()
greet_parser = subparsers.add_parser('greet')
greet_parser.add_argument('-n', '--name', default='John Doe', help='name of the person to greet')
goodbye_parser = subparsers.add_parser('goodbye')
goodbye_parser.add_argument('-n', '--name', default='John Doe', help='name of the person to say goodbye')
args = parser.parse_args()
...

This looks much more laborious than click and still only gets the parsed result, not processed, but this drawback also makes argparse more flexible and we can control how it finds the corresponding processing method. Inheritance and extension, isn’t that the idea of OOP? So can I change this spaghetti-type code to OOP?

OOPification of argparse

The principle is to put each subcommand into its own class, so I’ll separate the above code.

# 根命令相关
parser = argparse.ArgumentParser()
subparsers = parser.add_subparsers()
# 子命令greet相关
greet_parser = subparsers.add_parser('greet')
greet_parser.add_argument('-n', '--name', default='John Doe', help='name of the person to greet')
# 子命令goodbye相关
goodbye_parser = subparsers.add_parser('goodbye')
goodbye_parser.add_argument('-n', '--name', default='John Doe', help='name of the person to say goodbye')
# 根命令相关
args = parser.parse_args()

You can see that the middle two subcommands are written in a highly consistent way with only one operation, which is add_argument, so I put this method inside the subcommand class to be implemented and use some IoC trick to get the following code.

class Command:
    """基类"""
    def add_arguments(self, parser):
        pass  # 可以不实现，即不包含任何参数

class GreetCommand(Command):
    """greet 命令实现"""
    def add_arguments(self, parser):
        parser.add_argument('-n', '--name', default='John Doe', help='name of the person to greet')

The following are the mounting methods in the root parser.

parser = argparse.ArgumentParser()
subparsers = parser.add_subparsers()
for name, command in subcommands.items():  # type: Dict[str, Type[Command]]
    cmd_instance = command()
    subparser = subparsers.add_parser(name)
    # subparser 是一个和 parser 一样的解析器对象
    cmd_instance.add_arguments(subparser)

Here I instantiated command instead of using classmethod directly, to facilitate passing in some root parser-related information when instantiating. This way I have decoupled the command parsing, and arguments related to subcommands are added in add_argument in their own class.

Handling method routing

Now we’ve just implemented adding arguments to subcommands, but we still need to choose different processing methods for different subcommands. We don’t know how to do this yet, so whatever, let’s put this method inside the Command class first.

class Command:
    """基类"""
    ...
    def handle(self, args):
        pass  # 可以不实现，即不做任何处理

class GreetCommand(Command):
    ...
    def handle(self, args):
        print(f'Hello {args.name}!')

How do I route to the processing of this subcommand when it is parsed? You need to understand the parsing process of argparse. argparse is to get sys.argv and then look at it in order, if it finds an argument, assign the value of that argument in the result, if it finds a subcommand name, get the parser of that subcommand and call the parser recursively to parse the rest of the command line arguments. In other words, if the subcommand is not matched, no action related to the subcommand will be executed, and the parameters of the subcommand will not be added to the parser. Subcommands at the same level are necessarily mutually exclusive, and it is not possible to match multiple subcommands at the same time. For example, python cli.py greet goodbye matches the greet command, and goodbye will be parsed in greet’s own parser as an argument to greet.

Then we can save the processing of this subcommand to the parser when it is matched, and we are done. All it takes is a slight modification to the subcommand mount procedure.

for name, command in subcommands.items():
    cmd_instance = command()
    subparser = subparsers.add_parser(cmd_instance.name)
    subparser.set_defaults(handle=cmd_instance.handle)
    cmd_instance.add_arguments(subparser)

The value of handle is set to cmd_instance.handle by set_defaults. It is used to set the value of handle to cmd_instance.handle if there is no handle in the result after parsing. And this behavior will only take effect when the subcommand is parsed, because it works on subparser.

Then the final processing logic is very natural.

1
2
3

args = parser.parse_args()
if hasatter(args, 'handle'):
    args.handle(args)

Parameter reuse

With the power of OOP, I can come up with some less repetitive code. Notice that greet and goodbye both have a -n/--name argument, of the same type. Adding the argument is done in add_argument. IoC again.

class Argument:
    def __init__(self, *args, **kwargs):
        self.args = args
        self.kwargs = kwargs

    def add_to_parser(self, parser):
        parser.add_argument(*self.args, **self.kwargs)

name_option = Argument("-n", "--name", help="name of the person", default="John Doe")

Further, add the class attribute arguments to the Command class.

class Command:
    arguments = [name_option]
    def add_arguments(self, parser):
        for arg in self.arguments:
            arg.add_to_parser(parser)
        self.subcommand_add_arguments(parser)

    def subcommand_add_arguments(self, parser):
        # 原来的add_arguments改名为此函数
        pass

Upgraded argparse usage

Now back to the requirement I started with. Inheritance and extension, if I want to add a new subcommand, I just need to inherit the base class Command, implement the subcommands_add_arguments and handle methods, and add it to subcommands (the added methods will be exposed).

If you want to modify the existing commands, you only need to inherit from the original command class.

class MyGreetCommand(GreetCommand):
    def subcommand_add_arguments(self, parser):
        super().subcommand_add_arguments(parser)
        parser.add_argument("-t", "--test", action="store_true", help="run under test environment")

    def handle(self, args):
        if args.test:
            print("I'm under testing, no time to greet")
        else:
            super().handle(args)

The original command name greet is used to override the original command when mounting.

Conclusion

We’ve taken advantage of Python’s dynamic nature and implemented the OOPization of argparse with reasonable finesse (IoC). PDM uses this approach to implement extensible command-line parsing. The complete command class is in pdm/cli/commands, and the command parsing assembly process is in pdm/ core.py. In fact, pip and Django are written in a similar way on the command line, only the implementation is different.

Table of Contents