This post demonstrates an approach to command line application development that minimizes errors, provides ready access to configurable parameters and is straightforward to implement. As Python was originally created to serve as an improved scripting language and environment, in-built support for command line application development is robust. In particular, argparse simplifies the implementation of command line options, arguments and sub-commands, and configparser makes it possible to specify program inputs in a file independent of the companion script.
Application Description
The application, cmdapp.py, filters and subsets the Linux dictionary file, usually found at /usr/share/dict/words or /var/lib/dict/words. A copy of this file (with all words containing digits removed) can be obtained here.
cmdapp.py can accept command line options which result in subsetting the word list, and after filtering has been applied, the remaining subset of words can be (optionally) written to file. What follows is the full list of available command line options:
--words-file
The file system location of words file
--output-directory
Directory of exported file (if applicable)
--output-filename
Name of exported file (if applicable)
--export
Should matching words be written to file (False)?
--startswith=PATTERN
Only return words starting with PATTERN
--endswith=PATTERN
Only return words ending with PATTERN
--contains=PATTERN
Only return words containing specified PATTERN
--suppress
Should matching words be suppressed (False)?
Implementing these options with argparse is straightforward. We initialize an instance of the
ArgumentParser
class, and then proceed as follows:
import argparse
parser = argparse.ArgumentParser()
# Specify command line arguments.
parser.add_argument(
"--words-file", type=str,
help="Absolute filepath of words(.txt) file",
default=None
)
parser.add_argument(
"--output-directory", type=str,
help="Directory of exported file, if applicable",
default=None
)
parser.add_argument(
"--output-filename", type=str,
help="Name of exported file, if applicable (matches.txt)",
default="matches.txt"
)
parser.add_argument(
"--export",
help="Should matching words be written to file (False)?",
action="store_true"
)
parser.add_argument(
"--startswith", type=str,
help="Only return words starting with specified char(s)",
default=None
)
parser.add_argument(
"--endswith", type=str,
help="Only return words ending with specified char(s)",
default=None
)
parser.add_argument(
"--contains", type=str,
help="Only return words containing specified char(s)",
default=None
)
parser.add_argument(
"--suppress", help="Should matching words be suppressed (False)?",
action="store_true"
)
parser.add_argument(
"-v", "--verbose", help="Should additional runtime details be displayed (False)?",
action="store_true"
)
# Parse command line arguments.
argvs = parser.parse_args()
# Bind reference to specified options.
argvs_words_file = argvs.words_file
argvs_output_directory = argvs.output_directory
argvs_output_filename = argvs.output_filename
argvs_suppress = argvs.suppress
argvs_export = argvs.export
argvs_startswith = argvs.startswith
argvs_endswith = argvs.endswith
argvs_contains = argvs.contains
# Optionally view specified command line options and associated values.
print(argvs._get_kwargs())
add_argument
‘s action parameter can be set to “store_true” or “store_false”. If set to
“store_true”, the presence of the flag evaluates to True. If set to “store_false”, the
presence of the flag evaluates to False. In the preceding example, the options
--suppress
and --export
are logical indicators with action set to “store_true”.
All other options require a string of text to follow the option (note that by default, argparse interprets all specified values as strings. If an option should be interpreted as something other than a string, specify the datatype using add_argument’s type parameter). The string can follow the command line option separated by an equal sign:
$ python3 cmdapp.py --export --startswith="a" --endswith="e" --contains="p"
When developing command line applications, a popular convention has command line flags overriding default values, with defaults obtained from environment variables or configuration files. configparser helps to facilitate the latter. We create a configuration file that consists of sections, each containing key-value pairs. The pairs are read into cmdapp.py and are set as the defaults for each modifiable option. The configuration file, defaults.cfg, contains the following:
[SETTINGS]
WORDS_FILE=dictionary.txt
OUTPUT_DIRECTORY=/temp
OUTPUT_FILENAME=matches.txt
[PARAMETERS]
SUPPRESS=N
EXPORT=N
STARTSWITH=
ENDSWITH=
CONTAINS=
When specifying character strings in a configuration file, it is not necessary to surround the content in
quotes. Key-value pairs can be separated by =
or :
. We next demonstrate how to read defaults.cfg into
cmdapp.py and extract the content using an instance the ConfigParser class:
import configparser
# Provide absolute path to defaults.cfg.
DEFAULTS = "defaults.cfg"
config = configparser.ConfigParser()
config.read(DEFAULTS)
def_words_file = config["SETTINGS"]["WORDS_FILE"]
def_output_directory = config["SETTINGS"]["OUTPUT_DIRECTORY"]
def_output_filename = config["SETTINGS"]["OUTPUT_FILENAME"]
def_suppress = config["PARAMETERS"]["SUPPRESS"]
def_export = config["PARAMETERS"]["EXPORT"]
def_startswith = config["PARAMETERS"]["STARTSWITH"]
def_endswith = config["PARAMETERS"]["ENDSWITH"]
def_contains = config["PARAMETERS"]["CONTAINS"]
The next step is to determine the final parameter values for each option. We have the def_
-prefixed values
read from defaults.cfg and the argvs_
-prefixed values containing command line overrides. We set the
final_
-prefixed variables to the command line overrides (if specified), otherwise fall back to the
key-value pairs specified in defaults.cfg:
# Bind reference to final parameter values.
final_output_directory = argvs_output_directory if argvs_output_directory else def_output_directory
final_output_filename = argvs_output_filename if argvs_output_filename else def_output_filename
def_suppress_ind = True if def_suppress.upper()=="Y" else False
final_suppress = argvs_suppress if argvs_suppress is not None else def_suppress_ind
def_export_ind = True if def_export.upper()=="Y" else False
final_export = argvs_export if argvs_export is not None else def_export_ind
# Determine patterns to match against, if applicable.
if argvs_startswith is not None:
final_startswith = argvs_startswith
else:
final_startswith = def_startswith if len(def_startswith)>0 else None
if argvs_endswith is not None:
final_endswith = argvs_endswith
else:
final_endswith = def_endswith if len(def_endswith)>0 else None
if argvs_contains is not None:
final_contains = argvs_contains
else:
final_contains = def_contains if len(def_contains)>0 else None
Next the full implementation of cmdapp.py:
import sys
import argparse
import configparser
import os.path
# Provide absolute path to defaults.cfg.
DEFAULTS = "defaults.cfg"
if __name__ == "__main__":
parser = argparse.ArgumentParser()
config = configparser.ConfigParser()
# Specify command line arguments.
parser.add_argument(
"--words-file", type=str,
help="Absolute filepath of words(.txt) file",
default=None
)
parser.add_argument(
"--output-directory", type=str,
help="Directory of exported file, if applicable",
default=None
)
parser.add_argument(
"--output-filename", type=str,
help="Name of exported file, if applicable (matches.txt)",
default=None
)
parser.add_argument(
"--export",
help="Should matching words be written to file (False)?",
action="store_true"
)
parser.add_argument(
"--startswith", type=str,
help="Only return words starting with specified char(s)",
default=None
)
parser.add_argument(
"--endswith", type=str,
help="Only return words ending with specified char(s)",
default=None
)
parser.add_argument(
"--contains", type=str,
help="Only return words containing specified char(s)",
default=None
)
parser.add_argument(
"--suppress", help="Should matching words be suppressed (False)?",
action="store_true"
)
# Parse command line arguments.
argvs = parser.parse_args()
# Read in `settings.cfg` to obtain default configuration values.
config.read(DEFAULTS)
def_words_file = config["SETTINGS"]["WORDS_FILE"]
def_output_directory = config["SETTINGS"]["OUTPUT_DIRECTORY"]
def_output_filename = config["SETTINGS"]["OUTPUT_FILENAME"]
def_suppress = config["PARAMETERS"]["SUPPRESS"]
def_export = config["PARAMETERS"]["EXPORT"]
def_startswith = config["PARAMETERS"]["STARTSWITH"]
def_endswith = config["PARAMETERS"]["ENDSWITH"]
def_contains = config["PARAMETERS"]["CONTAINS"]
# Parse parameter overrides specified as command line arguments.
argvs_words_file = argvs.words_file
argvs_output_directory = argvs.output_directory
argvs_output_filename = argvs.output_filename
argvs_suppress = argvs.suppress
argvs_export = argvs.export
argvs_startswith = argvs.startswith
argvs_endswith = argvs.endswith
argvs_contains = argvs.contains
final_words_file = argvs_words_file if argvs_words_file else def_words_file
# Verify existence of final_words_file.
if not os.path.isfile(final_words_file):
raise FileNotFoundError(f"File not found: {final_words_file}")
# Bind reference to final parameter values.
final_output_directory = argvs_output_directory if argvs_output_directory else def_output_directory
final_output_filename = argvs_output_filename if argvs_output_filename else def_output_filename
def_suppress_ind = True if def_suppress.upper()=="Y" else False
final_suppress = argvs_suppress if argvs_suppress is not None else def_suppress_ind
def_export_ind = True if def_export.upper()=="Y" else False
final_export = argvs_export if argvs_export is not None else def_export_ind
# Determine patterns to match against, if applicable.
if argvs_startswith is not None:
final_startswith = argvs_startswith
else:
final_startswith = def_startswith if len(def_startswith)>0 else None
if argvs_endswith is not None:
final_endswith = argvs_endswith
else:
final_endswith = def_endswith if len(def_endswith)>0 else None
if argvs_contains is not None:
final_contains = argvs_contains
else:
final_contains = def_contains if len(def_contains)>0 else None
# Begin processing words list.
with open(final_words_file, "r") as f:
ll_init = f.read().split("\n")
# Remove whitespace from words and convert to lowercase.
ll = [i.strip().lower() for i in ll_init]
len_words_init = len(ll)
# Filter ll by final_startswith.
if final_startswith is not None:
ll = filter(lambda w: w.startswith(final_startswith), ll)
# Filter ll by final_endswith.
if final_endswith is not None:
ll = filter(lambda w: w.endswith(final_endswith), ll)
# Filter ll by final_contains.
if final_contains is not None:
ll = filter(lambda w: final_contains in w, ll)
ll_final = list(ll)
len_words_final = len(ll_final)
pct_retained = round(100*(len_words_final/len_words_init),2)
if not final_suppress:
print(f"Number of words in total : {len_words_init}")
print(f"Number of words in subset: {len_words_final}")
print(f"Percent of words retained: {pct_retained}%")
print("")
disp_list = list(enumerate(ll_final, start=1))
for i in disp_list:
print(f"{i[0]}. {i[1]}")
print("")
# Construct filepath if `--export` present in sys.argv.
if final_export:
# Check for trailing slash with final_output_directory.
if final_output_directory.endswith("\\"):
final_output_directory = final_output_directory.rstrip("\\")
elif final_output_directory.endswith("/"):
final_output_directory = final_output_directory.rstrip("/")
export_fullpath = \
final_output_directory + os.path.sep + final_output_filename
with open(export_fullpath, "w") as fw:
for i in ll_final: fw.write(f"{i}\n")
Another useful feature of argparse is the automatic creation of help menus. To demonstrate, we run
cmdapp.py with the --help
flag. Recall that we didn’t include any logic to produce a command line
help menu:
$ python3 --help
usage: cmdapp.py [-h] [--words-file WORDS_FILE]
[--output-directory OUTPUT_DIRECTORY]
[--output-filename OUTPUT_FILENAME] [--export]
[--startswith STARTSWITH] [--endswith ENDSWITH]
[--contains CONTAINS] [--display] [-v]
optional arguments:
-h, --help show this help message and exit
--words-file WORDS_FILE
Absolute filepath of words(.txt) file
--output-directory OUTPUT_DIRECTORY
Directory of exported file, if applicable
--output-filename OUTPUT_FILENAME
Name of exported file, if applicable (matches.txt)
--export Should matching words be written to file (False)?
--startswith STARTSWITH
Only return words starting with specified char(s)
--endswith ENDSWITH Only return words ending with specified char(s)
--contains CONTAINS Only return words containing specified char(s)
--suppress Should matching words be suppressed (False)?
Let’s demonstrate cmdapp.py in action: Match all words in the dictionary file beginning with “s”, ending with “r” and containing “u”:
$ python3 cmdapp.py --startswith="s" --endswith="r" --contains="u"
Number of words in total : 4319
Number of words in subset: 15
Percent of words retained: 0.35%
1. santour
2. sauncier
3. secutor
4. sejour
5. semicircular
6. sinistrocular
7. souder
8. spitefuller
9. squidder
10. strummer
11. stubborner
12. suffixer
13. sunbather
14. superoposterior
15. sur
If we include the --output-directory
option, the results will be written to a file named matches.txt in
that directory. Also, inclusion of --suppress
prevents matches from being written to stdout:
$ python3 cmdapp.py --startswith="s" --endswith="r" --contains="u" --export --output-directory="/temp" --suppress