Configuration¶
The configuration file is in toml format. The file location follows the freedesktop.org standard. Check the location on your system by running mario --help
:
% mario --help
Usage: mario [OPTIONS] COMMAND1 [ARGS]... [COMMAND2 [ARGS]...]...
Mario: Python pipelines for your shell.
GitHub: https://github.com/python-mario/mario
Configuration:
Declarative config: /home/user/.config/mario/config.toml
Python modules: /home/user/.config/mario/m/
Config modules¶
Mario will make the m
package available at startup. Define any functions you want for your commands in a file in the m/
directory. For example, if you define a file called m/code.py
in your config directory,
# m/code.py
def increment(number):
return number + 1
you can use m.code.increment
in your commands, like this:
% mario map 'int ! m.code.increment' <<EOF
1
2
3
EOF
2
3
4
Any code that needs to run at startup, such as defining a new command, can be placed in m/__init__.py
(or in the declarative config; see Declarative configuration).
You also can add functions directly to the m
namespace by placing them in m/__init__.py
. For example, defining increment
in m/__init__.py
# m/__init__.py
def increment(number):
return number + 1
allows invoking m.increment
, like this:
% mario map 'int ! m.increment' <<EOF
1
2
3
EOF
2
3
4
But note that Mario executes m/__init__.py
at startup, so code placed in that file may affect startup time.
Declarative config¶
The declarative configuration is in mario/mario.toml
. For example, on Ubuntu we use ~/.config/mario/config.toml
.
In the declarative configuration you can:
- set default values for the
mario
command-line options, and - define your own mario commands, like
map
,filter
, orread-csv
. See Command configuration schema for the command format specification.
You can set any of the mario
command-line options in your config. For example, to set a different default value for the concurrency maximum mario --max-concurrent
, add max_concurrent
to your config file. Note the configuration file uses underscores as word separators, not hyphens.
# ~/.config/mario/config.toml
max_concurrent = 10
then just use mario
as normal.
The base_exec_before
option allows you to define any Python code you want to execute before your commands run. Your commands can reference names defined in the base_exec_before
. This option can be supplemented by using the --exec-before
option on the command line to run additional code before your commands.
# ~/.config/mario/config.toml
base_exec_before = """
from itertools import *
from collections import Counter
"""
Then you can directly use the imported objects without referencing the module.
% mario map 'Counter ! json.dumps' <<<$'hello\nworld'
{"h": 1, "e": 1, "l": 2, "o": 1}
{"w": 1, "o": 1, "r": 1, "l": 1, "d": 1}
Custom commands¶
Define new commands in your config file which provide commands to other commands. For example, this config adds a jsonl
command for reading jsonlines streams into Python objects, by calling calling out to the map
traversal.
Load jsonlines¶
[[command]]
name = "jsonl"
help = "Load jsonlines into python objects."
[[command.stages]]
command = "map"
params = {code="json.loads"}
Now we can use it like a regular command:
% mario jsonl <<< $'{"a":1, "b":2}\n{"a": 5, "b":9}'
{'a': 1, 'b': 2}
{'a': 5, 'b': 9}
The new command jsonl
can be used in pipelines as well. To get the maximum value in a sequence of jsonlines objects:
$ mario jsonl map 'x["a"]' apply max <<< $'{"a":1, "b":2}\n{"a": 5, "b":9}'
5
Convert yaml to json¶
Convenient for removing trailing commas.
% mario yml2json <<<'{"x": 1,}'
{"x": 1}
[[command]]
name = "yml2json"
help = "Convert yaml to json"
[[command.stages]]
command = "read-text"
[[command.stages]]
command = "map"
params = {code="yaml.safe_load ! json.dumps"}
Search for xml elements with xpath¶
Pull text out of xml documents.
% mario xpath '//' map 'x.text' <<EOF
<slide type="all">
<title>Overview</title>
<item>Anything <em>can be</em> in here</item>
<item>Or <em>also</em> in here</item>
</slide>
EOF
Overview
Anything
can be
Or
also
[[command]]
name="xpath"
help = "Find xml elements matching xpath query."
arguments = [{name="query", type="str"}]
inject_values=["query"]
[[command.stages]]
command = "map"
[[command.stages]]
command = "map"
params = {code="x.encode() ! io.BytesIO ! lxml.etree.parse ! x.findall(query) ! list" }
[[command.stages]]
command="chain"
Generate json objects¶
% mario jo 'name=Alice age=21 hobbies=["running"]'
{"name": "Alice", "age": 21, "hobbies": ["running"]}
[[command]]
name="jo"
help="Make json objects"
arguments=[{name="pairs", type="str"}]
inject_values=["pairs"]
[[command.stages]]
command = "eval"
params = {code="pairs"}
[[command.stages]]
command = "map"
params = {code="shlex.split(x, posix=False)"}
[[command.stage]]
command = "chain"
[[command.stages]]
command = "map"
params = {code="x.partition('=') ! [x[0], ast.literal_eval(re.sub(r'^(?P<value>[A-Za-z]+)$', r'\"\\g<value>\"', x[2]))]"}
[[command.stages]]
command = "apply"
params = {"code"="dict"}
[[command.stages]]
command = "map"
params = {code="json.dumps"}
Read csv file¶
Read a csv file into Python dicts. Given a csv like this:
% cat names.csv
name,age
Alice,21
Bob,25
try:
% mario csv < names.csv
{'name': 'Alice', 'age': '21'}
{'name': 'Bob', 'age': '25'}
base_exec_before = '''
import csv
import typing as t
def read_csv(
file, header: bool, **kwargs
) -> t.Iterable[t.Dict[t.Union[str, int], str]]:
"Read csv rows into an iterable of dicts."
rows = list(file)
first_row = next(csv.reader(rows))
if header:
fieldnames = first_row
reader = csv.DictReader(rows, fieldnames=fieldnames, **kwargs)
return list(reader)[1:]
fieldnames = range(len(first_row))
return csv.DictReader(rows, fieldnames=fieldnames, **kwargs)
'''
[[command]]
name = "csv"
help = "Load csv rows into python dicts. With --no-header, keys will be numbered from 0."
inject_values=["delimiter", "header"]
[[command.options]]
name = "--delimiter"
default = ","
help = "field delimiter character"
[[command.options]]
name = "--header/--no-header"
default=true
help = "Treat the first row as a header?"
[[command.stages]]
command = "apply"
params = {code="read_csv(x, header=header, delimiter=delimiter)"}
[[command.stages]]
command = "chain"
[[command.stages]]
command = "map"
params = {code="dict(x)"}
Command configuration schema¶
At the top level, add new commands with a [[command]]
heading, documented as CommandSpecschema
in the tables.
CommandSpecSchema¶
type | object | ||
definition | |||
|
|||
properties | |||
|
arguments | ||
Arguments accepted by the new command. | |||
type | array | ||
items | |||
type | object | ||
ArgumentSchema | |||
|
help | ||
Long-form documentation of the command. Will be interpreted as ReStructuredText markup. | |||
type | string | ||
default | None | ||
|
hidden | ||
Hide this command on the help page. | |||
type | boolean | ||
|
inject_values | ||
CLI parameters to be injected into the local namespace, accessible by the executing commands. | |||
type | array | ||
items | |||
inject_values | |||
type | string | ||
|
name | ||
Name of the new command. | |||
type | string | ||
|
options | ||
Options accepted by the new command. | |||
type | array | ||
items | |||
type | object | ||
OptionSchema | |||
|
section | ||
Name of the documentation section in which the new command should appear. | |||
type | string | ||
|
short_help | ||
Single-line CLI description. | |||
type | string | ||
default | None | ||
|
stages | ||
List of pipeline command stages that input will go through. | |||
type | array | ||
items | |||
type | object | ||
CommandStageSchema | |||
|
tests | ||
List of specifications to test the new command. | |||
type | array | ||
items | |||
type | object | ||
CommandTestSchema |
ArgumentSchema¶
type | object | ||
definition | |||
|
|||
properties | |||
|
choices | ||
List of allowed string values. | |||
type | array | ||
default | None | ||
items | |||
choices | |||
type | string | ||
|
nargs | ||
Number of instances expected. Pass -1 for variadic. | |||
type | number | ||
default | None | ||
format | integer | ||
|
Name of the argument. | ||
type | string | ||
|
required | ||
Whether the argument is required. | |||
type | boolean | ||
default | True | ||
|
Name of the type. int, str, bool, float accepted. | ||
type | string |
CommandStageSchema¶
type | object | ||
definition | |||
|
|||
properties | |||
|
command | ||
Name of the base command | |||
type | string | ||
|
params | ||
Mapping from new command param name (str) to value (any json type). | |||
type | object | ||
|
remap_params | ||
Provide new names for the parameters, different from the base command parameters’ names | |||
type | array | ||
items | |||
type | object | ||
RemapParamSchema |
RemapParamSchema¶
type | object | |
definition | ||
|
||
properties | ||
|
new | |
New name of the parameter. | ||
type | string | |
|
old | |
Old name of the parameter. | ||
type | string |
OptionSchema¶
type | object | ||
definition | |||
|
|||
properties | |||
|
choices | ||
List of allowed string values. | |||
type | array | ||
default | None | ||
items | |||
choices | |||
type | string | ||
|
Default value. | ||
type | string | ||
|
help | ||
Documentation for the option. | |||
type | string | ||
default | None | ||
|
hidden | ||
Whether the option is hidden from help. | |||
type | boolean | ||
default | False | ||
|
is_flag | ||
Whether the option is a boolean flag. | |||
type | boolean | ||
default | False | ||
|
multiple | ||
Whether multiple values can be passed. | |||
type | boolean | ||
|
nargs | ||
Number of instances expected. Pass -1 for variadic. | |||
type | number | ||
format | integer | ||
|
Name of the option. Usually prefixed with - or –. | ||
type | string | ||
|
required | ||
Whether the option is required. | |||
type | boolean | ||
default | False | ||
|
Name of the type. int, str, bool, float accepted. | ||
type | string |
CommandTestSchema¶
type | object | ||
definition | |||
|
|||
properties | |||
|
input | ||
String passed on stdin to the program. | |||
type | string | ||
|
invocation | ||
Command line arguments to mario. (Don’t include mario.) | |||
type | array | ||
items | |||
invocation | |||
type | string | ||
|
output | ||
Expected string output from the program. | |||
type | string |