.. This document is UTF-8 encoded and marked up using reStructured text.

NAME
====

**mkroesti** - generate different kinds of hashes


SYNOPSIS
========

| **mkroesti** [**-a** *LIST*] [**-d**] [**-x**] [**-p LIST**] [**-c** CODEC] [**-e**]
| **mkroesti** [**-a** *LIST*] [**-d**] [**-x**] [**-p LIST**] [**-c** CODEC] **-b** *input*
| **mkroesti** [**-a** *LIST*] [**-d**] [**-x**] [**-p LIST**] [**-c** CODEC] **-f** *FILE*
| **mkroesti** **-l** [**-x**] [**-p LIST**]
| **mkroesti** **-V**
| **mkroesti** **-h**


DESCRIPTION
===========

The **mkroesti** command takes an input and generates different kinds of hashes from that input. Without special instructions, **mkroesti** takes its input either from standard input or asks for it interactively. Unless echo mode is enabled (**--echo**), **mkroesti** does not display input typed by the user, assuming that the user enters a secret password.

Additional sources for the input are: a file (**--file**) or the command line (**--batch**). Note that the latter option should be used with extreme care, since if the input is a password, it will be visible to any programm or user looking at the system's list of processes at the time when **mkroesti** is run.

The user usually specifies one or more algorithms (**--algorithms**) that should be used to generate hashes. If no specific algorithm is selected, **mkroesti** generates hashes for all algorithms that it knows about.

**mkroesti** knows about a number of built-in hash algorithms and aliases. These are listed further down in the **ALGORITHMS** and **ALIASES** section. Using the **--providers** option, it is possible to extend **mkroesti** with new hash algorithms and aliases. Details are listed in the **ALGORITHM PROVIDERS** section.


OPTIONS
=======

-a LIST, --algorithms LIST
  Comma separated list of algorithms and/or aliases that should be used to generate hashes. See **ALGORITHMS** and **ALIASES** below.

-b, --batch
  Use batch mode; i.e., get the input from the command line rather than prompting for it. This option should be used with extreme care, since if the input is a password, it will be visible to any program or user looking at the system's list of processes at the time when **mkroesti** is run.

-c CODEC, --codec CODEC
  If necessary, use the character encoding named CODEC for internal conversion between binary and string data. See **ENCODINGS** below. This option has no effect if **mkroesti** is run under Python 2.6.

-d, --duplicate-hashes
  Allow duplicate hashes; i.e. if the same algorithm is available from multiple implementation sources, generate a hash for each implementation.

-e, --echo
  Enable echo mode; i.e. when the user is prompted for input, the characters she types are echoed on the screen. This option cannot be combined with **--batch** or **--file**.

-f FILE, --file FILE
  Read the input from **FILE**.

-l, --list
  List all supported algorithms, together with the information which algorithms are actually available, and which implementation sources exist for them.

-p LIST, --providers LIST
  Comma separated list of third party Python modules that provide hash algorithms. This option can be used to extend mkroesti with new algorithms. See **ALGORITHM PROVIDERS** below.

-x, --exclude-builtin
  Exclude built-in algorithms from the operation of **mkroesti**. This is useful if you want to test your own algorithm providing modules without interference from built-in algorithms.

-V, --version
  Print the version number and some diagnostic data.

-h, --help
  Show a short usage summary.       


ALGORITHMS
==========

**mkroesti** defines the following keywords to identify built-in hash algorithms:

base16
  Base16 encoding as specified in RFC 3548. Strictly speaking, this is not a hash but an encoding.
base32
  Base32 encoding as specified in RFC 3548. Strictly speaking, this is not a hash but an encoding.
base64
  Base64 encoding as specified in RFC 3548. Strictly speaking, this is not a hash but an encoding.
adler32
  Adler-32 checksum. Strictly speaking, this is not a hash but a checksum.
crc32
  CRC-32 checksum. Strictly speaking, this is not a hash but a checksum.
crc32b
  .. empty definition
crypt-system
  Local crypt() system call. Note that crypt hashes are salted, i.e. repeated calls with the same input will yield different results.
crypt-md5
  MD5-based crypt (hash starts with **$1$**). Note that crypt hashes are salted, i.e. repeated calls with the same input will yield different results.
crypt-apr1
  MD5-based crypt, Apache variant (hash starts with **$apr1$**). Note that crypt hashes are salted, i.e. repeated calls with the same input will yield different results.
crypt-blowfish
  blowfish-based crypt (hash starts with **$2a$**). Note that crypt hashes are salted, i.e. repeated calls with the same input will yield different results.
md2
  Message-Digest algorithm 2.
md4
  Message-Digest algorithm 4, generates a 128 bit hash.
md5
  Message-Digest algorithm 5, generates a 128 bit hash.
sha-0
  The original SHA algorithm. Superseded by **sha-1**.
sha-1
  SHA-1 algorithm, generates a 160 bit hash. Supersedes **sha-0**.
sha-224
  SHA-2 algorithm, generates a 224 bit hash.
sha-256
  SHA-2 algorithm, generates a 256 bit hash.
sha-384
  SHA-2 algorithm, generates a 384 bit hash.
sha-512
  SHA-2 algorithm, generates a 512 bit hash.
ripemd-original
  The original RIPEMD algorithm. Superseded by **ripemd-160**.
ripemd-128
  RIPEMD algorithm, generates a 128 bit hash.
ripemd-160
  RIPEMD algorithm, generates a 160 bit hash. Supersedes **ripemed**.
ripemd-256
  RIPEMD algorithm, generates a 256 bit hash.
ripemd-320
  RIPEMD algorithm, generates a 320 bit hash.
haval-128-[345]
  HAVAL algorithm, generates a 128 bit hash. The number of rounds is either 3, 4 or 5.
haval-160-[345]
  HAVAL algorithm, generates a 160 bit hash. The number of rounds is either 3, 4 or 5.
haval-192-[345]
  HAVAL algorithm, generates a 192 bit hash. The number of rounds is either 3, 4 or 5.
haval-224-[345]
  HAVAL algorithm, generates a 224 bit hash. The number of rounds is either 3, 4 or 5.
haval-256-[345]
  HAVAL algorithm, generates a 256 bit hash. The number of rounds is either 3, 4 or 5.
whirlpool
  WHIRLPOOL algorithm.
tiger-128
  Tiger algorithm, generates a 128 bit hash.
tiger-160
  Tiger algorithm, generates a 160 bit hash.
tiger-192
  Tiger algorithm, generates a 192 bit hash. This is the variant commonly used, and commonly referred to as "Tiger".
tiger2
  .. empty definition
snefru-128
  Snefru algorithm, generates a 128 bit hash.
snefru-256
  Snefru algorithm, generates a 256 bit hash.
gost
  GOST algorithm, generates a 256 bit hash.
windows-lm
  The Windows LanManager password hash.
windows-nt
  The Windows NT password hash.
mysql-password
  MySQL's PASSWORD() function.


ALIASES
=======

**mkroesti** defines the following keywords to identify built-in aliases. Each alias refers to a set of hash algorithms:

all
  All algorithms known to **mkroesti**. The default if **--algorithms** is not specified.
chksum
  All algorithms that generate a checksum instead of a hash.
encoding
  All algorithms that generate an encoding instead of a hash.
crypt
  All algorithms that generate a hash based on crypt().
sha
  All algorithms of the SHA (Secure Hash Algorithm) family.
ripemd
  All algorithms of the RIPEMD (RACE Integrity Primitives Evaluation Message Digest) family.
haval
  All algorithms of the HAVAL family.
tiger
  All algorithms of the Tiger family.
snefru
  All algorithms of the Snefru family.


ALGORITHM PROVIDERS
===================

It is possible to extend **mkroesti** with new algorithms by specifying third party Python modules that provide these new algorithms. The option to use is **--providers**, the option value is a list of comma separated Python modules that provide the new algorithms. The following example adds algorithms provided by the three new modules *mymodule1*, *mymodule2* and *mypackage.mymodule3* to whatever operation is performed by **mkroesti**:

  **mkroesti** **--providers** *mymodule1*,*mymodule2,*mypackage.mymodule3* [...]

The specified modules must be available for importing via the regular Python module search path. If the modules are not part of the system, it might be necessary to make them known to Python by setting the PYTHONPATH environment variable.

Every module must contain a callable named "getProviders"; this will usually be a function. **mkroesti** invokes the callable and expects to get a list of instances of algorithm provider classes in return. **mkroesti** registers these provider instances in its internal system, just as it does for its own built-in providers. **mkroesti** then goes on to perform whatever operation was requested.

For the gritty details, use **pydoc** to view the API documentation inside the modules **mkroesti.algorithm** and **mkroesti.provider**. It might also be useful to have a look at an example, a good one would be **mkroesti.algorithm.Base64Algorithms** and **mkroesti.provider.Base64Provider**.


ENCODINGS
=========

This section is not relevant if **mkroesti** is run under Python 2.6, because Python 2.6 does not have a binary data type. If **mkroesti** is run under Python 2.6, it always stores the raw, uninterpreted input data inside a string object and passes that string object directly to algorithms - without any conversion. Because no conversion is necessary, the **--codec** command line argument is always ignored.

From Python 3 onwards, **mkroesti** treats the input data specified by the user either as binary data, or as string data, depending on the input method that the user chooses:

- Reading from a file (i.e. using **--file**): Input data is treated as binary data.
- Reading from stdin, when stdin is not a tty (i.e. piping the input into **mkroesti**, or redirecting stdin): Input data is treated as binary data.
- Reading from the command line (i.e. using **--batch**): Input data is treated as string data.
- Specifying the input interactively (with or without echo mode): Input data is treated as string data.

While most hash algorithms require their input to be binary data, some algorithms require their input to be string data (**crypt-system** is a notable case). Whenever an algorithm requires its input to be of a type that is different from the type how **mkroesti** treats its input data, **mkroesti** is forced to convert between the two types. So if, for instance, the user specifies a file (input data is binary data) and selects the **crypt-system** algorithm (requires string data), **mkroesti** needs to convert from binary to string data. Another example would be if the user specifies the input on the command line using **--batch** (input data is string data), and selects the **md5** algorithm (requires binary data): In this case **mkroesti** needs to convert from string to binary data.

Because strings are involved in converting from binary to string data, or vice versa, **mkroesti** must employ a character encoding (e.g. UTF-8, Latin-1) to interpret the data. Without any special measures, **mkroesti** implicitly uses Python's default character encoding to perform the conversion. Invoke **mkroesti** with the **--version** argument to see which default encoding would be used right now.

The **--codec** command line argument tells **mkroesti** to use the specified encoding instead of Python's default encoding. The encoding is specified as a "codec name", which can be the name of any codec registered via the Python Standard Library module "codecs". See the module's documentation for an extensive list of possible names. Often used encodings would be "utf-8", "latin_1" or "iso8859-1" (both equivalent), and "mac_roman".

**NOTE 1:** We have already seen that in some cases **mkroesti** initially gets its input data from the Python interpreter as string data (i.e. when using **--batch**, or entering the input data interactively). In those cases, the raw input data has already been converted into (Unicode) string data, before **mkroesti** gets any chance to do something with the data. This initial conversion is performed by the Python interpreter; to determine the encoding to be used in the conversion, the interpreter looks at the user's locale (e.g. the content of the *LANG* environment variable). If you use **--codec** in such a situation, and you specify a new encoding that is different from the initial locale-based encoding, **mkroesti** will use your new encoding to convert the string input data into binary input data. Due to this re-interpretation, the final byte sequence given to the hash algorithm may therefore very well be different from the original byte sequence that you typed on the command line, which will result in a different hash than the one you may have expected.

**NOTE 2:** If **mkroesti** does not need to convert between binary and string data, it will ignore the **--codec** command line argument! If **mkroesti** generates multiple hashes, it will use the specified encoding for those algorithms that require conversion, but ignore it for all others.


EXAMPLES
========

(1) Prompt the user to interactively provide the input, not echoing the input on the screen. Hashes for all algorithms are generated.

  **mkroesti** 


(2) Take the input from file *foo* and generate hashes for the algorithms **md5**, **sha-1** and **haval-128** (4 rounds variant).

  **mkroesti** **-f** *foo* **-a** **md5**,**sha-1**,**haval-128-4**


(3) Take the input from standard input and generate a single hash using the local crypt() system function. Make sure that the input (i.e. the content of file *foo*) is interpreted using the *mac_roman* character encoding (because **crypt-system** requires string input data). 

  **cat** *foo* | **mkroesti** **-a** **crypt-system** **-c** *mac_roman*


(4) Use the string *secret* as input and generate hashes for all algorithms of the SHA and RIPEMD family.

  **mkroesti** **-b** **-a** **sha**,**ripemd** *secret*


(5) List all supported algorithms, which in this case are: built-in algorithms, and algorithms provided by the Python module *mymodule*.

  **mkroesti** **-l** **-p** *mymodule*


(6) Use the string *secret* as input and generate a single hash using the algorithm *myalgorithm*. Algorithms provided by the Python module *mymodule* are included in the operation (which is probably what the user wants, since *myalgorithm* is not a built-in algorithm).

  **mkroesti** **-b** **-p** *mymodule* **-a** *myalgorithm* *secret*


(7) Use the string *secret* as input and generate hashes for all algorithms provided by the Python module *mymodule*.

  **mkroesti** **-b** **-x** **-p** *mymodule* *secret*


(8) Use the string *αβγ* as input and generate a single hash using the **md5** algorithm. Make sure that the input string is re-interpreted using the *utf_16* character encoding before it is passed to the hash algorithm.

  **mkroesti** **-b** **-a** **md5** **-c** *utf_16* *αβγ*


EXIT CODES
==========

| 0  success
| 1  runtime error
| 2  error during parsing of command line arguments


BUGS
====

Not all hash algorithms advertised in this manual are actually provided. Use "**mkroesti** **--list**" to find out which algorithms are actually available on your system.

**mkroesti** is not good at handling large files, because it tries to read a file's entire content into memory before hash generation commences.

The handling of encodings is probably incomplete, and certainly awkward in some situations. You may find it easier to ignore encodings in **mkroesti** altogether, and instead use a different tool that is better at handling encodings, to pre-process your input data.

If you find any other bugs, please report them to <herzbube@herzbube.ch>.


DEPENDENCIES
============

The built-in algorithms provided by *mkroesti** depend on the following Python modules:

base64
  From the Python Standard Library
hashlib
  From the Python Standard Library
crypt
  From the Python Standard Library
random
  From the Python Standard Library
string
  From the Python Standard Library
smbpasswd
  See http://barryp.org/software/py-smbpasswd/
mhash
  See http://labix.org/python-mhash
bcrypt
  See http://www.mindrot.org/projects/py-bcrypt/


SEE ALSO
========

-


AUTHOR
======

Patrick Näf (herzbube@herzbube.ch)


LICENSE
=======

GPLv3


VERSION
=======

0.2 (Jun 03 2009)
