The Python `multiprocessing`

library exposes an interface that simplifies
distributing tasks to multiple cores. The `multiprocessing.Pool`

class
provides access to a pool of worker processes to which jobs can be submitted.
It supports asynchronous results with timeouts and callbacks and
has a parallel map implementation. Leveraging `multiprocessing.Pool`

is
straightforward. To demonstrate, we will solve
Project Euler Problem #14 in a
distributed fashion. The problem states:

```
The following iterative sequence is defined for the set of positive integers:
n -> n/2 (n is even)
n -> 3n + 1 (n is odd)
Using the rule above and starting with 13, we generate the following sequence:
13 -> 40 -> 20 -> 10 -> 5 -> 16 -> 8 -> 4 -> 2 -> 1
It can be seen that this sequence (starting at 13 and finishing at 1) contains
10 terms. Although it has not been proved yet (Collatz Problem), it is thought
that all starting numbers finish at 1.
Which starting number, under one million, produces the longest chain?
NOTE: Once the chain starts the terms are allowed to go above one million.
```

To start, we define two functions: `collatz_test`

and `chain_length`

.
`collatz_test`

contains the logic that either divides the input by 2 (if even)
or multiplies it by 3 and adds 1 (if odd). `chain_length`

returns a tuple
consisting of the initial integer along with the length of the collatz chain:

```
def collatz_test(n):
"""
If n is even, return (n/2), else return (3n+1).
"""
return((n/2) if n%2==0 else (3*n+1))
def chain_length(n):
"""
Return the length of the collatz chain along
with the input value `n`.
"""
if n<=0: return(None)
cntr, tstint = 0, n
while tstint!=1:
cntr+=1
tstint = collatz_test(tstint)
return(n, cntr)
```

One thing to keep in mind when using the `multiprocessing`

library is that
instances of the `Pool`

and `Process`

classes can only be initialized after
the `if __name__ == "__main__"`

statement, and as a consequence
`multiprocessing.Pool`

cannot be called from within an interactive Python session.

Next we present our declarations from earlier along with the distributed
logic, which sets up our `chain_length`

function for parallel dispatch:

```
"""
Parallel solution to Project Euler Problem # 14.
"""
import multiprocessing
def collatz_test(n):
"""
If n is even, return (n/2), else return (3n+1).
"""
return((n/2) if n%2==0 else (3*n+1))
def chain_length(n):
"""
Return the length of the collatz chain along
with the input value `n`.
"""
if n<=0: return(None)
cntr, tstint = 0, n
while tstint!=1:
cntr+=1
tstint = collatz_test(tstint)
return(n, cntr)
if __name__ == "__main__":
# Initialize array of values to test.
arr = multiprocessing.Array('L', range(1, 1000000))
pool = multiprocessing.Pool()
all_lengths = pool.map(chain_length, arr, chunksize=1000)
pool.close(); pool.join()
# Search pool_results for longest chain.
longest_chain = max((i for i in all_lengths), key=lambda x: x[1])
```

We first declare our sequence of test values as `multiprocessing.Array`

, which
prevents the same 1,000,000 element sequence from being replicated in each
process. Instead, the array will be created once, and all processes will have
access to it. The `L`

typecode is from the `array`

module in the Python
Standard Library, which indicates the datatype of the elements contained in
the sequence. We initialize the `multiprocessing.Pool`

instance, then call its
`map`

method, which works similiarly to the builtin `map`

function, only in
parallel. Within `pool.map`

, We set `chunksize=1000`

due to the following
commentary in `multiprocessing`

‘s documentation:

For very long iterables using a large value for chunksize can make the job complete much faster than using the default value of 1.

Upon execution, we find that **837,799** produces the longest sequence, and it
is of length 524. By distributing the tasks to four cores
(Intel i5 - Windows 10 - Python 3.6), the script completes in 25 seconds,
whereas the sequential implementation requires approx. 55 seconds. This
disparity would only grow as the range of evaluation increases from 1M to 5M
or 10M.

For more information on the `multiprocessing`

module, be sure to check out the
documentation.
In addition, the Python Standard Library includes the `concurrent.futures`

module, which exposes an even higher-level interface that facilitates both
thread and process-based parallelism via `Executor`

objects. Until next time,
happy coding!