python - Is IPC slowing this down? -

i understand there overhead when using multiprocessing module, seems high amount , level of ipc should low can gather.

say generate large-ish list of random numbers between 1-1000 , want obtain list of prime numbers. code meant test multiprocessing on cpu-intensive tasks. ignore overall inefficiency of primality test.

the bulk of code may this:

from random import systemrandom math import sqrt timeit import default_timer time multiprocessing import pool, process, manager, cpu_count  rdev = systemrandom() num_cnt = 0x5000 nums = [rdev.randint(0, 1000) _ in range(num_cnt)] primes = []   def chunk(l, n):     = int(len(l)/float(n))     j in range(0, n-1):         yield l[j*i:j*i+i]     yield l[n*i-i:]   def is_prime(n):     if n <= 2: return true     if not n % 2: return false     in range(3, int(sqrt(n)) + 1, 2):         if n % == 0:             return false     return true

it seems me should able split among multiple processes. have 8 logical cores, should able use cpu_count() # of processes.

serial:

def serial():     global primes     primes = []     num in nums:         if is_prime(num):             primes.append(num)  # primes contain values

the following sizes of num_cnt correspond speed:

0x500 = 0.00100 sec.
0x5000 = 0.01723 sec.
0x50000 = 0.27573 sec.
0x500000 = 4.31746 sec.

this way chose multiprocessing. uses chunk() function split nums cpu_count() (roughly equal) parts. passes each chunk new process, iterates through them, , assigns entry of shared dict variable. ipc should occur when assign value shared variable. why occur otherwise?

def loop(ret, id, numbers):     l_primes = []     num in numbers:         if is_prime(num):             l_primes.append(num)     ret[id] = l_primes   def parallel():     man = manager()     ret = man.dict()     num_procs = cpu_count()     procs = []     i, l in enumerate(chunk(nums, num_procs)):         p = process(target=loop, args=(ret, i, l))         p.daemon = true         p.start()         procs.append(p)     [proc.join() proc in procs]     return sum(ret.values(), [])

again, expect overhead, time seems increasing exponentially faster serial version.

0x500 = 0.37199 sec.
0x5000 = 0.91906 sec.
0x50000 = 8.38845 sec.
0x500000 = 119.37617 sec.

what causing this? ipc? initial setup makes me expect overhead, insane amount.

edit:

here's how i'm timing execution of functions:

if __name__ == '__main__':     print(hex(num_cnt))     func in (serial, parallel):         t1 = time()         vals = func()         t2 = time()         if vals none:  # serial has no return value             print(len(primes))         else:  # parallel             print(len(vals))         print("took {:.05f} sec.".format(t2 - t1))

the same list of numbers used each time.

example output:

0x5000 3442 took 0.01828 sec. 3442 took 0.93016 sec.

hmm. how measure time? on computer, parallel version faster serial one.

i'm mesuring using time.time() way: if assume tt alias time.time().

serial() t2 = int(round(tt() * 1000)) print(t2 - t1) parallel() t3 = int(round(tt() * 1000)) print(t3-t2)

i get, 0x500000 input:

5519ms serial version
3351ms parallel version

i believe your mistake caused inclusion of number generation process inside parallel, not inside serial one.

on computer, generation of random numbers takes 45seconds (it's slow process). so, can explain difference between 2 values don't think computer uses different architecture.

Story

Search This Blog

python - Is IPC slowing this down? -