Parallel Runs ============= Almost all computers are parallel and have multiple cores available. In a polynomial homotopy, all solution paths can be computed independently from each other. :: from phcpy.dimension import get_core_count nbcores = get_core_count() nbcores In the experiment we use the cyclic 7-roots problem. :: from phcpy.families import cyclic c7 = cyclic(7) for pol in c7: print(pol) which shows the polynomials :: x0 + x1 + x2 + x3 + x4 + x5 + x6; x0*x1 + x1*x2 + x2*x3 + x3*x4 + x4*x5 + x5*x6 + x6*x0; x0*x1*x2 + x1*x2*x3 + x2*x3*x4 + x3*x4*x5 + x4*x5*x6 + x5*x6*x0 + x6*x0*x1; x0*x1*x2*x3 + x1*x2*x3*x4 + x2*x3*x4*x5 + x3*x4*x5*x6 + x4*x5*x6*x0 + x5*x6*x0*x1 + x6*x0*x1*x2; x0*x1*x2*x3*x4 + x1*x2*x3*x4*x5 + x2*x3*x4*x5*x6 + x3*x4*x5*x6*x0 + x4*x5*x6*x0*x1 + x5*x6*x0*x1*x2 + x6*x0*x1*x2*x3; x0*x1*x2*x3*x4*x5 + x1*x2*x3*x4*x5*x6 + x2*x3*x4*x5*x6*x0 + x3*x4*x5*x6*x0*x1 + x4*x5*x6*x0*x1*x2 + x5*x6*x0*x1*x2*x3 + x6*x0*x1*x2*x3*x4; x0*x1*x2*x3*x4*x5*x6 - 1; The mixed volume provides a generically sharp upper bound on the number of isolated solutions. :: from phcpy.volumes import mixed_volume mixed_volume(c7) For this problem, there are as many solutions as the mixed volume. So, there are ``924`` paths to track. :: from phcpy.solver import solve speeding up the solver on multiple cores ---------------------------------------- To measure the speedup, the elapsed time between the start and the end of the run has to be computed. The most honest time measurement is the *wall clock time* which as suggested uses the time on the wall clock. The timers provided by Python do not measure the CPU time of compiled code that is executed by the solver. :: from datetime import datetime Then the code cell below does a run on one core and prints the elapsed wall clock time. :: timestart = datetime.now() s = solve(c7) timestop = datetime.now() elapsed_onecore = timestop - timestart print('elapsed wall clock time on 1 core :', elapsed_onecore) We check on ``len(s)`` whether we have as many solutions as the mixed volume. Now we solve again, using all available cores, computed earlier in ``nbcores``. :: timestart = datetime.now() s = solve(c7, tasks=nbcores) timestop = datetime.now() elapsed_manycores = timestop - timestart print('elapsed wall clock time on', nbcores, 'cores:', elapsed_manycores) We observe the reduction in the elapsed wall clock time and compute the speedup as follows. :: speedup = elapsed_onecore/elapsed_manycores speedup quality up ---------- Can multithreading compensate for the overhead of double double arithmetic? If we can afford the time for a sequential run, by how much can we increase the precision in a multithreaded run in the same time or less? The code cell below computes all solutions in double double precision, on all avaible cores. :: timestart = datetime.now() s = solve(c7, tasks=nbcores, precision='dd') timestop = datetime.now() elapsed = timestop - timestart print('elasped wall clock time on', nbcores, 'cores :', elapsed) Again, we check whether we have as many solutions as the mixed volume. If ``elapsed < elapsed_onecore`` evaluates to ``True``, then we have achieved quality up in the multicore run. With the multicore run, we compensate for the cost overhead of double double arithmetic, if the elapsed wall clock time on many cores in double double precision is less than the run on one core in double precision."