Python very slow as compared to Java for this algorithm

Tags: , ,



I’m studying algorithms and decided to port the Java Programs from the textbook to Python, since I dislike the Java overhead, especially for small programs, and as an exercise.

The algorithm itself is very simple, It just takes all triplets out of an array, in a bruteforce kinda way, and counts how many of the triplets sum up to zero (eg: [-2,7,-5])

 public static int count(int[] a) {
        int N = a.length;
        int cnt = 0;
        for (int i = 0; i < N; i++) {
            for (int j = i+1; j < N; j++) {
                for (int k = j+1; k < N; k++) {
                    if (a[i] + a[j] + a[k] == 0) {
                        cnt++;
                    }
                }
            }
        }
        return cnt;
    } 

I ported it to :

def count(a):
    cnt = 0
    ln = len(a)
    for i in xrange(0,ln): 
        for j in xrange(i + 1,ln):
            for k in xrange(j + 1,ln): 
                if a[i] + a[j] + a[k] == 0:
                    cnt+=1
    return cnt

Now measuring just these functions are taking :

java :   array of 2000 elements --> 3 seconds 
python : array of 2000 elements --> 2 minutes, 19 seconds

UPDATE 
python (pypy) : array of 2000 elements --> 4 seconds ( :-) )

Of course this is not a good algorithm, it just goes to show, both here and in the textbook. I have done some programming both in Java and Python before, but was not aware of this huge difference.

The question boils down to : how te overcome this? More specifically :

  1. Is this code a good port, or am I missing something trivial?
  2. Is switching to another runtime Jython for example a solution? Is it easy to keep my codebase in eclipse and just add an interpreter (compiler?) ? Or will switching to another interpreter/compiler only make things slightly better?

Right now I am using python 2.7.3 and Java 1.7 32ibts on windows 7.

I know there are similar questions out there on SO about java/python performance, but the answers like there are different runtime environments for python out there are not helpfull for me at the moment.

What I want to know is if some of these runtimes can close this huge gap and are worth epxloring?

UPDATE :

I installed pypy and the differences now are enormous…

UPDATE 2 :

Some very interesting things I noticed : the islice method in an answer here is faster on ‘regular’ python, but a lot slower on pypy. Even so, pypy still remains a lot faster using no matter it uses regular loops or islices in this algoritm

As Bakuriu notices in a remark runtime environments can matter a whole lot, but a runtime environment faster for this algoritm is not necessarily faster for any algoritm…

Answer

Try running it with PyPy instead of CPython. It will very likely go much faster.



Source: stackoverflow