Java collections faster than c++ containers?

Question

I was reading the comments on this answer and I saw this quote. Object instantiation and object-oriented features are blazing fast to use (faster than C++ in many cases) because they're designed in from the beginning. and Collections are fast. Standard Java beats standard C/C++ in this area, even for most optimized C code. One user (with really high rep

Accepted Answer

This sort of statement is ridiculous; people making it areeither incredibly uninformed, or incredibly dishonest.  Inparticular: The speed of dynamic memory allocation in the two cases willdepend on the pattern of dynamic memory use, as well as theimplementation.  It is trivial for someone familiar with thealgorithms used in both cases to write a benchmark proving whichever one he wanted to be faster.  (Thus, for example, programsusing large, complex graphs that are build, then torn down andrebuilt, will typically run faster under garbage collection.  Aswill programs that never use enough dynamic memory to triggerthe collector.  Programs using few, large, long livedallocations will often run faster with manual memorymanagement.)When comparing the collections, you have to consider what isin the collections.  If you&#8217;re comparing large vectors ofdouble, for example, the difference between Java and C++ willlikely be slight, and could go either way.  If you&#8217;re comparinglarge vectors of Point, where Point is a value class containingtwo doubles, C++ will probably blow Java out of the water,because it uses pure value semantics (with no additional dynamicallocation), where as Java needs to dynamically allocate eachPoint (and no dynamic allocation is always faster than eventhe fastest dynamic allocation).  If the Point class in Javais correctly designed to act as a value (and thus immutable,like java.lang.String), then doing a translation on thePoint in a vector will require a new allocation for everyPoint; in C++, you could just assign.Much depends on the optimizer.  In Java, the optimizer workswith perfect knowledge of the actual use cases, in thisparticular run of the program, and perfect knowledge of theactual processor it is running on, in this run.  In C++, theoptimizer must work with data from a profiling run, which willnever correspond exactly to any one run of the program, and theoptimizer must (usually) generate code that will run (and runquickly) on a wide variety of processor versions.  On the otherhand, the C++ optimizer may take significantly more timeanalysing the different paths (and effective optimization canrequire a lot of CPU); the Java optimizer has to be fairlyquick.Finally, although not relevant to all applications, C++ can besingle threaded.  In which case, no locking is needed in theallocator, which is never the case in Java.With regards to the two numbered points: C++ can use more orless the same algorithms as Java in its heap allocator.  I&#8217;veused C++ programs where the ::operator delete() function wasempty, and the memory was garbage collected.  (If yourapplication allocates lots of short lived, small objects, suchan allocator will probably speed things up.)  And as for thesecond: the really big advantage C++ has is that its memorymodel doesn&#8217;t require everything to be dynamically allocated.Even if allocation in Java takes only a tenth of the time itwould take in C++ (which could be the case, if you only countthe allocation, and not the time needed for the collectorsweeps), with large vectors of Point, as above, you&#8217;recomparing two or three allocations in C++ with millions ofallocations in Java.And finally: &#8220;why is Java&#8217;s heap allocation so much faster?&#8221;  Itisn&#8217;t, necessarily, if you amortise the time for thecollection phases.  The time for the allocation itself can bevery cheap, because Java (or at least most Java implementations)use a relocating collector, which results in all of the freememory being in a single contiguous block.  This is at leastpartially offset by the time needed in the collector: to getthat contiguity, you&#8217;ve got to move data, which means a lot ofcopying.  In most implementations, it also means an additionalindirection in the pointers, and a lot of special logic to avoidissues when one thread has the address in a register, or such.

Advertisement

Answer