PHP vs Python array memory allocation

I’ve been writing some import scripts for a system at work that will take data in either CSV, XLS, XML or a database (via Zend_Db), then store that data in a common format for us to manipulate and use for generating charts and tables.

I quickly ran in to a problem using PHP, the memory limit. If I import an excel file or database with over 25,000 rows of data I soon hit my memory limit (which on my dev box is set at 256MB!).

At first I looked for problems in my classes, then for memory leaks related to various php bugs. In the end, although I’d managed to cut memory usage down by a quarter, the app still uses way too much memory.

I decided to write a script to test how much memory PHP needed just to store this data in an array, not using my class based data sets. I used the following script:

$data = array();
for($i = 0; $i < 10000; $i++){
$data[] = array(‘one’, ‘two’, ‘three’, ‘four’, ‘five’, ‘six’);
}

The result: 19MB!

If I compare the same sort of data structure in python:

data = []
count = 0

while (count < 10000):
data.append([‘one’, ‘two’, ‘three’, ‘four’, ‘five’, ‘six’])
count = count + 1

The result: 1.39MB

The difference is huge. Unfortunately I think I’m going to have to use a different programming language to handle this part of the project, I don’t think PHP is up to the task. I am a huge fan of PHP, and use it for most things, but despite its need to remain flexible I can’t see why it needs so much memory!

I also produced a graph, I thought it would be pretty:

PHP Python memory chart benchmark