Your list implementation uses (depending on your architecture) 8 bytes per list element.
>>> import sys >>> b = [False] * 100001 >>> sys.getsizeof(b) 800064
Note: This is just the memory of the list structure itself. In general, the contents of the list will use additional memory. In the "original" version, this would be pointers to two integer constants (
1); in the "original_improved" version, it would be storing pointers to the two boolean singletons (
True). Since we're only storing references to two objects in each case, this additional memory is insignificant, so we'll ignore it.
800kB of memory is not a huge amount, but to be polite, we can reduce it:
import array def aneufeld_array(a): b = array.array('b', (0,)) * (len(a) + 1) for num in a: if b[num]: return num b[num] = 1 return -1
bytearray is a better choice than
def aneufeld_bytearray(a): b = bytearray(len(a) + 1) for num in a: if b[num]: return num b[num] = 1 return -1
bytearray(size) creates a tightly packed of bytes. Unlike lists, which can store different kinds of things in each element of the list, the bytearray, as its name implies, will only store bytes.
With this new structure, we now use only 1 byte per flag, so around 100kB of memory:
>>> b = bytearray(100001) >>> sys.getsizeof(b) 100058
Performance wise, this solution is close to the original speed, so we're not giving up any significant speed while reducing the memory load to around 12.5%.
We can still do better. Using an entire byte for a single flag is wasteful; we can squeeze 8 flags into each byte. The
bitarray class does all of the heavy lifting for us:
import bitarray def aneufeld_bitarray(a): b = bitarray.bitarray(len(a) + 1) b.setall(False) for num in a: if b[num]: return num b[num] = True return -1
This gets us down to 12.5kB for the bit-array of flags. Unfortunately, this additional memory optimization comes with an additional speed hit, due to the bit packing and unpacking. The performance is still better than "Sriv_improved" performance, and we're using only 1/64th of the original memory requirement.
Timing, on my machine:
4.94 ms 4.62 ms 4.55 ms original 3.89 ms 3.85 ms 3.84 ms original_improved 20.05 ms 20.03 ms 19.78 ms hjpotter92 9.59 ms 9.69 ms 9.75 ms Reinderien 8.60 ms 8.68 ms 8.75 ms Reinderien_improved 19.69 ms 19.69 ms 19.40 ms Sriv 13.92 ms 13.99 ms 13.98 ms Sriv_improved 6.84 ms 6.84 ms 6.86 ms aneufeld_array 4.76 ms 4.80 ms 4.77 ms aneufeld_bytearray 12.71 ms 12.65 ms 12.57 ms aneufeld_bitarray