Coding conventions - how to do specific things in python

Started by Ryex, October 20, 2011, 06:11:29 pm

Previous topic - Next topic

Ryex

Python is a bit strange as a language. In many ways is is very similar to ruby and yet at it's core it is fundamentally different. as such many ways you would achieve thing in ruby or another language will either not work or not be the fastest in python.

Here I will record specific cases and how they should be dealt with in python.

Looping a specific number of times:

in ruby you can do this

a = 10
a.times {|i| do_somthing(i)}


python does not have this convention so to do it you must use a for loop with a list, tuple, or generator of that length


for i in xrange(10):
    do_somthing(i)


I use xrange because it creates a genertor, this is better then range because range actually makes a full list. while you wont really see any speed increase in normal usage like this case the benefit is that if for some reason you break iteration before you go out of range you do not end up creating the rest of the numbers to iterate over and if you are using a very large range (say 1,000,000 iterations?) the memory consumption is much lower because your only creating one object at a time and the iteration starts instantly instead of waiting for the list to finish creating.

Finding the difference between two lists:
in ruby you can do this

a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
b = [3, 4, 6, 8]
c = a - b
c => [1, 2, 5, 7, 9, 10]


but in python the same thing will throw an error because lists do not have a - operator (they have a + but no -, weird I know)
so to get the difference you have two options
use the set type
use a list comprehension (same speed as a for loop where you add each item individually)
c = list(set(a) - set(b))

or
c = [x for x in a if x not in b]

the first will be FAR faster than the second but it's still pretty slow and it won't preserve list order the second will be close to 10 time slower than the first or more but it will preserve list order
ironically if we combine these two methods we get something that FASTER than the first AND preserves list order.

s = set(b)
c = [x for x in a if x not in s]

this third method is faster than the first and stays that way across item types where the first will get slower as the item type gets more complex ie. numbers to strings to objects.
thus this third combination method should be used to get the difference between two lists.
I no longer keep up with posts in the forum very well. If you have a question or comment, about my work, or in general I welcome PM's. if you make a post in one of my threads and I don't reply with in a day or two feel free to PM me and point it out to me.<br /><br />DropBox, the best free file syncing service there is.<br />