Notes
Check the major difference between Python2 and Python3 at What’s New in Python 3.0
- Views And Iterators Instead Of Lists
 - PEP 238 Changing the behavior of division
 - All text is Unicode
 
REPL(read-eval-print-loop), Doctesting. Writing tests first as TDD suggests.
“dunder methods”: double under function, i.e.
__getitem__.Use of
collections.namedtupleto construct a simple class to represent individual cards.1
Card = collections.namedtuple('Card', ['rank', 'suit'])
Python Data Models
len(some_object)rather thansome_object.len().- Implement 
__getitem__to make an object iterable, then usable for some standard libraries! - CPython interpreter will call 
__len__function for user-defined class, but will take a shortcut by gettingob_sizein thePyVarObjectCfor built-in objects likelist,str, etc. - Avoid creating arbitrary, custom attributes with the 
__foo__syntax because such names
may acquire special meanings in the future, even if they are unused today. - If we did not implement 
__repr__, vector instances would
be shown in the console like<Vector object at 0x10e100070>. The string returned by__repr__should be unambiguous and, if possible, match the
source code necessary to recreate the object being represented. __str__should return a string suitable for display
to end-users. Check difference between__str__and__repr__.- By default, instances of user-defined classes are considered truthy, unless either
__bool__or__len__is implemented. 
Data structures
Array of sequences
Strings, lists, byte sequences, arrays, XML elements and database results share a rich set of common operations including iteration, slicing, sorting and concatenation.
Container sequences(
list,tupleandcollections.deque) hold references to the objects they contain, which may be of any type, while flat sequences (str,bytes,bytearray,memoryviewandarray.array) physically store the value of each item within its own memory space, and not as distinct objects.Mutable sequences:
list,bytearray,array.array,collections.dequeandmemory; Immutable sequences:tuple,strandbytesList comprehension = listcomp, generator expression = genexprs.
Only do listcomp when you create a list, others should use for loop for more readable and less undetected side-effects.
In Python code, line breaks are ignored inside pairs of
[],{}or().
So you can build multi-line lists, listcomps, genexps, dictionaries etc.
without using the ugly\line continuation escape.Use tuple as records, where the position is vital for the meaning of records.
1
2traveler_ids = [('USA', '31195855'), ('BRA', 'CE342567'),
('ESP', 'XDA205856')]Unpacking. Tuple unpacking is widely used in Python. But Python also provides Extended Iterable Unpacking, where we use
*to grab excess items.1
2
3
4
5
6
7
8
9a, b, *rest = range(5)
a, b, rest
(0, 1, [2, 3, 4])
a, b, *rest = range(3)
a, b, rest
(0, 1, [2])
a, b, *rest = range(2)
a, b, rest
(0, 1, [])Nested tuple unpacking: The tuple to receive an expression to unpack can have nested tuples, like
(a, b, (c, d))and Python will do the right thing if the expression matches the nesting structure.Named tuples: The
collections.namedtuplefunction is a factory that produces subclasses of tuple
enhanced with field names and a class name — which helps debugging1
2
3
4
5from collections import namedtuple
City = namedtuple('City', 'name country population coordinates')
tokyo = City('Tokyo', 'JP', 36.933, (35.689722, 139.691667))
tokyo
City(name='Tokyo', country='JP', population=36.933, coordinates=(35.689722, 139.691667))Other attributes of namedtuple,
fields,make,asdict1
2
3
4
5
6
7
8City._fields
('name', 'country', 'population', 'coordinates')
LatLong = namedtuple('LatLong', 'lat long')
delhi_data = ('Delhi NCR', 'IN', 21.935, LatLong(28.613889, 77.208889))
delhi = City._make(delhi_data)
delhi._asdict()
OrderedDict([('name', 'Delhi NCR'), ('country', 'IN'), ('population',
21.935), ('coordinates', LatLong(lat=28.613889, long=77.208889))])Slice Object:
slice(start, end, step)can be used to get sliced sequences from list, most importantly, we can define the slicing based by a name.Assigning to slice will replace the corresponding sequences defined by the slice.
1
2
3
4
5
6
7
8
9
10l = list(range(10))
l
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
l[2:5] = [20, 30]
l
[0, 1, 20, 30, 5, 6, 7, 8, 9]
del l[5:7]
l
[0, 1, 20, 30, 5, 8, 9]
l[3::2] = [11, 22]Beware of expressions like
a * nwhenais a sequence containing mutable items.1
2
3
4
5l = [[1], [2], [3]]
b = l * 5
b[2][0] = 10
b
[[1], [2], [10], [1], [2], [10], [1], [2], [10], [1], [2], [10], [1], [2], [10]]Something wierd from list inside tuple, see the example below,
1
2
3
4
5
6
7t = (1, 2, [30, 40])
t[2] += [50, 60]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
t
(1, 2, [30, 40, 50, 60])Takeaway from the example:
- Putting mutable items in tuples is not a good idea.
 - Augmented assignment is not an atomic operation — we just saw it throwing an exception after doing part of its job.
 
Python API convention: functions or methods that change an object in-place should return
Noneto make it clear to the caller that the object itself was changed, and no new object was created.Because of returning None, Python is not naturally support fluenct interface. A fluent interface is normally implemented by using method chaining to implement method cascading (in languages that do not natively support cascading), concretely by having each method return
this (self).bisect.insort: insertsitemintoseqso as to keepseqin ascending order.array.arrayis better thanlistwhen all items in the sequence are numbers. Thearrayis actually C array, and supports all mutable sequence operations (pop,insert,extend), also supports additional write and read function (fromfile,tofile). Notably, thefromfilefunction reading 10 million floating number is 60 times faster than reading numbers fromtxtfile since it write and read to binary file(also 50-60% smaller thantxtfile).Memory views: the built-in
memorviewis a shared-memory sequence type that lets you handle
slices of arrays without copying bytes. It is inspired byNumPylibrary. It is useful when dealing with large dataset.deque: double ended queue, optimized for modifying element from both ends.queueprovides the synchronized (i.e. thread-safe) classesQueue,LifoQueueandPriorityQueue. These are used for safe communication between threads.multiprocessingpackage has its own queue but designed for inter-process communication,asyncioalso implements similar queues which are adapted for managing task in asynchronous programming.
Dictionaries and Sets
- Dictionaries are everywhere in Python. The built-in functions live in 
__builtins__.__dict__. - The key must be hashable in dictionary, which implements 
__hash__method. The atomic immutable types (str,bytes, numeric types) are all hashable. Afrozen set(builtin set type which cannot be changed after created) is always hashable, because its elements must be hashable by definition. A tuple is hashable only if all its items are hashable. - dict comprehension - dictcomp: builds a dict instance by producing 
key:valuepair from any iterable - Every Pythonista knows that d.get(k, default) is an alternative to 
d[k]whenever a default value is more convenient than handlingKeyError. However I didn’t konw that before read the paragraph… setdefault:1
2
3
4
5my_dict.setdefault(key, []).append(new_value)
# is equvalent to
if key not in my_dict:
my_dict[key] = []
my_dict[key].append(new_value)- Using 
defaultdictor another mapping implemented with__missing__, to properly handle the missing key in lookup. - A search like 
k in my_dict.keys()is efficient in Python 3 even for very large mappings because dict.keys() returns a view, which is similar to a set, and containment checks in sets are as fast as in dicts. Check this for explanations about dictionary view. - Other types of mapping (dict): 
collections.OrderedDict: maintains keys in insertion order, allowing iteration over items in a predictable order;collections.OrderedDict: holds a list of mappings which can be searched as one. The lookup is performed on
each mapping in order, and succeeds if the key is found in any of them.collections.Counter: a mapping that holds an integer count for each key;collections.UserDict: a pure Python implementation of a mapping that works like a standard dict, ususally desiged to be subclassed. - We should subclass 
UserDictrather than bulit-indict, because there are shortcuts implementations that may cause problem. Here is the comparison for designing a new dictionary class withUserDictanddict:1
2
3
4
5
6
7
8
9
10
11
12class StrKeyDict0(dict):
def __missing__(self, key):
if isinstance(key, str):
raise KeyError(key)
return self[str(key)]
def get(self, key, default=None):
try:
return self[key]
except KeyError:
return default
def __contains__(self, key):
return key in self.keys() or str(key) in self.keys()1
2
3
4
5
6
7
8
9
10import collections
class StrKeyDict(collections.UserDict):
def __missing__(self, key):
if isinstance(key, str):
raise KeyError(key)
return self[str(key)]
def __contains__(self, key):
return str(key) in self.data
def __setitem__(self, key, item):
self.data[str(key)] = item - Use 
type.MappingProxyTypecould generate a read-only but dynamic view of a dictionary, which prevents uncautious update of a dictionary. Setclass implements some infix operations union, intersection and difference etc., which could be used to reduce the loop thus the running time: here is an exaple to check the occurence of needles in *haystack:1
found = len(needles & haystack)
set comprehensionsimilar to list / dict comprehension.- The hashtable used by 
dictandsetare powerful and should achieve constant key searching / retrieving time, also in practice it might be always constant, for example the memeory cannot fit the whole data, then memory fetching time may increase w.r.t. the size of data. - Implementation of hashtable used by Python: a sparse array of cells/buckets, each cell contains the reference to key and value. Python tries to keep at least 1/3 cells are empty.
 - due to the hash table implementation, adding items to a dict/set may change the order of existing keys. So DO NOT modifying contents while iterating through it in the context of 
dictandset. 
Text and bytes
stris sequence of characters, while characters may have different definition and may lead to problem if not properly handled.The actual bytes that represent a character depend on the encoding in use. From code point of the character to byte representation is encoder, the reverse is decoding. The byte representation is used for storage and transmission, while
stris for human readability.UnicodeEncodeErrorusually associates with source string having characters cannot be mapped to byte sequence with specified encoder.1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20city = 'São Paulo'
city.encode('utf_8')
b'S\xc3\xa3o Paulo'
city.encode('utf_16')
b'\xff\xfeS\x00\xe3\x00o\x00 \x00P\x00a\x00u\x00l\x00o\x00'
city.encode('iso8859_1')
b'S\xe3o Paulo'
city.encode('cp437')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/.../lib/python3.4/encodings/cp437.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_map)
UnicodeEncodeError: 'charmap' codec can't encode character '\xe3' in
position 1: character maps to <undefined>
>>> city.encode('cp437', errors='ignore')
b'So Paulo'
>>> city.encode('cp437', errors='replace')
b'S?o Paulo'
city.encode('cp437', errors='xmlcharrefreplace')
b'São Paulo'UnicodeDecodeErroroccurs when byte sequneces cannot be recoginized by specified decoder.1
2
3
4
5
6
7
8
9
10
11
12
13
14octets = b'Montr\xe9al'
octets.decode('cp1252')
'Montréal'
octets.decode('iso8859_7')
'Montrιal'
octets.decode('koi8_r')
'MontrИal
>>> octets.decode('utf_8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 5:
invalid continuation byte
octets.decode('utf_8', errors='replace')
'Montr�alSyntaxErroroccurs when the source Python file contains non UTF8 (default codecs for Python) chars.When you are not sure about the string codecs, use
chardetto detect.Unicode sandwich as the best practice of text processing. (1) read and encode text from bytes to
str; (2) processing instr, (3) decode to byte sequence and store back.Garbled chars occurs when you read file with wrong decoder, ususally windows having its default “Windows 1252” will not compatible with some chars in UTF8.
Be careful between the codecs default in Unix and Windows system. check
locale.getpreferredencoding()for the default encoding.osandrepackages could takestrandbytesas arguments, with corresponding different handling stategies. Be careful about the type before passing it to those functions inosandre.
Functions as objects
Functions
- Functions in Python are first-class objects (i.e. integers, strings and dictionaries). Programming language theorists define a 
first-class objectas a program entity that can be: 
- created at runtime;
 - assigned to a variable or element in a data structure;
 - passed as an argument to a function;
 - returned as the result of a function.
 
Higher order functions: takes a function as argument or returns a function as a results. For example the function
sortedis an example.1
2
3
4# sorted takes a function len as argument
fruits = ['strawberry', 'fig', 'apple', 'cherry', 'raspberry', 'banana']
sorted(fruits, key=len)
['fig', 'apple', 'cherry', 'banana', 'raspberry', 'strawberry']In the functional programming paradigm, some of the best known higher-order functions are
map,filter,reduceandapply.applyis deprecated and not used after Python3. And for other functions there are many better alternatives for most of their use cases.mapandfiltercan be replaced bylistcompandgenexprsince they are more readable, sometimes more faster according to the link (also there are many good explanations of the difference betweenmapv.s.listcomp).The best use of anonymous functions is in the context of an argument list. Otherwise it is better to used the normal function definition. Check here for the refactor tips for bad lambda functions in Python.
Python3 proivdes pretty flexible arguments passing to a function, check here for more information. Official explanation is introduced by PEP 3102.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20# understand more about how to right functions with variable positional arguments and keyword argument
def f(a, *pargs, **kargs):
print(a, pargs, kargs)
f(1, 2, 3, x=4, y=5)
1 (2, 3) {'x': 4, 'y': 5} # the arguments are parsed to assign to different argument in the signature of the function
def f_1(a, *pargs, **kargs):
print(a, pargs, kargs)
f(1, 2, 3, x=4, x=5)
File "<stdin>", line 1
SyntaxError: keyword argument repeated # cannot parse duplicated keyword arguments
f(1, 2, 3, x=4, y=5, 10)
File "<stdin>", line 1
SyntaxError: positional argument follows keyword argument
def f(a, **kargs, *pargs):
File "<stdin>", line 1
def f(a, **kargs, *pargs):
^
SyntaxError: invalid syntax # cannot create positional arguments after keyword arguments.Function annotations: Python 3 provides syntax to attach metadata to the parameters of a function declaration
and its return value. Here is an example, the only difference is its sigature in the declaration.1
2
3
4
5
6def clip(text, max_len:=80): # without annotations
...
def clip(text:str, max_len:'int > 0'=80) -> str: # with annotations
...
clip.__annotations__
{'text': <class 'str'>, 'max_len': 'int > 0', 'return': <class 'str'>}
Note: The only thing Python does with annotations is to store them in the __annotations__ attribute of the function. Nothing else: no checks, enforcement, validation, or any other action is performed. They are just metadata that may be used by tools, such as IDEs, frameworks and decorators. 
operatormodule implements many common functions that needed in functional programming, for examplemul==lambda a, b: a*b, alsoitemgetterwhich helps you to get subitem from the item which can retrieve item by index (implements__getitem__),attrgettercan get values by keyword(s) from an item having mapping. Some functions supported byoperatorcan be found in the following:Most of the 52 names above are self-evident. The group of names prefixed with1
2
3
4[name for name in dir(operator) if not name.startswith('_')]
['abs', 'add', 'and_', 'attrgetter', 'concat', 'contains',
'countOf', 'delitem', 'eq', 'floordiv', 'ge', 'getitem', 'gt',
'iadd', 'iand', 'iconcat', 'ifloordiv', 'ilshift', 'imod', 'imul', 'index', 'indexOf', 'inv', 'invert', 'ior', 'ipow', 'irshift', 'is_', 'is_not', 'isub', 'itemgetter', 'itruediv', 'ixor', 'le', 'length_hint', 'lshift', 'lt', 'methodcaller', 'mod', 'mul', 'ne', 'neg', 'not_', 'or_', 'pos', 'pow', 'rshift', 'setitem', 'sub', 'truediv', 'truth', 'xor']iand
the name of another operator — e.g.iadd,iandetc. — correspond to the augmented
assignment operators — e.g.+=,&=etc.
Also check methodcaller in operator that could create those functions based on other built-in functions.
functools.partialcan create partial functions where part of the arguments are freezed compared to the normal functions. Here is an example1
2
3
4
5
6
7from operator import mul
from functools import partial
triple = partial(mul, 3)
triple(7)
21
list(map(triple, range(1, 10)))
[3, 6, 9, 12, 15, 18, 21, 24, 27]partialtakes a callable as first argument, followed by an arbitrary number of positional
and keyword arguments to bind. On a sidenote,partialmethod(introduced from Python 3.4) is same topartialbut for class method, refer to this.
Decorators and closures
The first crucial fact about
decoratorsis that they have the power to replace the decorated function with a different one. The second crucial fact is that they are executed immediately when a module is loaded.
Check the following decorators and prints to understand it better.1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32# registration.py
registry = []
def register(func):
print('running register(%s)' % func)
registry.append(func)
return func
def f1():
print('running f1()')
def f2():
print('running f2()')
def f3():
print('running f3()')
def main():
print('running main()')
print('registry ->', registry)
f1()
f2()
f3()
if __name__=='__main__':
main()
## running this as a python file
$ python3 registration.py
running register(<function f1 at 0x100631bf8>)
running register(<function f2 at 0x100631c80>)
running main()
registry -> [<function f1 at 0x100631bf8>, <function f2 at 0x100631c80>]
running f1()
running f2()
running f3()Closure: similar to the meaning applied to other programming languages, the closure is reachability of an local variable. Consider the following decorator example:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15def make_averager():
series = []
def averager(new_value):
series.append(new_value)
total = sum(series)
return total/len(series)
return averager
# load the module and run
avg = make_averager()
avg(10)
10.0
avg(11)
10.5
avg(12)
11.0Where
seriesis a free variable (variable that is not bound in the local scope). Although we could not find in the local scope, theseriesis bound and kept in the__closure___attribute of the returned functionavg.
Here is how we could find the free variable (continued from last code block).1
2
3
4
5
6
7
8avg.__code__.co_varnames
('new_value', 'total')
avg.__code__.co_freevars
('series',)
avg.__closure__
(<cell at 0x107a44f78: list object at 0x107a91a48>,)
avg.__closure__[0].cell_contents
[10, 11, 12]To make it easier, we only need to deal with those “external variables” when we define a function nested in another function.
nonlocalis used to keep immutable free variables not changed to local even if there is an assignment in the local scope. Here is a function which give the same output with the previousaverager.1
2
3
4
5
6
7
8
9def make_averager():
count = 0
total = 0
def averager(new_value):
nonlocal count, total # without this line assignment error will occur because count will be update to be a local variable
count += 1
total += new_value
return total / count
return averagerA draw back for a simple decorator is that it will mask the
__name__and__doc__of the decorated functions. We could usefunctools.wrapsto copy those information to the decorated return function.Check
singledispatchdecorator at here.
stacked decorators applied from inner to outer:
1
2
3
4
5
6
7
def f():
...
# is the same as
f = d1(d2(f))Use decorator factory method could let the decorator accept other arguments, note that the factory method return the decorator rather than the inner function defined inside the decorator. Samples:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20registry = set()
def register(active=True):
def decorate(func):
print('running register(active=%s)->decorate(%s)'
% (active, func))
if active:
registry.add(func)
else:
registry.discard(func)
return func
return decorate
def f1():
print('running f1()')
# here we need to add bracket to call register to get real decorator!!
def f2():
print('running f2()')Check Python Decorator Library for more practical examples of decorators.
Object Oriented Idioms
Object references, mutability, recylcing
Python variables are reference to the real object.
1
2
3
4
5a = [1, 2, 3]
b = a
a.append(4)
b
[1, 2, 3, 4]use
isto check object identities are same or not, use==to check the values are equal or not between two objects.isis faster than==because the former cannot be overloaded so Python does not have to find and invoke special methods to evaluate it. Note that==is the syntactic sugar for__eq__function.Copies are shallow by default: the shallows refers during copy the outermost container get copied but inside elements are not copied. Here is an example
1
2
3
4
5
6
7
8
9
10
11
12
13l1 = [3, [55, 44], (7, 8, 9)]
l2 = list(l1)
l2
[3, [55, 44], (7, 8, 9)]
l2 == l1
True
l2 is l1
False
l2[1].append(100)
l1
[3, [55, 44, 100], (7, 8, 9)]
l2
[3, [55, 44, 100], (7, 8, 9)]use
deepcopyto copy objects which don’t have shared reference to the source objects. Also we could customized copy and deepcopy behavior by implementing__copy__()and__deepcopy__()method according to the doc.The parameter passing in Python is call by sharing: each formal parameter of the function gets a copy of each reference in the arguments. The result of this scheme is that a function may change any mutable object passed as a parameter, but it cannot change the identity of those objects, i.e. it cannot replace altogether an object with another. Here is an example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25def f(a, b):
a += b
return a
...
x = 1
y = 2
f(x, y)
3
# x and y does not change because they are int type which is immutable, so changes in function cannot replace x with a new int object.
x, y
(1, 2)
a = [1, 2]
b = [3, 4]
f(a, b)
[1, 2, 3, 4]
# the object a does not get changed in identity, but the value are indeed changed.
a, b
([1, 2, 3, 4], [3, 4])
t = (10, 20)
u = (30, 40)
f(t, u)
(10, 20, 30, 40)
t, u
((10, 20), (30, 40))
# tuple are immutable objects, so the behavior should be the same as int object discussed above.Avoid setting the default parameter as mutable object (
list,dict, …). Check the example to understand the side effect from it.1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31class HauntedBus:
"""A bus model haunted by ghost passengers"""
def __init__(self, passengers=[]):
self.passengers = passengers
def pick(self, name):
self.passengers.append(name)
def drop(self, name):
self.passengers.remove(name)
bus1 = HauntedBus(['Alice', 'Bill'])
bus1.passengers
['Alice', 'Bill']
bus1.pick('Charlie')
bus1.drop('Alice')
bus1.passengers
['Bill', 'Charlie']
bus2 = HauntedBus()
bus2.pick('Carrie')
bus2.passengers
['Carrie']
bus3 = HauntedBus()
bus3.passengers
['Carrie']
# the bus3 acutally shares the passengers list with bus2!!
bus3.pick('Dave')
bus2.passengers
['Carrie', 'Dave']
bus2.passengers is bus3.passengers
True
bus1.passengers
['Bill', 'Charlie']The problem is that each default value is evaluated when the function is defined — i.e. usually when the module is loaded — and the default values become attributes of the function object. So if a default value is a mutable object, and you change it, the change will affect every future call of the function.
The defensive way to handle the above problem:
1  | def __init__(self, passengers=None):  | 
The
delstatement deletes names, not objects. An object may be garbage collected as result of adelcommand, but only if the variable deleted holds the last reference to the
object, or if the object becomes unreachable. In CPython the primary algorithm for garbage collection is reference counting. But for other implementations of Python they have more sophisticated garbage collector that do not rely on reference counting. We could useweakref.finalizeto register a callback function to be called when an object is destroyed.Weak references refers to those references that will not increase the reference count, a low level mechanism underlying the more useful collections
WeakValueDictionary,WeakKeyDictionary,WeakSet, and thefinalizefunction from theweakrefmodule are related to some use case of weak references.Some strings or integers are interned, where creating two variables with the same string or integers separately will make two variables references to the same object. However it is mostly dependent on different implementation of Python. But be careful to those common stings and integers: (1)Integers in range(-5, 257), (2) Often-used, single-character, and empty strings.
(Source: Slide 22 from Wesley Chun’s talk). The intering is quite common, for example the string interning.
Pythonic object
- Difference between object representation: 
repr()andstr(): 
repr():
Return a string representing the object as the developer wants to see it.str():
Return a string representing the object as the user wants to see it.
Difference between
@classmethodand@staticmethod, the former take the class (not the instance) as the first argument, so it could change the behavior based on which class is passed in; while the latter don’t have any class object as the input and it is mainly used when we want to do something only related to the specific class, regardless of what instance in the class. A good reference artical discussed the difference in detail.How
format()works: check here about the format mini-language. Then we could add custom formatting for a class by implementing__format__functions.
Name mangling: if we name a field inside a class starting with double underscore and no or one underscore the the end, Python stores the name in the instance
__dict__prefixed with a leading underscore and the class name. Note that it is about safety, not security: it’s designed to prevent accidental access
and not intentional wrongdoing.The single underscore prefix has no special meaning to the Python interpreter when used in attribute names, but it’s a very strong convention among Python programmers that you should not access such attributes from outside the class. In modules, a single
_in front of a top-level name does have an effect: if you write frommymod import *the names with a_prefix are not imported from mymod. But you can still write from mymodimport _privatefunc. According to this.To define
__slots__you create a class attribute with that name and assign it an iterable of str with identifiers for the instance attributes. The author prefers using a tuple for that, because it conveys the message that the__slots__definition cannot change. That will help to reduce the overhead of per-instance__dict__. However, here are few drawbacks:
- You must remember to redeclare 
__slots__in each subclass, since the inherited attribute is ignored by the interpreter. - Instances will only be able to have the attributes listed in 
__slots__, unless you include ‘__dict__‘ in__slots__— but doing so may negate the memory savings. - Instances cannot be targets of weak references unless you remember to include ‘
__weakref__‘ in__slots__. 
Sequence hacking, hashing and slicing
repr()is conventionaly used by debugging print, so it will abbreviate if the array output has larger (>6) elements. We should usereprlibto limit the printing length for customized class.How slice works? The slice index become slice object:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15class MySeq:
def __getitem__(self, index):
return index #
...
s = MySeq()
s[1] #
1
s[1:4] #
slice(1, 4, None)
s[1:4:2] #
slice(1, 4, 2)
s[1:4:2, 9] #
(slice(1, 4, 2), 9)
s[1:4:2, 7:9] #
(slice(1, 4, 2), slice(7, 9, None))To handle better slicing, in the customized class
__getitem__function, we should separate the logic fortype(index) == sliceandnumbers.Integral.__getattr__is used to get a specifc attribute in a class. Dynamic attributes could be created using specific logic in this function. Remember to raiseAttributeErroraccordingly.__setattr__is the corresponding setter, could used to set read only attributes, which is pretty useful in hashable object.When using reduce it’s good practice to provide the third argument,
reduce(function, iterable, initializer), to prevent this
exception:TypeError: reduce() of empty sequence with no initial value.
Interfaces: from protocols to ABCs
Warning of over engineering: ABCs, like descriptors and metaclasses, are tools for building frameworks.
Therefore, only a very small minority of Python developers
can create ABCs without imposing unreasonable limitations and
needless work on fellow programmers.Every class has an interface: the set public attributes (methods or data attributes) implemented or inheritedby the class. For example,
__getitem__or__add__.Monkey patching is to change a class or module at runtime without touchign the source code.
Other information
Python in the book:
CPython 3.4Github: fluentpython
Useful visualization of Python code running: pythontutor
use
dirfunction to check the keyword of an object.use
disfunction to get the bytecode of a python function, which could be helpful if some subtle difference comparision.