類和接口
Python作為面向?qū)ο笳Z(yǔ)言,繼承多態(tài)和封裝有良好的應(yīng)用,如何編寫可維護(hù)的代碼呢?
- Item37: 組合類而不是嵌套多層的Built-in類型
假設(shè)現(xiàn)在要記錄一群學(xué)生(不知道姓名)的分?jǐn)?shù)。我可以定義一個(gè)類來(lái)把姓名存儲(chǔ)為字典。
class SimpleGradebook:
def __init__(self):
self._grades = {}
def add_student(self, name):
self._grades[name] = []
def report_grade(self, name, score):
self._grades[name].append(score)
def average_grade(self, name):
grades = self._grades[name]
return sum(grades) / len(grades)
book = SimpleGradebook()
book.add_student('Isaac Newton')
book.report_grade('Isaac Newton', 90)
book.report_grade('Isaac Newton', 95)
book.report_grade('Isaac Newton', 85)
print(book.average_grade('Isaac Newton'))
>>>
90.0
字典及相關(guān)的built-in類型容易用,但是有過(guò)度擴(kuò)展的危險(xiǎn)。比如現(xiàn)在不止想保存分?jǐn)?shù),還想保存對(duì)應(yīng)科目:
from collections import defaultdict
class BySubjectGradebook:
def __init__(self):
self._grades = {} # Outer dict
def add_student(self, name):
self._grades[name] = defaultdict(list) # Inner dict
這足夠直接且符合直覺(jué),多層的字典似乎也還能管理。繼續(xù)修改對(duì)應(yīng)的代碼:
def report_grade(self, name, subject, grade):
by_subject = self._grades[name]
grade_list = by_subject[subject]
grade_list.append(grade)
def average_grade(self, name):
by_subject = self._grades[name]
total, count = 0, 0
for grades in by_subject.values():
total += sum(grades)
count += len(grades)
return total / count
book = BySubjectGradebook()
book.add_student('Albert Einstein')
book.report_grade('Albert Einstein', 'Math', 75)
book.report_grade('Albert Einstein', 'Math', 65)
book.report_grade('Albert Einstein', 'Gym', 90)
book.report_grade('Albert Einstein', 'Gym', 95)
print(book.average_grade('Albert Einstein'))
>>>
81.25
假如現(xiàn)在又有新的需求,需要變?yōu)椴煌瑴y(cè)試帶有不同的權(quán)重:(不止是分?jǐn)?shù),還有權(quán)重)
class WeightedGradebook:
def __init__(self):
self._grades = {}
def add_student(self, name):
self._grades[name] = defaultdict(list)
def report_grade(self, name, subject, score, weight):
by_subject = self._grades[name]
grade_list = by_subject[subject]
grade_list.append((score, weight))
def average_grade(self, name):
by_subject = self._grades[name]
score_sum, score_count = 0, 0
for subject, scores in by_subject.items():
subject_avg, total_weight = 0, 0
for score, weight in scores:
subject_avg += score * weight
total_weight += weight
score_sum += subject_avg / total_weight
score_count += 1
return score_sum / score_count
book = WeightedGradebook()
book.add_student('Albert Einstein')
book.report_grade('Albert Einstein', 'Math', 75, 0.05)
book.report_grade('Albert Einstein', 'Math', 65, 0.15)
book.report_grade('Albert Einstein', 'Math', 70, 0.80)
book.report_grade('Albert Einstein', 'Gym', 100, 0.40)
book.report_grade('Albert Einstein', 'Gym', 85, 0.60)
print(book.average_grade('Albert Einstein'))
>>>
80.25
超過(guò)一層的嵌套盡量就不要繼續(xù)用了。(維護(hù)噩夢(mèng))
應(yīng)該重構(gòu)成類。
grades = []
grades.append((95, 0.45))
grades.append((85, 0.55))
total = sum(score * weight for score, weight in grades)
total_weight = sum(weight for _, weight in grades)
average_grade = total / total_weight
如果要加一些教師評(píng)價(jià),可能就會(huì)引入很多下劃線_:
grades = []
grades.append((95, 0.45, 'Great job'))
grades.append((85, 0.55, 'Better next time'))
total = sum(score * weight for score, weight, _ in grades)
total_weight = sum(weight for _, weight, _ in grades)
average_grade = total / total_weight
這里,namedtuple剛好符合要求:
from collections import namedtuple
Grade = namedtuple('Grade', ('score', 'weight'))
但是,namedtuple也有限制:
不能指定默認(rèn)參數(shù)。
當(dāng)你的數(shù)據(jù)有很多可選的屬性時(shí),這點(diǎn)就很不好。屬性多的時(shí)候用built-in可能更合適。
namedtuple的屬性值仍可訪問(wèn)。如果不能控制它們的使用,最好還是顯式地定義一個(gè)新的類。
class Subject:
def __init__(self):
self._grades = []
def report_grade(self, score, weight):
self._grades.append(Grade(score, weight))
def average_grade(self):
total, total_weight = 0, 0
for grade in self._grades:
total += grade.score * grade.weight
total_weight += grade.weight
return total / total_weight
class Student:
def __init__(self):
self._subjects = defaultdict(Subject)
def get_subject(self, name):
return self._subjects[name]
def average_grade(self):
total, count = 0, 0
for subject in self._subjects.values():
total += subject.average_grade()
count += 1
return total / count
class Gradebook:
def __init__(self):
self._students = defaultdict(Student)
def get_student(self, name):
return self._students[name]
book = Gradebook()
albert = book.get_student('Albert Einstein')
math = albert.get_subject('Math')
math.report_grade(75, 0.05)
math.report_grade(65, 0.15)
math.report_grade(70, 0.80)
gym = albert.get_subject('Gym')
gym.report_grade(100, 0.40)
gym.report_grade(85, 0.60)
print(albert.average_grade())
>>>
80.25
- Item38: 對(duì)于簡(jiǎn)單的接口,接受函數(shù)而不是類
許多built-in的API允許傳遞函數(shù)。這些鉤子(hooks)被API回調(diào)。比如:sort函數(shù)的key參數(shù)可以傳遞函數(shù):
names = ['Socrates', 'Archimedes', 'Plato', 'Aristotle']
names.sort(key=len)
print(names)
>>>
['Plato', 'Socrates', 'Aristotle', 'Archimedes']
當(dāng)然,還有很多例子,比如defaultdict的參數(shù)也可以是類名或者函數(shù),就是需要返回默認(rèn)的值。
如果定義為每次返回0:
def log_missing():
print('Key added')
return 0
先構(gòu)建出current的result,再增量地加回去。默認(rèn)值為log_missing返回的0。
from collections import defaultdict
current = {'green': 12, 'blue': 3}
increments = [
('red', 5),
('blue', 17),
('orange', 9),
]
result = defaultdict(log_missing, current)
print('Before:', dict(result))
for key, amount in increments:
result[key] += amount
print('After: ', dict(result))
>>>
Before: {'green': 12, 'blue': 3}
Key added
Key added
After: {'green': 12, 'blue': 20, 'red': 5, 'orange': 9}
假如現(xiàn)在在添加的時(shí)候,需要統(tǒng)計(jì)添加的列別的數(shù)目,如下:(利用了閉包的屬性,可以在內(nèi)部進(jìn)行統(tǒng)計(jì)。)
def increment_with_report(current, increments):
added_count = 0
def missing():
nonlocal added_count # Stateful closure
added_count += 1
return 0
result = defaultdict(missing, current)
for key, amount in increments:
result[key] += amount
return result, added_count
盡管defaultdict不知道m(xù)issing這個(gè)hook保持了什么狀態(tài)信息,最終結(jié)果也可以得到為2。
result, count = increment_with_report(current, increments)
assert count == 2
其它的語(yǔ)言可能可以定義一個(gè)類來(lái)保持狀態(tài),然后傳遞這個(gè)實(shí)例的方法:
class CountMissing:
def __init__(self):
self.added = 0
def missing(self):
self.added += 1
return 0
同樣也是可以達(dá)到效果:
counter = CountMissing()
result = defaultdict(counter.missing, current) # Method ref
for key, amount in increments:
result[key] += amount
assert counter.added == 2
雖然類比閉包清晰一些,但是CountMissing類的目的不是很顯而易見(jiàn),直到看到defaultdict的時(shí)候。(誰(shuí)創(chuàng)建,誰(shuí)調(diào)用missing,這個(gè)類未來(lái)需要其它的puclic方法嗎?)
python允許類定義__call__的方法,調(diào)用callable時(shí),如果該類實(shí)現(xiàn)了__call__會(huì)返回true。
class BetterCountMissing:
def __init__(self):
self.added = 0
def __call__(self):
self.added += 1
return 0
counter = BetterCountMissing()
assert counter() == 0
assert callable(counter)
當(dāng)key缺失的時(shí)候,會(huì)調(diào)用一次counter,即其call方法。
counter = BetterCountMissing()
result = defaultdict(counter, current) # Relies on __call__
for key, amount in increments:
result[key] += amount
assert counter.added == 2
這樣,就可以很方便的實(shí)現(xiàn)上面的需求。
- Item39: 用@classmethod多態(tài)來(lái)泛化(泛型)地構(gòu)建對(duì)象
不止對(duì)象支持多態(tài),類也同樣支持,有什么好處?
多態(tài)允許多個(gè)類在一個(gè)層級(jí)制度下實(shí)現(xiàn)它們自己的特有的版本。這意味著許多類可以提供不同的功能給同一個(gè)接口或者抽象類。
比如,現(xiàn)在在寫MapReduce的實(shí)現(xiàn),要一個(gè)公共的抽象類來(lái)表示輸入數(shù)據(jù):
class InputData:
def read(self):
raise NotImplementedError
從磁盤上的文件讀數(shù)據(jù):
class PathInputData(InputData):
def __init__(self, path):
super().__init__()
self.path = path
def read(self):
with open(self.path) as f:
return f.read()
我可以有很多種InputData,比如NetworkInputData。而對(duì)于MapReduce的worker來(lái)說(shuō),需要輸入和消費(fèi)這些數(shù)據(jù):
class Worker:
def __init__(self, input_data):
self.input_data = input_data
self.result = None
def map(self):
raise NotImplementedError
def reduce(self, other):
raise NotImplementedError
此時(shí),有一個(gè)具體的獲取行數(shù)的Worker:
class LineCountWorker(Worker):
def map(self):
data = self.input_data.read() # 讀數(shù)據(jù)
self.result = data.count('\n') # 當(dāng)前數(shù)據(jù)的行數(shù)
def reduce(self, other):
self.result += other.result # 合并其它的Worker的結(jié)果。
似乎需要一個(gè)helper函數(shù)來(lái)生成數(shù)據(jù)。
import os
def generate_inputs(data_dir):
for name in os.listdir(data_dir):
yield PathInputData(os.path.join(data_dir, name))
然后根據(jù)這些數(shù)據(jù),來(lái)生成worker:
def create_workers(input_list):
workers = []
for input_data in input_list:
workers.append(LineCountWorker(input_data))
return workers
然后調(diào)用map來(lái)分散到各個(gè)線程計(jì)算,最后用reduce來(lái)產(chǎn)生最終結(jié)果:
from threading import Thread
def execute(workers):
threads = [Thread(target=w.map) for w in workers]
for thread in threads: thread.start()
for thread in threads: thread.join()
first, *rest = workers
for worker in rest:
first.reduce(worker)
return first.result
最后把幾個(gè)helper連接到一起返回結(jié)果:
def mapreduce(data_dir):
inputs = generate_inputs(data_dir)
workers = create_workers(inputs)
return execute(workers)
隨機(jī)生成一些文件,發(fā)現(xiàn)可以工作得很好:
import os
import random
def write_test_files(tmpdir):
os.makedirs(tmpdir)
for i in range(100):
with open(os.path.join(tmpdir, str(i)), 'w') as f:
f.write('\n' * random.randint(0, 100))
tmpdir = 'test_inputs'
write_test_files(tmpdir)
result = mapreduce(tmpdir)
print(f'There are {result} lines')
>>>
There are 4360 lines
問(wèn)題出現(xiàn)在哪?mapreduce方法不夠泛化。如果我要寫另一種InputData或者Worker的子類,需要重寫上面的幾個(gè)方法來(lái)匹配。
最好的方式是用類多態(tài)(因?yàn)?strong>init只有一個(gè),對(duì)每個(gè)InputData的子類來(lái)寫適配的constructor不合理。)
使用了@classmethod來(lái)創(chuàng)建新的InputData:
class GenericInputData:
def read(self):
raise NotImplementedError
@classmethod
def generate_inputs(cls, config):
raise NotImplementedError
用config來(lái)找到字典值來(lái)處理:
class PathInputData(GenericInputData):
...
@classmethod
def generate_inputs(cls, config):
data_dir = config['data_dir']
for name in os.listdir(data_dir):
yield cls(os.path.join(data_dir, name))
類似地,可以創(chuàng)建泛型Worker。用cls()創(chuàng)建特定的子類。
class GenericWorker:
def __init__(self, input_data):
self.input_data = input_data
self.result = None
def map(self):
raise NotImplementedError
def reduce(self, other):
raise NotImplementedError
@classmethod
def create_workers(cls, input_class, config):
workers = []
for input_data in input_class.generate_inputs(config):
workers.append(cls(input_data))
return workers
注意到調(diào)用input_class.generate_inputs是類的多態(tài)。可以看到create_workers調(diào)用了cls()來(lái)提供額外的方式來(lái)構(gòu)建GenericWorker(用到__init__)
class LineCountWorker(GenericWorker):
...
最后,重寫mapreduce函數(shù):
def mapreduce(worker_class, input_class, config):
workers = worker_class.create_workers(input_class,
config)
return execute(workers)
config = {'data_dir': tmpdir}
result = mapreduce(LineCountWorker, PathInputData, config)
print(f'There are {result} lines')
>>>
There are 4360 lines
可以看出,通過(guò)@classmethod的cls可以建立具體類的連接。
- Item40: 用super來(lái)初始化父類
古老且簡(jiǎn)單的方式來(lái)初始化父類是直接調(diào)用父類的__init__方法:
class MyBaseClass:
def __init__(self, value):
self.value = value
class MyChildClass(MyBaseClass):
def __init__(self):
MyBaseClass.__init__(self, 5)
但是在許多情況下失效。比如定義類來(lái)操作實(shí)例變量value。
class TimesTwo:
def __init__(self):
self.value *= 2
class PlusFive:
def __init__(self):
self.value += 5
構(gòu)建的時(shí)候,繼承的時(shí)候是匹配結(jié)果的順序。
class OneWay(MyBaseClass, TimesTwo, PlusFive):
def __init__(self, value):
MyBaseClass.__init__(self, value)
TimesTwo.__init__(self)
PlusFive.__init__(self)
結(jié)果為:
foo = OneWay(5)
print('First ordering value is (5 * 2) + 5 =', foo.value)
>>>
First ordering value is (5 * 2) + 5 = 15
另一種是定義一樣的父類但是不一樣的順序:
class AnotherWay(MyBaseClass, PlusFive, TimesTwo):
def __init__(self, value):
MyBaseClass.__init__(self, value)
TimesTwo.__init__(self)
PlusFive.__init__(self)
定義和實(shí)現(xiàn)的順序不同。這種順序比較難發(fā)現(xiàn),對(duì)于新手來(lái)說(shuō)不友好。
bar = AnotherWay(5)
print('Second ordering value is', bar.value)
>>>
Second ordering value is 15
另一個(gè)問(wèn)題發(fā)生在菱形繼承。比如兩個(gè)類繼承同一個(gè)類:
class TimesSeven(MyBaseClass):
def __init__(self, value):
MyBaseClass.__init__(self, value)
self.value *= 7
class PlusNine(MyBaseClass):
def __init__(self, value):
MyBaseClass.__init__(self, value)
self.value += 9
然后定義一個(gè)類繼承這兩個(gè)類:
class ThisWay(TimesSeven, PlusNine):
def __init__(self, value):
TimesSeven.__init__(self, value)
PlusNine.__init__(self, value)
foo = ThisWay(5)
print('Should be (5 * 7) + 9 = 44 but is', foo.value)
>>>
Should be (5 * 7) + 9 = 44 but is 14
由于__init__再次被調(diào)用,因此結(jié)果變?yōu)?+9=14,如果情況更復(fù)雜的話,這點(diǎn)是比較難以debug的。
為了解決這些問(wèn)題,Python自帶了super自建的函數(shù)還有標(biāo)準(zhǔn)方法解析順序(MRO)。super確保了公共的父類只運(yùn)行一次。MRO定義了父類被初始化的順序(以C3線性(C3 linearization)算法的順序進(jìn)行)
class TimesSevenCorrect(MyBaseClass):
def __init__(self, value):
super().__init__(value)
self.value *= 7
class PlusNineCorrect(MyBaseClass):
def __init__(self, value):
super().__init__(value)
self.value += 9
現(xiàn)在,正確地運(yùn)行如下:
class GoodWay(TimesSevenCorrect, PlusNineCorrect):
def __init__(self, value):
super().__init__(value)
foo = GoodWay(5)
print('Should be 7 * (5 + 9) = 98 and is', foo.value)
>>>
Should be 7 * (5 + 9) = 98 and is 98
順序看著是反著來(lái)的,實(shí)際是根據(jù)MRO的順序來(lái)的:
mro_str = '\n'.join(repr(cls) for cls in GoodWay.mro())
print(mro_str)
>>>
<class '__main__.GoodWay'>
<class '__main__.TimesSevenCorrect'>
<class '__main__.PlusNineCorrect'>
<class '__main__.MyBaseClass'>
<class 'object'>
super的兩個(gè)參數(shù):MRO父視圖的類類型、訪問(wèn)這個(gè)視圖的實(shí)例。
class ExplicitTrisect(MyBaseClass):
def __init__(self, value):
super(ExplicitTrisect, self).__init__(value)
self.value /= 3
對(duì)于object實(shí)例的初始化,參數(shù)不是要求的。(因?yàn)槿绻褂胹uper(),編譯器會(huì)自動(dòng)提供正確的參數(shù)__class__和self,因此,下面幾種都是等價(jià)的。)
class AutomaticTrisect(MyBaseClass):
def __init__(self, value):
super(__class__, self).__init__(value)
self.value /= 3
class ImplicitTrisect(MyBaseClass):
def __init__(self, value):
super().__init__(value)
self.value /= 3
assert ExplicitTrisect(9).value == 3
assert AutomaticTrisect(9).value == 3
assert ImplicitTrisect(9).value == 3
- Item41: 考慮用Mix-in類來(lái)組合功能性
最好還是避免多繼承,考慮編寫mix-in(定義了小的、額外的方法類,供子類使用)。
比如,假如現(xiàn)在需要從內(nèi)存表示轉(zhuǎn)換Python對(duì)象到序列化的字典:
class ToDictMixin:
def to_dict(self):
return self._traverse_dict(self.__dict__)
用hasattr來(lái)進(jìn)行動(dòng)態(tài)屬性訪問(wèn),用isinstance來(lái)進(jìn)行動(dòng)態(tài)類檢查。并且訪問(wèn)實(shí)例字典__dict__:
def _traverse_dict(self, instance_dict):
output = {}
for key, value in instance_dict.items():
output[key] = self._traverse(key, value)
return output
def _traverse(self, key, value):
if isinstance(value, ToDictMixin):
return value.to_dict()
elif isinstance(value, dict):
return self._traverse_dict(value)
elif isinstance(value, list):
return [self._traverse(key, i) for i in value]
elif hasattr(value, '__dict__'):
return self._traverse_dict(value.__dict__)
else:
return value
這里定義了一個(gè)類來(lái)使得字典表達(dá)為二叉樹(shù):
class BinaryTree(ToDictMixin):
def __init__(self, value, left=None, right=None):
self.value = value
self.left = left
self.right = right
# 把大量的對(duì)象轉(zhuǎn)換成字典變得容易:
tree = BinaryTree(10,
left=BinaryTree(7, right=BinaryTree(9)),
right=BinaryTree(13, left=BinaryTree(11)))
print(tree.to_dict())
>>>
{'value': 10,
'left': {'value': 7,
'left': None,
'right': {'value': 9, 'left': None, 'right':
None}},
'right': {'value': 13,
'left': {'value': 11, 'left': None, 'right':
None},
'right': None}}
定義了BinaryTree的子類,帶著父節(jié)點(diǎn)的引用。這個(gè)循環(huán)引用可能會(huì)導(dǎo)致ToDictMixin.to_dict無(wú)限循環(huán):
class BinaryTreeWithParent(BinaryTree):
def __init__(self, value, left=None,
right=None, parent=None):
super().__init__(value, left=left, right=right)
self.parent = parent
解決方案就是重寫(override)此類中的_traverse方法,使得方法只處理數(shù)值,避免mix-in帶來(lái)循環(huán)。這里給了父節(jié)點(diǎn)的數(shù)值,否則就用默認(rèn)的實(shí)現(xiàn)。
def _traverse(self, key, value):
if (isinstance(value, BinaryTreeWithParent) and
key == 'parent'):
return value.value # Prevent cycles
else:
return super()._traverse(key, value)
調(diào)用BinaryTreeWithParent.to_dict沒(méi)有問(wèn)題,因?yàn)檠h(huán)引用的屬性不被允許:
root = BinaryTreeWithParent(10)
root.left = BinaryTreeWithParent(7, parent=root)
root.left.right = BinaryTreeWithParent(9, parent=root.left)
print(root.to_dict())
>>>
{'value': 10,
'left': {'value': 7,
'left': None,
'right': {'value': 9,
'left': None,
'right': None,
'parent': 7},
'parent': 10},
'right': None,
'parent': None}
可以使得擁有類型BinaryTreeWithParent的屬性的類自動(dòng)和ToDictMixin工作得很好。
class NamedSubTree(ToDictMixin):
def __init__(self, name, tree_with_parent):
self.name = name
self.tree_with_parent = tree_with_parent
my_tree = NamedSubTree('foobar', root.left.right)
print(my_tree.to_dict()) # No infinite loop
>>>
{'name': 'foobar',
'tree_with_parent': {'value': 9,
'left': None,
'right': None,
'parent': 7}}
Mix-in可以被組合。比如,需要提供JSON序列化:
import json
class JsonMixin:
@classmethod
def from_json(cls, data):
kwargs = json.loads(data)
return cls(**kwargs)
def to_json(self):
return json.dumps(self.to_dict())
JsonMixin定義了兩個(gè)方法,下面是數(shù)據(jù)中心的拓?fù)浣Y(jié)構(gòu):
class DatacenterRack(ToDictMixin, JsonMixin):
def __init__(self, switch=None, machines=None):
self.switch = Switch(**switch)
self.machines = [
Machine(**kwargs) for kwargs in machines]
class Switch(ToDictMixin, JsonMixin):
def __init__(self, ports=None, speed=None):
self.ports = ports
self.speed = speed
class Machine(ToDictMixin, JsonMixin):
def __init__(self, cores=None, ram=None, disk=None):
self.cores = cores
self.ram = ram
self.disk = disk
這里測(cè)試了從json中加載對(duì)象,然后序列化回json的整個(gè)閉環(huán):
serialized = """{
"switch": {"ports": 5, "speed": 1e9},
"machines": [
{"cores": 8, "ram": 32e9, "disk": 5e12},
{"cores": 4, "ram": 16e9, "disk": 1e12},
{"cores": 2, "ram": 4e9, "disk": 500e9}
]
}"""
deserialized = DatacenterRack.from_json(serialized)
roundtrip = deserialized.to_json()
assert json.loads(serialized) == json.loads(roundtrip)
可以看出,用這種插件類的方式,也可以實(shí)現(xiàn)很多靈活性。
- Item42: 使用公有屬性而不是私有屬性
在Python中,有兩種可見(jiàn)性:public和private
class MyObject:
def __init__(self):
self.public_field = 5
self.__private_field = 10
def get_private_field(self):
return self.__private_field
公有直接訪問(wèn):
foo = MyObject()
assert foo.public_field == 5
私有通過(guò)get方法獲得:
assert foo.get_private_field() == 10
直接訪問(wèn)會(huì)引發(fā)Error:
foo.__private_field
>>>
Traceback ...
AttributeError: 'MyObject' object has no attribute '__private_field'
類方法同樣有訪問(wèn)私有屬性的權(quán)限,因?yàn)樗鼈冊(cè)陬悆?nèi)被聲明:
class MyOtherObject:
def __init__(self):
self.__private_field = 71
@classmethod
def get_private_field_of_instance(cls, instance):
return instance.__private_field
bar = MyOtherObject()
assert MyOtherObject.get_private_field_of_instance(bar) == 71
繼承訪問(wèn)不到父類的私有域:
class MyParentObject:
def __init__(self):
self.__private_field = 71
class MyChildObject(MyParentObject):
def get_private_field(self):
return self.__private_field
baz = MyChildObject()
baz.get_private_field()
>>>
Traceback ...
AttributeError: 'MyChildObject' object has no attribute
'_MyChildObject__private_field'
私有域的實(shí)現(xiàn)是簡(jiǎn)單地把屬性名做了個(gè)轉(zhuǎn)換。比如__private_field其實(shí)被轉(zhuǎn)換成_MyChildObject__private_field。如果是指代父類的__private_field,則是被轉(zhuǎn)換成了_MyParentObject__private_field。知道這個(gè)規(guī)則的話,就可以直接訪問(wèn)到對(duì)應(yīng)的屬性值了:
assert baz._MyParentObject__private_field == 71
或者直接通過(guò)__dict__來(lái)查看類內(nèi)的屬性:
print(baz.__dict__)
>>>
{'_MyParentObject__private_field': 71}
Python為了功能性,用戶實(shí)際上可以繞開(kāi)private。
根據(jù)Item2的PEP8的風(fēng)格指引:一個(gè)下劃線_protected_field表示保護(hù)域,表示使用類的外界用戶需要小心處理。而私有域則是不希望被外界使用和繼承。
class MyStringClass:
def __init__(self, value):
self.__value = value
def get_value(self):
return str(self.__value)
foo = MyStringClass(5)
assert foo.get_value() == '5'
這是錯(cuò)誤的方式。
class MyIntegerSubclass(MyStringClass):
def get_value(self):
return int(self._MyStringClass__value)
foo = MyIntegerSubclass('5')
assert foo.get_value() == 5
class MyBaseClass:
def __init__(self, value):
self.__value = value
def get_value(self):
return self.__value
class MyStringClass(MyBaseClass):
def get_value(self):
return str(super().get_value()) # Updated
class MyIntegerSubclass(MyStringClass):
def get_value(self):
return int(self._MyStringClass__value) # Not updated
foo = MyIntegerSubclass(5)
foo.get_value()
>>>
Traceback ...
AttributeError: 'MyIntegerSubclass' object has no attribute
'_MyStringClass__value'
最好還是以protected的形式,同時(shí)給出注釋,告訴其他人這是內(nèi)部的。
class MyStringClass:
def __init__(self, value):
# This stores the user-supplied value for the object.
# It should be coercible to a string. Once assigned
in
# the object it should be treated as immutable.
self._value = value
...
需要考慮的是使用私有屬性來(lái)區(qū)分變量名:
class ApiClass:
def __init__(self):
self._value = 5
def get(self):
return self._value
class Child(ApiClass):
def __init__(self):
super().__init__()
self._value = 'hello' # Conflicts
a = Child()
print(f'{a.get()} and {a._value} should be different')
>>>
hello and hello should be different
為了減少變量名被覆蓋的風(fēng)險(xiǎn),區(qū)別域是一種可行的選擇:
class ApiClass:
def __init__(self):
self.__value = 5 # Double underscore
def get(self):
return self.__value # Double underscore
class Child(ApiClass):
def __init__(self):
super().__init__()
self._value = 'hello' # OK!
a = Child()
print(f'{a.get()} and {a._value} are different')
>>>
5 and hello are different
- Item43: 繼承collections.abc,來(lái)定制Container類型
每個(gè)Python類是一個(gè)容器,封裝屬性和功能。同時(shí)內(nèi)部還提供了很多的容器類型(比如:list,tuple,set和dict)。比如現(xiàn)在要統(tǒng)計(jì)元素的頻率:
class FrequencyList(list):
def __init__(self, members):
super().__init__(members)
def frequency(self):
counts = {}
for item in self:
counts[item] = counts.get(item, 0) + 1
return counts
通過(guò)繼承l(wèi)ist,可以得到list的基礎(chǔ)功能。然后可以定義方法來(lái)提供定制的功能:
foo = FrequencyList(['a', 'b', 'a', 'c', 'b', 'a', 'd'])
print('Length is', len(foo))
foo.pop()
print('After pop:', repr(foo))
print('Frequency:', foo.frequency())
>>>
Length is 7
After pop: ['a', 'b', 'a', 'c', 'b', 'a']
Frequency: {'a': 3, 'b': 2, 'c': 1}
現(xiàn)在,假設(shè)我要提供一個(gè)類似list的取下標(biāo)功能,但是針對(duì)二叉樹(shù)的結(jié)點(diǎn):
class BinaryNode:
def __init__(self, value, left=None, right=None):
self.value = value
self.left = left
self.right = right
如何使得這個(gè)類像序列一樣工作?即:
bar = [1, 2, 3]
bar[0]
# 實(shí)際上就是:
bar.__getitem__(0)
可以提供__getitem__的實(shí)現(xiàn):使用前序遍歷,每次記錄index。
class IndexableNode(BinaryNode):
def _traverse(self):
if self.left is not None:
yield from self.left._traverse()
yield self
if self.right is not None:
yield from self.right._traverse()
def __getitem__(self, index):
for i, item in enumerate(self._traverse()):
if i == index:
return item.value
raise IndexError(f'Index {index} is out of range')
可以構(gòu)建二叉樹(shù)如下:
tree = IndexableNode(
10,
left=IndexableNode(
5,
left=IndexableNode(2),
right=IndexableNode(
6,
right=IndexableNode(7))),
right=IndexableNode(
15,
left=IndexableNode(11)))
可以像list一樣進(jìn)行訪問(wèn):
print('LRR is', tree.left.right.right.value)
print('Index 0 is', tree[0])
print('Index 1 is', tree[1])
print('11 in the tree?', 11 in tree)
print('17 in the tree?', 17 in tree)
print('Tree is', list(tree))
>>>
LRR is 7
Index 0 is 2
Index 1 is 5
11 in the tree? True
17 in the tree? False
Tree is [2, 5, 6, 7, 10, 11, 15]
問(wèn)題是實(shí)現(xiàn)了__getitem__對(duì)于list的功能并不齊全,比如:
len(tree)
>>>
Traceback ...
TypeError: object of type 'IndexableNode' has no len()
此時(shí)要實(shí)現(xiàn)__len__:
class SequenceNode(IndexableNode):
def __len__(self):
for count, _ in enumerate(self._traverse(), 1):
pass
return count
tree = SequenceNode(
10,
left=SequenceNode(
5,
left=SequenceNode(2),
right=SequenceNode(
6,
right=SequenceNode(7))),
right=SequenceNode(
15,
left=SequenceNode(11))
)
print('Tree length is', len(tree))
>>>
Tree length is 7
不幸的是,count和index方法還是無(wú)法使用。這就使得自己定義容器類比較困難。為了避免這個(gè)困難,collections.abc有一系列的抽象類提供:
from collections.abc import Sequence
class BadType(Sequence):
pass
foo = BadType()
>>>
Traceback ...
TypeError: Can't instantiate abstract class BadType with abstract methods __getitem__, __len__
同時(shí)繼承Sequence,可以滿足一些方法,比如index,count等的使用:
class BetterNode(SequenceNode, Sequence):
pass
tree = BetterNode(
10,
left=BetterNode(
5,
left=BetterNode(2),
right=BetterNode(
6,
right=BetterNode(7))),
right=BetterNode(
15,
left=BetterNode(11))
)
print('Index of 7 is', tree.index(7))
print('Count of 10 is', tree.count(10))
>>>
Index of 7 is 3
Count of 10 is 1
還有更多的比如Set和MutableMapping,可以來(lái)實(shí)現(xiàn)來(lái)匹配Python自建的容器類。排序也是如此(見(jiàn)Item73)