聊聊我的源码阅读方法-六虎

本次代码阅览的项目来自 500lines 的子项目 web-server。 500 Lines or Less不仅是一个项目，也是一本同名书，有源码，也有文字介绍。这个项目由多个独立的章节组成，每个章节由范畴大牛试图用 500 行或许更少(500 or less)的代码，让读者了解一个功用或需求的简略完成。本文包括下面几个部分:

导读
项目结构介绍
简易HTTP服务
echo服务
文件服务
文件目录服务和cgi服务
服务重构
小结
小技巧

导读

咱们之前现已埋头阅览了十二个项意图源码，是时候空谈一下如何阅览源码了。

python项目许多，优异的也不少。学习这些项意图源码，能够让咱们更深化的理解API，了解项意图完成原理和细节。只是会用项目API，并不符合有进阶之心的你我。个人觉得看书，做题和重复照轮子，都不如源码阅览。咱们学习的进程，便是从模仿到创造的进程，看优异的源码，模仿它，然后超越它。

挑选合适项目也需求一定的技巧，这儿讲讲我的办法：

项目小巧一点，刚开始的时候功力有限，代码量小的项目，更容易读下去。初期阶段的项目，主张尽量在5000行以下。
项目纵向贯穿某个方向，逐步的打通整个链条。比方围绕http服务的不同阶段，咱们阅览了gunicorn，wsgi，http-server，bottle，mako。从服务到WSGI标准，从web框架到模版引擎。
项目横行能够比照，比方CLI部分，比照getopt和argparse；比方blinker和flask/django-signal的不同。

挑选好项目后，便是如何阅览源码了。咱们之前的代码阅览办法我称之为：概读法 。具体的讲便是根据项意图主要功用，仅剖析其核心完成，对于辅佐的功用，增强的功用能够暂时不必理会，避免堕入太多细节。简略举个例子: “研表究明，汉字的序顺并不定一影阅响读，比方你看完这句话后才发现这儿的字满是乱的”，咱们了解项目主要的功用，就能够开始到达意图。

哈哈，愚人节快乐

概读法，有一个弊端：咱们知道代码是这样完成的，但是无法解读为什么这样完成？所以是时候介绍一下另外一种代码阅览办法：前史比照法。前史比照法主要是比照代码的需求改变和版别前史，然后学习需求如何被完成。一般项目中，运用gitlog种的commit
-message来展现前史和需求。本篇的500lines-webserver项目中直接供给了演化示例，用来演示前史比照法再适合不过。

项目结构

本次代码阅览是用的版别是 fba689d1 , 项目目录结构如下表:

目录	描绘
00-hello-web	简易http服务
01-echo-request-info	能够显示恳求的http服务
02-serve-static	静态文件服务
03-handlers	支撑目录展现的http文件服务
04-cgi	cgi完成
05-refactored	重构http服务

简易HTTP服务

http服务十分简略，这样发动服务:

serverAddress = ('', 8080)
server = BaseHTTPServer.HTTPServer(serverAddress, RequestHandler)
server.serve_forever()

只呼应get恳求的Handler:

class RequestHandler(BaseHTTPServer.BaseHTTPRequestHandler):
...
def do_GET(self):
self.send_response(200)
self.send_header("Content-type", "text/html")
self.send_header("Content-Length", str(len(self.Page)))
self.end_headers()
self.wfile.write(self.Page)

服务的效果，能够配合下面的恳求示例:

# curl -v http://127.0.0.1:8080
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to 127.0.0.1 (127.0.0.1) port 8080 (#0)
> GET / HTTP/1.1
> Host: 127.0.0.1:8080
> User-Agent: curl/7.64.1
> Accept: */*
>
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Server: BaseHTTP/0.3 Python/2.7.16
< Date: Wed, 31 Mar 2021 11:57:03 GMT
< Content-type: text/html
< Content-Length: 49
<
<html>
<body>
<p>Hello, web!</p>
</body>
</html>
* Closing connection 0

本文不计划具体介绍http协议细节的完成，假如想了解http协议细节的请看第2篇博文，或许我之前的[python http 源码阅览]

echo服务

echo服务是在简易http服务上演进的，支撑对用户的恳求回声。所以咱们比照一下2个文件，就知道更改了哪些内容:

更改的重点在 do_GET 的完成，图片或许不太明晰，我把代码贴在下面:

# hello
def do_GET(self):
self.send_response(200)
...
self.wfile.write(self.Page)
# echo
def do_GET(self):
page = self.create_page()
self.send_page(page)

能够看到echo的 do_GET 中调用了 create_page 和 send_page 2个办法。短短两行代码，十分明晰的显示了echo和hello的差异。因为echo要获取客户端恳求并原样输出，固定的页面肯定部满意需求。需求先运用模版创立页面，再发送页面给用户。hello的 do_GET 办法的完成重构成send_page函数的主体，新增的create_page就十分简略：

def create_page(self):
values = {
'date_time'   : self.date_time_string(),
'client_host' : self.client_address[0],
'client_port' : self.client_address[1],
'command'     : self.command,
'path'        : self.path
}
page = self.Page.format(**values)
return page

单看echo的代码，会觉得平铺直叙。比照了hello和echo的差异，才能够感受到大师的手艺。代码展现了如何写出可读的代码和如何完成新增需求：

create-page和send-page函数名称明晰可读，能够断章取义。
create和send的逻辑天然平等。举个反例：更改成函数名称为create_page和_do_GET,功用不变，我们就会觉得别扭。
hello中的do_GET函数的5行完成代码彻底没变，只是重构成新的send_page函数。这样从测验角度，只需求对改变的部分(create_page)增加测验用例。

比照是用的命令是 vimdiff 00-hello-web/server.py 01-echo-request-info/server.py 也能够是用ide供给的比照东西。

文件服务

文件服务能够展现服务本地html页面:

 # Classify and handle request.
def do_GET(self):
try:
# Figure out what exactly is being requested.
full_path = os.getcwd() + self.path
# 文件不存在
if not os.path.exists(full_path):
raise ServerException("'{0}' not found".format(self.path))
# 处理html文件
elif os.path.isfile(full_path):
self.handle_file(full_path)
...
# 处理反常
except Exception as msg:
self.handle_error(msg)

文件和反常的处理:

def handle_file(self, full_path):
try:
with open(full_path, 'rb') as reader:
content = reader.read()
self.send_content(content)
except IOError as msg:
msg = "'{0}' cannot be read: {1}".format(self.path, msg)
self.handle_error(msg)
def handle_error(self, msg):
content = self.Error_Page.format(path=self.path, msg=msg)
self.send_content(content)

目录下还供给了一个status-code的版别，相同比照一下:

假如文件不存在，依照http协议标准，应该报404过错:

def handle_error(self, msg):
content = ...
self.send_content(content, 404)
def send_content(self, content, status=200):
self.send_response(status)
...

这儿利用了python函数参数支撑默许值的特性，让send_content函数稳定下来，即便后续有30x/50x过错，也不必修正send_content函数。

文件目录服务和CGI服务

文件服务需求升级支撑文件目录。一般假如一个目录下有index.html就展现该文件；没有该文件，就显示目录列表，便利运用者检查，不必手艺输入文件名称。

相同我把版别的迭代比照成下图，主要展现RequestHandler的改变:

do_GET要处理三种逻辑:html文件，目录和过错。假如持续用if-else办法就会让代码丑恶，也不易扩展，所以这儿运用战略形式进行了扩展:

# 有序的战略
Cases = [case_no_file(),
case_existing_file(),
case_always_fail()]
# Classify and handle request.
def do_GET(self):
try:
# Figure out what exactly is being requested.
self.full_path = os.getcwd() + self.path
# 挑选战略
for case in self.Cases:
if case.test(self):
case.act(self)
break
# Handle errors.
except Exception as msg:
self.handle_error(msg)

html，文件不存在和反常的3种战略完成:

class case_no_file(object):
'''File or directory does not exist.'''
def test(self, handler):
return not os.path.exists(handler.full_path)
def act(self, handler):
raise ServerException("'{0}' not found".format(handler.path))
class case_existing_file(object):
'''File exists.'''
def test(self, handler):
return os.path.isfile(handler.full_path)
def act(self, handler):
handler.handle_file(handler.full_path)
class case_always_fail(object):
'''Base case if nothing else worked.'''
def test(self, handler):
return True
def act(self, handler):
raise ServerException("Unknown object '{0}'".format(handler.path))

目录的完成就很简略了，再扩展一下 case_directory_index_file 和 case_directory_no_index_file 战略即可; cgi 的支撑也相同，增加一个 case_cgi_file 战略。

class case_directory_index_file(object):
...
class case_directory_no_index_file(object):
...
class case_cgi_file(object):
...

服务重构

完成功用后，作者对代码进行了一次重构:

重构后RequestHandler代码简洁了许多，只包括http协议细节的处理。handle_error处理反常，返回404过错；send_content生成http的呼应。

class RequestHandler(BaseHTTPServer.BaseHTTPRequestHandler):
# Classify and handle request.
def do_GET(self):
try:
# Figure out what exactly is being requested.
self.full_path = os.getcwd() + self.path
# Figure out how to handle it.
for case in self.Cases:
if case.test(self):
case.act(self)
break
# Handle errors.
except Exception as msg:
self.handle_error(msg)
# Handle unknown objects.
def handle_error(self, msg):
content = self.Error_Page.format(path=self.path, msg=msg)
self.send_content(content, 404)
# Send actual content.
def send_content(self, content, status=200):
self.send_response(status)
self.send_header("Content-type", "text/html")
self.send_header("Content-Length", str(len(content)))
self.end_headers()
self.wfile.write(content)

恳求处理战略也进行了重构，构建了base_case父类，约好了处理的模版和步骤，而且默许供给了html文件的读取办法。

class base_case(object):
'''Parent for case handlers.'''
def handle_file(self, handler, full_path):
try:
with open(full_path, 'rb') as reader:
content = reader.read()
handler.send_content(content)
except IOError as msg:
msg = "'{0}' cannot be read: {1}".format(full_path, msg)
handler.handle_error(msg)
def index_path(self, handler):
return os.path.join(handler.full_path, 'index.html')
def test(self, handler):
assert False, 'Not implemented.'
def act(self, handler):
assert False, 'Not implemented.'

html文件的处理函数就很简略，完成判别函数和履行函数，其中履行函数还是还复用父类的html处理函数。

class case_existing_file(base_case):
'''File exists.'''
def test(self, handler):
return os.path.isfile(handler.full_path)
def act(self, handler):
self.handle_file(handler, handler.full_path)

战略最长便是不存在index.html页面的目录:

class case_directory_no_index_file(base_case):
'''Serve listing for a directory without an index.html page.'''
# How to display a directory listing.
Listing_Page = '''\
<html>
<body>
<ul>
{0}
</ul>
</body>
</html>
'''
def list_dir(self, handler, full_path):
try:
entries = os.listdir(full_path)
bullets = ['<li>{0}</li>'.format(e) for e in entries if not e.startswith('.')]
page = self.Listing_Page.format('\n'.join(bullets))
handler.send_content(page)
except OSError as msg:
msg = "'{0}' cannot be listed: {1}".format(self.path, msg)
handler.handle_error(msg)
def test(self, handler):
return os.path.isdir(handler.full_path) and \
not os.path.isfile(self.index_path(handler))
def act(self, handler):
self.list_dir(handler, handler.full_path)

list_dir动态生成一个文件目录列表的html文件。

小结

咱们一起运用前史比照法，阅览了500lines-webserver的代码演进进程，明晰的了解如何一步一步的完成一个文件目录服务。

RequestHandler的do_GET办法处理http恳求
运用send_content输出response，包括状况码，呼应头和body。
读取html文件展现html页面
展现目录
支撑cgi

在学习进程中，咱们还额定获得了如何扩充代码，编写可维护代码和重构代码示例，期望我们和我相同有所收成。

小技巧

前面介绍了，恳求的处理运用战略形式。能够先看看来自python-patterns项意图战略形式完成:

class Order:
def __init__(self, price, discount_strategy=None):
self.price = price
self.discount_strategy = discount_strategy
def price_after_discount(self):
if self.discount_strategy:
discount = self.discount_strategy(self)
else:
discount = 0
return self.price - discount
def __repr__(self):
fmt = "<Price: {}, price after discount: {}>"
return fmt.format(self.price, self.price_after_discount())
def ten_percent_discount(order):
return order.price * 0.10
def on_sale_discount(order):
return order.price * 0.25 + 20
def main():
"""
>>> Order(100)
<Price: 100, price after discount: 100>
>>> Order(100, discount_strategy=ten_percent_discount)
<Price: 100, price after discount: 90.0>
>>> Order(1000, discount_strategy=on_sale_discount)
<Price: 1000, price after discount: 730.0>
"""

ten_percent_discount供给9折，on_sale_discount供给75折再减20的优惠。不同的订单能够运用不同的扣头形式，比方示例调整成下面:

order_amount_list = [80, 100, 1000]
for amount in order_amount_list:
if amount < 100:
Order(amount)
break;
if amount < 1000:
Order(amount, discount_strategy=ten_percent_discount)
break;
Order(amount, discount_strategy=on_sale_discount)

对应的业务逻辑是:

订单金额小于100不打折
订单金额小于1000打9折
订单金额大于等于1000打75折并优惠20

假如咱们把打折的条件和扣头办法完成在一个类中，那就和web-server类似:

class case_discount(object):
def test(self, handler):
# 打折条件
...
def act(self, handler):
# 计算扣头
...

参考链接

github.com/aosabook/50…
github.com/HT524/500Li…
shuhari.dev/blog/2020/0…

聊聊我的源码阅读方法

导读

项目结构

简易HTTP服务

echo服务

文件服务

文件目录服务和CGI服务

服务重构

小结

小技巧

参考链接

相关文章

上篇：技术架构的设计方法

远光软件获得阿里云产品生态集成认证，携手阿里云共建新合作

Dubbo-kubernetes 基于 Informer 服务发现优化之路

面试官问你前端性能优化时，他想问什么？

作者信息