CentOS 7 with Nginx+uWSGI+Flask

Introduction

Flask 是一個使用 Python 編寫的輕量級 Web 應用框架,由於其輕量特性,也被稱為 micro-framework。雖然我們在完成一個 Flask 應用程式後可以通過 python3 api.py 來啟動伺服器,但這只適合用於測試環境中,正式發佈的服務需要一個穩定而持續的伺服器,例如 Apache、Nginx 等。其中 Nginx 是一款面向效能設計的 HTTP 伺服器,相較於 Apache、lighttpd 具有占有記憶體少,穩定性高等優勢,此外,Nginx 的優點還包括:高度模組化的設計,模組編寫簡單,以及設定檔簡潔。

因此在本篇文章,我們將結合先前的 PyTorch 實戰 - 高鐵驗證碼辨識,實作一個辨識驗證碼的 Flask API──通過傳入圖片在辨識完成後回傳驗證碼──,並部署於 CentOS 7 上,

architecture

Preparation

  • CentOS 7
  • Python 3.6

Implementation

在開始實作之前,我們需要在 CentOS 7 上安裝 Python 3.6,並更新 pip3:

1
2
3
4
5
6
# Install Python 3
$ sudo yum -y install epel-release
$ sudo yum -y install python36

# Update pip3
$ sudo pip3 install --upgrade pip

Flask

Flask is a micro web framework written in Python. It is classified as a microframework because it does not require particular tools or libraries. It has no database abstraction layer, form validation, or any other components where pre-existing third-party libraries provide common functions. However, Flask supports extensions that can add application features as if they were implemented in Flask itself. Extensions exist for object-relational mappers, form validation, upload handling, various open authentication technologies and several common framework related tools. Extensions are updated far more frequently than the core Flask program.

Wikipedia── Flask (web framework)

Install Flask

首先我們先透過 pip3 安裝 Flask:

1
$ sudo pip3 install flask

Create Flask Applocation

接著,我們建立一個資料夾 ocr.holey.cc,並在裡面建立一個 api.py,通過將圖片轉為 base64 後對 api.py 發起 GET 請求,在辨識完成後回傳驗證碼,在這邊需要注意的是,由於 base64 中含有字元 +/ 以及 =,這些都不是被 URL 所接受的,因此在傳送之前我們會先建立規則:

Src-Char Dst-Char
+ -
/ _
= <empty>

而這些規則將會在 api.py 中進行反解析:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
import base64
import os
import re
import uuid
from flask import Flask, request
from io import BytesIO
from PIL import Image

from thsrc.captcha_ocr import CaptchaOCR as thsrc_co
from thsrc.image_transformer import ImageTransformer as thsrc_it


IMAGE_DOWNLOAD_DIR = './__tmp__/images/downloads'


app = Flask(__name__)


def base64_to_image(base64_str, download_dir):
if not os.path.isdir(download_dir):
os.makedirs(download_dir)
filename = '{}/{}.png'.format(download_dir, str(uuid.uuid4()))

base64_data = str.encode(base64_str)
missing_padding = 4 - len(base64_data) % 4
if missing_padding:
base64_data += b'=' * missing_padding

with open(filename, 'wb') as f:
f.write(base64.b64decode(base64_data))
return filename


@app.route('/thsrc', methods=['GET'])
def thsrc():
data = request.args.get('base64_str')
if not data:
return '----'

data = str(data).replace('-', '+').replace('_', '/')
src_captcha = base64_to_image(data, IMAGE_DOWNLOAD_DIR)
dst_captcha = src_captcha.replace('.png', '_dst.png')

it = thsrc_it()
it.Transform(src_captcha, dst_captcha)
co = thsrc_co(dst_captcha, './thsrc/checkpoint/cnn_ckpt.t7')
captcha_code = co.recognizing()

if os.path.isfile(src_captcha):
os.remove(src_captcha)
if os.path.isfile(dst_captcha):
os.remove(dst_captcha)

return '{}'.format(captcha_code)


if __name__ == '__main__':
app.debug = True
app.run()

在建立完 api.py 後,我們可以簡單地通過 python3 api.py 來檢視是否有錯誤,並開啟瀏覽器對網頁發起辨識請求:

1
http://127.0.0.1:5000/thsrc?base64_str=

這時候應該 Nginx 會發出 414 Request-URI Too Large 警告,其原因為 GET 標頭過大,超過了 large_client_header_buffer,解決方法是在 /etc/nginx/nginx.confhttp 區段中加入以下參數並重新啟動 Nginx:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ sudo vi /etc/nginx/nginx.conf

...
http {
...
# Fix 414 Request-URI Too Large
client_header_buffer_size 512k;
large_client_header_buffers 4 512k;
client_max_body_size 100m;
...
}
...

$ sudo systemctl restart nginx

在測試沒問題後,就可以開始安裝 uWSGI 並開始部署。

uWSGI

uWSGI is a software application that “aims at developing a full stack for building hosting services”. It is named after the Web Server Gateway Interface (WSGI), which was the first plugin supported by the project.

uwsgi (all lowercase) is the native binary protocol that uWSGI uses to communicate with other servers.

uWSGI is often used for serving Python web applications in conjunction with web servers such as Cherokee and Nginx, which offer direct support for uWSGI’s native uwsgi protocol. For example, data may flow like this: HTTP client ↔ Nginx ↔ uWSGI ↔ Python app.

Wikipedia── uWSGI

Install uWSGI

在安裝 uWSGI 之前,我們需要先安裝相依函式庫再安裝 uWSGI:

1
2
$ sudo yum -y install gcc python3-devel
$ sudo pip3 install uwsgi

Create uwsgi.ini File

建立一個 uwsgi.ini 檔案在專案目錄下:

1
2
3
4
5
6
7
8
9
10
11
12
13
[uwsgi]
chdir = /home/wwwroot/ocr.holey.cc
socket = 127.0.0.1:8400
chmod-socket = 666
master = 1
processes = 2
callable = app
wsgi-file = api.py
daemonize = /home/wwwroot/ocr.holey.cc/uwsgi.log
stats = 127.0.0.1:8401
buffer-size = 32768
vacuum = true
die-on-term = true

Setup Website

Create Nginx conf File

建立一個 ocr.holey.cc.conf 檔案在 /etc/nginx/conf.d/ 下:

1
2
3
4
5
6
7
8
9
10
11
12
server {
listen 80;
listen [::]:80;

server_name ocr.holey.cc;
root /home/wwwroot/ocr-thsrc;

location / {
include uwsgi_params;
uwsgi_pass 0.0.0.0:5000;
}
}

在建立完成後重新讀取組態檔:

1
$ sudo nginx -s reload

Create Service

接著為了我們操作方便,我們將建立一個 ocr.holey.cc.service 來管理網站:

1
2
3
4
5
6
7
8
9
10
11
12
13
$ sudo vi /etc/systemd/system/ocr.holey.cc.service
[Unit]
Description=Captcha OCR

[Service]
WorkingDirectory=/home/wwwroot/ocr.holey.cc
ExecStart=/usr/local/bin/uwsgi --ini /home/wwwroot/ocr.holey.cc/uwsgi.ini
Restart=always
# Restart service after 10 seconds if the uwsgi service crashes:
RestartSec=10
SyslogIdentifier=ocr-python
Type=forking
User=wwwroot

在建立完成後重新讀取組態檔:

1
$ sudo systemctl daemon-reload

如此一來我們便可以透過 systemctl 簡單地啟動、暫停、重啟網站:

1
2
3
4
5
6
7
8
# Start website
$ sudo systemctl start ocr.holey.cc.service

# Stop website
$ sudo systemctl stop ocr.holey.cc.service

# Restart website
$ sudo systemctl restart ocr.holey.cc.service

References