想通過requests
庫下載一個文件。
環境是:Python3.5.2
+requests
+windows10
import requests
# 通過requests庫下載文件
url = 'https://www.gipsa.usda.gov/fgis/exportgrain/CY2016.csv'
r = requests.get(url)
print(r.content)
with open("myCY2016.csv", "wb") as code:
code.write(r.content)
但是,報錯。
File "C:\Users\admin\AppData\Local\Programs\Python\Python35-32\lib\site-packages\requests\adapters.py", line 497, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:645)
看到stackoverflow中說有這樣:
The problem you are having is caused by an untrusted SSL certificate.
Like @dirk mentioned in a previous comment, the *quickest* fix is setting verify=False
.
Please note that this will cause the certificate not to be verified. **This will expose your application to security risks, such as man-in-the-middle attacks.**
Of course, apply judgment. As mentioned in the comments, this *may* be acceptable for quick/throwaway applications/scripts, *but really should not go to production software*.
If just skipping the certificate check is not acceptable in your particular context, consider the following options, your best option is to set the verify
parameter to a string that is the path of the .pem
file of the certificate (which you should obtain by some sort of secure means).
So, as of version 2.0, the verify
parameter accepts the following values, with their respective semantics:
True
: causes the certificate to validated against the library's own trusted certificate authorities (Note: you can see which Root Certificates Requests uses via the Certifi library, a trust database of RCs extracted from Requests: [Certifi - Trust Database for Humans](http://certifiio.readthedocs.org/en/latest/)).
False
: bypasses certificate validation *completely*.
Path to a CA_BUNDLE file for Requests to use to validate the certificates.
Source: [Requests - SSL Cert Verification](http://docs.python-requests.org/en/master/user/advanced/?highlight=ssl#ssl-cert-verification)
Also take a look at the cert
parameter on the same link.
好的,雖然不是很明白,但是把參數verify = False
設置好。繼續。
C:\Users\admin\AppData\Local\Programs\Python\Python35-32\lib\site-packages\requests\packages\urllib3\connectionpool.py:843:
InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised.
See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
InsecureRequestWarning)
又看到stackoverflow看到這個:
The reason doing urllib3.disable_warnings()
didn't work for you is because it looks like you're using a separate instance of urllib3 vendored inside of requests.
I gather this based on the path here: /usr/lib/python2.6/site-packages/requests/packages/urllib3/connectionpool.py
To disable warnings in requests' vendored urllib3, you'll need to import that specific instance of the module:
import requestsfrom requests.packages.urllib3.exceptions import InsecureRequestWarningrequests.packages.urllib3.disable_warnings(InsecureRequestWarning)
OK,運行成功。
貼最后的代碼:
import requests
from requests.packages.urllib3.exceptions import InsecureRequestWarning
requests.packages.urllib3.disable_warnings(InsecureRequestWarning)
# 通過requests庫下載文件
url = 'https://www.gipsa.usda.gov/fgis/exportgrain/CY2016.csv'
r = requests.get(url,verify = False)
print(r.content)
with open("myCY2016.csv", "wb") as code:
code.write(r.content)
不過,好像網速巨慢。
so,問題來了,而且有很多,ssl到底是個什么東西?為什么同樣是發送一個get
請求,在瀏覽器里面就直接會觸發下載按鈕,而模擬的時候,卻不是,而是放在body
里面?也許是因為有什么字段告訴瀏覽器,把body
里面的東西下載到本地之類的把?為什么HTTP
傳輸的時候,有時候會說是用字節編碼傳遞過來,有時候是用字符編碼傳遞過來,但是最后編程網絡層進行傳輸之后,不都是字節流嗎?也許在某一層改變了呢?
另外stackoverflow
真是厲害,不是嗎!?