I am trying to use the proxy feature of python requests library, but the data being returned from the requests being made via proxies is incorrect (page text is still english when it should be the localized language). Is there a way to verify that the proxy is being utilized correctly?
agent = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/XX.X.XXXX.XX Safari/537.36"
proxy_list = {
'South Korea' : 'http://1.241.102.9:3128',
'Sweden' : 'http://79.136.65.150:80',
'Russia' : 'http://77.236.87.175:80',
'Japan' : 'http://153.149.158.149:3128',
'Germany' : 'http://213.136.89.121:80',
}
# Check app availability via each proxy
for proxy_country, proxy_val in proxy_list.items():
proxyDict = {"http" : proxy_val}
try:
req = requests.get(url, headers={'user-agent':agent}, proxies=proxyDict,timeout=5)
except:
print "COULD NOT DETERMINE AVAILABILITY FOR: %s" % (proxy_country)
else:
print "%s : %s" % (proxy_country,req.status_code)
The easiest way to verify whether or not requests
is using a proxy is simply to enable debug logging. The requests
module logs a variety of interesting at DEBUG
priority, so just do:
import logging
logging.basicConfig(level='DEBUG')
Here's my simple test script:
#!/usr/bin/env python
import sys
import logging
import requests
logging.basicConfig(level='DEBUG')
res = requests.get(sys.argv[1])
res.raise_for_status()
If I run this:
$ python reqtest.py http://lwn.net/
I see:
INFO:requests.packages.urllib3.connectionpool:Starting new HTTP connection (1): lwn.net
DEBUG:requests.packages.urllib3.connectionpool:"GET / HTTP/1.1" 200 9098
But if I enable a proxy:
$ http_proxy=http://squid.corp.example.com:3128 pytyhon reqtest.py http://lwn.net/
I clearly see that requests
is connecting to the proxy, rather than directly to the remote system:
INFO:requests.packages.urllib3.connectionpool:Starting new HTTP connection (1): squid.corp.example.com
DEBUG:requests.packages.urllib3.connectionpool:"GET http://lwn.net/ HTTP/1.1" 200 9098
I see the same behavior if I modify the code like this:
#!/usr/bin/env python
import sys
import logging
import requests
logging.basicConfig(level='DEBUG')
res = requests.get(sys.argv[1],
proxies=dict(http='http://squid.corp.example.com:3128'))
res.raise_for_status()
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments