《Python网络编程》学习笔记--使用谷歌地理编码API获取一个JSON文档

2018-01-13 20:33:20来源:cnblogs.com作者:takefetter人点击

分享

Foundations of Python Network Programing,Third Edition 《python网络编程》,本书中的代码可在Github上搜索fopnp下载

本书的第一章中使用到了google地图的api来获取一个地址的经度和纬度,因为众所周知的原因会出现无法访问,我们需要使用代理访问

因此书上的代码需要根据实际情况来修改,我的电脑的代理地址为127.0.0.1:1080,下面放我的代码吧,可根据自己电脑的代理设置进行修改。

运行环境:Windows 10,Anaconda3,python3.6.3,Pycharm Edu 2017.3

调用库:

#search1.pyfrom pygeocoder import Geocoderif __name__ == '__main__':    a = Geocoder()    a.proxy = "127.0.0.1:1080"    address = '207 N. Definace St,Archbold,OH'    print(a.geocode(address)[0].coordinates)

这里使用的是Geocoder中的proxy参数设置代理(需要先使用pip安装pygeocoder),因此必须先实例化,不能像书中一样直接print

应用层:

#search2.py

import requestsproxies = {"http": "http://127.0.0.1:1080", "https": "http://127.0.0.1:1080", }def geocode(address): parameters = {'address': address, 'sensor': 'falise'} base = 'http://maps.googleapis.com/maps/api/geocode/json' response = requests.get(base, params=parameters, proxies=proxies) answer = response.json() print(answer['results'][0]['geometry']['location'])if __name__ == '__main__': geocode('207 N. Defiance St,Archbold, OH')

这里使用了requests中的proxies参数设置代理

使用HTTP协议:

# search3.pyimport http.clientimport jsonfrom urllib.parse import quote_plusbase = '/maps/api/geocode/json'def geocode(address):    path = '{}?address={}&sensor=false'.format(base, quote_plus(address))    connection = http.client.HTTPSConnection('127.0.0.1', 1080)    connection.set_tunnel('map.google.com')    connection.request('GET', path)    rawreply = connection.getresponse().read()    reply = json.loads(rawreply.decode('utf-8'))    print(reply['results'][0]['geometry']['location'])if __name__ == '__main__':    geocode('207 N. Defiance St,Archbold, OH')

这里会提示

Traceback (most recent call last):  File "E:/Learn Python/Python网络编程/search3.py", line 21, in <module>    geocode('207 N. Defiance St,Archbold, OH')  File "E:/Learn Python/Python网络编程/search3.py", line 16, in geocode    reply = json.loads(rawreply.decode('utf-8'))  File "D:/Anaconda3/lib/json/__init__.py", line 354, in loads    return _default_decoder.decode(s)  File "D:/Anaconda3/lib/json/decoder.py", line 339, in decode    obj, end = self.raw_decode(s, idx=_w(s, 0).end())  File "D:/Anaconda3/lib/json/decoder.py", line 357, in raw_decode    raise JSONDecodeError("Expecting value", s, err.value) from Nonejson.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Process finished with exit code 1

很明显报了一个json.decoder.JSONDecodeError的错误 说明没有能够正确访问,json decode失败

print(rawreply)发现rawreply返回的是这样的html文件

b'<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">/n<TITLE>301 Moved</TITLE></HEAD><BODY>/n<H1>301 Moved</H1>/nThe document has moved/n<A HREF="https://maps.google.com/maps/api/geocode/json?address=207+N.+Defiance+St%2CArchbold%2C+OH&sensor=false">here</A>./r/n</BODY></HTML>/r/n'

返回了一个301错误,说明需要重定向这里我们使用的是HTTPS协议因此不会像浏览器一样直接重定向,感觉应该是google反爬虫的一种行为

因此我们使用正则表达式提取字符串(方法来自https://www.cnblogs.com/rj81/p/5933838.html),更改后代码如下

# search3.pyimport http.clientimport jsonfrom urllib.parse import quote_plusimport rebase = '/maps/api/geocode/json'def geocode(address):    path = '{}?address={}&sensor=false'.format(base, quote_plus(address))    connection = http.client.HTTPSConnection('127.0.0.1', 1080)    connection.set_tunnel('map.google.com')    connection.request('GET', path)    rawreply = connection.getresponse().read().decode()    newweb = re.findall(r"HREF=/"(.+?)/"", string=rawreply)    # print(newweb)    connection.request('GET', newweb[0])    rawreply = connection.getresponse().read()    # print(path)    # print(rawreply)    reply = json.loads(rawreply.decode('utf-8'))    print(reply['results'][0]['geometry']['location'])if __name__ == '__main__':    geocode('207 N. Defiance St, Archbold, OH')




即可正确输出结果

{'lat': 41.5219645, 'lng': -84.3066496}Process finished with exit code 0

这里需要注意的是 我一开始以为newweb是一个str,直接使用了connection.request('GET', newweb)

结果发现AttributeError: 'list' object has no attribute 'startswith'的错误,更改之后即可正常输出

直接使用Socket与谷歌地图通信:

设置代理的方法(转自http://www.jb51.net/article/50510.htm)

         urllib2:

proxy_handler = urllib2.ProxyHandler({'http' : 'http://地址:端口'})opener = urllib2.build_opener(proxy_handler, urllib2.HTTPHandler)urllib2.install_opener(opener)

           socket:

import socks, socketsocks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, "地址", 端口)socket.socket = socks.socksocket

代码如下:

#search4.py#!/usr/bin/env python3import socketimport socksfrom urllib.parse import quote_plusrequest_text = """/GET /maps/api/geocode/json?address={}&sensor=false HTTP/1.1/r/n/Host: maps.google.com:80/r/n/User-Agent: search4.py (Foundations of Python Network Programming)/r/n/Connection: close/r/n//r/n/"""def geocode(address):    socks.set_default_proxy(socks.PROXY_TYPE_SOCKS5, "127.0.0.1", 1080)    socket.socket = socks.socksocket    sock = socket.socket()    sock.connect(('maps.google.com', 80))    request = request_text.format(quote_plus(address))    sock.sendall(request.encode('ascii'))    raw_reply = b''    while True:        more = sock.recv(4096)        if not more:            break        raw_reply += more    print(raw_reply.decode('utf-8'))if __name__ == '__main__':    geocode('207 N. Defiance St, Archbold, OH')

运行输出:

HTTP/1.1 200 OKContent-Type: application/json; charset=UTF-8Date: Fri, 12 Jan 2018 07:21:20 GMTExpires: Sat, 13 Jan 2018 07:21:20 GMTCache-Control: public, max-age=86400Access-Control-Allow-Origin: *Server: mafeX-XSS-Protection: 1; mode=blockX-Frame-Options: SAMEORIGINAccept-Ranges: noneVary: Accept-Language,Accept-EncodingConnection: close{   "results" : [      {         "address_components" : [            {               "long_name" : "207",               "short_name" : "207",               "types" : [ "street_number" ]            },            {               "long_name" : "North Defiance Street",               "short_name" : "N Defiance St",               "types" : [ "route" ]            },            {               "long_name" : "Archbold",               "short_name" : "Archbold",               "types" : [ "locality", "political" ]            },            {               "long_name" : "German Township",               "short_name" : "German Township",               "types" : [ "administrative_area_level_3", "political" ]            },            {               "long_name" : "Fulton County",               "short_name" : "Fulton County",               "types" : [ "administrative_area_level_2", "political" ]            },            {               "long_name" : "Ohio",               "short_name" : "OH",               "types" : [ "administrative_area_level_1", "political" ]            },            {               "long_name" : "United States",               "short_name" : "US",               "types" : [ "country", "political" ]            },            {               "long_name" : "43502",               "short_name" : "43502",               "types" : [ "postal_code" ]            },            {               "long_name" : "1160",               "short_name" : "1160",               "types" : [ "postal_code_suffix" ]            }         ],         "formatted_address" : "207 N Defiance St, Archbold, OH 43502, USA",         "geometry" : {            "bounds" : {               "northeast" : {                  "lat" : 41.521994,                  "lng" : -84.30646179999999               },               "southwest" : {                  "lat" : 41.521935,                  "lng" : -84.30683739999999               }            },            "location" : {               "lat" : 41.5219645,               "lng" : -84.3066496            },            "location_type" : "ROOFTOP",            "viewport" : {               "northeast" : {                  "lat" : 41.5233134802915,                  "lng" : -84.30530061970849               },               "southwest" : {                  "lat" : 41.5206155197085,                  "lng" : -84.3079985802915               }            }         },         "place_id" : "ChIJk4BHnIy0PYgRXbKj5GjFe_U",         "types" : [ "premise" ]      }   ],   "status" : "OK"}Process finished with exit code 0

最新文章

123

最新摄影

微信扫一扫

第七城市微信公众平台