在我正在请求从服务器获取EPG数据的项目时,Json响应中的标题和描述为乱码。最初我虽然可能会遇到编码问题,但尝试以各种格式对文本进行解码/编码,但没有成功。
无论如何,这里是用于通过一些调试输出获取请求的代码:
def get_short_epg(profile, stream_id, limit=1):
epg_url = build_url("{0}/player_api.php".format(profile['server']),
{'username': profile['username'], 'password': profile['password'],
'action': 'get_short_epg', 'stream_id': stream_id, 'limit': limit})
response = requests.get(epg_url)
if response.status_code == 200:
if profile['debug']:
print(response.encoding)
print(response.apparent_encoding)
print(response.request.headers)
print(response.json())
print(response.content)
data_dir = os.path.join(os.path.dirname(os.path.realpath(__file__)), 'data')
output_file= os.path.join(data_dir, 'short_epg.json')
f = open(output_file, 'w')
f.write(response.text)
f.close()
else:
print("Failed with status code {0}".format(response.status_code))
产生以下输出:
无ascii {'User-Agent':'python-requests / 2.24.0','Accept-Encoding':'gzip,deflate','Accept':' /','Connection':'keep-alive'} {'epg_listings':[{'id':'1037932301','epg_id':'34','title':'QnJpdGFpbidzIEdvdCBUYWxlbnQ =','lang':'en ' '启动': '2020年10月3日20点00分00秒', '结束': '2020年10月3日22时00分00秒', '描述': 'QW50IGFuZCBEZWMgaG9zdCB0aGUgZmlmdGggc2VtaS1maW5hbCBvZiB0aGlzIHllYXIncyB0YWxlbnQgY29udGVzdCwgd2VsY29taW5nIGJhY2sgdGhlIGFjdHMgdGhhdCBpbXByZXNzZWQgdGhlIGp1ZGdlcyBkdXJpbmcgdGhlIGF1ZGl0aW9ucyBlcGlzb2RlcyBpbiB0aGUgc3ByaW5nLiBBbWFuZGEgSG9sZGVuLCBBbGVzaGEgRGl4b24sIERhdmlkIFdhbGxpYW1zIGFuZCBBc2hsZXkgQmFuam8gYXJlIG9uIHRoZSBqdWRnaW5nIGRlc2su', 'CHANNEL_ID':' ITV伦敦”,“开始时间标记”:“ 1601751600”,“停止时间标记”:“ 1601758800”}]} b'{“epg_listings“:[{” id“:” 1037932301“,” epg_id“:” 34“,” title“:” QnJpdGFpbidzIEdvdCBUYWxlbnQ =“,” lang“:” en“,”开始“:” 2020-10-03 20: 00:00" , “结束”: “2020年10月3日22:00:00”, “说明”: “QW50IGFuZCBEZWMgaG9zdCB0aGUgZmlmdGggc2VtaS1maW5hbCBvZiB0aGlzIHllYXIncyB0YWxlbnQgY29udGVzdCwgd2VsY29taW5nIGJhY2sgdGhlIGFjdHMgdGhhdCBpbXByZXNzZWQgdGhlIGp1ZGdlcyBkdXJpbmcgdGhlIGF1ZGl0aW9ucyBlcGlzb2RlcyBpbiB0aGUgc3ByaW5nLiBBbWFuZGEgSG9sZGVuLCBBbGVzaGEgRGl4b24sIERhdmlkIFdhbGxpYW1zIGFuZCBBc2hsZXkgQmFuam8gYXJlIG9uIHRoZSBqdWRnaW5nIGRlc2su”, “CHANNEL_ID”: “ITV伦敦”, “start_timestamp”: “1601751600”, “stop_timestamp” :“ 1601758800”}]}'epg_id“:” 34“,” title“:” QnJpdGFpbidzIEdvdCBUYWxlbnQ =“,” lang“:” en“,” start“:” 2020-10-03 20:00:00“,” end“:” 2020-10- 03 22:00:00" , “说明”: “QW50IGFuZCBEZWMgaG9zdCB0aGUgZmlmdGggc2VtaS1maW5hbCBvZiB0aGlzIHllYXIncyB0YWxlbnQgY29udGVzdCwgd2VsY29taW5nIGJhY2sgdGhlIGFjdHMgdGhhdCBpbXByZXNzZWQgdGhlIGp1ZGdlcyBkdXJpbmcgdGhlIGF1ZGl0aW9ucyBlcGlzb2RlcyBpbiB0aGUgc3ByaW5nLiBBbWFuZGEgSG9sZGVuLCBBbGVzaGEgRGl4b24sIERhdmlkIFdhbGxpYW1zIGFuZCBBc2hsZXkgQmFuam8gYXJlIG9uIHRoZSBqdWRnaW5nIGRlc2su”, “CHANNEL_ID”: “ITV伦敦”, “start_timestamp”: “1601751600”, “stop_timestamp”: “1601758800”}]}”epg_id“:” 34“,” title“:” QnJpdGFpbidzIEdvdCBUYWxlbnQ =“,” lang“:” en“,” start“:” 2020-10-03 20:00:00“,” end“:” 2020-10- 03 22:00:00" , “说明”: “QW50IGFuZCBEZWMgaG9zdCB0aGUgZmlmdGggc2VtaS1maW5hbCBvZiB0aGlzIHllYXIncyB0YWxlbnQgY29udGVzdCwgd2VsY29taW5nIGJhY2sgdGhlIGFjdHMgdGhhdCBpbXByZXNzZWQgdGhlIGp1ZGdlcyBkdXJpbmcgdGhlIGF1ZGl0aW9ucyBlcGlzb2RlcyBpbiB0aGUgc3ByaW5nLiBBbWFuZGEgSG9sZGVuLCBBbGVzaGEgRGl4b24sIERhdmlkIFdhbGxpYW1zIGFuZCBBc2hsZXkgQmFuam8gYXJlIG9uIHRoZSBqdWRnaW5nIGRlc2su”, “CHANNEL_ID”: “ITV伦敦”, “start_timestamp”: “1601751600”, “stop_timestamp”: “1601758800”}]}”“开始”: “2020年10月3日20:00:00”, “结束”: “2020年10月3日22:00:00”, “说明”: “QW50IGFuZCBEZWMgaG9zdCB0aGUgZmlmdGggc2VtaS1maW5hbCBvZiB0aGlzIHllYXIncyB0YWxlbnQgY29udGVzdCwgd2VsY29taW5nIGJhY2sgdGhlIGFjdHMgdGhhdCBpbXByZXNzZWQgdGhlIGp1ZGdlcyBkdXJpbmcgdGhlIGF1ZGl0aW9ucyBlcGlzb2RlcyBpbiB0aGUgc3ByaW5nLiBBbWFuZGEgSG9sZGVuLCBBbGVzaGEgRGl4b24sIERhdmlkIFdhbGxpYW1zIGFuZCBBc2hsZXkgQmFuam8gYXJlIG9uIHRoZSBqdWRnaW5nIGRlc2su”, “CHANNEL_ID”: “ITV伦敦” ,“ start_timestamp”:“ 1601751600”,“ stop_timestamp”:“ 1601758800”}]}'“开始”: “2020年10月3日20:00:00”, “结束”: “2020年10月3日22:00:00”, “说明”: “QW50IGFuZCBEZWMgaG9zdCB0aGUgZmlmdGggc2VtaS1maW5hbCBvZiB0aGlzIHllYXIncyB0YWxlbnQgY29udGVzdCwgd2VsY29taW5nIGJhY2sgdGhlIGFjdHMgdGhhdCBpbXByZXNzZWQgdGhlIGp1ZGdlcyBkdXJpbmcgdGhlIGF1ZGl0aW9ucyBlcGlzb2RlcyBpbiB0aGUgc3ByaW5nLiBBbWFuZGEgSG9sZGVuLCBBbGVzaGEgRGl4b24sIERhdmlkIFdhbGxpYW1zIGFuZCBBc2hsZXkgQmFuam8gYXJlIG9uIHRoZSBqdWRnaW5nIGRlc2su”, “CHANNEL_ID”: “ITV伦敦” ,“ start_timestamp”:“ 1601751600”,“ stop_timestamp”:“ 1601758800”}]}'QW50IGFuZCBEZWMgaG9zdCB0aGUgZmlmdGggc2VtaS1maW5hbCBvZiB0aGlzIHllYXIncyB0YWxlbnQgY29udGVzdCwgd2VsY29taW5nIGJhY2sgdGhlIGFjdHMgdGhhdCBpbXByZXNzZWQgdGhlIGp1ZGdlcyBkdXJpbmcgdGhlIGF1ZGl0aW9ucyBlcGlzb2RlcyBpbiB0aGUgc3ByaW5nLiBBbWFuZGEgSG9sZGVuLCBBbGVzaGEgRGl4b24sIERhdmlkIFdhbGxpYW1zIGFuZCBBc2hsZXkgQmFuam8gYXJlIG9uIHRoZSBqdWRnaW5nIGRlc2su”, “CHANNEL_ID”: “ITV伦敦”, “start_timestamp”: “1601751600”, “stop_timestamp”: “1601758800”}]}”QW50IGFuZCBEZWMgaG9zdCB0aGUgZmlmdGggc2VtaS1maW5hbCBvZiB0aGlzIHllYXIncyB0YWxlbnQgY29udGVzdCwgd2VsY29taW5nIGJhY2sgdGhlIGFjdHMgdGhhdCBpbXByZXNzZWQgdGhlIGp1ZGdlcyBkdXJpbmcgdGhlIGF1ZGl0aW9ucyBlcGlzb2RlcyBpbiB0aGUgc3ByaW5nLiBBbWFuZGEgSG9sZGVuLCBBbGVzaGEgRGl4b24sIERhdmlkIFdhbGxpYW1zIGFuZCBBc2hsZXkgQmFuam8gYXJlIG9uIHRoZSBqdWRnaW5nIGRlc2su”, “CHANNEL_ID”: “ITV伦敦”, “start_timestamp”: “1601751600”, “stop_timestamp”: “1601758800”}]}”1601758800“}]}'1601758800“}]}'
怀疑将json输出保存到文件中,可能是因为仅对json中提供的标题和描述文本进行了不同的编码。尝试使用以下方式对加载的标题值进行编码/解码:
list = json.load(open('data/short_epg.json'))
title = list['epg_listings'][0]['title']
print(title.encode())
for enc in encodings.aliases.aliases:
for enc2 in encodings.aliases.aliases:
try:
test = title.encode(enc2).decode(enc)
if test not in title:
print('{0} to {1} = {2}'.format(enc2, enc, test))
except:
pass
但是导致不同输出的唯一结果是中文/阿拉伯语的含义。还尝试了不同的服务器,以防万一这是孤立的问题,但每次都获得相同的乱码数据。
仅出于参考目的,“ QnJpdGFpbidzIEdvdCBUYWxlbnQ =“的标题值应为“ British's Got Talent”,任何人都不知道这是哪种错误编码或如何获取正确的文本值。
它是base64编码的。QnJpdGFpbidzIEdvdCBUYWxlbnQ=
变成Britain's Got Talent
您可以使用python base64 lib对此进行解码。
import base64
print(base64.b64decode('QnJpdGFpbidzIEdvdCBUYWxlbnQ=').decode('utf-8'))
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句