我有来自Elasticsearch的查询结果,格式如下:
[
{
"_index": "product",
"_type": "_doc",
"_id": "23234sdf",
"_score": 2.2295187,
"_source": {
"SERP_KEY": "",
"r_variant_info": "",
"s_asin": "",
"pid": "394",
"r_gtin": "00838128000547",
"additional_attributes_remarks": "publisher:0|size:0",
"s_gtin": "",
"r_category": "",
"confidence_score": "2.4545",
"title_match": "45.45"
}
},
{
"_index": "product",
"_type": "_doc",
"_id": "23234sdf",
"_score": 2.2295187,
"_source": {
"SERP_KEY": "",
"r_variant_info": "",
"s_asin": "",
"pid": "394",
"r_gtin": "00838128000547",
"additional_attributes_remarks": "publisher:0|size:0",
"s_gtin": "",
"r_category": "",
"confidence_score": "2.4545",
"title_match": "45.45"
}
},
]
我正在尝试将_source
字段_id
也加载到数据帧中。
我尝试了这个:
def fetch_records_from_elasticsearch_index(index, filter_json):
search_param = prepare_es_body(filter_json_dict=filter_json)
response = settings.ES.search(index=index, body=search_param, size=10)
if len(response['hits']['hits']) > 0:
import pandas as pd
all_hits = response['hits']['hits']
# return all_hits
# export es hits to pandas dataframe
df = pd.concat(map(pd.DataFrame.from_dict, all_hits), axis=1)['_source'].T
return df
else:
return 0
df
_source
仅包含字段,但我也想向其添加_id
字段。
这是df输出格式:
{
"AdminEdit": [
"False",
"False",
"False",
"False",
],
"Group": [
"Grp2",
"Grp2",
"Grp2",
"Grp2"
],
}
如何添加_id
呢?
有两种方法可以解决此问题:
直接代码
import pandas as pd
df = pd.json_normalize(all_hits)
改进您的代码
import json
import pandas as pd
df = pd.concat(map(pd.DataFrame.from_dict, all_hits), axis=1)['_source'].T
df["_id"] = [i["_id"] for i in all_hits]
使用的JSON是:
all_hits = [
{
"_index": "product",
"_type": "_doc",
"_id": "23234sdg",
"_score": 2.2295187,
"_source": {
"SERP_KEY": "",
"r_variant_info": "",
"s_asin": "",
"pid": "394",
"r_gtin": "00838128000547",
"additional_attributes_remarks": "publisher:0|size:0",
"s_gtin": "",
"r_category": "",
"confidence_score": "2.4545",
"title_match": "45.45"
}
},
{
"_index": "product",
"_type": "_doc",
"_id": "23234sdf",
"_score": 2.2295187,
"_source": {
"SERP_KEY": "",
"r_variant_info": "",
"s_asin": "",
"pid": "394",
"r_gtin": "00838128000547",
"additional_attributes_remarks": "publisher:0|size:0",
"s_gtin": "",
"r_category": "",
"confidence_score": "2.4545",
"title_match": "45.45"
}
},
]
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句