导入XML以查询Google表格

M·法赞·法鲁克

当我使用importxml应用类时,我没有数据,我已经通过使用右键单击页面源选项查看了源代码,并且sendencs在那里,我使用ctrl + f查找行并按如下所述尝试了xpath,但是它给出了#不适用,由于是首次导入XML谷歌表格查询,因此我需要在单元格E55中的单元格I55和J55中的日语和英语句子下面的快照中附加结果。

这些句子来自下面的搜索,第一句话总是用日语和英语使用,这是快照

我尝试了几种组合,但是在#NA节目中;

  1. =IMPORTXML("https://tangorin.com/sentences?search=時","//div[@class='s-jp']")

  2. =IMPORTXML("https://tangorin.com/sentences?search=時","//div[@class='entry entry-border sentences undefined ']/dd[@class='s-jp']")

Can anyone please assist as can I cant copy for 2000 letters both Japnese & English sentences which I need for translation class, much appreciated thanks

E.Wiest

Output :

JapEng

3 formulas to get the data (url is in cell D2).

For Kana in D4 :

=TRANSPOSE(SPLIT(SUBSTITUTE(TEXTJOIN("",TRUE,IMPORTXML(D2,"//div[@class='results-main-container']//dt//text()[not(parent::rt)]|//dd[@class='s-en']/@class")),"s-en","¤"),"¤"))

For Romaji in E4 :

=ARRAYFORMULA(TRIM(TRANSPOSE(SPLIT(SUBSTITUTE(TEXTJOIN(" ",TRUE,IMPORTXML(D2,"//div[@class='results-main-container']//dt//rt|//dd[@class='s-en']/@class")),"s-en","¤"),"¤"))))

For English in F4 :

=ARRAYFORMULA(TRIM(TRANSPOSE(SPLIT(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(REGEXREPLACE(TEXTJOIN(" ",FALSE,IMPORTXML(D2,"//div[@class='results-main-container']//dd/span|(//dd[@class='s-en'])[not(position()=1)]/@class")),"(\w)(  )(\w)","$1'$3"),"s-en","¤")," , ",", ")," . ",".")," - ","-")," ( "," (")," ) ",") ")," !","!")," ?","?"),"¤"))))

To limit the array to 1 result, you can use something like :

=INDEX(one of the preceding formulas,1,1)

Output :

Index

EDIT : If you need something like this (word in a cell and first example retrieved. /!\ Limit the number of words to search. Each word = 3 IMPORTXML requests. So, for 20 words => 60 requests, leading to a slow sheet.)

Cell

In column B, copy-paste the words to search.

For Kana in cells C3,C4,C5,... the following formula :

=TEXTJOIN("",TRUE,IMPORTXML("https://tangorin.com/sentences?search="&B3,"(//div[@class='results-main-container']//dt)[1]//text()[not(parent::rt)]"))

For Romaji in cells D3,D4,D5,... the following formula :

=TEXTJOIN(" ",TRUE,IMPORTXML("https://tangorin.com/sentences?search="&B3,"(//div[@class='results-main-container']//dt)[1]//rt"))

For English in cells E3,E4,E5,... the following formula :

=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(REGEXREPLACE(TEXTJOIN(" ",FALSE,IMPORTXML("https://tangorin.com/sentences?search="&B3,"(//div[@class='results-main-container']//dd/span)[1]")),"(\w)(  )(\w)","$1'$3"),"s-en","¤")," ,  ",", ")," .",".")," - ","-")," ( "," (")," ) ",") ")," !","!")," ?","?"),". ",".")

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章