我有一个XML文件,其中包含一些具有特定kaywords特征的条目。我需要在条目上运行一个for循环,为它们提取两个不同的关键字,以便它们在for循环中用作变量。
这是list.xml的示例:
<?xml version="1.0" encoding="UTF-8"?>
<responses type="C-FIND">
<data-set xfer="1.2.840.10008.1.2.1" name="Little Endian Explicit">
<element tag="0008,0005" vr="CS" vm="1" len="10" name="SpecificCharacterSet">ISO_IR 192</element>
<element tag="0008,0052" vr="CS" vm="1" len="6" name="QueryRetrieveLevel">STUDY</element>
<element tag="0008,0054" vr="AE" vm="1" len="8" name="RetrieveAETitle">PLATONE</element>
<element tag="0010,0010" vr="PN" vm="1" len="16" name="PatientName">Anon^1600373003</element>
<element tag="0020,000d" vr="UI" vm="1" len="42" name="StudyInstanceUID">1.3.76.13.99972.2.20181217085753.1484038.1</element>
</data-set>
<data-set xfer="1.2.840.10008.1.2.1" name="Little Endian Explicit">
<element tag="0008,0005" vr="CS" vm="1" len="10" name="SpecificCharacterSet">ISO_IR 192</element>
<element tag="0008,0052" vr="CS" vm="1" len="6" name="QueryRetrieveLevel">STUDY</element>
<element tag="0008,0054" vr="AE" vm="1" len="8" name="RetrieveAETitle">PLATONE</element>
<element tag="0010,0010" vr="PN" vm="1" len="16" name="PatientName">Anon^1599844862</element>
<element tag="0020,000d" vr="UI" vm="1" len="42" name="StudyInstanceUID">1.3.76.13.99972.2.20180925142630.1456727.1</element>
</data-set>
</responses>
我需要提取关键字“ PatientName”和“ StudyInstanceUID”。我试图使用这样的东西:
grep -A2 -i "PatientName" list.xml | while read -r string ; do
PatientName="$(echo $string | grep -i "PatientName" | cut -d ">" -f 2 | cut -d "<" -f 1)"
StudyInstanceUID="$(echo $string | grep -i "StudyInstanceUID" | cut -d ">" -f 2 | cut -d "<" -f 1)"
echo "$PatientName"
echo "$StudyInstanceUID"
done
问题是我得到了很多空行!有什么问题?
[编辑]我想从此示例中获得以下信息:
Anon^1600373003
1.3.76.13.99972.2.20181217085753.1484038.1
Anon^1599844862
1.3.76.13.99972.2.20180925142630.1456727.1
非常感谢。
伊万
正如Raman在评论中提到的那样,使用XML感知工具来解析XML数据可能是您最好的选择,尤其是如果您的某些XML的格式可能不如问题中所显示的那样(例如,所有内容都排成一排)。
假设:
PatientName
并且StudyInstanceUID
不会显示在较大的字符串中(例如LastPatientName
或PreviousStudyInstanceUID
)PatientName
元素的前始终上市StudyInstanceUID
元素一个awk
消除了所有的子进程的需要解决来电echo
,grep
并cut
:
awk -F'[<>]' ' # define input field separators as "<" and ">"
/PatientName/ || /StudyInstanceUID/ { print $3 } # if we find one of our search strings then print field #3
' list.xml
与单线相同,无注释:
awk -F'[<>]' '/PatientName/ || /StudyInstanceUID/ { print $3 }' list.xml
上面生成:
Anon^1600373003
1.3.76.13.99972.2.20181217085753.1484038.1
Anon^1599844862
1.3.76.13.99972.2.20180925142630.1456727.1
至于将输出捕获到变量中(例如,在while
循环内),我们可以进行一些小的更改,例如:
awk -F'[<>]' '
/PatientName/ { pn=$3 } # store field #3 in variable "pn"
/StudyInstanceUID/ { printf "%s %s\n", pn, $3 } # print data to stdout
' list.xml
这将生成:
Anon^1600373003 1.3.76.13.99972.2.20181217085753.1484038.1
Anon^1599844862 1.3.76.13.99972.2.20180925142630.1456727.1
将其送入while
循环:
while read -r PatientName StudyInstanceUID
do
echo "+++++++++++++++++++"
echo "PatientName: ${PatientName}"
echo "StudyInstanceUID: ${StudyInstanceUID}"
done < <(awk -F'[<>]' ' /PatientName/ { pn=$3 } /StudyInstanceUID/ { printf "%s %s\n", pn, $3 } ' list.xml)
这会生成:
+++++++++++++++++++
PatientName: Anon^1600373003
StudyInstanceUID: 1.3.76.13.99972.2.20181217085753.1484038.1
+++++++++++++++++++
PatientName: Anon^1599844862
StudyInstanceUID: 1.3.76.13.99972.2.20180925142630.1456727.1
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句