このxml文字列をPythonでどのように解析する必要がありますか？

Question

私のXML文字列は-

xmlData = """<SMSResponse xmlns="http://example.com" xmlns:i="http://www.w3.org/2001/XMLSchema-instance"> <Cancelled>false</Cancelled> <MessageID>00000000-0000-0000-0000-000000000000</MessageID> <Queued>false</Queued> <SMSError>NoError</SMSError> <SMSIncomingMessages i:nil="true"/> <Sent>false</Sent> <SentDateTime>0001-01-01T00:00:00</SentDateTime> </SMSResponse>"""

タグの値（Cancelled、MessageId、SMSErrorなど）を解析して取得しようとしています。Pythonの Elementtree ライブラリを使用しています。これまでのところ、私は次のようなことを試しました-

root = ET.fromstring(xmlData) print root.find('Sent') // gives None for child in root: print chil.find('MessageId') // also gives None

ただし、-でタグを印刷することはできます

for child in root: print child.tag //child.tag for the tag Cancelled is - {http://example.com}Cancelled

およびそれぞれの値-

for child in root: print child.text

どうすれば次のようなものを入手できますか-

print child.Queued // will print false

PHPのように、ルートでそれらにアクセスできます-

$xml = simplexml_load_string($data); $status = $xml->SMSError;

Martijn Pieters · Accepted Answer

ドキュメントには名前空間があります。検索するときに名前空間を含める必要があります。

_root = ET.fromstring(xmlData) print root.find('{http://example.com}Sent',) print root.find('{http://example.com}MessageID') _

出力：

_<Element '{http://example.com}Sent' at 0x1043e0690> <Element '{http://example.com}MessageID' at 0x1043e0350> _

find()メソッドとfindall()メソッドも名前空間マップを取ります。任意のプレフィックスを検索すると、そのプレフィックスがそのマップで検索され、入力を節約できます。

_nsmap = {'n': 'http://example.com'} print root.find('n:Sent', namespaces=nsmap) print root.find('n:MessageID', namespaces=nsmap) _

tuomur · Answer

Python標準のXMLライブラリを使用している場合は、次のようなものを使用できます。

root = ET.fromstring(xmlData) namespace = 'http://example.com' def query(tree, nodename): return tree.find('{{{ex}}}{nodename}'.format(ex=namespace, nodename=nodename)) queued = query(root, 'Queued') print queued.text

root · Answer

lxml.etree：

In [8]: import lxml.etree as et In [9]: doc=et.fromstring(xmlData) In [10]: ns={'n':'http://example.com'} In [11]: doc.xpath('n:Queued/text()',namespaces=ns) Out[11]: ['false']

elementtreeを使用すると、次のことができます。

import xml.etree.ElementTree as ET root=ET.fromstring(xmlData) ns={'n':'http://example.com'} root.find('n:Queued',namespaces=ns).text Out[13]: 'false'

ATOzTOA · Answer

辞書を作成して、そこから直接値を取得できます...

tree = ET.fromstring(xmlData) root = {} for child in tree: root[child.tag.split("}")[1]] = child.text print root["Queued"]