動的にネストされたJSONオブジェクトと配列の生成-python

Question

質問で問題を説明しているように、ネストされたJSONオブジェクトを生成しようとしています。この場合、辞書[forからデータを取得するdicループがあります。以下にコードを示します。

f = open("test_json.txt", 'w') flag = False temp = "" start = "{
	\"filename\"" + " : \"" +initial_filename+"\",
	\"data\"" +" : " +" [
" end = "
	]" +"
}" f.write(start) for i, (key,value) in enumerate(dic.iteritems()): f.write("{
	\"keyword\":"+"\""+str(key)+"\""+",
") f.write("\"term_freq\":"+str(len(value))+",
") f.write("\"lists\":[
	") for item in value: f.write("{
") f.write("		\"occurance\" :"+str(item)+"
") #Check last object if value.index(item)+1 == len(value): f.write("}
" f.write("]
") else: f.write("},") # close occurrence object # Check last item in dic if i == len(dic)-1: flag = True if(flag): f.write("}") else: f.write("},") #close lists object flag = False #check for flag f.write("]") #close lists array f.write("}")

予想される出力は次のとおりです。

{ "filename": "abc.pdf", "data": [{ "keyword": "irritation", "term_freq": 5, "lists": [{ "occurance": 1 }, { "occurance": 1 }, { "occurance": 1 }, { "occurance": 1 }, { "occurance": 2 }] }, { "keyword": "bomber", "lists": [{ "occurance": 1 }, { "occurance": 1 }, { "occurance": 1 }, { "occurance": 1 }, { "occurance": 2 }], "term_freq": 5 }] }

しかし、現在私は以下のような出力を取得しています：

{ "filename": "abc.pdf", "data": [{ "keyword": "irritation", "term_freq": 5, "lists": [{ "occurance": 1 }, { "occurance": 1 }, { "occurance": 1 }, { "occurance": 1 }, { "occurance": 2 },] // Here lies the problem "," before array(last element) }, { "keyword": "bomber", "lists": [{ "occurance": 1 }, { "occurance": 1 }, { "occurance": 1 }, { "occurance": 1 }, { "occurance": 2 },], // Here lies the problem "," before array(last element) "term_freq": 5 }] }

助けてください、私はそれを解決しようとしましたが、失敗しました。私はすでに他の回答をチェックしており、まったく助けにならなかったので、重複してマークしないでください。

編集1：入力は、基本的に辞書dicから取得され、マッピングタイプは<String, List>例： "irritation" => [1,3,5,7,8]ここで、irritationはキーであり、ページ番号のリストにマップされます。これは基本的に、外側のforループで読み取られます。ここで、キーはキーワードであり、値はそのキーワードの発生ページのリストです。

編集2：

dic = collections.defaultdict(list) # declaring the variable dictionary dic[key].append(value) # inserting the values - useless to tell here for key in dic: # Here dic[x] represents list - each value of x print key,":",dic[x],"
" #prints the data in dictionary

Kruup&#246;s · Accepted Answer

@ andrea-fが私にとって良さそうなもの、ここで別の解決策：

両方を自由に選んでください:)

_import json dic = { "bomber": [1, 2, 3, 4, 5], "irritation": [1, 3, 5, 7, 8] } filename = "abc.pdf" json_dict = {} data = [] for k, v in dic.iteritems(): tmp_dict = {} tmp_dict["keyword"] = k tmp_dict["term_freq"] = len(v) tmp_dict["lists"] = [{"occurrance": i} for i in v] data.append(tmp_dict) json_dict["filename"] = filename json_dict["data"] = data with open("abc.json", "w") as outfile: json.dump(json_dict, outfile, indent=4, sort_keys=True) _

それは同じ考えです。まず、jsonに直接保存する大きな_json_dict_を作成します。 with ステートメントを使用して、exceptionのキャッチを避けてJSONを保存します。

また、jsonの出力を今後改善する必要がある場合は、 json.dumps() のドキュメントを参照してください。

[〜＃〜] edit [〜＃〜]

ただ楽しみのために、tmp varが気に入らなければ、すべてのデータforループを1行で実行できます:)

_json_dict["data"] = [{"keyword": k, "term_freq": len(v), "lists": [{"occurrance": i} for i in v]} for k, v in dic.iteritems()] _

最終的な解決策として、次のような完全に読みにくいものを与えることができます。

_import json json_dict = { "filename": "abc.pdf", "data": [{ "keyword": k, "term_freq": len(v), "lists": [{"occurrance": i} for i in v] } for k, v in dic.iteritems()] } with open("abc.json", "w") as outfile: json.dump(json_dict, outfile, indent=4, sort_keys=True) _

EDIT 2

jsonを目的の出力として保存したくないようですが、readを省略してください。

実際、jsonをprintするためにjson.dumps()を使用することもできます。

_with open('abc.json', 'r') as handle: new_json_dict = json.load(handle) print json.dumps(json_dict, indent=4, sort_keys=True) _

ただし、まだ1つの問題があります。dのdataがfの前に来るため、_"filename":_がリストの最後に出力されます。

順序を強制するには、dictの生成で OrderedDict を使用する必要があります。 _python 2.X_の構文はwithい（imo）であることに注意してください

これが新しい完全なソリューションです;）

_import json from collections import OrderedDict dic = { 'bomber': [1, 2, 3, 4, 5], 'irritation': [1, 3, 5, 7, 8] } json_dict = OrderedDict([ ('filename', 'abc.pdf'), ('data', [ OrderedDict([ ('keyword', k), ('term_freq', len(v)), ('lists', [{'occurrance': i} for i in v]) ]) for k, v in dic.iteritems()]) ]) with open('abc.json', 'w') as outfile: json.dump(json_dict, outfile) # Now to read the orderer json file with open('abc.json', 'r') as handle: new_json_dict = json.load(handle, object_pairs_hook=OrderedDict) print json.dumps(json_dict, indent=4) _

出力されます：

_{ "filename": "abc.pdf", "data": [ { "keyword": "bomber", "term_freq": 5, "lists": [ { "occurrance": 1 }, { "occurrance": 2 }, { "occurrance": 3 }, { "occurrance": 4 }, { "occurrance": 5 } ] }, { "keyword": "irritation", "term_freq": 5, "lists": [ { "occurrance": 1 }, { "occurrance": 3 }, { "occurrance": 5 }, { "occurrance": 7 }, { "occurrance": 8 } ] } ] } _

しかし、ほとんどの場合、クロスランゲージであるためには、regular_.json_ファイルを保存することをお勧めします。

andrea-f · Answer

ループは},を追加する最後の前の項目を反復処理し、ループが再度実行されるとフラグをfalseに設定するため、現在のコードは機能しませんが、最後に実行したときに,別の要素があると思ったからです。

これがあなたの辞書である場合：a = {"bomber":[1,2,3,4,5]}そして、次のことができます：

import json file_name = "a_file.json" file_name_input = "abc.pdf" new_output = {} new_output["filename"] = file_name_input new_data = [] i = 0 for key, val in a.iteritems(): new_data.append({"keyword":key, "lists":[], "term_freq":len(val)}) for p in val: new_data[i]["lists"].append({"occurrance":p}) i += 1 new_output['data'] = new_data

次に、次の方法でデータを保存します。

f = open(file_name, 'w+') f.write(json.dumps(new_output, indent=4, sort_keys=True, default=unicode)) f.close()