Elasticsearchにcurl APIで結果をCSVファイルとして取得する方法はありますか？

Question

弾性検索を使用しています。弾性検索の結果はCSVファイルとして必要です。これを達成するためのカールURLまたはプラグインはありますか？

anarchivist · Accepted Answer

CURLと jq （ "sedに似ていますが、JSONの場合"）を使用してこれを実行しました。たとえば、特定のファセットの上位20個の値のCSV出力を取得するには、次の操作を実行できます。

$ curl -X GET 'http://localhost:9200/myindex/item/_search?from=0&size=0' -d ' {"from": 0, "size": 0, "facets": { "sourceResource.subject.name": { "global": true, "terms": { "order": "count", "size": 20, "all_terms": true, "field": "sourceResource.subject.name.not_analyzed" } } }, "sort": [ { "_score": "desc" } ], "query": { "filtered": { "query": { "match_all": {} } } } }' | jq -r '.facets["subject"].terms[] | [.term, .count] | @csv' "United States",33755 "Charities--Massachusetts",8304 "Almshouses--Massachusetts--Tewksbury",8304 "Shields",4232 "Coat of arms",4214 "Springfield College",3422 "Men",3136 "Trees",3086 "Session Laws--Massachusetts",2668 "Baseball players",2543 "Animals",2527 "Books",2119 "Women",2004 "Landscape",1940 "Floral",1821 "Architecture, Domestic--Lowell (Mass)--History",1785 "Parks",1745 "Buildings",1730 "Houses",1611 "Snow",1579

Jeff Steinmetz · Answer

私はPythonを正常に使用しました。スクリプトのアプローチは直感的で簡潔です。pythonのESクライアントは、人生を楽にします。 Pythonこちら：
http://www.elasticsearch.org/blog/unleash-the-clients-Ruby-python-php-Perl/#python

次に、Pythonスクリプトに次のような呼び出しを含めることができます。

import elasticsearch import unicodedata import csv es = elasticsearch.Elasticsearch(["10.1.1.1:9200"]) # this returns up to 500 rows, adjust to your needs res = es.search(index="YourIndexName", body={"query": {"match": {"title": "elasticsearch"}}},500) sample = res['hits']['hits'] # then open a csv file, and loop through the results, writing to the csv with open('outputfile.tsv', 'wb') as csvfile: filewriter = csv.writer(csvfile, delimiter='	', # we use TAB delimited, to handle cases where freeform text may have a comma quotechar='|', quoting=csv.QUOTE_MINIMAL) # create column header row filewriter.writerow(["column1", "column2", "column3"]) #change the column labels here # fill columns 1, 2, 3 with your data col1 = hit["some"]["deeply"]["nested"]["field"].decode('utf-8') #replace these nested key names with your own col1 = col1.replace('
', ' ') # col2 = , col3 = , etc... for hit in sample: filewriter.writerow([col1,col2,col3])

文書は構造化されておらず、時々フィールドを持たない場合があるため（インデックスによって異なります）、try/catchエラー処理でcolumn ['key']参照への呼び出しをラップすることができます。

完全なPython最新のESを使用したサンプルスクリプトpythonここから入手できるクライアント：

https://github.com/jeffsteinmetz/pyes2csv

Tikki · Answer

Elasticsearch headプラグインを使用できます。 elasticsearch head plugin http：// localhost：9200/_plugin/head / からインストールできます。プラグインをインストールしたら、構造化クエリタブに移動し、クエリの詳細を指定します[出力結果]プルダウンから[csv]形式を選択できます。

jamesc · Answer

検索エンジンから直接CSV結果を提供するプラグインはないと思うので、ElasticSearchにクエリして結果を取得し、CSVファイルに書き込む必要があります。

コマンドライン

UnixライクなOSを使用している場合、 es2unix を使用すると、コマンドラインで検索結果を生のテキスト形式で返すことができるため、スクリプト化できるはずです。

次に、これらの結果をテキストファイルにダンプするか、awkまたは同様のCSV形式のパイプにパイプします。 -oフラグを使用できますが、現時点では「生」形式のみを提供します。

Java

サンプルを見つけました Javaを使用 -テストしていません。

Python

ElasticSearchにpyesなどを使用してクエリを実行し、標準のcsvライターライブラリを使用して結果セットをファイルに書き込むことができます。

Perl

Perlを使用すると、RakeshによってリンクされたClinton GormleyのGistを使用できます- https://Gist.github.com/clintongormley/2049562

miku · Answer

ハレンチプラグ。 estab -elasticsearchドキュメントをタブ区切り値にエクスポートするコマンドラインプログラムを書きました。

例：

$ export MYINDEX=localhost:9200/test/default/ $ curl -XPOST $MYINDEX -d '{"name": "Tim", "color": {"fav": "red"}}' $ curl -XPOST $MYINDEX -d '{"name": "Alice", "color": {"fav": "yellow"}}' $ curl -XPOST $MYINDEX -d '{"name": "Brian", "color": {"fav": "green"}}' $ estab -indices "test" -f "name color.fav" Brian green Tim red Alice yellow

estabは、複数のインデックス、カスタムクエリ、欠損値、値のリスト、ネストされたフィールドからのエクスポートを処理でき、かなり高速です。

pandaadb · Answer

これには https://github.com/robbydyer/stash-query stash-queryを使用しています。

再インストールするたびにインストールに苦労していますが、これは非常に便利でうまく機能しています（これはgemとRubyにあまり流ではないためです）。

ただし、Ubuntu 16.04では、動作するように思われたものは次のとおりです。

apt install Ruby Sudo apt-get install libcurl3 libcurl3-gnutls libcurl4-openssl-dev gem install stash-query

そして、あなたは行くのが良いはずです

Rubyをインストールします
Stash-queryツールはelasticsearchのREST APIを介して動作しているため、Rubyのcurl依存関係をインストールします。
Stashクエリをインストールします

このブログ投稿では、その構築方法についても説明しています。

https://robbydyer.wordpress.com/2014/08/25/exporting-from-kibana/

Maoz Zadok · Answer

elasticsearch2csv を使用できます。Elasticsearchscroll APIを使用し、大きなクエリ応答を処理する小さくて効果的なpython3スクリプトです。