pandas）を使用してCSVファイルにコメントを書き込みます

Question

pandasで作成したCSVファイルにコメントを書きたいと思います。 DataFrame.to_csv（read_csvはコメントをスキップできますが）でも、標準のcsvモジュールでも、このためのオプションは見つかりませんでした。ファイルを開き、コメント（#で始まる行）を書き込んでから、to_csvに渡すことができます。より良い選択肢がある体はありますか？

Vor · Accepted Answer

df.to_csvファイルオブジェクトを受け入れます。したがって、ファイルをaモードで開き、コメントを書き込んで、それをデータフレームのto_csv関数に渡すことができます。

例えば：

In [36]: df = pd.DataFrame({'a':[1,2,3], 'b':[1,2,3]}) In [37]: f = open('foo', 'a') In [38]: f.write('# My awesome comment
') In [39]: f.write('# Here is another one
') In [40]: df.to_csv(f) In [41]: f.close() In [42]: more foo # My awesome comment # Here is another one ,a,b 0,1,1 1,2,2 2,3,3

joelostblom · Answer

@Vorの別のアプローチは、最初にコメントをファイルに書き込み、次にmode='a'とto_csv()を使用して、データフレームのコンテンツを同じファイルに追加することです。私のベンチマーク（下記）によると、これは、ファイルを追加モードで開き、コメントを追加してから、ファイルハンドラーをpandas（@ Vorの回答による）に渡すのと同じくらいの時間がかかります。これが内部で行うpandas）であることを考えると、同様のタイミングが理にかなっています（DataFrame.to_csv()はCSVFormatter.save()を呼び出し、_get_handles()を使用します open()を介してファイルを読み込みます。

別の注意点として、ファイルIO withステートメントを介して作業すると便利です。これにより、開いたファイルが終了したときに閉じて、withステートメントを残すことができます。以下のベンチマークの例を参照してください。

テストデータを読み込む

import pandas as pd # Read in the iris data frame from the seaborn GitHub location iris = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv') # Create a bigger data frame while iris.shape[0] < 100000: iris = iris.append(iris) # `iris.shape` is now (153600, 5)

1.同じファイルハンドラーを追加します

%%timeit -n 5 -r 5 # Open a file in append mode to add the comment # Then pass the file handle to pandas with open('test1.csv', 'a') as f: f.write('# This is my comment
') iris.to_csv(f)

ループあたり972ミリ秒±31.9ミリ秒（5回の実行の平均±標準偏差、各5ループ）

2. `to_csv(mode='a')`でファイルを再度開きます

%%timeit -n 5 -r 5 # Open a file in write mode to add the comment # Then close the file and reopen it with pandas in append mode with open('test2.csv', 'w') as f: f.write('# This is my comment
') iris.to_csv('test2.csv', mode='a')

ループあたり949ミリ秒±19.3ミリ秒（5回の実行の平均±標準偏差、各5ループ）

pandas）を使用してCSVファイルにコメントを書き込みます

テストデータを読み込む

1.同じファイルハンドラーを追加します

2. to_csv(mode='a')でファイルを再度開きます

2. `to_csv(mode='a')`でファイルを再度開きます