SeabornFacetGridを使用してデータフレームからエラーバーをプロットする

Question

SeabornFacetGridのpandasデータフレームの列からエラーバーをプロットしたい

import matplotlib.pyplot as plt import pandas as pd import seaborn as sns df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar']*2, 'B' : ['one', 'one', 'two', 'three', 'two', 'two', 'one', 'three'], 'C' : np.random.randn(8), 'D' : np.random.randn(8)}) df

データフレームの例

 A B C D 0 foo one 0.445827 -0.311863 1 bar one 0.862154 -0.229065 2 foo two 0.290981 -0.835301 3 bar three 0.995732 0.356807 4 foo two 0.029311 0.631812 5 bar two 0.023164 -0.468248 6 foo one -1.568248 2.508461 7 bar three -0.407807 0.319404

このコードは、固定サイズのエラーバーで機能します。

g = sns.FacetGrid(df, col="A", hue="B", size =5) g.map(plt.errorbar, "C", "D",yerr=0.5, fmt='o');

enter image description here

しかし、データフレームの値を使用して動作させることはできません

df['E'] = abs(df['D']*0.5) g = sns.FacetGrid(df, col="A", hue="B", size =5) g.map(plt.errorbar, "C", "D", yerr=df['E']);

または

g = sns.FacetGrid(df, col="A", hue="B", size =5) g.map(plt.errorbar, "C", "D", yerr='E');

どちらもエラーのスクリードを生成します

編集：

たくさんのmatplotlibドキュメントを読んで、さまざまなスタックオーバーフローの回答をした後、ここに純粋なmatplotlibソリューションがあります

#define a color palette index based on column 'B' df['cind'] = pd.Categorical(df['B']).labels #how many categories in column 'A' cats = df['A'].unique() cats.sort() #get the seaborn colour palette and convert to array cp = sns.color_palette() cpa = np.array(cp) #draw a subplot for each category in column "A" fig, axs = plt.subplots(nrows=1, ncols=len(cats), sharey=True) for i,ax in enumerate(axs): df_sub = df[df['A'] == cats[i]] col = cpa[df_sub['cind']] ax.scatter(df_sub['C'], df_sub['D'], c=col) eb = ax.errorbar(df_sub['C'], df_sub['D'], yerr=df_sub['E'], fmt=None) a, (b, c), (d,) = eb.lines d.set_color(col)

ラベル以外、軸はOKを制限します。列「A」のカテゴリごとに個別のサブプロットをプロットし、列「B」のカテゴリで色分けしました。（ランダムデータは上記のものとは異なることに注意してください）

誰かアイデアがあれば、パンダ/海の解決策が欲しいですか？

enter image description here

mwaskom · Accepted Answer

FacetGrid.mapを使用する場合、dataDataFrameを参照するものはすべて位置引数として渡す必要があります。 yerrはplt.errorbarの3番目の位置引数であるため、これはあなたの場合に機能しますが、ヒントデータセットを使用することを示します。

from scipy import stats tips_all = sns.load_dataset("tips") tips_grouped = tips_all.groupby(["smoker", "size"]) tips = tips_grouped.mean() tips["CI"] = tips_grouped.total_bill.apply(stats.sem) * 1.96 tips.reset_index(inplace=True)

次に、FacetGridとerrorbarを使用してプロットできます。

g = sns.FacetGrid(tips, col="smoker", size=5) g.map(plt.errorbar, "size", "total_bill", "CI", marker="o")

enter image description here

ただし、完全なデータセットからエラーバー付きのプロット（ブートストラップを使用）に移行するための海のプロット関数があるため、多くのアプリケーションではこれは必要ない場合があることに注意してください。たとえば、factorplotを使用できます。

sns.factorplot("size", "total_bill", col="smoker", data=tips_all, kind="point")

enter image description here

またはlmplot：

sns.lmplot("size", "total_bill", col="smoker", data=tips_all, fit_reg=False, x_estimator=np.mean)

enter image description here

user3834473 · Answer

df['E']が実際に何であるか、およびそれがdf['C']およびdf['D']と同じ長さのリストであるかどうかを示していません。

yerrキーワード引数（kwarg）は、データフレームのキーCおよびDのリスト内のすべての要素に適用される単一の値を取るか、それらのリストと同じ長さの値のリストが必要です。

したがって、C、D、およびEはすべて同じ長さのリストに関連付けられているか、CとDは同じ長さのリストであり、Eは単一のfloatまたはintに関連付けられている必要があります。その単一のfloatまたはintがリスト内にある場合は、df['E'][0]のようにそれを抽出する必要があります。

matplotlibを含むyerrコードの例： http://matplotlib.org/1.2.1/examples/pylab_examples/errorbar_demo.html

yerrを説明する棒グラフAPIドキュメント： http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.bar