Pythonリスト内の出現回数をカウントする最速の方法

Question

Pythonリストがあり、アイテムの出現回数をカウントする最も簡単な方法は何か、'1'このリスト。私の実際のケースでは、アイテムは何万回も発生する可能性があるため、高速な方法が必要です。

['1', '1', '1', '1', '1', '1', '2', '2', '2', '2', '7', '7', '7', '10', '10']

collectionsモジュールは役立ちますか？私はPython 2.7を使用しています

Jakob Bowyer · Accepted Answer

a = ['1', '1', '1', '1', '1', '1', '2', '2', '2', '2', '7', '7', '7', '10', '10'] print a.count("1")

おそらくCレベルで大幅に最適化されています。

編集：私はランダムに大きなリストを生成しました。

In [8]: len(a) Out[8]: 6339347 In [9]: %timeit a.count("1") 10 loops, best of 3: 86.4 ms per loop

編集編集：これは collections.Counter で実行できます

a = Counter(your_list) print a['1']

最後のタイミングの例で同じリストを使用する

In [17]: %timeit Counter(a)['1'] 1 loops, best of 3: 1.52 s per loop

私のタイミングは単純化されており、多くのさまざまな要因を条件としていますが、パフォーマンスに関する良い手がかりを与えてくれます。

ここにいくつかのプロファイリングがあります

In [24]: profile.run("a.count('1')") 3 function calls in 0.091 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.091 0.091 <string>:1(<module>) 1 0.091 0.091 0.091 0.091 {method 'count' of 'list' objects} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Prof iler' objects} In [25]: profile.run("b = Counter(a); b['1']") 6339356 function calls in 2.143 seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 2.143 2.143 <string>:1(<module>) 2 0.000 0.000 0.000 0.000 _weakrefset.py:68(__contains__) 1 0.000 0.000 0.000 0.000 abc.py:128(__instancecheck__) 1 0.000 0.000 2.143 2.143 collections.py:407(__init__) 1 1.788 1.788 2.143 2.143 collections.py:470(update) 1 0.000 0.000 0.000 0.000 {getattr} 1 0.000 0.000 0.000 0.000 {isinstance} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Prof iler' objects} 6339347 0.356 0.000 0.356 0.000 {method 'get' of 'dict' objects}

Surya Prakash Singh · Answer

Counter辞書を使用することにより、すべての要素の出現回数と、pythonの最も一般的な要素の出現値を最も効率的な方法でリストします。

pythonリストが次の場合：-

l=['1', '1', '1', '1', '1', '1', '2', '2', '2', '2', '7', '7', '7', '10', '10']

pythonリスト内のすべてのアイテムの出現を見つけるには、以下を使用します：-

\>>from collections import Counter \>>c=Counter(l) \>>print c Counter({'1': 6, '2': 4, '7': 3, '10': 2})

pythonリスト：-内のアイテムの最大/最高のオカレンスを検索するには

\>>k=c.most_common() \>>k [('1', 6), ('2', 4), ('7', 3), ('10', 2)]

最高のもの：-

\>>k[0][1] 6

アイテムにはk [0] [0]を使用します

\>>k[0][0] '1'

リスト内でn番目に高いアイテムとその出現回数については、follow：-を使用します

** n = 2の場合**

\>>print k[n-1][0] # For item 2 \>>print k[n-1][1] # For value 4

Mahdi Ghelichi · Answer

ラムダとマップ関数の組み合わせも仕事をすることができます：

list_ = ['a', 'b', 'b', 'c'] sum(map(lambda x: x=="b", list_)) :2

J. Doe · Answer

pandasを_pd.Series_に変換してから.value_counts()を使用するだけで、listを使用できます

_import pandas as pd a = ['1', '1', '1', '1', '1', '1', '2', '2', '2', '2', '7', '7', '7', '10', '10'] a_cnts = pd.Series(a).value_counts().to_dict() Input >> a_cnts["1"], a_cnts["10"] Output >> (6, 2) _