テキストファイルの行からデータを抽出する

Question

テキストファイルの行からデータを抽出する必要があります。データは、次のような形式の名前とスコアリング情報です。

Shyvana - 12/4/5 - Loss - 2012-11-22 Fizz - 12/4/5 - Win - 2012-11-22 Miss Fortune - 12/4/3 - Win - 2012-11-22

このファイルは私の小さなpythonプログラムの別の部分によって生成され、ユーザーに名前を要求し、ユーザーが入力した名前を検索して、名前のリストから有効であることを確認してから、killを要求します、死亡、アシスト、そして勝ったか負けたかを確認します。次に確認を求め、そのデータを新しい行のファイルに書き込み、そのように最後に日付を追加します。そのデータを準備するコード：

data = "%s - %s/%s/%s - %s - %s
" % ( champname, kills, deaths, assists, winloss, timestamp)

基本的に私はそのデータをプログラムの別の部分で読み取り、それをユーザーに表示し、特定の名前の時間平均のような計算を行いたいと思います。

私はpythonに不慣れであり、一般的なプログラミングの経験があまりないので、見つける文字列の分割とフォーマットの例のほとんどは、私が適応する方法を理解するにはあまりにも不可解です。ここで必要なものは何ですか、だれでも手伝ってくれるでしょうか？トークンの検索がより簡単になるように、書き込まれたデータを別の方法でフォーマットすることができますが、ファイル内で直接シンプルにする必要があります。

martineau · Accepted Answer

次のコードは、すべてをプレイヤー名をキーにした辞書に読み込みます。各プレーヤーに関連付けられた値は、それ自体がレコードとして機能する辞書であり、アイテムに関連付けられた名前付きフィールドが、さらに処理するのに適した形式に変換されます。

info = {} with open('scoring_info.txt') as input_file: for line in input_file: player, stats, outcome, date = ( item.strip() for item in line.split('-', 3)) stats = dict(Zip(('kills', 'deaths', 'assists'), map(int, stats.split('/')))) date = Tuple(map(int, date.split('-'))) info[player] = dict(Zip(('stats', 'outcome', 'date'), (stats, outcome, date))) print('info:') for player, record in info.items(): print(' player %r:' % player) for field, value in record.items(): print(' %s: %s' % (field, value)) # sample usage player = 'Fizz' print('
%s had %s kills in the game' % (player, info[player]['stats']['kills']))

出力：

info: player 'Shyvana': date: (2012, 11, 22) outcome: Loss stats: {'assists': 5, 'kills': 12, 'deaths': 4} player 'Miss Fortune': date: (2012, 11, 22) outcome: Win stats: {'assists': 3, 'kills': 12, 'deaths': 4} player 'Fizz': date: (2012, 11, 22) outcome: Win stats: {'assists': 5, 'kills': 12, 'deaths': 4} Fizz had 12 kills in the game

あるいは、ほとんどのデータを辞書に保持するのではなく、ネストされたフィールドへのアクセスを少し厄介にすることができます— info[player]['stats']['kills'] —代わりに、もう少し高度な「汎用」クラスを使用してそれらを保持し、info2[player].stats.kills代わりに。

説明のために、これは、C言語のStructデータ型に少し似ているため、私がstructという名前のクラスを使用してほぼ同じものです。

class Struct(object): """ Generic container object """ def __init__(self, **kwds): # keyword args define attribute names and values self.__dict__.update(**kwds) info2 = {} with open('scoring_info.txt') as input_file: for line in input_file: player, stats, outcome, date = ( item.strip() for item in line.split('-', 3)) stats = dict(Zip(('kills', 'deaths', 'assists'), map(int, stats.split('/')))) victory = (outcome.lower() == 'win') # change to boolean T/F date = dict(Zip(('year','month','day'), map(int, date.split('-')))) info2[player] = Struct(champ_name=player, stats=Struct(**stats), victory=victory, date=Struct(**date)) print('info2:') for rec in info2.values(): print(' player %r:' % rec.champ_name) print(' stats: kills=%s, deaths=%s, assists=%s' % ( rec.stats.kills, rec.stats.deaths, rec.stats.assists)) print(' victorious: %s' % rec.victory) print(' date: %d-%02d-%02d' % (rec.date.year, rec.date.month, rec.date.day)) # sample usage player = 'Fizz' print('
%s had %s kills in the game' % (player, info2[player].stats.kills))

出力：

info2: player 'Shyvana': stats: kills=12, deaths=4, assists=5 victorious: False date: 2012-11-22 player 'Miss Fortune': stats: kills=12, deaths=4, assists=3 victorious: True date: 2012-11-22 player 'Fizz': stats: kills=12, deaths=4, assists=5 victorious: True date: 2012-11-22 Fizz had 12 kills in the game

Calvin Cheng · Answer

テキストファイルの例からデータを読み取る方法は2つあります。

最初の方法

Pythonのcsvモジュールを使用して、区切り文字が-であることを指定できます。

参照 http://www.doughellmann.com/PyMOTW/csv/

2番目の方法

または、このcsvモジュールを使用したくない場合は、ファイルの各行を文字列として読み取った後で、splitメソッドを使用するだけで済みます。

f = open('myTextFile.txt', "r") lines = f.readlines() for line in lines: words = line.split("-") # words is a list (of strings from a line), delimited by "-".

したがって、上記の例では、champnameは実際にはwordsリストの最初の項目、つまりwords[0]になります。

Bas Wijnen · Answer

分割（ '-'）を使用してパーツを取得し、次にもう一度番号を取得する必要があります。

for line in yourfile.readlines (): data = line.split (' - ') nums = [int (x) for x in data[1].split ('/')]

Data []とnums []で必要なすべてのものを取得する必要があります。または、reモジュールを使用して、正規表現を作成することもできます。しかし、これはそのために十分に複雑に見えません。

BoppreH · Answer

# Iterates over the lines in the file. for line in open('data_file.txt'): # Splits the line in four elements separated by dashes. Each element is then # unpacked to the correct variable name. champname, score, winloss, timestamp = line.split(' - ') # Since 'score' holds the string with the three values joined, # we need to split them again, this time using a slash as separator. # This results in a list of strings, so we apply the 'int' function # to each of them to convert to integer. This list of integers is # then unpacked into the kills, deaths and assists variables kills, deaths, assists = map(int, score.split('/')) # Now you are you free to use the variables read to whatever you want. Since # kills, deaths and assists are integers, you can sum, multiply and add # them easily.

Dmitry Shevchenko · Answer

まず、行をデータフラグメントに分割します

>>> name, score, result, date = "Fizz - 12/4/5 - Win - 2012-11-22".split(' - ') >>> name 'Fizz' >>> score '12/4/5' >>> result 'Win' >>> date '2012-11-22'

次に、スコアを解析します

>>> k,d,a = map(int, score.split('/')) >>> k,d,a (12, 4, 5)

そして最後に、日付文字列を日付オブジェクトに変換します

>>> from datetime import datetime >>> datetime.strptime(date, '%Y-%M-%d').date() datetime.date(2012, 1, 22)

これで、すべてのパーツが解析され、データ型に正規化されました。