SQLAlchemyで一時テーブルを使用する

Question

SQLAlchemyで一時テーブルを使用して、既存のテーブルに対して結合しようとしています。これが今のところ

_engine = db.get_engine(db.app, 'MY_DATABASE') df = pd.DataFrame({"id": [1, 2, 3], "value": [100, 200, 300], "date": [date.today(), date.today(), date.today()]}) temp_table = db.Table('#temp_table', db.Column('id', db.Integer), db.Column('value', db.Integer), db.Column('date', db.DateTime)) temp_table.create(engine) df.to_sql(name='tempdb.dbo.#temp_table', con=engine, if_exists='append', index=False) query = db.session.query(ExistingTable.id).join(temp_table, temp_table.c.id == ExistingTable.id) out_df = pd.read_sql(query.statement, engine) temp_table.drop(engine) return out_df.to_dict('records') _

_to_sql_が実行する挿入ステートメントが実行されないため、これは結果を返しません（これは_sp_prepexec_を使用して実行されるためだと思いますが、それについては完全にはわかりません）。

次に、SQLステートメント（_CREATE TABLE #temp_table..._、_INSERT INTO #temp_table..._、_SELECT [id] FROM..._）を書き込んでから、pd.read_sql(query, engine)を実行してみました。エラーメッセージが表示される

この結果オブジェクトは行を返しません。それは自動的に閉じられました。

これは、ステートメントがSELECTだけではないためだと思いますか？

この問題を修正するにはどうすればよいですか（どちらの解決策も機能しますが、ハードコーディングされたSQLを回避するため、最初の解決策が望ましいでしょう）。明確にするために、私は既存のデータベース（ベンダーデータベース）のスキーマを変更することはできません。

van · Accepted Answer

一時テーブルに挿入されるレコードの数が少ない/中程度の場合、一時テーブルを作成する代わりにliteral subqueryまたはvalues CTEを使用する可能性があります。

# MODEL class ExistingTable(Base): __tablename__ = 'existing_table' id = sa.Column(sa.Integer, primary_key=True) name = sa.Column(sa.String) # ...

また、次のデータをtempテーブルに挿入するとします。

# This data retrieved from another database and used for filtering rows = [ (1, 100, datetime.date(2017, 1, 1)), (3, 300, datetime.date(2017, 3, 1)), (5, 500, datetime.date(2017, 5, 1)), ]

そのデータを含むCTEまたはサブクエリを作成します。

stmts = [ # @NOTE: optimization to reduce the size of the statement: # make type cast only for first row, for other rows DB engine will infer sa.select([ sa.cast(sa.literal(i), sa.Integer).label("id"), sa.cast(sa.literal(v), sa.Integer).label("value"), sa.cast(sa.literal(d), sa.DateTime).label("date"), ]) if idx == 0 else sa.select([sa.literal(i), sa.literal(v), sa.literal(d)]) # no type cast for idx, (i, v, d) in enumerate(rows) ] subquery = sa.union_all(*stmts) # Choose one option below. # I personally prefer B because one could reuse the CTE multiple times in the same query # subquery = subquery.alias("temp_table") # option A subquery = subquery.cte(name="temp_table") # option B

必要な結合とフィルターを使用して最終クエリを作成します。

query = ( session .query(ExistingTable.id) .join(subquery, subquery.c.id == ExistingTable.id) # .filter(subquery.c.date >= XXX_DATE) ) # TEMP: Test result output for res in query: print(res)

最後に、pandasデータフレームを取得します。

out_df = pd.read_sql(query.statement, engine) result = out_df.to_dict('records')

Mikhail Lobanov · Answer

別のソリューションを試すことができます-Process-Keyed Table

プロセスキー付きテーブルは、一時テーブルとして機能する永続テーブルです。プロセスがテーブルを同時に使用できるようにするために、テーブルにはプロセスを識別するための追加の列があります。これを行う最も簡単な方法は、グローバル変数@@ spidです（@@ spidはSQL ServerのプロセスIDです）。

...

プロセスキーの代替案の1つは、GUID（データ型uniqueidentifier）を使用することです。

http://www.sommarskog.se/share_data.html#prockeyed