python - How to TABLESAMPLE with SQLAlchemy? -
is possible tablesample
on of tables in orm query in sqlalchemy?
the closest find query.suffix_with(*suffixes)
, places suffix on end of query, rather on end of table.
if helps, use case:
my query currently:
query = session.query(a).options( subqueryload(a.rel1), subqueryload(a.rel2) ).filter(a.id >= min_id, a.id < max_id, [...])
and tablesample
on table a
efficiently load subset of a
, corresponding relations, filters. using postgresql 9.5 database.
update
sqlalchemy 1.1 , newer support tablesample
selectables. fromclause
elements have method fromclause.tablesample()
(think table
s) , sqlalchemy.tablesample()
function can used selectables, example declarative model classes:
from sqlalchemy import tablesample, func sqlalchemy.orm import aliased # create alias uses system sampling method (default) a_sampled = aliased(a, tablesample(a, 2.5)) # create alias uses bernoulli sampling method a_bernoulli = aliased(a, tablesample(a, func.bernoulli(2.5)))
there's slight asymmetry in tablesample(a, ...)
, a
declarative class, returns tablesample
from-clause, has aliased if used model. otherwise acts fromclause
, e.g. table
underlying model class.
before
there seems no discussion tablesample
support @ time of writing this. sqlalchemy extensible, here's simple (read: may not work use cases) implementation of tablesample support:
from sqlalchemy.ext.compiler import compiles sqlalchemy.sql import alias, fromclause sqlalchemy.orm import aliased class tablesample(alias): __visit_name__ = 'tablesample' def __init__(self, selectable, argument=none, method=none, seed=none): super(tablesample, self).__init__(selectable) self.method = method self.argument = argument self.seed = seed def tablesample(element, argument=none, method=none, seed=none): if isinstance(element, fromclause): return tablesample(element, argument=argument, method=method, seed=seed) else: return aliased(element, tablesample(element.__table__, argument=argument, method=method, seed=seed)) @compiles(tablesample) def compile_tablesample(element, compiler, **kwargs): s = "%s tablesample %s(%s)" % ( compiler.visit_alias(element, **kwargs), element.method or 'system', element.argument) if element.seed: s += " repeatable (%s)" % compiler.process(element.seed, **kwargs) return s
the argument
should time being float representing percentage between 0 , 100 convenience, though postgresql accept real-valued expression. seed
on other hand compiled , literal python values have wrapped literal()
or like.
it should used in similar fashion aliased
:
a_sampled = tablesample(a, 2.5) query = session.query(a_sampled).options( subqueryload(a_sampled.rel1), subqueryload(a_sampled.rel2) ).filter(a_sampled.id >= min_id, a_sampled.id < max_id, [...])
Comments
Post a Comment