python - Speeding up templates in GAE-Py by aggregating RPC calls -
here's problem:
class city(model): name = stringproperty() class author(model): name = stringproperty() city = referenceproperty(city) class post(model): author = referenceproperty(author) content = stringproperty()
the code isn't important... django template:
{% post in posts %} <div>{{post.content}}</div> <div>by {{post.author.name}} {{post.author.city.name}}</div> {% endfor %}
now lets first 100 posts using post.all().fetch(limit=100)
, , pass list template - happens?
it makes 200 more datastore gets - 100 each author, 100 each author's city.
this understandable, actually, since post has reference author, , author has reference city. __get__
accessor on post.author
, author.city
objects transparently , pull data (see this question).
some ways around are
- use
post.author.get_value_for_datastore(post)
collect author keys (see link above), , batch get them - trouble here need re-construct template data object... needs code , maintenance each model , handler. - write accessor,
cached_author
, checks memcache author first , returns - problem here post.cached_author going called 100 times, mean 100 memcache calls. - hold static key object map (and refresh maybe once in 5 minutes) if data doesn't have date.
cached_author
accessor can refer map.
all these ideas need code , maintenance, , they're not transparent. if
@prefetch def render_template(path, data) template.render(path, data)
turns out can... hooks , guido's instrumentation module both prove it. if @prefetch
method wraps template render capturing keys requested can (atleast 1 level of depth) capture keys being requested, return mock objects, , batch on them. repeated depth levels, till no new keys being requested. final render intercept gets , return objects map.
this change total of 200 gets 3, transparently , without code. not mention cut down need memcache , in situations memcache can't used.
trouble don't know how (yet). before start trying, has else done this? or want help? or see massive flaw in plan?
i have been in similar situation. instead of referenceproperty, had parent/child relationships basics same. current solution not polished @ least efficient enough reports , things 200-1,000 entities, each several subsequent child entities require fetching.
you can manually search data in batches , set if want.
# given posts, fetches data template need # 2 key-only loads datastore. posts = get_the_posts() author_keys = [post.author.get_value_for_datastore(x) x in posts] authors = db.get(author_keys) city_keys = [author.city.get_value_for_datastore(x) x in authors] cities = db.get(city_keys) post, author, city in zip(posts, authors, cities): post.author = author author.city = city
now when render template, no additional queries or fetches done. it's rough around edges not live without pattern described.
also might consider validating none of entities none
because db.get() return none if key bad. getting basic data validation though. similarly, need retry db.get() if there timeout, etc.
(finally, don't think memcache work primary solution. maybe secondary layer speed datastore calls, need work if memcache empty. also, memcache has several quotas such memcache calls , total data transferred. overusing memcache great way kill app dead.)
Comments
Post a Comment