python - Speeding up templates in GAE-Py by aggregating RPC calls -


here's problem:

class city(model):   name = stringproperty()  class author(model):   name = stringproperty()   city = referenceproperty(city)  class post(model):   author = referenceproperty(author)   content = stringproperty() 

the code isn't important... django template:

{% post in posts %} <div>{{post.content}}</div> <div>by {{post.author.name}} {{post.author.city.name}}</div> {% endfor %} 

now lets first 100 posts using post.all().fetch(limit=100), , pass list template - happens?

it makes 200 more datastore gets - 100 each author, 100 each author's city.

this understandable, actually, since post has reference author, , author has reference city. __get__ accessor on post.author , author.city objects transparently , pull data (see this question).

some ways around are

  1. use post.author.get_value_for_datastore(post) collect author keys (see link above), , batch get them - trouble here need re-construct template data object... needs code , maintenance each model , handler.
  2. write accessor, cached_author, checks memcache author first , returns - problem here post.cached_author going called 100 times, mean 100 memcache calls.
  3. hold static key object map (and refresh maybe once in 5 minutes) if data doesn't have date. cached_author accessor can refer map.

all these ideas need code , maintenance, , they're not transparent. if

@prefetch def render_template(path, data)       template.render(path, data) 

turns out can... hooks , guido's instrumentation module both prove it. if @prefetch method wraps template render capturing keys requested can (atleast 1 level of depth) capture keys being requested, return mock objects, , batch on them. repeated depth levels, till no new keys being requested. final render intercept gets , return objects map.

this change total of 200 gets 3, transparently , without code. not mention cut down need memcache , in situations memcache can't used.

trouble don't know how (yet). before start trying, has else done this? or want help? or see massive flaw in plan?

i have been in similar situation. instead of referenceproperty, had parent/child relationships basics same. current solution not polished @ least efficient enough reports , things 200-1,000 entities, each several subsequent child entities require fetching.

you can manually search data in batches , set if want.

# given posts, fetches data template need # 2 key-only loads datastore. posts = get_the_posts()  author_keys = [post.author.get_value_for_datastore(x) x in posts] authors = db.get(author_keys)  city_keys = [author.city.get_value_for_datastore(x) x in authors] cities = db.get(city_keys)  post, author, city in zip(posts, authors, cities):   post.author = author   author.city = city 

now when render template, no additional queries or fetches done. it's rough around edges not live without pattern described.

also might consider validating none of entities none because db.get() return none if key bad. getting basic data validation though. similarly, need retry db.get() if there timeout, etc.

(finally, don't think memcache work primary solution. maybe secondary layer speed datastore calls, need work if memcache empty. also, memcache has several quotas such memcache calls , total data transferred. overusing memcache great way kill app dead.)


Comments

Popular posts from this blog

c++ - Convert big endian to little endian when reading from a binary file -

C#: Application without a window or taskbar item (background app) that can still use Console.WriteLine() -

unicode - Are email addresses allowed to contain non-alphanumeric characters? -