Automatic Admin Systems - Semantics with Rails & Django
January 18th, 2008
The Magically Appearing Admin
Web developers using an MVC framework produce their websites playing with their models, views and controllers. Then by adding a few lines of magic an admin system appears which allows users to add/edit/delete/view/search their models.
Examples:
Django’s Magic Admin (Also NewFormsAdmin - a branch of Django focused on making it easier to customise auto-admin)
Ruby on rails Plugins:
- Streamlined framework - http://streamlinedframework.org/
- Admin magic/config - outside of the models
- Auto-Admin - http://code.trebex.net/auto-admin
- Admin magic/config - inside the models
Latent Semantic Analysis in Python
December 19th, 2007
Latent Semantic Analysis (LSA) is a mathematical method that tries to bring out latent relationships within a collection of documents. Rather than looking at each document isolated from the others it looks at all the documents as a whole and the terms within them to identify relationships.
An example of LSA:
Using a search engine search for “sand“.
Documents are returned which do not contain the search term “sand” but contains terms like “beach”.
LSA has identified a latent relationship, “sand” is semantically close to “beach”.
There are some very good papers which describing LSA in detail:
- An introduction to LSA: http://lsa.colorado.edu/papers/dp1.LSAintro.pdf
- Creating your own LSA space: http://www.andrew.cmu.edu/user/jquesada/pdf/bookSpacesRev1.pdf
- Latent Semantic analysis: http://en.wikipedia.org/wiki/Latent_semantic_indexing
This is an implementation of LSA in Python (2.4+). Thanks to scipy its rather simple!
Building a Vector Space Search Engine in Python
November 27th, 2007
A vector space search involves converting documents into vectors. Each dimension within the vectors represents a term. If a document contains that term then the value within the vector is greater than zero.
Here is an implementation of Vector space searching using python (2.4+). Read the rest of this entry »
Funkload Build script
November 23rd, 2007
Funkload is an open source python based unit testing tool. It serves as a good tool for load testing. We can use it to create a unit test which simulates a user browsing through a site. To test load run two simultaneous instances of the unit test and so on scaling up the number of concurrent instances.
Offical Site: http://funkload.nuxeo.org/
I have written a python based Funkload build script which:
- Builds the Funkload configuration for multiple sites
- Uses wget to generate sample of pages for load testing
- Runs load tests
- Builds HTML documentation from test results.
Keeping the Cache Hot
November 15th, 2007
Problem
The exipry of content within caching architectures is only identified when a user makes a request for expired data. Hence a % of the visitors to the site will not be able to take advantage of caching.Many different caching architectures are used within a typical dynamic site. Hence the solution needs to be cache agnostic.
Architecture
Emmao bot was the name given to the python program which is used to keep the cache hot.
Figure 1: Emmaobot Server UML Model

Solution
Emmao bot has been built to act as a user agent and request pages. mod_python is used to make the apache children log their requests in a special format. Emmao bot is running in the background as a daemon process. It can be run from the webhead or an alternative server. It examines the special apache log files and adds events for when these expiry. Lib event is used to manage these events. Pages have different rankings based on analysis as emmao runs. It uses this to ensure that the most important/popular/heavy pages never expiry. Also if there is a limit on the number of pages to focus on, rank can be used to decided which pages to ignore.
The Cost
Although the number of pages that emmao bit manages can be set to limit load on the webserver, there is still an increase in traffic due to Emmao Bot.
In live production environments with Emmao bot managing 10,000 pages I have not found the peformance outway the benfit of reducing maximum user fetch time.
Links
LibEvent http://monkey.org/~provos/libevent
ModPython http://www.modpython.org/