The Django Book

Chapter 13: Caching

第十三章 缓存机制

Static Web sites, in which simple files are served directly to the Web, scale like crazy. But a fundamental tradeoff in dynamic Web sites is, well, theyre dynamic. Each time a user requests a page, the Web server makes all sorts of calculationsfrom database queries, to template rendering, to business logic to create the page that your sites visitor sees. From a processing-overhead perspective, this is quite expensive.

静态的网站的内容都是些简单的静态网页直接存储在服务器上,可以非常容易地达到非常惊人的访问量。但是动态网站因为是动态的,也就是说每次用户访问一个页面,服务器要执行数据库查询,启动模板,执行业务逻辑到最终生成一个你所看到的网页,这一切都是动态即时生成的。从处理器资源的角度来看,这是比较昂贵的。

For most Web applications, this overhead isnt a big deal. Most Web applications arent washingtonpost.com or Slashdot; theyre simply small- to medium-sized sites with so-so traffic. But for medium- to high-traffic sites, its essential to cut as much overhead as possible. Thats where caching comes in.

对于大多数网络应用来说,过载并不是大问题。因为大多数网络应用并不是washingtonpost.com或Slashdot;它们通常是很小很简单,或者是中等规模的站点,只有很少的流量。但是对于中等至大规模流量的站点来说,尽可能地解决过载问题是非常必要的。这就需要用到缓存了。

To cache something is to save the result of an expensive calculation so that you dont have to perform the calculation next time. Heres some pseudocode explaining how this would work for a dynamically generated Web page:

缓存的目的是为了避免重复计算,特别是对一些比较耗时间、资源的计算。下面的伪代码演示了如何对动态页面的结果进行缓存。

given a URL, try finding that page in the cache
if the page is in the cache:
    return the cached page
else:
    generate the page
    save the generated page in the cache (for next time)
    return the generated page

Django comes with a robust cache system that lets you save dynamic pages so they dont have to be calculated for each request. For convenience, Django offers different levels of cache granularity. You can cache the response of specific views, you can cache only the pieces that are difficult to produce, or you can cache your entire site.

为此,Django提供了一个稳定的缓存系统让你缓存动态页面的结果,这样在接下来有相同的请求就可以直接使用缓存中的数据,避免不必要的重复计算。另外Django还提供了不同粒度数据的缓存,例如:你可以缓存整个页面,也可以缓存某个部分,甚至缓存整个网站。

Django also works well with upstream caches, such as Squid (http://www.squid-cache.org/) and browser-based caches. These are the types of caches that you dont directly control but to which you can provide hints (via HTTP headers) about which parts of your site should be cached, and how.

Django也和”上游”缓存工作的很好,例如Squid(http://www.squid-cache.org)和基于浏览器的缓存,这些类型的缓存你不直接控制,但是你可以提供关于你的站点哪部分应该被缓存和怎样缓存的线索(通过HTTP头部)给它们

Read on to discover how to use Djangos caching system. When your site gets Slashdotted youll be happy you understand this material.

继续阅读来研究如何使用Django的缓存系统。当你的网站变成象Slashdot的时候,你会很很高兴理解了这部分材料

Setting Up the Cache

设定缓存

The cache system requires a small amount of setup. Namely, you have to tell it where your cached data should live, whether in a database, on the filesystem, or directly in memory. This is an important decision that affects your caches performance (yes, some cache types are faster than others). In-memory caching will generally be much faster than filesystem or database caching, because it lacks the overhead of hitting the filesystem or database.

缓存系统需要一些少量的设定工作,即你必需告诉它你的缓存数据在哪里—在数据库,文件系统或者直接在内存中,这是影响你的缓存性能的重要决定,是的,一些缓存类型要比其它的快,内存缓存通常比文件系统或数据库缓存快,因为前者没有访问文件系统或数据库的过度连接

Your cache preference goes in the CACHE_BACKEND setting in your settings file. If you use caching and do not specify CACHE_BACKEND , Django will use simple:/// by default. The following sections explain all available values for CACHE_BACKEND .

你的缓存选择在你的settings文件的 CACHE_BACKEND 设置中,如果你使用缓存但没有指定 CACHE_BACKEND ,Django将默认使用 simple:/// ,下面将解释 CACHE_BACKEND 的所有可得到的值

Memcached

内存缓冲

By far the fastest, most efficient type of cache available to Django, Memcached is an entirely memory-based cache framework originally developed to handle high loads at LiveJournal (http://www.livejournal.com/) and subsequently open-sourced by Danga Interactive (http://danga.com/). Its used by sites such as Slashdot and Wikipedia to reduce database access and dramatically increase site performance.

目前为止Django可得到的最快的最高效的缓存类型是基于内存的缓存框架Memcached,它起初开发来为LiveJournal.com处理高负荷并随后被Danga Interactive(http://www.danga.com)开源,它被Slashdot和Wikipedia等站点采用以减少数据库访问并极大的提升了站点性能

Memcached is available for free at http://danga.com/memcached/. It runs as a daemon and is allotted a specified amount of RAM. Its primary feature is to provide an interfacea super-lightning-fast interfacefor adding, retrieving, and deleting arbitrary data in the cache. All data is stored directly in memory, so theres no overhead of database or filesystem usage.

Memcached可以在http://danga.com/memcached/免费得到,它作为后台进程运行并分配一个指定数量的RAM.它能为你提供在缓存中*如闪电般快速的*添加,获取和删除任意数据,所有的数据直接存储在内存中,所以没有数据库和文件系统使用的过度使用

After installing Memcached itself, youll need to install the Memcached Python bindings, which are not bundled with Django directly. These bindings are in a single Python module, memcache.py , which is available at http://www.tummy.com/Community/software/python-memcached/.

在安装了Memcached本身之后,你将需要安装Memcached Python绑定,它没有直接和Django绑定,这些绑定在一个单独的Python模块中,’memcache.py’,可以在http://www.djangoproject.com/thirdparty/python-memcached得到

To use Memcached with Django, set CACHE_BACKEND to memcached://ip:port/ , where ip is the IP address of the Memcached daemon and port is the port on which Memcached is running.

设置 CACHE_BACKENDmemcached://ip:port/ 来让Django使用Memcached,这里的 ip 是Memcached后台进程的IP地址, port 则是Memcached运行所在的端口

In this example, Memcached is running on localhost (127.0.0.1) port 11211:

在这个例子中,Memcached运行在本地主机 (127.0.0.1)上,端口为11211:

CACHE_BACKEND = 'memcached://127.0.0.1:11211/'

One excellent feature of Memcached is its ability to share cache over multiple servers. This means you can run Memcached daemons on multiple machines, and the program will treat the group of machines as a single cache, without the need to duplicate cache values on each machine. To take advantage of this feature with Django, include all server addresses in CACHE_BACKEND , separated by semicolons.

Memcached的一个极好的特性是它在多个服务器分享缓存的能力,这意味着你可以在多台机器上运行Memcached进程,程序将会把这组机器当作一个*单独的*缓存,而不需要在每台机器上复制缓存值,为了让Django利用此特性,需要在CACHE_BACKEND里包含所有的服务器地址并用分号分隔

In this example, the cache is shared over Memcached instances running on the IP addresses 172.19.26.240 and 172.19.26.242, both of which are on port 11211:

这个例子中,缓存在运行在172.19.26.240和172.19.26.242的IP地址和11211端口的Memcached实例间分享:

CACHE_BACKEND = 'memcached://172.19.26.240:11211;172.19.26.242:11211/'

In the following example, the cache is shared over Memcached instances running on the IP addresses 172.19.26.240 (port 11211), 172.19.26.242 (port 11212), and 172.19.26.244 (port 11213):

这个例子中,缓存在运行在172.19.26.240(端口11211),172.19.26.242(端口11212),172.19.26.244(端口11213)的Memcached实例间分享:

CACHE_BACKEND = 'memcached://172.19.26.240:11211;172.19.26.242:11212;172.19.26.244:11213/'

A final point about Memcached is that memory-based caching has one important disadvantage. Because the cached data is stored only in memory, the data will be lost if your server crashes. Clearly, memory isnt intended for permanent data storage, so dont rely on memory-based caching as your only data storage. Without a doubt, none of the Django caching back-ends should be used for permanent storagetheyre all intended to be solutions for caching, not storagebut we point this out here because memory-based caching is particularly temporary.

最后关于Memcached的是基于内存的缓存有一个重大的缺点,因为缓存数据只存储在内存中,则如果服务器死机的话数据会丢失,显然内存不是为持久数据存储准备的,Django没有一个缓存后端是用来做持久存储的,它们都是缓存方案,而不是存储.但是我们在这里指出是因为基于内存的缓存特别的短暂 .

Database Caching

数据库缓存

To use a database table as your cache back-end, create a cache table in your database and point Djangos cache system at that table.

为了将数据库表作为缓存后端,需要在数据库中创建一个缓存表并将Django的缓存系统指向该表

First, create a cache table by running this command:

首先,使用如下语句创建一个缓存用数据表:

python manage.py createcachetable [cache_table_name]

where [cache_table_name] is the name of the database table to create. This name can be whatever you want, as long as its a valid table name thats not already being used in your database. This command creates a single table in your database that is in the proper format Djangos database-cache system expects.

这里的[cache_table_name]是要创建的数据库表名,名字可以是任何你想要的,只要它是合法的在你的数据库中没有被使用,这个命令在你的数据库创建一个遵循Django的数据库缓存系统期望形式的单独的表.

Once youve created that database table, set your CACHE_BACKEND setting to "db://tablename" , where tablename is the name of the database table. In this example, the cache tables name is my_cache_table :

一旦你创建了数据库表,设置你的CACHE_BACKEND设置为”db://tablename”,这里的tablename是数据库表的名字,在这个例子中,缓存表名为my_cache_table:

CACHE_BACKEND = 'db://my_cache_table'

The database caching back-end uses the same database as specified in your settings file. You cant use a different database back-end for your cache table.

数据库缓存后端使用你的settings文件指定的同一数据库,你不能为你的缓存表使用不同的数据库后端.

Filesystem Caching

文件系统缓存

To store cached items on a filesystem, use the "file://" cache type for CACHE_BACKEND , specifying the directory on your filesystem that should store the cached data.

使用”file://“缓存类型作为CACHE_BACKEND并指定存储缓存数据的文件系统目录来在文件系统存储缓存数据.

For example, to store cached data in /var/tmp/django_cache , use this setting:

例如,使用下面的设置来在/var/tmp/django_cache存储缓存数据:

CACHE_BACKEND = 'file:///var/tmp/django_cache'

Note that there are three forward slashes toward the beginning of the preceding example. The first two are for file:// , and the third is the first character of the directory path, /var/tmp/django_cache . If youre on Windows, put the drive letter after the file:// , like so:: file://c:/foo/bar .

注意例子中开头有三个前斜线,前两个是file://,第三个是目录路径的第一个字符,/var/tmp/django_cache,如果你使用Windows系统,把盘符字母放在file://后面,像这样:’file://c:/foo/bar‘.

The directory path should be absolute that is, it should start at the root of your filesystem. It doesnt matter whether you put a slash at the end of the setting.

目录路径应该是*绝对*路径,即应该以你的文件系统的根开始,你在设置的结尾放置斜线与否无关紧要.

Make sure the directory pointed to by this setting exists and is readable and writable by the system user under which your Web server runs. Continuing the preceding example, if your server runs as the user apache , make sure the directory /var/tmp/django_cache exists and is readable and writable by the user apache .

确认该设置指向的目录存在并且你的Web服务器运行的系统的用户可以读写该目录,继续上面的例子,如果你的服务器以用户apache运行,确认/var/tmp/django_cache存在并且用户apache可以读写/var/tmp/django_cache目录

Each cache value will be stored as a separate file whose contents are the cache data saved in a serialized (pickled) format, using Pythons pickle module. Each files name is the cache key, escaped for safe filesystem use.

每个缓存值将被存储为单独的文件,其内容是Python的pickle模块以序列化(“pickled”)形式保存的缓存数据,每个文件的 文件名是缓存键,以规避开安全文件系统的使用

Local-Memory Caching

本地内存缓存

If you want the speed advantages of in-memory caching but dont have the capability of running Memcached, consider the local-memory cache back-end. This cache is per-process and thread-safe, but it isnt as efficient as Memcached due to its simplistic locking and memory allocation strategies.

如果你想要内存缓存的速度优势但没有能力运行Memcached,可以考虑使用本地存储器缓存后端,该缓存是多线程和线程安全 的,但是由于其简单的锁和内存分配策略它没有Memcached高效

To use it, set CACHE_BACKEND to 'locmem:///' , for example:

设置 CACHE_BACKENDlocmem:/// 来使用它,例如:

CACHE_BACKEND = 'locmem:///'

Simple Caching (for Development)

简易缓存(用于开发阶段)

A simple, single-process memory cache is available as 'simple:///' , for example:

可以通过配置 'simple:///' 来使用一个简单的单进程内存缓存,例如:

CACHE_BACKEND = 'simple:///'

This cache merely saves cached data in process, which means it should be used only in development or testing environments.

这个缓存仅仅是将数据保存在进程内,因此它应该只在开发环境或测试环境中使用.

Dummy Caching (for Development)

仿缓存(供开发时使用)

Finally, Django comes with a dummy cache that doesnt actually cache; it just implements the cache interface without doing anything.

最后,Django提供一个假缓存的设置:它仅仅实现了缓存的接口而不做任何实际的事情

This is useful if you have a production site that uses heavy-duty caching in various places and a development/test environment on which you dont want to cache. In that case, set CACHE_BACKEND to 'dummy:///' in the settings file for your development environment, for example:

这是个有用的特性,如果你的线上站点使用了很多比较重的缓存,而在开发环境中却不想使用缓存,那么你只要修改配置文件,将 CACHE_BACKEND 设置为 'dummy:///' 就可以了,例如:

CACHE_BACKEND = 'dummy:///'

As a result, your development environment wont use caching, but your production environment still will.

这样的结果就是你的开发环境没有使用缓存,而线上环境依然在使用缓存.

CACHE_BACKEND Arguments

CACHE_BACKEND参数

Each cache back-end may take arguments. Theyre given in query-string style on the CACHE_BACKEND setting. Valid arguments are as follows:

每个缓存后端都可能使用参数,它们在CACHE_BACKEND设置中以查询字符串形式给出,合法的参数为:

timeout : The default timeout, in seconds, to use for the cache. This argument defaults to 300 seconds (5 minutes).

timeout:用于缓存的过期时间,以秒为单位。这个参数默认被设置为300秒(五分钟)

max_entries : For the simple, local-memory, and database back-ends, the maximum number of entries allowed in the cache before old values are deleted. This argument defaults to 300.

max_entries : 对于simple, local-memory与database类型的缓存,这个参数是指定缓存中存放的最大条目数,大于这个数时,旧的条目将会被删除。这个参数默认是300.

cull_frequency : The ratio of entries that are culled when max_entries is reached. The actual ratio is 1/cull_frequency , so set cull_frequency=2 to cull half of the entries when max_entries is reached.

cull_frequency :当达到 max_entries 的时候,被接受的访问的比率。实际的比率是 1/cull_frequency ,所以设置cull_frequency=2就是在达到 max_entries 的时候去除一半数量的缓存

A value of 0 for cull_frequency means that the entire cache will be dumped when max_entries is reached. This makes culling much faster at the expense of more cache misses. This argument defaults to 3.

cull_frequency 的值设置为 0 意味着当达到 max_entries 时,缓存将被清空。这将以很多缓存丢失为代价,大大提高接受访问的速度。这个值默认是3

In this example, timeout is set to 60 :

在这个例子中, timeout 被设成 60

CACHE_BACKEND = "locmem:///?timeout=60"

In this example, timeout is 30 and max_entries is 400 :

而在这个例子中, timeout 设为 30max_entries400 :

CACHE_BACKEND = "locmem:///?timeout=30&max_entries=400"

Invalid arguments are silently ignored, as are invalid values of known arguments.

其中,非法的参数与非法的参数值都将被忽略。

The Per-Site Cache

站点级 Cache

Once youve specified CACHE_BACKEND , the simplest way to use caching is to cache your entire site. This means each page that doesnt have GET or POST parameters will be cached for a specified amount of time the first time its requested.

一旦你指定了”CACHE_BACKEND”,使用缓存的最简单的方法就是缓存你的整个网站。这意味着所有不包含GET或POST参数的页面在第一次被请求之后将被缓存指定好的一段时间(就是设置的timeout参数)。

To activate the per-site cache, just add 'django.middleware.cache.CacheMiddleware' to your MIDDLEWARE_CLASSES setting, as in this example:

要激活每个站点的cache,只要将``’django.middleware.cache.CacheMiddleware’`` 添加到 MIDDLEWARE_CLASSES 的设置里,就像下面这样:

MIDDLEWARE_CLASSES = (
    'django.middleware.cache.CacheMiddleware',
    'django.middleware.common.CommonMiddleware',
)

Note

注意

The order of MIDDLEWARE_CLASSES matters. See the section Order of MIDDLEWARE_CLASSES later in this chapter.

关于 MIDDLEWARE_CLASSES 顺序的一些事情。请看本章节后面的MIDDLEWARE_CLASSES顺序部分。

Then, add the following required settings to your Django settings file:

然后,在你的Django settings文件里加入下面所需的设置:

  • CACHE_MIDDLEWARE_SECONDS : The number of seconds each page should be cached.

CACHE_MIDDLEWARE_SECONDS :每个页面应该被缓存的秒数

  • CACHE_MIDDLEWARE_KEY_PREFIX : If the cache is shared across multiple sites using the same Django installation, set this to the name of the site, or some other string that is unique to this Django instance, to prevent key collisions. Use an empty string if you dont care.

  • CACHE_MIDDLEWARE_KEY_PREFIX :如果缓存被多个使用相同Django安装的网站所共享,那么把这个值设成当前网站名,或其他能代表这个Django实例的唯一字符串,以避免key发生冲突。如果你不在意的话可以设成空字符串。

The cache middleware caches every page that doesnt have GET or POST parameters. That is, if a user requests a page and passes GET parameters in a query string, or passes POST parameters, the middleware will not attempt to retrieve a cached version of the page. If you intend to use the per-site cache, keep this in mind as you design your application; dont use URLs with query strings, for example, unless it is acceptable for your application not to cache those pages.

缓存中间件缓存每个没有GET或者POST参数的页面,即如果用户请求页面并在查询字符串里传递GET参数或者POST参数,中间件将不会尝试得到缓存版本的页面,如果你打算使用整站缓存,设计你的程序时牢记这点,例如,不要使用拥有查询字符串的URLs,除非那些页面可以不缓存

The cache middleware supports another setting, CACHE_MIDDLEWARE_ANONYMOUS_ONLY . If youve defined this setting, and its set to True , then the cache middleware will only cache anonymous requests (i.e., those requests made by a non-logged-in user). This is a simple and effective way of disabling caching for any user-specific pages, such as Djangos admin interface. Note that if you use CACHE_MIDDLEWARE_ANONYMOUS_ONLY , you should make sure youve activated AuthenticationMiddleware and that AuthenticationMiddleware appears before CacheMiddleware in your MIDDLEWARE_CLASSES .

缓存中间件( cache middleware)支持另外一种设置选项, CACHE_MIDDLEWARE_ANONYMOUS_ONLY 。如果你把它设置为“True”,那么缓存中间件就只会对匿名请求进行缓存, 匿名请求是指那些 没有登录的用户发起的请求。如果想取消用户相关页面(user-specific pages)的缓存,例如Djangos 的管理界面,这是一种既简单又有效的方法。另外,如果你要使用 CACHE_MIDDLEWARE_ANONYMOUS_ONLY 选项,你必须先激活 AuthenticationMiddleware 才行,也就是在你的配置文件 MIDDLEWARE_CLASSES 的地方, AuthenticationMiddleware 必须出现在 CacheMiddleware 前面。

Finally, note that CacheMiddleware automatically sets a few headers in each HttpResponse :

最后,再提醒一下: CacheMiddleware 在每个 HttpResponse 中都会自动设置一些头部信息(headers)

  • It sets the Last-Modified header to the current date/time when a fresh (uncached) version of the page is requested.

  • 当一个新(没缓存的)版本的页面被请求时设置Last-Modified头部为当前日期/时间

  • It sets the Expires header to the current date/time plus the defined CACHE_MIDDLEWARE_SECONDS .

  • 设置Expires头部为当前日期/时间加上定义的CACHE_MIDDLEWARE_SECONDS

  • It sets the Cache-Control header to give a maximum age for the page, again from the CACHE_MIDDLEWARE_SECONDS setting.

  • 设置Cache-Control头部来给页面一个最大的时间—再一次,根据CACHE_MIDDLEWARE_SECONDS设置

The Per-View Cache

视图级缓存

A more granular way to use the caching framework is by caching the output of individual views. This has the same effects as the per-site cache (including the omission of caching on requests with GET and POST parameters). It applies to whichever views you specify, rather than the whole site.

更加颗粒级的缓存框架使用方法是对单个视图的输出进行缓存。这和整站级缓存有一样的效果(包括忽略对有 GET 和 POST 参数的请求的缓存)。它应用于你所指定的视图,而不是整个站点。

Do this by using a decorator , which is a wrapper around your view function that alters its behavior to use caching. The per-view cache decorator is called cache_page and is located in the django.views.decorators.cache module, for example:

完成这项工作的方式是使用 修饰器 ,其作用是包裹视图函数,将其行为转换为使用缓存。视图缓存修饰器称为 cache_page ,位于 django.views.decorators.cache 模块中,例如:

from django.views.decorators.cache import cache_page

def my_view(request, param):
    # ...
my_view = cache_page(my_view, 60 * 15)

Alternatively, if youre using Python 2.4 or greater, you can use decorator syntax. This example is equivalent to the preceding one:

It’s not my first time to go to see this website, i am visiting this web page dailly and obtain nice facts from here daily.

from django.views.decorators.cache import cache_page

@cache_page(60 * 15)
def my_view(request, param):
    # ...

cache_page takes a single argument: the cache timeout, in seconds. In the preceding example, the result of the my_view() view will be cached for 15 minutes. (Note that weve written it as 60 * 15 for the purpose of readability. 60 * 15 will be evaluated to 900 that is, 15 minutes multiplied by 60 seconds per minute.)

cache_page 只接受一个参数:以秒计的缓存超时。在前例中, “my_view()” 视图的结果将被缓存 15 分钟。(注意:为了提高可读性,该参数被书写为 60 * 1560 * 15 将被计算为 900 ,也就是说15 分钟乘以每分钟 60 秒。)

The per-view cache, like the per-site cache, is keyed off of the URL. If multiple URLs point at the same view, each URL will be cached separately. Continuing the my_view example, if your URLconf looks like this:

和站点缓存一样,视图缓存与 URL 无关。如果多个 URL 指向同一视图,每个视图将会分别缓存。继续 my_view 范例,如果 URLconf 如下所示:

urlpatterns = ('',
    (r'^foo/(\d{1,2})/$', my_view),
)

then requests to /foo/1/ and /foo/23/ will be cached separately, as you may expect. But once a particular URL (e.g., /foo/23/ ) has been requested, subsequent requests to that URL will use the cache.

那么正如你所期待的那样,发送到 /foo/1//foo/23/ 的请求将会分别缓存。但一旦发出了特定的请求(如: /foo/23/ ),之后再度发出的指向该 URL 的请求将使用缓存。

Specifying Per-View Cache in the URLconf

在 URLconf 中指定视图缓存

The examples in the previous section have hard-coded the fact that the view is cached, because cache_page alters the my_view function in place. This approach couples your view to the cache system, which is not ideal for several reasons. For instance, you might want to reuse the view functions on another, cacheless site, or you might want to distribute the views to people who might want to use them without being cached. The solution to these problems is to specify the per-view cache in the URLconf rather than next to the view functions themselves.

前一节中的范例将视图硬编码为使用缓存,因为 cache_page 在适当的位置对 my_view 函数进行了转换。该方法将视图与缓存系统进行了耦合,从几个方面来说并不理想。例如,你可能想在某个无缓存的站点中重用该视图函数,或者你可能想将该视图发布给那些不想通过缓存使用它们的人。解决这些问题的方法是在 URLconf 中指定视图缓存,而不是紧挨着这些视图函数本身来指定。

Doing so is easy: simply wrap the view function with cache_page when you refer to it in the URLconf. Heres the old URLconf from earlier:

完成这项工作非常简单:在 URLconf 中用到这些视图函数的时候简单地包裹一个 cache_page 。以下是刚才用到过的 URLconf :

urlpatterns = ('',
    (r'^foo/(\d{1,2})/$', my_view),
)

Heres the same thing, with my_view wrapped in cache_page :

以下是同一个 URLconf ,不过用 cache_page 包裹了 my_view

from django.views.decorators.cache import cache_page

urlpatterns = ('',
    (r'^foo/(\d{1,2})/$', cache_page(my_view, 60 * 15)),
)

If you take this approach, dont forget to import cache_page within your URLconf.

如果采取这种方法, 不要忘记在 URLconf 中导入 cache_page .

The Low-Level Cache API

低层次缓存API

Sometimes, caching an entire rendered page doesnt gain you very much and is, in fact, inconvenient overkill.

有些时候,对整个经解析的页面进行缓存并不会给你带来太多,事实上可能会过犹不及。

Perhaps, for instance, your site includes a view whose results depend on several expensive queries, the results of which change at different intervals. In this case, it would not be ideal to use the full-page caching that the per-site or per-view cache strategies offer, because you wouldnt want to cache the entire result (since some of the data changes often), but youd still want to cache the results that rarely change.

比如说,也许你的站点所包含的一个视图依赖几个费时的查询,每隔一段时间结果就会发生变化。在这种情况下,使用站点级缓存或者视图级缓存策略所提供的整页缓存并不是最理想的,因为你可能不会想对整个结果进行缓存(因为一些数据经常变化),但你仍然会想对很少变化的部分进行缓存。

For cases like this, Django exposes a simple, low-level cache API, which lives in the module django.core.cache . You can use the low-level cache API to store objects in the cache with any level of granularity you like. You can cache any Python object that can be pickled safely: strings, dictionaries, lists of model objects, and so forth. (Most common Python objects can be pickled; refer to the Python documentation for more information about pickling.)

在像这样的情形下, Django 展示了一种位于 django.core.cache 模块中的简单、低层次的缓存 API 。你可以使用这种低层次的缓存 API 在缓存中以任何级别粒度进行对象储存。你可以对所有能够安全进行 pickle 处理的 Python 对象进行缓存:字符串、字典和模型对象列表等等;查阅 Python 文档可以了解到更多关于 pickling 的信息。)

Heres how to import the API:

下面是如何导入这个 API :

>>> from django.core.cache import cache

The basic interface is set(key, value, timeout_seconds) and get(key) :

基本的接口是 set(key, value, timeout_seconds)get(key) :

>>> cache.set('my_key', 'hello, world!', 30)
>>> cache.get('my_key')
'hello, world!'

The timeout_seconds argument is optional and defaults to the timeout argument in the CACHE_BACKEND setting explained earlier.

timeout_seconds 参数是可选的, 并且默认为前面讲过的 CACHE_BACKEND 设置中的 timeout 参数.

If the object doesnt exist in the cache, or the cache back-end is unreachable, cache.get() returns None :

如果对象在缓存中不存在, 或者缓存后端是不可达的, cache.get() 返回 None :

# Wait 30 seconds for 'my_key' to expire...

>>> cache.get('my_key')
None

>>> cache.get('some_unset_key')
None

We advise against storing the literal value None in the cache, because you wont be able to distinguish between your stored None value and a cache miss signified by a return value of None .

我们不建议在缓存中保存 None 常量,因为你将无法区分所保存的 None 变量及由返回值 None 所标识的缓存未中。

cache.get() can take a default argument. This specifies which value to return if the object doesnt exist in the cache:

cache.get() 接受一个 缺省 参数。其指定了当缓存中不存在该对象时所返回的值:

>>> cache.get('my_key', 'has expired')
'has expired'

To retrieve multiple cache values in a single shot, use cache.get_many() . If possible for the given cache back-end, get_many() will hit the cache only once, as opposed to hitting it once per cache key. get_many() returns a dictionary with all of the keys you asked for that exist in the cache and havent expired:

要想一次获取多个缓存值,可以使用 cache.get_many() 。如果可能的话,对于给定的缓存后端, get_many() 将只访问缓存一次,而不是对每个缓存键值都进行一次访问。 get_many() 所返回的字典包括了你所请求的存在于缓存中且未超时的所有键值。

>>> cache.set('a', 1)
>>> cache.set('b', 2)
>>> cache.set('c', 3)
>>> cache.get_many(['a', 'b', 'c'])
{'a': 1, 'b': 2, 'c': 3}

If a cache key doesnt exist or is expired, it wont be included in the dictionary. The following is a continuation of the example:

如果某个缓存关键字不存在或者已超时, 它将不会被包含在字典中。 下面是范例的延续:

>>> cache.get_many(['a', 'b', 'c', 'd'])
{'a': 1, 'b': 2, 'c': 3}

Finally, you can delete keys explicitly with cache.delete() . This is an easy way of clearing the cache for a particular object:

最后,你可以用 cache.delete() 显式地删除关键字。这是在缓存中清除特定对象的简单途径。

>>> cache.delete('a')

cache.delete() has no return value, and it works the same way whether or not a value with the given cache key exists.

cache.delete() 没有返回值, 不管给定的缓存关键字对应的值存在与否, 它都将以同样方式工作。

Upstream Caches

上游缓存

So far, this chapter has focused on caching your own data. But another type of caching is relevant to Web development, too: caching performed by upstream caches. These are systems that cache pages for users even before the request reaches your Web site.

目前为止,本章的焦点一直是对你 自己的 数据进行缓存。但还有一种与 Web 开发相关的缓存:由 上游 高速缓存执行的缓冲。有一些系统甚至在请求到达站点之前就为用户进行页面缓存。

Here are a few examples of upstream caches:

下面是上游缓存的几个例子:

  • Your ISP may cache certain pages, so if you requested a page from http://example.com/, your ISP would send you the page without having to access example.com directly. The maintainers of example.com have no knowledge of this caching; the ISP sits between example.com and your Web browser, handling all of the caching transparently.

  • 你的 ISP (互联网服务商)可能会对特定的页面进行缓存,因此如果你向 http://example.com/ 请求一个页面,你的 ISP 可能无需直接访问 example.com 就能将页面发送给你。而 example.com 的维护者们却无从得知这种缓存,ISP 位于 example.com 和你的网页浏览器之间,透明地处理所有的缓存。

  • Your Django Web site may sit behind a proxy cache , such as Squid Web Proxy Cache (http://www.squid-cache.org/), that caches pages for performance. In this case, each request first would be handled by the proxy, and it would be passed to your application only if needed.

  • 你的 Django 网站可能位于某个 代理缓存 之后,例如 Squid 网页代理缓存 (http://www.squid-cache.org/),该缓存为提高性能而对页面进行缓存。在此情况下 ,每个请求将首先由代理服务器进行处理,然后仅在需要的情况下才被传递至你的应用程序。

  • Your Web browser caches pages, too. If a Web page sends out the appropriate headers, your browser will use the local cached copy for subsequent requests to that page, without even contacting the Web page again to see whether it has changed.

  • 你的网页浏览器也对页面进行缓存。如果某网页送出了相应的头部,你的浏览器将在为对该网页的后续的访问请求使用本地缓存的拷贝,甚至不会再次联系该网页查看是否发生了变化。

Upstream caching is a nice efficiency boost, but theres a danger to it. The content of many Web pages differs based on authentication and a host of other variables, and cache systems that blindly save pages based purely on URLs could expose incorrect or sensitive data to subsequent visitors to those pages.

上游缓存将会产生非常明显的效率提升,但也存在一定风险。许多网页的内容依据身份验证以及许多其他变量的情况发生变化,缓存系统仅盲目地根据 URL 保存页面,可能会向这些页面的后续访问者暴露不正确或者敏感的数据。

For example, say you operate a Web e-mail system, and the contents of the inbox page obviously depend on which user is logged in. If an ISP blindly cached your site, then the first user who logged in through that ISP would have his or her user-specific inbox page cached for subsequent visitors to the site. Thats not cool.

举个例子,假定你在使用网页电邮系统,显然收件箱页面的内容取决于登录的是哪个用户。如果 ISP 盲目地缓存了该站点,那么第一个用户通过该 ISP 登录之后,他(或她)的用户收件箱页面将会缓存给后续的访问者。这一点也不好玩。

Fortunately, HTTP provides a solution to this problem. A number of HTTP headers exist to instruct upstream caches to differ their cache contents depending on designated variables, and to tell caching mechanisms not to cache particular pages. Well look at some of these headers in the sections that follow.

幸运的是, HTTP 提供了解决该问题的方案。已有一些 HTTP 头标用于指引上游缓存根据指定变量来区分缓存内容,并通知缓存机制不对特定页面进行缓存。我们将在本节后续部分将对这些头标进行阐述。

Using Vary Headers

使用 Vary 头标

The Vary header defines which request headers a cache mechanism should take into account when building its cache key. For example, if the contents of a Web page depend on a users language preference, the page is said to vary on language.

Vary 头标定义了缓存机制在构建其缓存键值时应当将哪个请求头标考虑在内。例如,如果网页的内容取决于用户的语言偏好,该页面被称为根据语言而不同。

By default, Djangos cache system creates its cache keys using the requested path (e.g., "/stories/2005/jun/23/bank_robbed/" ). This means every request to that URL will use the same cached version, regardless of user-agent differences such as cookies or language preferences. However, if this page produces different content based on some difference in request headerssuch as a cookie, or a language, or a user-agentyoull need to use the Vary header to tell caching mechanisms that the page output depends on those things.

缺省情况下,Django 的缓存系统使用所请求的路径(比如: "/stories/2005/jun/23/bank_robbed/" )来创建其缓存键。这意味着对该 URL 的每个请求都将使用同一个已缓存版本,而不考虑 cookies 或语言偏好之类的 user-agent 差别。然而,如果该页面基于请求头标的区别(例如 cookies、语言或者 user-agent)产生不同内容,你就不得不使用

Vary 头标来通知缓存机制:该页面的输出取决于这些东西。

To do this in Django, use the convenient vary_on_headers view decorator, like so:

要在 Django 完成这项工作,可使用便利的 vary_on_headers 视图修饰器,如下所示:

from django.views.decorators.vary import vary_on_headers

# Python 2.3 syntax.
def my_view(request):
    # ...
my_view = vary_on_headers(my_view, 'User-Agent')

# Python 2.4+ decorator syntax.
@vary_on_headers('User-Agent')
def my_view(request):
    # ...

In this case, a caching mechanism (such as Djangos own cache middleware) will cache a separate version of the page for each unique user-agent.

在这种情况下,缓存装置(如 Django 自己的缓存中间件)将会为每一个单独的用户浏览器缓存一个独立的页面版本。

The advantage to using the vary_on_headers decorator rather than manually setting the Vary header (using something like response['Vary'] = 'user-agent' ) is that the decorator adds to the Vary header (which may already exist), rather than setting it from scratch and potentially overriding anything that was already in there.

使用 vary_on_headers 修饰器而不是手动设置 Vary 头标(使用像 response['Vary'] = 'user-agent' 之类的代码)的好处是修饰器在(可能已经存在的) Vary 之上进行 添加 ,而不是从零开始设置,且可能覆盖该处已经存在的设置。

You can pass multiple headers to vary_on_headers() :

你可以向 vary_on_headers() 传入多个头标:

@vary_on_headers('User-Agent', 'Cookie')
def my_view(request):
    # ...

This tells upstream caches to vary on both , which means each combination of user-agent and cookie will get its own cache value. For example, a request with the user-agent Mozilla and the cookie value foo=bar will be considered different from a request with the user-agent Mozilla and the cookie value foo=ham .

该段代码通知上游缓存对 两者 都进行不同操作,也就是说 user-agent 和 cookie 的每种组合都应获取自己的缓存值。举例来说,使用 Mozilla 作为 user-agent 而 foo=bar 作为 cookie 值的请求应该和使用 Mozilla 作为 user-agent 而 foo=ham 的请求应该被视为不同请求。

Because varying on cookie is so common, theres a vary_on_cookie decorator. These two views are equivalent:

由于根据 cookie 而区分对待是很常见的情况,因此有 vary_on_cookie 修饰器。以下两个视图是等效的:

@vary_on_cookie
def my_view(request):
    # ...

@vary_on_headers('Cookie')
def my_view(request):
    # ...

The headers you pass to vary_on_headers are not case sensitive; "User-Agent" is the same thing as "user-agent" .

传入 vary_on_headers 头标是大小写不敏感的; "User-Agent""user-agent" 完全相同。

You can also use a helper function, django.utils.cache.patch_vary_headers , directly. This function sets, or adds to, the Vary header , for example:

你也可以直接使用帮助函数: django.utils.cache.patch_vary_headers 。该函数设置或增加 Vary header ,例如:

from django.utils.cache import patch_vary_headers

def my_view(request):
    # ...
    response = render_to_response('template_name', context)
    patch_vary_headers(response, ['Cookie'])
    return response

patch_vary_headers takes an HttpResponse instance as its first argument and a list/tuple of case-insensitive header names as its second argument.

patch_vary_headers 以一个 HttpResponse 实例为第一个参数,以一个大小写不敏感的头标名称列表或元组为第二个参数。

Other Cache Headers

其它缓存头标

Other problems with caching are the privacy of data and the question of where data should be stored in a cascade of caches.

关于缓存剩下的问题是数据的私隐性以及关于在级联缓存中数据应该在何处储存的问题。

A user usually faces two kinds of caches: his or her own browser cache (a private cache) and his or her providers cache (a public cache). A public cache is used by multiple users and controlled by someone else. This poses problems with sensitive datayou dont want, say, your bank account number stored in a public cache. So Web applications need a way to tell caches which data is private and which is public.

通常用户将会面对两种缓存:他或她自己的浏览器缓存(私有缓存)以及他或她的提供者缓存(公共缓存)。公共缓存由多个用户使用,而受其他某人的控制。这就产生了你不想遇到的敏感数据的问题,比如说你的银行账号被存储在公众缓存中。因此,Web 应用程序需要以某种方式告诉缓存那些数据是私有的,哪些是公共的。

The solution is to indicate a pages cache should be private. To do this in Django, use the cache_control view decorator:

解决方案是标示出某个页面缓存应当是私有的。要在 Django 中完成此项工作,可使用 cache_control 视图修饰器:

from django.views.decorators.cache import cache_control

@cache_control(private=True)
def my_view(request):
    # ...

This decorator takes care of sending out the appropriate HTTP header behind the scenes.

该修饰器负责在后台发送相应的 HTTP 头标。

There are a few other ways to control cache parameters. For example, HTTP allows applications to do the following:

还有一些其他方法可以控制缓存参数。例如, HTTP 允许应用程序执行如下操作:

  • Define the maximum time a page should be cached.

  • 定义页面可以被缓存的最大次数。

  • Specify whether a cache should always check for newer versions, only delivering the cached content when there are no changes. (Some caches might deliver cached content even if the server page changed, simply because the cache copy isnt yet expired.)

  • 指定某个缓存是否总是检查较新版本,仅当无更新时才传递所缓存内容。(一些缓存即便在服务器页面发生变化的情况下都可能还会传送所缓存的内容,只因为缓存拷贝没有过期。)

In Django, use the cache_control view decorator to specify these cache parameters. In this example, cache_control tells caches to revalidate the cache on every access and to store cached versions for, at most, 3,600 seconds:

在 Django 中,可使用 cache_control 视图修饰器指定这些缓存参数。在本例中, cache_control 告诉缓存对每次访问都重新验证缓存并在最长 3600 秒内保存所缓存版本:

from django.views.decorators.cache import cache_control
@cache_control(must_revalidate=True, max_age=3600)
def my_view(request):
    ...

Any valid Cache-Control HTTP directive is valid in cache_control() . Heres a full list:

cache_control() 中,任何有效 Cache-Control HTTP 指令都是有效的。以下是一个完整的清单:

  • public=True

  • public=True

  • private=True

  • private=True

  • no_cache=True

  • no_cache=True

  • no_transform=True

  • no_transform=True

  • must_revalidate=True

  • must_revalidate=True

  • proxy_revalidate=True

  • proxy_revalidate=True

  • max_age=num_seconds

  • max_age=num_seconds

  • s_maxage=num_seconds

  • s_maxage=num_seconds

Tip

小提示

For explanation of Cache-Control HTTP directives, see the specification at http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.

要了解有关 Cache-Control HTTP 指令的相关解释, 可以查阅 http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9 的规范文档。

Note

注意

The caching middleware already sets the cache headers max-age with the value of the CACHE_MIDDLEWARE_SETTINGS setting. If you use a custom max_age in a cache_control decorator, the decorator will take precedence, and the header values will be merged correctly.)

缓存中间件已经使用 CACHE_MIDDLEWARE_SETTINGS 设置设定了缓存头标 max-age 。如果你在 cache_control 修饰器中使用了自定义的 max_age ,该修饰器将会取得优先权,该头标的值将被正确地被合并。)

Other Optimizations

其他优化

Django comes with a few other pieces of middleware that can help optimize your applications performance:

Django 带有一些其它中间件可帮助您优化应用程序的性能:

  • django.middleware.http.ConditionalGetMiddleware adds support for modern browsers to conditionally GET responses based on the ETag and Last-Modified headers.

  • django.middleware.http.ConditionalGetMiddleware 为现代浏览器增加了有条件地 GET 基于 ETagLast-Modified 头标的响应的相关支持。

  • django.middleware.gzip.GZipMiddleware compresses responses for all moderns browsers, saving bandwidth and transfer time.

  • django.middleware.gzip.GZipMiddleware 为所有现代浏览器压缩响应内容,以节省带宽和传送时间。

Order of MIDDLEWARE_CLASSES

MIDDLEWARE_CLASSES 的顺序

If you use CacheMiddleware , its important to put it in the right place within the MIDDLEWARE_CLASSES setting, because the cache middleware needs to know the headers by which to vary the cache storage.

如果使用缓存中间件,一定要将其放置在 MIDDLEWARE_CLASSES 设置的正确位置,因为缓存中间件需要知道用于产生不同缓存存储的的头标。

Put the CacheMiddleware after any middlewares that might add something to the Vary header, including the following:

CacheMiddleware 放置在所有可能向 Vary 头标添加内容的中间件之后,包括下列中间件:

  • SessionMiddleware , which adds Cookie

  • 添加 CookieSessionMiddleware

  • GZipMiddleware , which adds Accept-Encoding

  • 添加 Accept-EncodingGZipMiddleware ,

Whats Next?

接下来?

Django ships with a number of contrib packagescool, optional features. Weve already covered a few of the: the admin system (Chapter 6) and the session/user framework (Chapter 11).

Django 带有一些功能包装了一些很酷的,可选的特色. 我们已经讲了一些: admin系统(第6章)和session/user框架(第11章).

The next chapter covers the rest of the contributed subframeworks. Theres a lot of cool tools available; you wont want to miss any of them.

下一章中,我们将讲述Django中其他的子框架,将会有很多很酷的工具出现,你一定不想错过它们。

Copyright 2006 Adrian Holovaty and Jacob Kaplan-Moss.
This work is licensed under the GNU Free Document License.
Hosting graciously provided by media temple
Chinese translate hosting by py3k.cn. 粤ICP备16122281号-1