The goal of this article
What is caching
What is Memcached
Installation (Linux, Windows)
Basic get/set operations
Where to go next
Conclusion

The goal of this article
This article is aiming to provide basic introduction into server-side
caching using open-source memcached system. If you are developer looking for caching
solution for your (web)applications, this article is going to give you the overview
of how to install and use memcached from several programming environments.
The goal of this article is to provide quick overview for developers with zero or minimal experience
with memcached, not to provide comprehensive reference of all memcached features.

What is caching
Performance matters. This fact becomes more and more important in the era of the web sites
with thounsands, millions or hundreds of milions of users.
Caching is one of the most important performance strategy applicable independently from the type of your application.
The basic idea is pretty simple – reduce the number of expensive tasks and put the results into the storage which is accesible faster.

You can find implementation of caching in many software products you use on daily-basis.
- Web browser cache images, scripts, or whole pages retrieved from server, storing them on client disk
- Text processor reads content of file from disk into memory to provide better performance when working with document
- Server-side application store results read from database in memory for faster access

What is Memcached
Memcached is free, open-source memory caching system with focus on high-performance,
distributed deployment and scalability.
Memcached started as a project for LiveJournal with very clear goal – speed up website by caching frequently
used items loaded from database into memory. Today, memcached is the core infrastructure component for the biggest players on web field, including Google/YouTube, Digg, Facebook, Wikipedia, Amazon and many others.

The design concept is that memcached acts as an key-value store, running separately from your application.
The independence from the memcached consumer is very important for easy deployment and the fact
that memcached stays generic from the consumer’s perspective.
Memcached runs as a separate service/daemon listening for incoming connection on TCP/IP.
Clients opens TCP connection to memcached, sends storage or retrieval command, gets response and closes the connection. Clients usually use socket pools to eliminate the need to open separate connection for every memcached request and returns connection into pool after performing the operation instead of closing it.

Installation (Linux, Windows)
This chapter is going to guide you through the basic installation procedure on Linux and Windows platforms
with special focus on common pitfals of Windows memcached port.

Linux
Linux is probably the most commonly used OS for memcached with wide support of binary, ready to use packages for various distributions.
Memcached has dependency on libevent library, therefore you need also libevent to get memcached up&running.
On Debian/.deb-based distros, installation is pretty easy:

$ sudo apt-get install memcached

This will install memcached and required dependencies (libevent).
On Debian, the config file is /etc/memcached.conf. There is a list of command-line options with comments.

For installation from source, get the latest source from memcached homepage and proceed
with standard ./configure && make commands. Note that libevent (and libevent-dev on Debian) must be available.

Windows
Installation on Windows can be tricky due to the fact that there are several versions of memcached compiled for Windows. The main difference is their age and the level of potentional issues. I recommend using the most recent build from jellycan which is basically memcached 1.2.6 compiled for Windows.
Get it, unzip, execute memcached.exe and you have the most minimal memcached setup ready.
Windows is not officially supported as a target platform for which memcached is continuously built and tested.
If you are more interested on the reasons for this, this thread is quite interesting description
of actual problems.

The very funny part about memcached on Windows is running it as a service with non-default settings.
As the default configuration of memcached allows only 64MB of RAM to be used for items, the very
common requirement for running memcached is to run it as a service with more than 64MB of memory allowed.
Default memcached switch for installing as a service on Windows is ‘-d’, which ignores additional command-line options.
This problem can be resolved using ‘sc.exe’ for service installer.

@echo off
set ramSize=512

sc create memcached binPath= "c:\memcached\memcached.exe -d runservice -m %ramSize%" start= auto DisplayName= "memcached"
sc start memcached

Save this to the install.bat file, change the path to memcached.exe and ramSize if needed, then run the bat file.

Test if memcached is running and accepting connections
Memcached by default listen on TCP port 11211.
After installation, you can test if memcached is accepting incoming connection just with telnet.

$ telnet localhost 11211
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
version
VERSION 1.2.8
quit
Connection closed by foreign host.
$

Note that on Windows 2008 Server, telnet is not installed by default and must be added as additional component.
However, I recommend using PuTTY on Windows instead.

Basic get/set operations
Before we jump into real code, there are few things which you should keep in mind from the start:

- maximum size per item in memcached is 1MB. This is by-design limitation, mostly done because of memory allocation optimizations. This problem can be partially resolved using compression on client-side before putting item into cache, but not for all cases. Memcached is not the right place to cache large images or documents. Memcached is optimized to store small chunks of data (strings, objects etc). However stored objects can be anything as memcached follows ‘What you set is what you get’ rule, keep the size limitation in mind.

- memcached does not provide security/authentication mechanism. Again, this is by design. Memcached is designed to be as fast as possible and you should delegate security to different layer. The simplest way is to firewall memcached server(s) which would allow only machines connecting to memcached directly (web servers) to open the connection. In common scenarios, memcached servers are not a part of publicly visible network and only web servers know IPs of memcached servers in internal network.

- memcached is not a database, it’s a key-value pair store. You can’t make something like ‘select * from items’ on memcached to dump all items in cache or so. Normally, you can access values in cache only by knowing their key(s).

The most common operations with memcached are getting items from cache and setting them to cache.
If you have code which query database for anything and then send the results out, you can basically
rewrite it to query the memcache first. If item is found in the cache, you send this out – avoiding DB query.
If item is not yet in the cache (which can be for several reasons we’ll describe later), you get results from DB query first
and then set them into the cache. On next request, only cache will be hit and you get the performance boost.

Speaking in code, if you have something like:

result = query("select a,b,c from x");
return result;

You can rewrite it into:

result = get_from_memcache("mykey1");
if (result == NULL)
{
 result = query("select a,b,c from x"); // get result from DB
 set_to_memcache("mykey1", result); // set result into memcache
}
return result;

It’s that simple. Note that there are several reasons client may return NULL/null/undefined/None value for get operation:
- item is not yet in the cache under specified key
- item may be expired. Memcached supports expiration of items which is very useful if you need automatically invalidate item after some period of time. You can also set expiry parameter to ’0′ and tell memcache not to expire item.
- item may be deleted
- memcached has been restarted, or ‘flush_all’ command was sent to memcached, cleaning up all caches

I’d recommend following libraries for Python, Java and .NET.

Python
python-memcached package, available though apt-get or easy_install.
You can also use this package with Django.

>>> import memcache
>>> mc = memcache.Client(['127.0.0.1:11211'], debug=0)
>>> mc.set("a", "X1")
True
>>> mc.get("a")
'X1'

Java
spymemcached is library developed and maintaned by memcached core contributor.
It’s also good choice if you use Hibernate for you projects, because hibernate-memcached second level cache provider is based on spymemcached. I would recommend using spymemcached even you don’t use Hibernate.

.NET
enyim.com Memcached Client seems to be the best option for .NET client and is under active development.
However, if you would like to use NHibernate with memcached, NHibernate memcached provider ships with different library (far older and not under active development).
If you do not use NHibernate, I would recommend go with Enyim.

Where to go next
Except the basic get/set operations, there is a lot of other magic available in memcached.
Very detailed description of all commands and options can be found in protocol documentation, for example statistics, increment/decrement statements, multi-get retrieval, UDP protocol usage and other built-in features.

There is a large community around memcached and a lot of useful resources.
Following links are good starting point:
Memcached official site
Memcached support mailing list
Memcached wiki, FAQ, HowTos
List of client APIs/libraries
Forks/reimplementations and other projects related to memcached

Conclusion
Memcached may be the silver bullet for many scalability problems of webapplications where frequent read operations from slower store (database, disk..) occurs.
However memcached can help you ‘make things faster’, keep in mind that the optimization rule #1 is to optimize application logic where possible.
Sooner or later, nothing will save you in case of poorly designed DB queries or suboptimal algorithms.
If you pass this ‘optimize yourself’ routine and decide you need caching, memcached can be very handy because
of its stability, strong adoption, excellent community support and distributed nature.

Slovak translation of this article at Zdrojak.


  • BROWSE / IN memcached

COMMENTS / 5 COMMENTS

Humm… interesting,
This is some great advice on what caching is…
Anyway, thanks for the post

bespoke software added these pithy words on Jan 27 10 at 12:53 pm

provide one live example on memcached with hibernate

kumar added these pithy words on Mar 08 11 at 1:34 pm

Kumar,
I’m sure you can find some on hibernate-memcached site

jsk added these pithy words on Mar 08 11 at 1:36 pm

I want to implement memcacehd in my application. Iam using mysql db and java hibernate to connect to underlying db. Need one executable program

kumar added these pithy words on Mar 08 11 at 1:38 pm

Should be easy if you browse and study site I mentioned :)

jsk added these pithy words on Mar 08 11 at 1:39 pm

SPEAK / ADD YOUR COMMENT
Comments are moderated.

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Return to Top