Async IO is a bit lesser known than its tried-and-true cousins, multiprocessing and threading. This section will give you a fuller picture of what async IO is and how it fits into its surrounding landscape.
In this miniature example, the pool is range(3). In a fuller example presented later, it is a set of URLs that need to be requested, parsed, and processed concurrently, and main() encapsulates that entire routine for each URL.
FULL AIO RUMTIMES
#3. Event loops are pluggable. That is, you could, if you really wanted, write your own event loop implementation and have it run tasks just the same. This is wonderfully demonstrated in the uvloop package, which is an implementation of the event loop in Cython.
Twisted is a high-level networking framework thatis built around event-driven asynchronous I/O. It supports TCP, SSL,UDP and other network transports. Twisted supports a wide varietyof network protocols (including IMAP, SSH, HTTP, DNS). It is designedin a way that makes it easy to use with other event-driven toolkitssuch as GTK+, Qt, Tk, wxPython and Win32. Implemented mostly inPython, Twisted makes it possible to create network applicationswithout having to worry about low-level platform specific details.However, unlike many other network toolkits, Twisted still allowsdevelopers to access platform specific features if necessary. Twistedhas been used to develop a wide variety of applications, includingmessaging clients, distributed hash tables, web applications and bothopen source projects and commercial applications.1 IntroductionNetworked infrastructure development currently exhibits a curious asymmetry.Very high-level infrastructure, such as web servers, and very low-levelinfrastructure, such as OS-level I/O, have both been actively developed.However, there is no popular middle layer between the two; each high-levelabstraction implements all the infrastructure from the base OS level all theway up to its application needs in a specific way.Development in the area of high performance multiplexing has continued on manyplatforms, yielding ever-increasing performance. These mechanisms are many andvaried: /dev/poll, epoll, select, poll, kqueue[1], completion ports,and POSIX AIO, to name a few. On the other end of the spectrum, manyframeworks provide high-level constructs for specific application domains. Forexample, web application servers provide infrastructure for developingHTTP-based applications. There are few frameworks that provide access to boththe low-level and high-level infrastructure required in real networkingapplications. Web application servers don't provide access to their low-levelnetworking event loop for extension with new protocols, while low-levellibraries require far too much work to be usable out of the box.Twisted is a networking framework suitable for building a wide range ofnetworked servers and clients. Twisted provides portability by usinghigh-level abstractions of protocols, various transports (such as TCP and UDP)and an event loop, allowing deployment of the same code across multipleplatforms, primarily Unix and Windows NT.However, access to platform-specific functionality is a real requirement formany applications. For example, without the ability to access file descriptorsdirectly, it isn't possible to write programs that integrate access to theserial port into the event loop.Twisted provides low-level access to operating system and event loop specificfunctionality. On UNIX, for example, it is possible to register filedescriptors with select and poll, even though this functionality is notavailable on Windows. At the same time, one can use the Win32 API to access aserial port through different mechanisms when using Windows. Twisted makes itpossible for a user to abstract such a feature themselves if the framework doesnot provide it already.Twisted also provides many high-level facilities commonly used by networkingapplications. A mail server shares a large number of requirements with a webserver, such as I/O, protocol parsing, logging and daemonization. In addition,many standardized protocols are shared between applications, e.g.: web mail requiresboth HTTP and SMTP. Including basic implementations of such protocols savesthe developers the need of developing them from scratch.By providing the full spectrum of functionality for networking applications,both low-level and high-level, developing networking applications is a farsimpler task, allowing the developer to concentrate on developing theirapplication rather than reinventing the wheel. Twisted also provides a middlelayer so that high-level networking applications may take advantage oflow-level advances in functionality and scalability.This approach contrasts to the two main approaches taken by most networkingframeworks. One approach is to use a low-level framework and language (C orC++). The low-level approach gives access to all of the capabilities and APIsprovided by the operating system, and if done correctly can result in a veryfast program. On the other hand, pervasively using platform-specificfunctionality results in a platform-specific program, so portability ishindered. Unless carefully audited, C and C++ code is more prone to bufferoverflows and system-crashing bugs than their high-level counterparts. Thisresults either in a much longer development process as testing locates all theproblems, or in a more fragile system.Another approach is to provide functionality which only provides access to thelowest common denominator between all supported platforms. This approach istaken by most high-level frameworks. While the ``lowest common denominator''approach makes the framework very portable, it means one can't do many basictasks that only work on specific platforms. For example, the Java platform,which takes this approach, does not support running a server as a daemon onUNIX or an NT Service on Windows, since neither of these features are availableon other platforms. This decision pleases no-one by trying to please everyone.High performance is not a major goal of the Twisted framework. Whileefficiency and scalability are taken into account during development,flexibility and clean design are more important. This focus stems from thebelief that in the real world, as long as performance meets the user'srequirements, other factors are more important when choosing a platform. Forexample, there are a large number of open source and commercial web servers andacademic papers describing architectures all of which are significantly fasterthan the Apache web server. Nevertheless, as of February 2003 Apache runs morethan 60% of the sites on the web[3].To achieve scalable performance that will meet most user'srequirements, Twisted uses multiplexing for all I/O operations, and asingle thread for almost all computation. As the progression of theJava language has shown[2], blocking, threaded I/O librariessimply do not scale to meet more than the most basic demand. Inaddition, performance costs associated with context-switching andsynchronization, which are exacerbated by Python's global interpreterlock, are eliminated. As others have shown[11],threading is useful only in certain circumstances and should beregarded as a low-level tool. As with all such tools, Twisted has ahigh-level wrapper that provides a portable and convenient way tointegrate threaded code with the event loop when necessary.Twisted is implemented mostly in Python, with a few parts also available as Cextension modules for performance reasons. Python is a very high-levelprogramming language with a great deal of run-time flexibility that allowsrapid development of dynamic systems. Despite its high-level nature, it offersaccess to many system calls necessary for networking, such as select(),socket() and poll(). It provides an excellent base for porting Twisted'sfunctionality to new operating systems.Thanks to the facilities provided byPython, Twisted is rather small: at the time of this writing Twisted has only89,000 lines of code in Python, including all of its protocol implementations.The C portion, which only duplicates certain performance-critical pythonfunctionality, is even smaller, only 4,000 lines of code in C.Twisted is currently at version 1.0.3, and is being used by a variety of commercial andopen source applications. It is a mature project with a large and activecommunity.2 The Python Programming LanguagePython is a high-level object-oriented programming language, withruntimes written in C and Java (and an experimental .NET CLR runtime).Python is highly portable, and runs on a large number of operatingsystems -- including UNIX and UNIX-like systems, Windows and others.However, unlike Java, Python does not go down the path of the thelowest common denominator. Instead, Python supports platform specificfeatures in addition to common functionality. On Windows, Python canbe used with COM and Win32 APIs. On UNIX, Python has access to a largerange of the POSIX functionality, from fork() to signals. At the sametime, where necessary Python provides common wrappers for low-levelfunctionality -- threads in Python use a common API that providesthe same functionality on different platforms.Because of its high-level orientation, Python alleviates the needto deal with memory allocation, array bounds checking and pointers,speeding development and preventing common security issues. Moreover,Python scales upwards when designing complex systems, allowing welldesigned libraries to provide powerful functionality with simple interfaces.Twisted builds on Python's flexibility, power and clean design. Pythonwraps GTK+, Qt, Tk, wxPython and Win32 GUI toolkits, and thus allowsTwisted to integrate with these toolkits' event loops. Python's supportfor sockets, select() and other low-level APIs are wrapped to createTwisted's networking core. Additional Python libraries which wrapC code are also used for various functionality (PyOpenSSL for SSLand TLS, PyCrypto for cryptographic algorithms and so on). Additionallow-level functionality is easy to integrate due to the simplicity ofextending Python with C or C++. 3 The Event LoopThe event loop is the core of Twisted. It implements a pluggable interface toOS-level functionality such as networking, timers and certain commonlyavailable utilities such as SSL encryption. There are two ways of implementingnetworking event loops. In one approach, event handlers are called in responseto readable or writable events on sockets and then an attempt is made to reador write as much as possible. This method is commonly used with non-blockingsockets. The other approach used is fully asynchronous I/O, where eventnotification happens when a read or write is finished (e.g. POSIX AIO orWindows' I/O Completion Ports). Currently Twisted implements severalnon-blocking event loops. The event loop APIs are designed to accommodatefully asynchronous I/O as well, but as yet, no implementations have been released.An object that implements an event loop in Twisted is called a ``reactor''.Twisted provides reactors that run on top of select(), poll(), kqueue andWin32 Events. Additionally, it provides reactors that use these same low-levelmechanisms, but access them through the APIs of the graphical toolkits, such asGTK+ and Qt. This enables Twisted to run within a graphical applicationwritten with these toolkits with no performance impact.For toolkits which do not provide networking APIs, such as Tk and wxWindows,Twisted provides support modules which will run any reactor at brief intervalsas the GUI event loop is running.On Jython (an implementation of Python written in Java) Twisted provides a JavaAPI based reactor that emulates an event loop using threads.The reactor implementation may be chosen at runtime, depending on which onesare available and what functionality is required. A Twisted reactor objectimplements functionality for working with at least some of the followingsystems:TCP
SSL/TLS
UDP
multicast
Unix sockets
generic file descriptors
Win32 events
process running
scheduling
threading.
For example, all of these interfaces are supported on Unix when using thedefault select() based reactor, except Win32 event support. However, whenrunning on Windows, the same reactor will not support generic file descriptorsand Unix sockets.This support for reactor-specific functionality does not mean that allapplications written with Twisted are not portable. The programmer can choose touse platform specific functionality (e.g. use the file descriptor support onUnix to write a curses-based console interface), or to use only thoseinterfaces that are cross-platform and supported in all reactors (e.g. use TCPsupport to write a telnet-based console interface). The choice is made by theprogrammer, not the framework.4 NetworkingTwisted has a small networking core, twisted.internet (a Python package), which aimsto provide a highly portable socket multiplexing API. This API isbased around four fundamental principles.All methods should be named in platform neutral, self consistent andsemantically clear ways.
The API should be as small and abstract as possible.
Low-level functionality should not be disabled or obscured in any way.
The same events should be provided from multiple different sources, toallow the same objects to communicate with any semantically identical objects.
Each of these four features has an important impact on the resultingframework's ease of use and implementation. Each will be explored indetail.4.1 Method NamingWhile this may sound like a small detail, clear method naming is importantto provide an API that developers familiar with event-based programmingcan pick up quickly. Obscure names like ``kqueue'' and ``/dev/poll''litter the multiplexing landscape, which can make the already-intimidatingconcept of event-based programming seem even more arcane.Since the idea of a method call maps very neatly onto that of a received event,all event handlers are simply methods named after past-tense verbs. All classnames are descriptive nouns, designed to mirror the is-a relationship of theabstractions they implement. All requests for notification or transmission arepresent-tense imperative verbs (see Fig. 1). 2ff7e9595c
Comments