Why I Dislike Threads
Most people who have talked to me about programming, knows that I am not a big fan of threads, and advocate event driven programming. The reason for this is not so much that I like event driven architecture, but more that I dislike threads.
Threads break one of the fundamental principles in programming: Isolation. Having two or more execution threads in a process, means that they are no longer isolated memory wise. Threads can mess with data other threads data, which in turn can create inconsistent data, or just lead to crashes.
Additionally threads cannot be controlled. There is no way to stop or kill a runaway thread without dragging the entire process with it. And it does not even make sense to kill a thread because it can leave the process in an inconsistent state. This also makes it impossible to use the fork() call, as there is no way a process can be made deterministically when another execution thread is running. Processes are on the other hand well isolated and can be easily controlled.
Another issue with threads are that they are hard to keep simple. Often a program will start with a few locks here and there, but soon they will be split into more fine grained locks, and locking order will be imposed. This makes it difficult to change or expand a program, as not only new APIs have to be learned, but locking order as well.
Sometimes threads are necessary though. Kernels is a prime example (and is where threads was initially created for). When performing parallel computations and a big data set needs be read and updated continously, threads are more or less impossible to avoid. This however is relatively rare. The most common usage of threads is to handle blocking IO calls. Instead of using blocking IO it is possible to non-blocking IO. Unfortunately a program must be heavily modified to use this programming model, due to interleaving between IO calls. An example of interleaving IO frameworks is Twisted. While Twisted is great, it suffers from having to use deferreds which can make program flow complicated to figure out. This is also partly a problem with Python as the language does not support anonymous code blocks (only expressions are supported)
An alternative to the interleaving model is to use shared-nothing micro threads. In this model threads are created but they do not share memory, and must communicate through message parsing. This may sound similar to multi-process scheme, however micro threads exists only in user space, i.e., they are application level only. This means that context switching and message parsing can be made very fast. Two examples of such languages is Stackless Python and Erlang. In Erlang creating a new thread is basically a function call, and uses 300 bytes of memory.
Both Stackless and Erlang have very good mechanisms for sending messages. Serialization and serialization is completely transparent. This makes is much easier to split a program into several execution units compared to using processes as schemes for message parsing has to be constructed when using processes. Using micro threads retains "normal" program flow, which makes programs more intuitive to understand. By using micro threads it is also possible to parallelize the program as the micro threads can be executed independently. This is in big contrast to interleaving of IO, which is can be difficult to parallelize.
I guess it is not really threads I dislike, but shared memory. Joe, who is one of the persons that created Erlang, wrote a nice entry about this some ago: Why I don't like shared memory.
Event-driven or a micro thread based architecture is by no means a silver bullet though. In a recent conversation with Gerd we talked about solving a problem which in itself concurrent. In this case, using event-driven did not solve anything. In such cases it is necessary to either bite the bullet, or one will have to rethink the solution.
Threads break one of the fundamental principles in programming: Isolation. Having two or more execution threads in a process, means that they are no longer isolated memory wise. Threads can mess with data other threads data, which in turn can create inconsistent data, or just lead to crashes.
Additionally threads cannot be controlled. There is no way to stop or kill a runaway thread without dragging the entire process with it. And it does not even make sense to kill a thread because it can leave the process in an inconsistent state. This also makes it impossible to use the fork() call, as there is no way a process can be made deterministically when another execution thread is running. Processes are on the other hand well isolated and can be easily controlled.
Another issue with threads are that they are hard to keep simple. Often a program will start with a few locks here and there, but soon they will be split into more fine grained locks, and locking order will be imposed. This makes it difficult to change or expand a program, as not only new APIs have to be learned, but locking order as well.
Sometimes threads are necessary though. Kernels is a prime example (and is where threads was initially created for). When performing parallel computations and a big data set needs be read and updated continously, threads are more or less impossible to avoid. This however is relatively rare. The most common usage of threads is to handle blocking IO calls. Instead of using blocking IO it is possible to non-blocking IO. Unfortunately a program must be heavily modified to use this programming model, due to interleaving between IO calls. An example of interleaving IO frameworks is Twisted. While Twisted is great, it suffers from having to use deferreds which can make program flow complicated to figure out. This is also partly a problem with Python as the language does not support anonymous code blocks (only expressions are supported)
An alternative to the interleaving model is to use shared-nothing micro threads. In this model threads are created but they do not share memory, and must communicate through message parsing. This may sound similar to multi-process scheme, however micro threads exists only in user space, i.e., they are application level only. This means that context switching and message parsing can be made very fast. Two examples of such languages is Stackless Python and Erlang. In Erlang creating a new thread is basically a function call, and uses 300 bytes of memory.
Both Stackless and Erlang have very good mechanisms for sending messages. Serialization and serialization is completely transparent. This makes is much easier to split a program into several execution units compared to using processes as schemes for message parsing has to be constructed when using processes. Using micro threads retains "normal" program flow, which makes programs more intuitive to understand. By using micro threads it is also possible to parallelize the program as the micro threads can be executed independently. This is in big contrast to interleaving of IO, which is can be difficult to parallelize.
I guess it is not really threads I dislike, but shared memory. Joe, who is one of the persons that created Erlang, wrote a nice entry about this some ago: Why I don't like shared memory.
Event-driven or a micro thread based architecture is by no means a silver bullet though. In a recent conversation with Gerd we talked about solving a problem which in itself concurrent. In this case, using event-driven did not solve anything. In such cases it is necessary to either bite the bullet, or one will have to rethink the solution.
Labels: io, microthreads, threads, twisted