Threading and concurrency overview

The Central Processing Unit (CPU) maintains a dedicated Java stack and PC registers for each thread. It quickly stores and re-loads these when switching context (moving from one thread to another), and this is a primary cost of having multiple threads.

A second cost of multiple threads is synchronization, but this is mostly a historic concern. If you design your application to use a few simple synchronized locks, with event-driven wait() and notifyAll() calls, the direct cost of synchronization on a modern pipelined CPU is zero or very close to zero. Excessive synchronization in a poor architecture may indirectly slow your application if threads are frequently blocked, since this will increase the frequency of context switching.

Each Java Thread maintains a call stack, which is where variables passed as arguments are allocated and de-allocated by moving the stack pointer. The stack is in the same 2 MB main application memory, and allocating a new stack is part of the cost of creating an additional thread. Recursive algorithms, however elegant in theory, may be relatively slow and expensive in terms of precious thread stack space on Series 40 so they should generally be avoided.

The ARM processor will periodically interrupt the current Java thread, store register values and switch to another thread. This context switch is quite efficient, however there are ways your code may be forcing this to occur more often and thus decreasing overall throughput. Things to watch out for include:

  • If you are using more than 5 or 6 threads, you probably have an architectural flaw and should re-design your MIDlet to be more appropriate for mobile device with limited resources. A Series 40 phone supports many more threads, but designs using so many threads with the associated context switching overhead are suboptimal for a mobile application.

  • Avoid using Thread.sleep() statements to slow one activity down. There are places where this is appropriate, but you are also forcing a context switch to another thread each time you sleep, and this can quickly add up. Synchronisation with wait() and notifyAll() are often more elegant solutions to event-driven architectures and result in noticeably faster MIDlets.

The theoretical ideal number of threads for a Nokia Series 40 application is based on the CPU and associated hardware which can operate in parallel to complete input-output tasks. The thread count you should use as a guideline is thus:

  1. Screen operations, meaning painters and incoming user events such as keyboard and touch screen events, must be on the phone’s Event Dispatch Thread (EDT) to avoid side-effects. In games with a GameCanvas, it is possible to also provide a separate render thread which takes care of painting, but Canvas and Form user interfaces do all painting on the EDT.

  2. Purely CPU-intensive operations, such as math and application logic, complete most quickly on a single thread. While there is some context switching overhead for having multiple CPU-intensive threads, this is generally not causing any measurable performance penalty and you should do what makes sense for your algorithm. Flash memory operations, loading and storing from the Record Management System (RMS) and memory card. Only one such activity can be active at any given time, so additional threads do not increase and may slightly decrease performance. Note that while flash read operations are about ten times faster than flash write operations, the slow write operation triggered by one thread may significantly slow down the read operation of another thread, so you can not assume that flash read operations will always complete quickly, for example, on the EDT, as this pause would be noticeable.

  3. Network operations, sending and receiving data from web sites. Here there is no strict rule, but around four network operations in parallel will balance excessive resource allocation associated with the network against the long delays of any one network connection waiting for data.

  4. Other IO operations, such as Bluetooth and serial port connections, each deserve their own thread. It is best to have one thread sending and one thread receiving as unlike HTTP which switches between send and receive, these are bidirectional communication channels.

  5. Media players should each have their own thread to smooth over buffering and resource allocation and deallocation delays. Do not reuse media threads, this can have side-effects as there are too many circumstances under which a media thread may be delayed for many seconds.