Previous Table of Contents Next


The Visual C++ project file, source code, and executable for this application can be found on my Web site at www.nt-guru.com/books/coriolis—just in case you want to try it out on your system. On a uniprocessor system, this little code snippet will use most of your available processor cycles, but on a multiprocessor system, the other processors will remain available to process other application requests. If you have a multiprocessor system, try the previous application and add the Processor % Total Processor Time for each processor to your chart. One of these processors will show 80 to 98 percent utilization while the other processors will continue to function normally. To see how a multithreaded processor-intensive application affects performance on a multiprocessor system, take a look at BadExp3.EXE, shown in Listing 6.2.

Listing 6.2 The BadExe3 example application source code.

//------------------------------------------------------------------------------------
// THIS CODE AND INFORMATION IS PROVIDED “AS IS” WITHOUT WARRANTY OF
// ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO
// THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A
// PARTICULAR PURPOSE.
//
//
//   BadExp3.C      Sample Multithreaded Application to demonstrate
//                    multithreaded processor-intensive application.
//
//
// Copyright (c) 1993-1997  Knowles Consulting. All rights Reserved.
//
//------------------------------------------------------------------------------------
#include <stdio.h>
#include <stdlib.h>
#include <windows.h>

HANDLE hEvent1;
HANDLE hEvent2;
HANDLE hEvent3;

int i1=1;
int i2=2;
int i3=3;

int Thread(int *);

int main (void)
   {
   DWORD Tid;
   HANDLE aHandles[3];

    // Create multiple event objects.
    hEvent1=CreateEvent(NULL, FALSE, FALSE, “Event1” );
    hEvent2=CreateEvent(NULL, FALSE, FALSE, “Event2” );
    hEvent3=CreateEvent(NULL, FALSE, FALSE, “Event3” );

    // Assign event object handles to an array of handles.
    aHandles[0]=hEvent1;
    aHandles[1]=hEvent2;
    aHandles[2]=hEvent3;
    aHandles[3]=NULL;

    // Create multiple threads.
    (void) CreateThread(NULL,0,(LPTHREAD_START_ROUTINE)Thread,&i1,0,&Tid);
    (void) CreateThread(NULL,0,(LPTHREAD_START_ROUTINE)Thread,&i2,0,&Tid);
    (void) CreateThread(NULL,0,(LPTHREAD_START_ROUTINE)Thread,&i3,0,&Tid);

    // Wait for threads to complete.
       WaitForMultipleObjects(3,(CONST HANDLE)&aHandles,TRUE,INFINITE);

        return (0);
       }

    // Thread processing procedure.
    int Thread(int * i)
       {
      unsigned long uMaxNumber=0;
       unsigned long x=0;
       char chUserInput;

    // Idle loop to eat CPU time.
    for(x=0;x<=4294967295;x++)
        {
       uMaxNumber=x;
        }

    //Displays thread identification number, and
    //waits for user to press a key.
    printf(" Thread = %d, Counter = %u\r\n" , *i, uMaxNumber);
     chUserInput=getchar();
     // Resets thread event object to terminate thread.
     switch (*i)
       {
       case 1:
           SetEvent(hEvent1);
          break;

       case 2:
           SetEvent(hEvent2);
          break;

       case 3:
           SetEvent(hEvent3);
          break;
       }

     return (0);

   }

This is a multithreaded version of the application. It will spawn three threads and will equally distribute the load among them. This is demonstrated in Figure 6.6. Of course, my system is only a uniprocessor machine, so the load per thread is between 30 and 33 percent. But if you have three or more processors, three of these processors should show 80 to 90 percent processor utilization. Notice in this example that the Thread object has been selected. This is the object counter you will use to determine which thread of a process is the bottleneck utilizing more of your processor than you would like.


Figure 6.6  A multithreaded example of a processor-intensive application displayed with Performance Monitor.

So, once you have isolated a performance-intensive application, what can you do about it? Well, there are several things you can do depending on your resources and the application. This breaks down into two categories, depending on whether you have access to the source code.

If you have access to the source code, you can try the following:

  Use a profiler to modify the application to be less processor intensive. This could include making more efficient use of critical sections or semaphores. For example, if your application spawns multiple threads, you should set the thread priorities to make more efficient use of these control mechanisms so that the threads are not constantly being switched in and out of the processor queue and performing less work than the overhead involved in switching them in and out of the processor queue.
  Rewrite the application to spread the load by distributing it among several computers.
  Rewrite the application to use thread affinities so that the application would use a specific processor (or processors) in a multiprocessor system. This can achieve higher system performance because the threads will not be constantly switched to different processors. As a thread is switched, it often requires that the processor cache be flushed to maintain data coherency, and that slows down the overall system performance.


Previous Table of Contents Next