ArcGIS Server creates cache tiles using a geoprocessing service named CachingTools. This service is configured for you in the System folder when you create the ArcGIS Server site. The number of instances you allow for the CachingTools service determines how much power your machine can dedicate toward caching jobs.
Additionally, you always need to have at least one instance running of the map or image service that you are caching. Increasing the number of instances of the map or image service does not affect how fast tiles are created.
Legacy:
In 10.0 and earlier versions, to increase the number of operating system processes working on a caching job, you increased the number of instances of the map service being cached. In 10.1 and later releases, you increase the number of instances of the CachingTools geoprocessing service instead.
Choosing the number of instances to allow for the CachingTools service
At any time, you can use Manager to adjust the maximum number of instances of the CachingTools geoprocessing service that you want to make available for working on caching jobs. The minimum and maximum values apply to each individual GIS server; thus, if your maximum is set to a value of 3 and you have four GIS servers running the CachingTools service, you could have up to 12 instances of CachingTools running.
This behavior allows you to add and remove GIS servers from the site to increase or reduce the number of resources dedicated to caching. You can add a GIS server even when the caching job is running and it will be detected and assigned tiles to create.
If you choose to allow too many instances of the CachingTools service, your machine can become overwhelmed and inefficient. If you choose to allow too few instances, your machine may be underutilized. Finding the best number can be a process of trial and error. A good starting point is to allow a maximum of n + 1 instances, where n is the number of CPU cores on a single machine in your cluster. If you're deploying your site on Amazon Web Services, use 2n + 1 where n is the number of virtual cores on a single EC2 instance in your site.
The CachingTools service must run with its execution mode as Asynchronous. This is the default value.
Choosing the number of instances to work on a caching job
Tools such as Manage Map Server Cache Tiles allow you to choose how many instances of CachingTools will work on the job. You can choose to divide the available instances of CachingTools among several running jobs. A job might not utilize its maximum number of instances of CachingTools if those instances are being used by other jobs. If a caching job is using all the CachingTools instances, other requested jobs are queued until the first job finishes.
Scenarios
Suppose you want to create a cache and you have four GIS servers in a site. You've configured each server to allow a maximum of five instances of CachingTools. The maximum number of instances you can dedicate toward any caching job is 20.
If you want to run two simultaneous caching jobs on this site and maintain an evenly distributed load, you can dedicate 10 instances toward each job.
Allowing for elasticity
It may be that you have configured your site in a cloud environment that can automatically add GIS servers in response to demand. In this case, you might not want to be limited by a fixed maximum number of instances that can work on the job. In this situation, you can enter a value of -1 to indicate that there is no limit on the number of instances that can work on the job. All the available instances of CachingTools will be used for the job, no matter how many GIS servers are added to your site.
Setting the number of jobs that can run simultaneously
If too many publishers start requesting cache to be built at the same time, the server can get overwhelmed, even if you only choose to dedicate a small number of instances toward each job. The CachingControllers service (in the System folder) determines how many jobs can run at the same time.
The default maximum number of instances for the CachingControllers service is 3, meaning only three caching jobs can run at once. If the server receives a request for a fourth caching job, it will be queued until one of the other jobs has finished. If you want to allow four jobs to run at once, you can set the maximum number of instances of CachingControllers to 4.