Hardware Sizing Guidelines
These sizing guidelines offer an approximation of the hardware resources required to deploy an AEM project. Sizing estimates depend on the architecture of the project, the complexity of the solution, expected traffic and the project requirements. This guide helps you to determine the hardware needs for a specific solution, or to find an upper and lower estimate for the hardware requirements.
Basic factors to consider are (in this order):
- Network speed
- Network latency
- Available bandwidth
- Computational speed
- Caching efficiency
- Expected traffic
- Complexity of templates, applications and components
- Concurrent authors
- Complexity of the authoring operation (simple content editing, MSM rollout, etc)
- I/O performance
- Performance and efficiency of the file or database storage
- Hard Drive
- at least two or three times larger than the repository size
- Size of website (number of content-object, pages, and users)
- Number of users/sessions that are active at the same time
A typical AEM setup consists of an author and a publish environment. These environments have different requirements regarding the underlying hardware size and system configuration. Detailed considerations for both environments are described in the author environment and publish environment sections.
In a typical project setup, you have several environments on which to stage project phases:
- Development environment To develop new features or make significant changes. Best practise is to work using a development environment per developer (usually local installations on their personal systems).
- Author test environment To verify changes. The number of test environments can vary depending on the project requirements (for example, separate for QA, integration testing, or user acceptance testing).
- Publish test environment Primarily for testing social collaboration use cases and/or the interaction between author and multiple publishing instances.
- Author production environment For authors to edit content.
- Publish production environment To serve published content.
Additionally the environments may vary, ranging from a single-server system running AEM and an application server, through to a highly scaled set of multi-server, multi-CPU clustered instances. We recommend that you use a separate computer for each production system and that you do not run other applications on these computers.
Generic hardware sizing considerations
The sections below provide guidance on how to calculate hardware requirements, taking various considerations into account. For large systems we suggest that you perform a simple set of in-house benchmark tests on a reference configuration.
Performance optimization is a fundamental task that needs to be performed before any benchmarking for a specific project can be done. Please make sure to apply the advice provided in the Performance Optimization documentation before performing any benchmark tests and using their results for any hardware sizing calculations.
Hardware sizing requirements for advanced use cases need to be based on a detailed performance assessment of the project. Characteristics of advanced use cases requiring exceptional hardware resources include combinations of:
- high content payload / throughput
- extensive use of customized code, custom workflows or 3rd party software libraries
- integration with unsupported external systems
Disk Space/ Hard Drive
The disk space required depends heavily on both the volume and type of your web application. The calculations should take into account:
- the quantity and size of pages, assets and other repository-stored entities such as workflows, profiles etc.
- the estimated frequency of content changes and therefore the creation of content versions
- the volume of DAM asset renditions that will be generated
- the overall growth of content over time
Disk space is continuously monitored during Online, and Offline, Revision Cleanup. Should the available disk space drop below a critical value, the process will be cancelled. The critical value is 25% of the current disk footprint of the repository and it is not configurable. It is recommended to size the disk at least two or three times larger than the repository size including the estimated growth.
Consider a setup of redundant arrays of independent disks (RAID, e.g. RAID10) for data redundancy.
The temporary directory of a production instance should have at least 6 GB of available space.
AEM runs well in virtualized environments, but there can be factors such as CPU or I/O that cannot be directly equated to physical hardware. A recommendation is to choose a higher I/O speed (in general) as this is a critical factor in most cases. Benchmarking your environment is necessary to get a precise understanding of what resources will be required.
Parallelization of AEM Instances
A fail-safe website is deployed on at least two separate systems. If one system breaks down, an other system can take over and thus compensate the system failure.
System resources scalability
While all systems are running, an increased computational performance is available. That additional performance is not necessarily linear with the number of cluster nodes as the relationship is highly dependent on the technical environment; please see the Cluster documentation for more information.
The estimation of how many cluster nodes are necessary is based on the basic requirements and specific use-cases of the particular web project:
- From the perspective of fail-safeness it is necessary to determine, for all environments, how critical failure is and the failure compensation time based on how long it takes for a cluster node to recover.
- For the aspect of scalability, the number of write operations is basically the most important factor; see Authors Working in Parallel for the author environment and Social Collaboration for the publish environment. Load balancing can be established for operations that access the system solely to process read operations; see Dispatcher for details.
Publish environment specific calculations
Caching efficiency and traffic
Cache efficiency is crucial for the website speed. The following table shows how many pages per second an optimized AEM system can handle using a reverse proxy, such as the dispatcher:
Million pages/day (average)
Disclaimer: The numbers are based on a default hardware configuration and may vary depending on the specific hardware used.
The cache ratio is the percentage of pages that the dispatcher can return without having to access AEM. 100% indicates that the dispatcher answers all requests, 0% means that AEM computes every single page.
Complexity of templates and applications
If you use complex templates AEM will need more time to render a page. Pages taken from the cache are not affected by this, but the page size is still relevant when considering the overall response time. Rendering a complex page can easily take ten times longer than rendering a simple page.
Using the following formula, you can compute an estimate for the overall complexity of your AEM solution:
complexity = applicationComplexity + ((1-cacheRatio) * templateComplexity)
Based on the complexity, you can determine the number of servers (or CPU cores) you need for the publish environment as follows:
n = (traffic * complexity / 1000 ) * activations
The variables in the equation are as follows:
|traffic||The expected peak traffic per second. You can estimate this as the number of page hits per day, divided by 35’000.|
Use 1 for a simple application, 2 for a complex application, or a value in-between:
|cacheRatio||The percentage of pages that come out of the dispatcher cache. Use 1 if all pages come from the cache, or 0 if every page is computed by AEM.|
|templateComplexity||Use a value between 1 and 10 to indicate the complexity of your templates. Higher numbers indicate more complex templates, using the value 1 for sites with an average of 10 components per page, the value 5 for a page average of 40 components and 10 for an average of over 100 components.|
|activations||Number of average activations (replication of average sized pages and assets from the author to the publish tier) per hour divided by x, where x is the number of activations done on a system without performance side effects to other tasks processed by the system. You can also predefine a pessimistic initial value like x = 100.|
If you have a more complex website, you also need more powerful web servers so that AEM can answer a request in an acceptable time.
- Complexity below 4:
- 1024 MB JVM RAM*
- Low to mid-performance CPU
- Complexity between 4 and 8:
- 2048 MB JVM RAM*
- Mid to high-performance CPU
- Complexity above 8:
- 4096 MB JVM RAM*
- High to high-end-performance CPU
* Reserve enough RAM for your operating system in addition to the memory required for your JVM.
Additional use-case specific calculations
In addition to the calculation for a default web application, you may need to consider specific factors for the following use-cases. The calculated values are to be added to the default calculation.
Extensive processing of digital assets requires optimized hardware resources, the most relevant factors are image size and the peak throughput of processed images.
Allocate at least 16GB of heap and configure the DAM Update Asset workflow to use the Camera Raw package for the ingestion of raw images.
A higher throughput of images means that the computing resources need to be able to keep pace with system I/O and vice versa. For example, if workflows are launched by the import of images, then uploading many images via WebDAV could cause a backlog of workflows.
The use of separate disks for TarPM, data store and search index can help to optimize the system I/O behavior (however, usually it makes sense to keep the search index locally).
See also the Assets Performance Guide .
The resource consumption when using AEM MSM on an authoring environment depends heavily on the specific use cases. Basic factors are:
- Number of Live-Copies
- Periodicity of rollouts
- Content tree size to be rolled out
- Connected functionality of the rollout actions
Testing the planned use case with a representative content excerpt can help you improve your understanding of the resource consumption. If you extrapolate the results with the planned throughput, you can assess the additional resources required for the AEM MSM.
Please also take into account, that authors working in parallel will perceive performance side effects if AEM MSM use cases consume more resources than planned.
AEM Communities sizing considerations
AEM sites that include AEM Communities features (community sites) experience a high level of interaction from site visitors (members) in the publish environment.
The sizing considerations for a community site depends on the anticipated interaction by community members and whether optimal performance for page content is of higher importance.
User generated content (UGC) submitted members is stored separately from page content. While the AEM platform uses a node store that replicates site content from author to publish, AEM Communities uses a single, common store for UGC that is never replicated.
For the UGC store, it is necessary to choose a storage resource provider (SRP), which influences the chosen deployment. See