Configuring Node Stores and Data Stores in AEM 6.0

Introduction

In AEM 6, binary data can be stored independently from the content nodes. The location where the binary data is stored is referred to as the Data Store, while the location of the content nodes is called the Node Store.

Both data stores and node stores can be configured using OSGi configuration. Each OSGi configuration is referred via a persistent identifier (PID).

Configuration steps

In order to configure both the node store and the data store, the following steps must be performed:

  1. Copy the AEM 6 quickstart jar into its installation directory.

  2. Create a folder named crx-quickstart\install in the installation directory.

  3. First, configure the node store by creating a configuration file with the name of the node store option you want to use in the crx-quickstart\install directory.

    For example, the Document Node Store (which is the basis for AEM's MongoMK implementation) will use a file called org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreService.cfg

  4. Edit the file and set your configuration options.

  5. Create a configuration file with the PID of the data store you wish to use and edit the file in order to set the configuration options.

    Hinweis

    See Node Store Configurations and Data Store Configurations for configuration options.

  6. Start AEM.

Node Store Configurations

Segment Node Store

The Segment Node Store is the basis of Adobe's TarMK implementation in AEM6. It uses the org.apache.jackrabbit.oak.plugins.segment.SegmentNodeStoreService PID for configuration.

The following options can be configured:

  • repository.home: Path to repository home under which various repository related data is stored. By default segment files would be stored under the crx-quickstart/segmentstore directory.
  • tarmk.size: Maximum size of a segment in MB. The default is 256MB.

An example org.apache.jackrabbit.oak.plugins.segment.SegmentNodeStoreService.cfg file should look like this:

repository.home=${repository.home}/segmentstore
tarmk.size=256
        

Code-Beispiele dienen lediglich zu Illustrationszwecken.

Document Node Store

The Document Node Store is the basis of AEM's MongoMK implementation. It uses the org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreService PID. The following configuration options are available:

  • mongouri: The MongoURI required to connect to Mongo Database. The default is mongodb://localhost:27017
  • db: Name of the Mongo database. The default is Oak. However, note that new AEM 6 installations use aem-author as the default database name.
  • cache: The cache size in MB. This is distributed among various caches used in DocumentNodeStore. The default is 256
  • changesSize: Size in MB of capped collection used in Mongo for caching the diff output. The default is 256
  • customBlobStore: Boolean value indicating that a custom data store will be used. The default is false.

Again, an example org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreService.cfg file should look like this:

#Mongo server details
mongouri=mongodb://localhost:27017

#Name of Mongo database to use
db=aem-author

#Store binaries in custom BlobStore
customBlobStore=false
        

Code-Beispiele dienen lediglich zu Illustrationszwecken.

Data Store Configurations

Hinweis

In order to enable custom Data Stores, you need to make sure that customBlobStore is set to true in the respective Node Store configuration file.

File Data Store

This is the implementation of FileDataStore present in Jackrabbit 2. It provides a way to store the binary data as normal files on the file system. It uses the org.apache.jackrabbit.oak.plugins.blob.datastore.FileDataStore PID.

These configuration options are available:

  • repository.home: Path to repository home under which various repository related data is stored. By default, binary files would be stored under crx-quickstart/repository/datastore directory
  • path: Path to the directory under which the files would be stored. If specified then it takes precedence over repository.home value
  • minRecordLength: The minimum size in bytes of a file stored in the data store. Binary content less than this value would be inlined.

 

  1. Extract the contents of the zip file to the <aem-install>\crx-quickstart folder.

  2. If AEM is already configured to work with the Tar or Mongo microkernels, remove any existing configuration files from the <aem-install>\crx-quickstart\install folder before proceeding. The files that need to be removed are:

    • For TarMK: org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreService.cfg
    • For MongoMK: org.apache.jackrabbit.oak.plugins.segment.SegmentNodeStoreService.cfg from crx-quickstart/install
  3. Create a file named org.apache.jackrabbit.oak.plugins.blob.datastore.S3DataStore.cfg in the same <aem-install>\crx-quickstart\install folder.

  4. Edit the file and add the configuration options required by your setup.

  5. Start AEM.

You can use the configuration file with the following options:

  • accessKey: The AWS account ID.
  • secretKey: The AWS password.
  • s3Bucket: The bucket name.
  • s3Region: The bucket region.
  • connectionTimeout
  • socketTimeout
  • maxConnections
  • maxErrorRetry
  • maxCachedBinarySize: Default - 17408 (17 KB) Size in bytes. Binaries with size less than or equal to this size would be stored in in memory cache
  • cacheSizeInMB: Default - 16Size in MB. In memory cache for storing small files whose size is less than maxCachedBinarySize. This helps in better performance when lots of small binaries are accessed frequently.
  • cacheSize: Value is spcified in bytes. The Default is 64GB. 
  • cachePurgeTrigFactor: The trigger factor that decides the purging of local cache. Cache purge will trigger if the current size of the cache is bigger than the amount of cachePurgeTrigFactor multiplied by cacheSize.
  • cachePurgeResizeFactor: The cache resize factor. The resulting cache size after the purge would be  the value of cachePurgeResizeFactor multiplied by the value of cacheSize.
  • asyncUploadlimit:
​