AEM as a Cloud Service Development Guidelines
Code running in AEM as a Cloud Service must be aware of the fact that it is always running in a cluster. This means that there is always more than one instance running. The code must be resilient especially as an instance might be stopped at any point in time.
During the update of AEM as a Cloud Service, there will be instances with old and new code running in parallel. Therefore, old code must not break with content created by new code and new code must be able to deal with old content.
If there is the need to identify the primary in the cluster, the Apache Sling Discovery API can be used to detect it.
State in Memory
State must not be kept in memory but persisted in the repository. Otherwise, this state might get lost if an instance is stopped.
State on the Filesystem
The instance's file system should not be used in AEM as a Cloud Service. The disk is ephemeral and will be disposed when instances are recycled. Limited use of the filesystem for temporary storage relating to the processing of single requests is possible, but should not be abused for huge files. This is because it may have a negative impact on the resource usage quota and run into disk limitations.
As an example where file system usage is not supported, the Publish tier should ensure that any data that needs to be persisted is shipped off to an external service for longer term storage.
Similar, with everything that is asynchronously happening like acting on observation events, it cannot be guaranteed to be executed locally and therefore must be used with care. This is true for both JCR events and Sling resource events. At the time a change is happening, the instance may be taken down and be replaced by a different instance. Other instances in the topology that are active at that time will be able to react to that event. In this case however, this will not be a local event and there might even be no active leader in case of an ongoing leader election when the event is issued.
Background Tasks and Long Running Jobs
Code executed as a background tasks must assume that the instance it is running in can be brought down at any time. Therefore the code must be resilient and most import resumable. That means that if the code gets re-executed it should not start from the beginning again but rather close to from where it left off. While this is not a new requirement for this kind of code, in AEM as a Cloud Service it is more likely that an instance take down is going to occur.
To minimize the trouble, long running jobs should be avoided if possible, and they should be resumable at a minimum. For executing such jobs, use Sling Jobs, which have an at-least-once guarantee and hence if they get interrupted will get re-executed as soon as possible. But they should probably not start from the beginning again. For scheduling such jobs, it is best to use the Sling Jobs scheduler as this again the at-least-once execution.
The Sling Commons Scheduler should not be used for scheduling as execution cannot be guaranteed. It is just more likely that it will be scheduled.
Similarly, with everything that is asynchronously happening, like acting on observation events, (being it JCR events or Sling resource events), can't be guaranteed to be executed and therefore must be used with care. This is already true for AEM deployments in the present.
Outgoing HTTP Connections
It is strongly recommended that any outgoing HTTP connections set reasonable connect and read timeouts. For code that does not apply these timeouts, AEM instances running on AEM as a Cloud Service will enforce a global timeouts. These timeout values are 10 seconds for connect calls and 60 seconds for read calls for connections used by the following popular Java libraries:
Adobe recommends the use of the provided Apache HttpComponents Client 4.x library for making HTTP connections.
Alternatives that are known to work, but may require providing the dependency yourself are:
No Classic UI Customizations
AEM as a Cloud Service only supports the Touch UI for 3rd party customer code. Classic UI is not available for customization.
Avoid Native Binaries
Code will not be able to download binaries at runtime nor modify them. For example, it will not be able to unpack jar or tar files.
No Streaming Binaries through AEM as a Cloud Service
Binaries should be accessed through the CDN, which will serve binaries outside of the core AEM services.
For example, do not use asset.getOriginal().getStream() , which triggers downloading a binary onto the AEM service's ephemeral disk.
No Reverse Replication Agents
Reverse replication from Publish to Author is not supported in AEM as a Cloud Service. If such a strategy is needed, you can use an external persistence store that is shared amongst the farm of Publish instances and potentially the Author cluster.
Forward Replication Agents Might Need to be Ported
Content is replicated from Author to Publish through a pub-sub mechanism. Custom replication agents are not supported.
Monitoring and Debugging
For local development, logs entries are written to local files in the /crx-quickstart/logs folder.
On Cloud environments, developers can download logs through Cloud Manager or use a command line tool to tail the logs.
Setting the Log Level
To change the log levels for Cloud environments, the Sling Logging OSGI configuration should be modified, followed by a full redeployment. Since this is not instantaneous, be cautious about enabling verbose logs on production environments which receive a lot of traffic. In the future, it's possible that there will be mechanisms to more quickly change the log level.
In order to perform the configuration changes listed below, you need to create them on a local development environment and then push them to an AEM as a Cloud Service instance. For more information on how to do this, see Deploying to AEM as a Cloud Service .
Activating the DEBUG Log Level
The default log level is INFO, that is, DEBUG messages are not logged. To activate DEBUG log level, set the
property to debug. Do not leave the log at the DEBUG log level longer than necessary, as it generates a lot of logs. A line in the debug file usually starts with DEBUG, and then provides the log level, the installer action and the log message. For example:
DEBUG 3 WebApp Panel: WebApp successfully deployed
The log levels are as follows:
The action has failed, and the installer cannot proceed.
The action has failed. The installation proceeds, but a part of CRX was not installed correctly and will not work.
The action has succeeded but encountered problems. CRX may or may not work correctly.
The action has succeeded.
Thread dumps on Cloud environments are collected on an ongoing basis, but cannot be downloaded in a self-serve manner at this time. In the meanwhile, please contact AEM support if thread dumps are needed for debugging an issue, specifying the exact time window.
CRX/DE Lite and System Console
For local development, Developers have full access to CRXDE Lite ( /crx/de ) and the AEM Web Console ( /system/console ).
Note that on local development (using the cloud-ready quickstart), /apps and /libs can be written to directly, which is different from Cloud environments where those top level folders are immutable.
AEM as a Cloud Service Development tools
Customers can access CRXDE lite on the development environment but not stage or production. The immutable repository ( /libs , /apps ) cannot be written to at runtime so attempting to do so will result in errors.
A set of tools for debugging AEM as a Cloud Service developer environments are available in the Developer Console for dev, stage, and production environments. The url can be determined by adjusting the Author or Publish service urls as follows:
As a shortcut, the following Cloud Manager CLI command can be used to launch the developer console based on an environment parameter described below:
aio cloudmanager:open-developer-console <ENVIRONMENTID> --programId <PROGRAMID>
See this page for more information.
Developers can generate status information, and resolve various resources.
As illustrated below, available statuses information include the state of bundles, components, OSGI configurations, oak indexes, OSGI services, and Sling jobs.
As illustrated below, developers can resolve package dependencies and servlets:
Also useful for debugging, the Developer console has a link to the Explain Query tool:
For regular programs, access to the Developer Console is defined by the "Cloud Manager - Developer Role" in the Admin Console, while for sandbox programs, the Developer Console is available to any user with a product profile giving them access to AEM as a Cloud Service. For more information about setting up user permissions, see Cloud Manager Documentation .
AEM Staging and Production Service
Customers will not have access to developer tooling for staging and production environments.
Adobe monitors application performance and takes measures to address if deterioration is observed. At this time, application metrics can not be obeserved.