freshidea - Fotolia
Amazon DAX manages database spikes for improved performance
The DynamoDB Accelerator reduces database latency without tacking on a separate cache. While the service has some early limitations, DAX benefits enterprises with real-time apps.
As enterprise data piles up, so do management and performance challenges. A new service, Amazon DAX, aims to minimize these issues for AWS workloads.
Amazon DynamoDB, a managed NoSQL offering and one of AWS' earliest storage services, is popular among the cloud provider's largest customers, including Expedia, Eyeview, Genesys and Twilio. But, as with any database implementation, IT teams can struggle to size their infrastructure to handle usage changes and to maintain acceptable performance levels. Cloud services make it easy to scale capacity, but it's still a challenge to scale databases without service disruption.
In-memory database caches, such as Amazon ElastiCache, can smooth out load spikes and accelerate performance without meddling with the underlying database. Although ElastiCache, which supports Memcached and Redis, reduces database performance bottlenecks, it also complicates database design and programming. In addition, ElastiCache is a separate service with its own APIs, and developers typically implement it as a side cache, not in line with the primary database.
To address these issues, AWS introduced DynamoDB Accelerator (DAX), a managed, in-memory cache designed as an extension, not an add-on, to the underlying NoSQL database.
Put Amazon DAX to work
Amazon DAX is beneficial in a number of scenarios. IT teams can use the accelerator to better manage unpredictable database access spikes, which can significantly degrade performance. Caching increases the utilization of a primary database and improves performance by serving most items directly from memory.
A team can also use Amazon DAX to deliver faster database performance for near-real-time applications, such as online games, internet of things data analysis or dynamic e-commerce pricing analysis. These types of apps require microseconds of latency. Amazon DAX addresses these needs in two different ways:
- DAX serves as a write-through cache in which incoming data writes directly to the fast cache, which then simultaneously copies it to the master database. Write-through caches accelerate reads and writes, because all new data resides in the cache, but this approach typically works with smaller data sets that won't fill up the cache.
- DAX can also provide a read-through cache. With this approach, writes occur directly on DynamoDB but read data copies to the cache so that commonly accessed items come from the cache.
Amazon DAX does not currently support a write-back cache -- a type of cache that immediately acknowledges writes and then writes changed blocks to the underlying database. Unlike the two scenarios above, which favor read-intensive applications, write-back caches work best for write-intensive applications. AWS might add support for this approach in the future.
How Amazon DAX speeds up database performance
Unlike DynamoDB databases, DAX runs within a Virtual Private Cloud (VPC), which enables users to control network addressing and access policies with security groups and AWS Identity and Access Management (IAM) permissions. DAX deploys as a scale-out cluster composed of five sizes of nodes -- individual cache instances that contain a replica of cached data.
Clusters are groups of database cache nodes, and DAX manages a cluster as a unit. The primary cluster node is solely responsible for writes, while all other nodes handle reads and memory management. DAX cluster endpoints remove the need for applications to know node names or port numbers. These endpoints add read replicas to scale capacity without interrupting the service.
In contrast to ElastiCache, DAX is compatible with DynamoDB APIs. It currently supports read APIs, such as GetItem, BatchGetItem, Query and Scan, and modify APIs, like PutItem, UpdateItem, DeleteItem and BatchWriteItem. But DAX doesn't currently support DynamoDB control plane APIs, such as CreateTable and DeleteTable.
Aside from read-through and write-through techniques, developers can use DAX as a query cache for query text and result sets or as an item cache -- essentially a key-value store. DAX currently doesn't support prepopulating the cache with warm data from DynamoDB; the cache fills over time as data is read from the master database.
Database caches are beneficial because they can increase capacity without disrupting the underlying database. DAX can scale either vertically within a single cache by increasing the memory capacity -- up to 32 virtual CPUs and 244 GB using r3.8xlarge instances -- or horizontally by replicating and distributing the cache across up to 10 nodes accessed via round-robin load balancing. Nodes that scale vertically do so offline, but horizontal scaling can occur online.
Management tips for Amazon DAX
As with VPCs, Amazon DAX clusters cannot span multiple regions or Availability Zones. To properly size clusters, estimate a database's working set size over the required time to live (TTL) for the data. For example, the working set would be 50 GB for a terabyte database in which you only access 5% of the records over a 24-hour period.
Perform cache garbage collection in one of these three ways:
- Use TTL parameters for cache items to evict data after a preset time in the cache.
- Construct a least recently used algorithm to first discard items that haven't been touched for the longest time when space is needed.
- Use write-through eviction, which replaces cache items with newer data written to the same location.
DAX does not currently support manually flushing the cache, but AWS has received the request from pilot customers and could add an API in the future.
Access controls, such as policies for read/write access to particular tables, are a primary security consideration for DAX configuration, as with any AWS database service. DAX doesn't have any unique security features, but it inherits those available to Amazon services within a VPC, including control over network access policy, integration with IAM and CloudTrail for logging. And AWS automatically handles network setup between the VPC-hosted DAX cluster and its associated DynamoDB database.
Developers manage DAX through either the AWS Management Console or AWS Command Line Interface. They can also use CloudWatch to monitor the service. Additionally, tags can track usage and spending for applications that use DynamoDB via the cache.
Java is currently the only supported language with a DAX software development kit (SDK), although Amazon says on its AWS Blog that it plans to support access to DAX through other languages. DAX is API-compatible with DynamoDB, but existing applications will need some modification using the Java SDK to access a DAX cluster. AWS documentation provides a simple example of the changes required to enable an application reading from a DynamoDB table to use DAX instead.