It’s a wrap: Key findings from AWS re:Invent 2023
Yet another AWS re:Invent has come and gone and made an indelible mark in the technology sector. There were so many fresh developments that the keynotes barely scratched the surface. In this blog, I cover the ones pertaining to data, analytics and AI — areas that I am most passionate about.
I am a veteran of conferences and am used to the craziness these events bring. But AWS re:Invent takes that craziness to new heights. From jam-packed Expo Hall to oversubscribed sessions, there are lines rivaling Starbucks everywhere one turns. But this is also the best place to have chance encounters with your acquaintances.
So, with no further delay, I present the key announcements that stood out for me at Amazon Web Services Inc.’s annual conference. As I have done for other conference announcements, I have not mentioned their beta/preview/general availability status. So, please check the relevant AWS sources for the latest release information.
Compute
AWS continued to showcase its proficiency in silicon. Graviton4 is the fourth-generation Arm-based processor for cloud workload in its five-year lifespan and claims to be 30% faster than its predecessor and with 96 cores, instead of 64 in Graviton3. A new instance type, R8g was announced for memory-intensive workloads to take advantage of it.
AWS has processors for training, Trainium, and inference, Inferentia. Both are now in their second generation. The new Trn2 instance promises to reduce training of large language models to weeks.
Jensen Huang, chief executive of Nvidia Corp., made yet another keynote appearance and regaled us with massive graphics processing unit farms coming to an AWS data center near you. However, AWS is pursuing a dual path — one with Nvidia and the other using its own chips. For example, it has a migration service for PyTorch apps on Nvidia to AWS GPUs. Also, Anthropic CEO Dario Amodei mentioned it will use Trainium and Inferentia processors for its AI models.
EC2 Capacity Blocks now allow users to reserve GPUs for periods of one to 14 days. This is great news for people who need to use GPUs for ephemeral workloads as they have been in short supply. The cost of Capacity Blocks is based on supply and demand.
Storage and data security
S3 Express One Zone is the newest storage class that uses custom hardware to deliver 10x faster performance than S3 standard and is 50% cheaper. But the tradeoff is that data is only in one availability zone, which users can choose for the first time. Over the years, S3 has become the de facto file storage with key capabilities like strong consistency. Now, with single-millisecond access time and coupled with table formats such as Iceberg, this one announcement can significantly transform and simplify future data infrastructure.
S3 Access Grants is the get-out-of-IAM-role jail. Identity and access management policies have complex authorization logic and are hard to express. S3 Access Grants is a new security model for access to S3 objects that maps identities in directories such as Active Directory to datasets in S3. It has a control plane and data plane and is part of the Trusted Identity Propagation. It simplifies attribute-based access control on S3 permissions by reducing the number of policies required.
Operational databases
Amazon RDS for Db2 is the newest addition to the RDS family which includes Oracle, SQL Server, MySQL, PostgreSQL and MariaDB. AWS also added data migration support or DMS for the Db2 LUW version on Linux and Windows and Db2 Mainframe z/OS Series. But DMS translates Java stored procedures, not the COBOL stored procedures. Presumably, one can use IBM’s Granite LLM to translate COBOL to Java first. On a side note, Db2 is unique compared to other databases as all its users are operating system users and not database users.
A continuing trend is to make various services serverless. The latest salvo is AWS ElastiCache for Redis and Memcached for microsecond response.
Amazon Aurora is the fastest adopted AWS service. One of the biggest announcements in this category was Aurora Serverless Limitless Database. It automates horizontal sharding while maintaining vertical scaling at partition level. It is a single writer but can now scale writes significantly.
Zero ETL continues its march. Aurora PostgreSQL to Redshift was introduced last year. Now the following have Zero ETL capabilities to Redshift:
- Aurora MySQL
- RDS PostgreSQL
- DynamoDB
In addition to Redshift, DynamoDB also has Zero ETL capability to Amazon OpenSearch. The latter can search (both lexical and semantic) DynamoDB’s data. AWS has expanded Zero ETL to other non-AWS properties such as Salesforce Data Cloud.
Although the feature is calledZero ETL, it is really just “EL.” There is minimal transformation of data as it lands in Redshift. However, once in Redshift, performance features such as AutoMV can monitor query patterns and automatically create materialized views with incremental refresh.
Analytical database
Redshift Serverless now runs on Graviton processor. It has new settings to prevent runaway queries, a workload slider to choose between cost and performance, and ability to sort on query predicates. Redshift Serverless AI Optimizations further automates workload management. Redshift spins up a warm pool of nodes for concurrency scaling. Now it has added two new scaling options — data volumes and query complexity.
Redshift ML does not get vector support as yet, but it now has an ability to create a user defined function to call SageMaker JumpStart, which is a machine learning hub of foundation models and built-in functions to perform tasks such as summarization, translation and sentiment analysis.
Redshift Data Sharing now supports writes, besides reads. It can share third-party data via its integration with Amazon Data Exchange.
Amazon Neptune Analytics is AWS’ latest analytics database engine. It enhances AWS’ property and RDF graph use cases. Its traditional use cases were customer 360 identity, fraud and security graphs, but now knowledge graphs are being leveraged for generative AI applications and vector searches. It has integrations with tools like LangChain. Neptune supports three query languages — OpenCypher, Gremlin and SparQL — but not the upcoming standard, Graph Query Language or GQL.
AI: LLMs and vectors
Amazon SageMaker is used to build and deploy gen AI apps. SageMaker Hyperpod uses parallel training to speed up model training. It also has features such as auto checkpoint and failover.
The coolest word in AI is optionality. Hugging Face, at the time of writing has 420,000 models! Amazon Bedrock provides a range of models from its partners, like Anthropic (Claude 2.1), Cohere and Meta, besides its own Titan family of models. The Titan family grew to become multimodal. AWS also introduced an image generation model with invisible tamper-proof watermark. Bedrock Model Evaluation helps with selecting the right model for your workload.
Knowledge Bases for Bedrock is used for the retrieval-augmented generation or RAG. It fetches, chunks and embeds data from Amazon S3. The embeddings can be stored in your choice of Amazon database like Aurora, or externally in Pinecone, Redis and soon MongoDB Atlas Vector Search.
Where are vector embeddings stored in AWS? Thus far:
- RDS and Aurora PostgreSQL editions: They benefit from the open-source pgvector. RDS Optimized Read allows the use of local NVMe-based SSDs in lieu of EBS to store vector embeddings. This allows AWS to accelerate vector searches by 20%.
- DocumentDB: This is the MongoDB-compatible document store.
- OpenSearch: This is the open-source fork of search giant Elastic.
- MemoryDB for Redis: This in-memory key-value store provides millisecond access.
- Amazon Neptune: This is a graph database.
So far there’s no vector support for MySQL RDS and Aurora.
AI: applications
Amazon Q will probably go down as historically the most significant announcement. Following the footsteps of Google’s Duet AI and Microsoft’s Copilot, Q is a natural language assistant that is being embedded across AWS’ entire portfolio — from Redshift to Amazon Glue to applications such as Amazon Connect for contact centers. Q has been around for a long time in Amazon’s BI tool, QuickSight. Now, powered by AI, it has expanded its wings.
Amazon Q connects to about 40 sources, indexes data and captures semantics as vectors. It supports sources like Google Drive, Gmail, Slack, Amazon S3 and Office365. Using Agents, it can open tickets in tools such as Jira and ServiceNow.
The new service can dramatically change our nature of work. For example, we can ask it to write SQL queries in Redshift or ETL pipelines in Glue. AWS demonstrated how it used Q to migrate 1,000 Java8 apps to Java17 in just two days. It announced that it will soon have .Net-to-Linux capabilities.
Amazon Q comes in two flavors — business ($20/month) and builder ($25/month). It prices lower than DuetAI and Copilot, which cost $30/month.
AWS CodeWhisperer also writes code, but unlike Amazon Q, lacks context, security, privacy and IP safeguarding features. It is still used to write code for services such as DynamoDB.
PartyRock reveals Amazon’s fun side as it launches an interactive playground to create generative AI applications. No AWS account is required and AWS claims tens of thousands of apps have already been created.
Governance and operations
Amazon DataZone is used to discover, catalog, and govern data on AWS, on-premises and third-party sources. Now, the AWS Titan model automatically creates business descriptions.
CloudWatch now has an API that can ease the observability processes.
Operation can be further simplified as Amazon Q can pick the right instance type, troubleshoot network issues, create policies, firewall rules and so on, and automatically open tickets in Jira and ServiceNow.
Conclusion
An estimated 65,000 people attended the event in Las Vegas from Nov. 27 to Dec. 1. The keynote speakers were informative and inspiring. The Analyst Summit had almost 150 of us analysts and was the most professionally run to date. Although we went nonstop with sessions and one-on-ones, we still could not cover the gamut of announcements.
AWS maturity is evident in two areas. One, it is no longer releasing a rampant number of new services, choosing to focus more on improving performance, cost, ease of use and reliability. Two, it is embracing the wider external ecosystem via connectors and integrations. However, on the flip side, its messaging is starting to directly make more aggressive comparisons with the competition. Interestingly, while AWS is starting to support on-premises and other clouds, it still shuns the term multicloud!
Finally, I want to summarize how many different patterns I learned for performing generative AI tasks:
- Natively embed vector embeddings that a gen AI app can use, such as Aurora PostgreSQL.
- Replicate data to a service that provides vector search, such as DynamoDB to OpenSearch.
- Embed a user defined function to perform the needed task in an external service, such as Amazon Redshift ML and SageMaker integration.
- Call an application that performs all the Gen AI lifecycle tasks, such as Amazon Q and Knowledge Bases for Bedrock.
My wish for the next re:Invent is to see a unification of the various data stores and the data catalog. For a conversational bot to deliver trusted responses, it needs to be able to see all the places where a particular data is stored and map it to a data catalog that has the business glossary and a semantic layer.
Sanjeev Mohan is an established thought leader in the areas of cloud, modern data architectures, analytics and AI. He researches and advises on changing trends and technologies and is the author of “Data Product for Dummies.” Until recently, he was a Gartner vice president known for his prolific and detailed research, while directing the research direction for data and analytics. He has been a principal at SanjMo for more than two years, providing technical advisory to elevate category and brand awareness. He has helped several clients in areas like data governance, generative AI, DataOps, data products and observability.
Photo: Robert Hof/SiliconANGLE
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU