reInventing AI

A review of re:Invent 2023

by Craig Tongue, Cloud Platform Architect here at Cloudscaler

Many people are calling this the “GenAI re:invent”, and it certainly feels like a watershed moment. Even with the, somewhat wobbly, launch of Amazon Q, it is hard not to notice the number of new products, features and enhancements all based around using Foundational Large Language Models.

In this review of re:Invent 2023 I want to summarise the ones that caught my eye and which I feel are worth looking into. I have broken it into two broad areas, those providing AI/ML based tooling as a service, and those that help you build your own services using AI and ML techniques.

AI as a Service

The elephant in the room here is clearly Q. This boldly named service is an umbrella term for many different services. While a little late to the GenAI party, AWS have spent time integrating Q everywhere across their landscape.

AWS Q is integrated with the console to provide recommendations on architecture patterns and even EC2 Instance sizes and can help navigate the encyclopaedia of documentation. It is also on error pages to help debug problems and can even create tickets, as well as suggested fixes and merge requests.

It also works with CodeCatalyst and integrates with QuickSight to create dashboards and editorial as well as with Connect to act as a virtual assistant to contact centre workers and chat bots.

Only time will tell if the quality of the outputs is as good as the Claudes and Chat GPTs of this world, but having it integrated into nearly every single interaction channel within AWS certainly takes the burden away, and allows powerful, frictionless and context-driven assistance where it is needed most.

Sneaking somewhat under the radar (at least for me) were some of the announcements on CodeWhisperer — I encourage everyone to watch the demo with the indefatigable Rory Richardson (of Lambda and serverless fame) and Doug Seven. They were left somewhat high and dry by the audience — I don’t think any of them got the applause hints or Queen references — I counted 10 in the slide titles alone.

However, the content and demos were really what made it for me, the additional language support (Cloud Formation and Terraform, Shell scripting (really!) among others) coupled with the Reference Tracking (enabling the attribution or avoidance of licenced code) and Security Scanning to identify bugs before they even make it into a commit, were the big developments for me. This and never having to remember any bash commands for complicated greps or git repo fu is really going to make building more fun.

AI Tools

Moving on to building AI-based solutions, the announcements were in three broad areas, obviously SageMaker and Bedrock have the lion’s share of new functionality and I will summarise my key points on those. But first, scattered over the keynotes were several Zero ETL integrations between common data components that will make an enormous difference to people in the data pipeline and prototyping spaces. Nothing grinds a project to a halt more than slow and painful data integrations where you map _id to Id and 100 other minute schema differences to get one service talking to another, and no-where is this more impactful than in the GenAI space, where data is king.

OpenSearch (popular for vectorised data like documents) is now talking directly to S3 and DynamoDB. A similar batch of integrations for Redshift (RDS, Aurora , DynamoDB) and Athena now understands both CloudTrail and their Contact Centre product, Connect. With more to follow, the true point and click data pipeline dream is not far away, and I cannot wait for the death of ETL.

For Bedrock, the announcements fit into two clear categories. Integrations with new versions of foundational models (Stable Diffusion XL 1.0, Llama 2, Claude 2.1 and of course AWS’ own Titan suite) as well as tools to help you assess and deploy them. The Model Evaluation tooling will help everyone going through the same problem of picking and choosing from the different model families and sizes, the landscape of options and optimisations seems be changing daily.

With the goalposts moving so fast (I was working with some data scientists who see the horizon as 3 months away — after that amount of time the landscape will have moved so far, they start again from scratch), tactical and strategic solutions are a thing of the past as change becomes the only constant. Having tooling to automate the assessment (and re-assessment) of these models in a hands-off way will help a lot with churn and sunk cost bias. The batch inference API will help with testing the models, while audit manager will help with governance, ensuring your usage is compliant with your policy (particularly relevant for AI with policies around sensitive data, bias and other pre- or post- processing checks).

Another innocuous sounding announcement was SageMaker HyperPod. Think of this like a Kubernetes cluster for model training. There is a great description of how it works in the talk by Bratin Saha (VP of AI/ML) but making the huge process of training models fault-tolerant and cost sensitive is a huge win for anyone (like me) who has come back to find a multi-day process crashed and all the effort and expense lost. We no longer have to go agane.

Finally, onto SageMaker. This was already mature, so the big announcements were almost all focused on user and developer experience. For business users, the no code Canvas launch and for the developers, both the Studio enhancements and integrations with JupyterLab will be helpful, along with the new onboarding and governance experience allowing guardrails to be applied and enforced more easily. Data Scientists will love being able to create their pipelines in Python and the ability to attach EFS volumes directly. Along with support for even bigger models using the new DLC running NVIDIA’s LLM libraries. None of these are ground-breaking but they all contribute to the maturity of the platform.

For me, overall, the biggest shift wasn’t necessarily in the tech, the models, or the integrations but the overall paradigm. Using Gen AI to solve problems both as an individual and as an organisation is now truly pervasive, here to stay, and the tools are maturing rapidly. The question has moved from “should I?” to “how should I?”. This will leave a lot of people running to catch up, both in their own workflows and as a business.

{If you’re undertaking a transformation programme based on GenAI, you may also be interested in our earlier blog ‘Can your cloud platform cope with Generative AI?’}