Skip to the content.
N
Notezio
/ AWS Certified AI Practitioner (AIF-C01)

Amazon SageMaker

Overview

Built-in ML Algorithms

Automatic Model Tuning (AMT)

Model Deployment and Inference

SageMaker Model Deployment Comparison

Feature Real-Time Inference Serverless Inference Asynchronous Inference Batch Transform
Latency Low (milliseconds to seconds) Low (milliseconds to seconds) Medium to high (near real-time) High (minutes to hours)
Max Payload up to 6 MB up to 4 MB 1 GB Up to 100 MB per invocation (per mini batch)
Timeout 60 seconds 60 seconds Max 1 hour Max 1 hour
Real Example Fast, near-instant predictions for web/mobile apps like Online Fraud Detection: Processing live credit card transactions in milliseconds. Sporadic, short-term inference without infrastructure like Customer Support Bot: Handling unpredictable chat volume during product launches. Large payloads and workloads requiring longer processing times, like Medical Imaging Analysis: Processing large high-res MRI scans or video files for diagnosis. Bulk processing for large datasets like E-commerce Analytics: Weekly churn prediction for 1M+ customers or generating daily product recommendations for an entire user base.

SageMaker Studio

Data Wrangler

ML Features

SageMaker Feature Store

SageMaker Clarify

Model Explainability

SageMaker Ground Truth

ML Governance

SageMaker Model Dashboards

SageMaker Model Monitor

SageMaker Model Registry

SageMaker Pipelines

Pipeline Structure

SageMaker JumpStart

SageMaker JumpStart

Model Fine-Tuning with JumpStart

SageMaker Canvas

MLFlow for Amazon SageMaker

Summary