A complete end-to-end news classification system using DistilBERT, SageMaker, Lambda, and API Gateway.
Client → API Gateway → Lambda → SageMaker Endpoint → DistilBERT Model
- Training:
script.py- Fine-tunes DistilBERT on news data - Inference:
inference.py- Handles model loading and prediction - Deployment:
Deployment.ipynb- Deploys model to SageMaker endpoint - Lambda:
aws-lambda-llm-endpoint-invoke-function.py- API handler - API Gateway:
template.yaml- REST API infrastructure
- AWS CLI configured (
aws configure) - AWS SAM CLI installed (
pip install aws-sam-cli) - SageMaker endpoint deployed and running
-
Deploy SageMaker Model (if not done):
# Run the Deployment.ipynb notebook first -
Deploy API Gateway + Lambda:
./deploy.sh
-
Test the API:
# Get the API URL from deployment output, then: python test_api.py <API_URL> # Or test specific headline: python test_api.py <API_URL> "Stock market crashes due to inflation"
POST https://{api-id}.execute-api.{region}.amazonaws.com/prod/classify
{
"query": {
"headline": "Scientists discover new treatment for cancer"
}
}{
"predicted_label": "Health",
"probabilities": [[0.05, 0.10, 0.05, 0.80]]
}- Business (index 0)
- Science (index 1)
- Entertainment (index 2)
- Health (index 3)
curl -X POST https://your-api-url/prod/classify \
-H "Content-Type: application/json" \
-d '{"query": {"headline": "New vaccine shows 95% effectiveness"}}'script.py- Training script for SageMakerinference.py- Model inference logicDeployment.ipynb- SageMaker deployment notebookaws-lambda-llm-endpoint-invoke-function.py- Lambda functiontemplate.yaml- SAM infrastructure templatedeploy.sh- Deployment scripttest_api.py- API testing script
- Lambda Timeout: Increase timeout in
template.yaml - Permissions Error: Check IAM roles for SageMaker access
- Endpoint Not Found: Verify SageMaker endpoint name matches
- CORS Issues: API Gateway CORS is pre-configured
- SageMaker Endpoint: Runs continuously (~$100-200/month for ml.m5.xlarge)
- Lambda: Pay per request (~$0.0000002 per request)
- API Gateway: Pay per request (~$0.0000035 per request)
Consider using SageMaker Serverless Inference for lower costs with variable traffic.