Trying out serverless in 2020
2018 and 2019 is when Kubernetes and Serverless became mainstream. 2020 is when we tried Serverless and Microservices in production.
- Kubernetes graduated from CNCF in 2018
- Kubeless FAAS on Kubernetes
- AWS Lambda now supports Java 11
- AWS API Gateway v2 for HTTP (replacement of ALB+Lambda combo)
- AWS Lambda Destinations and Asynchronous Invocation Improvements
- Announcing Serverless CI/CD (2020)
- AWS Native SAM alternative to serverless.com
- AWS Lambda Now Supports Custom Runtimes
- PHP goes serverless with Laravel Vapor in 2019
- The AWS Serverless Application Repository
- AWS App Mesh Service Discovery
- CNCF Cloud Native Interactive Landscape
- AWS Database Migration Service got more mature
- Go released modules support finally in 1.13
- AOT JVM based micro-frameworks for Serverless started to gain strength, i.e. Micronaut or Quarkus
- Improved VPC networking for AWS Lambda functions
- AWS Savings Plan Update: Save Up to 17% On Your Lambda …
Part 1 - Legacy
We have our backend in PHP7.3 and upgrade to new version of PHP every year. This way we stay compatible with latest releases, but we do not rewrite everything in PHP7.3 every year.
- Backend is >5 years SOA app.
- Runs on AWS in Docker.
- Has internal services, modules, that communicate via RPC
- Codebase and features are stable
- Backend has thousands of users, with hundreds of requests per second
Part 3 - Dream
We wanted to try serverless/microservices since summer 2019, as a small dev team it had clear benefits for us (tradeoffs). November 2019 JIRA ticked named “Implement microservice X” made it into sprint.
High level wish list
- Replace legacy PHP internal service
- Try serverless
- Expose microservice as REST API
- Expose autogenerated documentation
- Try new language/runtime (Java, Kotlin, Go, Python … PHP runs in Kubeless btw.)
- Performance monitoring, increased security/access patterbs, create reusable stack for future (micro)services
- Console/cli deployment, turnkey solution with only AWS key & secret
- Try DynamoDB on big scale
Part 4 - Beginning
- Use AWS Lambda
- Benchmarks of various cold-start/runtimes showed Golang/JVM good results, we selected JVM (Java8)
- Among JVM we found out Micronaut framework to be really similar with SpringBoot, standard AWS SDK for Java (com.amazonaws:aws-java-sdk-s3:1.11.500)
- We used JVM8 because AWSSDK2 was not yet stable/released.
- We used Serverless.com framework as recommended by other frond-end team that had experience with it,
- AWS SAM lacked features we needed (ALB setup out of the box).
- PHP service to be replaced with similar REST API
- RDS database replaced with DynamoDB serverless database
- Micronaut supports generating Swagger documentation (OpenAPI)
- Kotlin with Micronaut
- Monitoring by AWS CloudWatch
- Service scales “ondemand” from 0 to “alot”
- Serverless.com framework only needs AWS Key&Secret to deploy, app build is inside MultiStage Docker using Coretto.
- Code and deployment hosted in separate git repository (polyrepo)
Part 5 - Reality (spoilers)
- 😒 Kotlin/Micronaut is “large” 23.5MB JAR file and utilize 230Mb memory (AWS CloudWatch)
- 😒 Kotlin lacked documentation for serverless
- Should have written Kotlin tests from the beginning
- 😥 AWS DMS Migration (RDS->Dynamodb) took us 2 weeks !!! to figure out with try/fail, there is very very little on debugging DMS migrations, thigs you will not encounter on small/test datasets but that WILL happen on big production migrations
- 😕 Limited information on how to choose DMS task configuration (defaults not optimized)
- 😕 DMS is slow (defaults not optimized)
- 😕 DMS Configuration is really specific to source and destination (Oracle->PostgreSQL, MySQL->DynamoDB etc.) every combination has its own options,
- 😕 DMS fine tuning is done via editing JSON configuration (no available via UI)
Migration to DynamoDB
- Migration and Migration & Replication are NOT the same, Replication has hidden issues as of 01.2020
- Only 7000…8000 write capacity units using biggest migration instance and concurrent processes … stable migration at 4000 capacity, otherwise errors
- DMS instance CPU bottleneck
- 😥
Last Error Task 'MYJHGWWJB54WQBZ76XXXXXXX' was suspended due to 6 successive unexpected failures Stop Reason FATAL_ERROR Error Level FATAL
- StackOverflowTuning has little effects
unloadTimeout ForceUnloadTimeout
- MySQL access permissions needed
#GRANT REPLICATION CLIENT ... REPLICATION CLIENT – This privilege is required for change data capture (CDC) tasks only. REPLICATION SLAVE – This privilege is required for change data capture (CDC) tasks only. SUPER – This privilege is required only in MySQL versions before 5.6.6.
- AWS Magic required for RDS binlogs for production-size DMS migrations
call mysql.rds_show_configuration; call mysql.rds_set_configuration('binlog retention hours', 24);
- Task settings that worked for us - recommended SQL->DynamoDB task, this is mostly black/poor-documented low-level magic we had to exercise
TargetMetadata.ParallelLoadThreads: 24 TargetMetadata.BatchApplyEnabled: false FullLoadSettings.TransactionConsistencyTimeout: 300 FullLoadSettings.CommitRate: 20000 StreamBufferSettings.StreamBufferCount: 12 StreamBufferSettings.StreamBufferSizeInMB: 40 StreamBufferSettings.CtrlStreamBufferSizeInMB: 8 ChangeProcessingTuning.MinTransactionSize: 1000 ChangeProcessingTuning.CommitTimeout: 5 ChangeProcessingTuning.MemoryLimitTotal: 1024 ChangeProcessingTuning.MemoryKeepTime: 60 ChangeProcessingTuning.StatementCacheSize: 50
Part 6 - tres meses despues
Overall API response times during migration from PHP+SQL solution to AWS/Serverless + DynamoDB
API response 50/95/99%
Microservice response - timings measured on client side (no percentiles unfortunatelly, NewRelic)
Microservice response - timings measured on Lambda side (AWS CloudWatch)
Summary
We will continue learning and trying out new technologies thru 2020. We are happy to see the solution works stable with zero maintenance overhead.
Achievements unlocked:
- removed memory hungry processing out into lambda and freed more memory inside docker for more PHP workers
- reduced the 95% and 99% response times, and removed long S3 calls into AWS Lambda where they dont consume limited EC2/Docker resources (memory!)
- deprecated legacy PHP codebase, cleaned up SQL database of unnecessary data (non relational)
- new microservice can be improved/deployed/maintained completely separate of main codebase
- we learned many new things on practice:
- new language - Kotlin
kotlinVersion=1.3.50
& Java (1.8) - new framework
micronautVersion=1.2.9
- Micronaut, Micronaut Test (io.micronaut.test:micronaut-test-kotlintest), AWS Java SDK - new CI/CD framework - AWS SAM, Serverless.com
- new runtime - AWS Lambda
- new database - AWS DynamoDB
- zero downtime migration of production/big databases from SQL to DynamoDB
- new language - Kotlin
Bonus, a different point of views
- Microservices guru warns devs that trendy architecture shouldn’t be the default for every app, but ‘a last resort’
- Kubernetes is not a silver bullet
- Monoliths are the future
- Give Me Back My Monolith
- Well Architected Monoliths are Okay
- Don’t ask if a monorepo is good for you – ask if you’re good enough for a monorepo
- The Many Benefits of Using a Monorepo
- Cold start / Warm start with AWS Lambda
Acknowledgments
Carolina A. for pressing the production deployment button during peak load times on wednesday 26th.
Update 26 Feb 2020
- Initial release
Update 5 Mar 2020
- Revisit some details and bonus articles
https://moar.sshilko.com/2020/02/26/Serverless-Try