Trying out serverless in 2020

2018 and 2019 is when Kubernetes and Serverless became mainstream. 2020 is when we tried Serverless and Microservices in production.

Part 1 - Legacy

We have our backend in PHP7.3 and upgrade to new version of PHP every year. This way we stay compatible with latest releases, but we do not rewrite everything in PHP7.3 every year.

  • Backend is >5 years SOA app.
  • Runs on AWS in Docker.
  • Has internal services, modules, that communicate via RPC
  • Codebase and features are stable
  • Backend has thousands of users, with hundreds of requests per second

Part 3 - Dream

serverless2019

We wanted to try serverless/microservices since summer 2019, as a small dev team it had clear benefits for us (tradeoffs). November 2019 JIRA ticked named “Implement microservice X” made it into sprint.

High level wish list

  • Replace legacy PHP internal service
  • Try serverless
  • Expose microservice as REST API
  • Expose autogenerated documentation
  • Try new language/runtime (Java, Kotlin, Go, Python … PHP runs in Kubeless btw.)
  • Performance monitoring, increased security/access patterbs, create reusable stack for future (micro)services
  • Console/cli deployment, turnkey solution with only AWS key & secret
  • Try DynamoDB on big scale

Part 4 - Beginning

  • Use AWS Lambda
  • Benchmarks of various cold-start/runtimes showed Golang/JVM good results, we selected JVM (Java8)
  • Among JVM we found out Micronaut framework to be really similar with SpringBoot, standard AWS SDK for Java (com.amazonaws:aws-java-sdk-s3:1.11.500)
  • We used JVM8 because AWSSDK2 was not yet stable/released.
  • We used Serverless.com framework as recommended by other frond-end team that had experience with it,
  • AWS SAM lacked features we needed (ALB setup out of the box).
  • PHP service to be replaced with similar REST API
  • RDS database replaced with DynamoDB serverless database
  • Micronaut supports generating Swagger documentation (OpenAPI)
  • Kotlin with Micronaut
  • Monitoring by AWS CloudWatch
  • Service scales “ondemand” from 0 to “alot”
  • Serverless.com framework only needs AWS Key&Secret to deploy, app build is inside MultiStage Docker using Coretto.
  • Code and deployment hosted in separate git repository (polyrepo)

Part 5 - Reality (spoilers)

  • 😒 Kotlin/Micronaut is “large” 23.5MB JAR file and utilize 230Mb memory (AWS CloudWatch)
  • 😒 Kotlin lacked documentation for serverless
  • Should have written Kotlin tests from the beginning
  • 😥 AWS DMS Migration (RDS->Dynamodb) took us 2 weeks !!! to figure out with try/fail, there is very very little on debugging DMS migrations, thigs you will not encounter on small/test datasets but that WILL happen on big production migrations
  • 😕 Limited information on how to choose DMS task configuration (defaults not optimized)
  • 😕 DMS is slow (defaults not optimized)
  • 😕 DMS Configuration is really specific to source and destination (Oracle->PostgreSQL, MySQL->DynamoDB etc.) every combination has its own options,
  • 😕 DMS fine tuning is done via editing JSON configuration (no available via UI)

Migration to DynamoDB

  • Migration and Migration & Replication are NOT the same, Replication has hidden issues as of 01.2020
  • Only 7000…8000 write capacity units using biggest migration instance and concurrent processes … stable migration at 4000 capacity, otherwise errors
  • DMS instance CPU bottleneck
  • 😥 Last Error Task 'MYJHGWWJB54WQBZ76XXXXXXX' was suspended due to 6 successive unexpected failures Stop Reason FATAL_ERROR Error Level FATAL
  • StackOverflowTuning has little effects
    unloadTimeout
    ForceUnloadTimeout
    
  • MySQL access permissions needed
     #GRANT REPLICATION CLIENT ...
      REPLICATION CLIENT – This privilege is required for change data capture (CDC) tasks only. 
      REPLICATION SLAVE  – This privilege is required for change data capture (CDC) tasks only. 
      SUPER              – This privilege is required only in MySQL versions before 5.6.6.
    
  • AWS Magic required for RDS binlogs for production-size DMS migrations
    call mysql.rds_show_configuration;
    call mysql.rds_set_configuration('binlog retention hours', 24);
    
  • Task settings that worked for us - recommended SQL->DynamoDB task, this is mostly black/poor-documented low-level magic we had to exercise
    TargetMetadata.ParallelLoadThreads: 24
    TargetMetadata.BatchApplyEnabled: false
    FullLoadSettings.TransactionConsistencyTimeout: 300
    FullLoadSettings.CommitRate: 20000
    StreamBufferSettings.StreamBufferCount: 12
    StreamBufferSettings.StreamBufferSizeInMB: 40
    StreamBufferSettings.CtrlStreamBufferSizeInMB: 8
    ChangeProcessingTuning.MinTransactionSize: 1000
    ChangeProcessingTuning.CommitTimeout: 5
    ChangeProcessingTuning.MemoryLimitTotal: 1024
    ChangeProcessingTuning.MemoryKeepTime: 60
    ChangeProcessingTuning.StatementCacheSize: 50
    

Part 6 - tres meses despues

Overall API response times during migration from PHP+SQL solution to AWS/Serverless + DynamoDB

2020_Clients_SLA

API response 50/95/99%

2020_Clients_SLA_P

Microservice response - timings measured on client side (no percentiles unfortunatelly, NewRelic)

2020_AWSLambda_MicronautResponseTimes

Microservice response - timings measured on Lambda side (AWS CloudWatch)

2020_AWSLambda_MicronautStats

Summary

2020_club

We will continue learning and trying out new technologies thru 2020. We are happy to see the solution works stable with zero maintenance overhead.

Achievements unlocked:

  • removed memory hungry processing out into lambda and freed more memory inside docker for more PHP workers
  • reduced the 95% and 99% response times, and removed long S3 calls into AWS Lambda where they dont consume limited EC2/Docker resources (memory!)
  • deprecated legacy PHP codebase, cleaned up SQL database of unnecessary data (non relational)
  • new microservice can be improved/deployed/maintained completely separate of main codebase
  • we learned many new things on practice:
    • new language - Kotlin kotlinVersion=1.3.50 & Java (1.8)
    • new framework micronautVersion=1.2.9 - Micronaut, Micronaut Test (io.micronaut.test:micronaut-test-kotlintest), AWS Java SDK
    • new CI/CD framework - AWS SAM, Serverless.com
    • new runtime - AWS Lambda
    • new database - AWS DynamoDB
    • zero downtime migration of production/big databases from SQL to DynamoDB

Bonus, a different point of views

Acknowledgments

Carolina A. for pressing the production deployment button during peak load times on wednesday 26th.

Update 26 Feb 2020

  • Initial release

Update 5 Mar 2020

  • Revisit some details and bonus articles