Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance degradation during scale-out operations #45

Öffnen Sie
felipmiguel opened this issue Apr 29, 2023 · 2 comments
Öffnen Sie

Performance degradation during scale-out operations #45

felipmiguel opened this issue Apr 29, 2023 · 2 comments
Assignees
Labels

Kommentare

@felipmiguel
Copy link

Describe the bug
During scale out operations the performace of the application is degraded. It happens with auto-scale and also manual scale. It doesn't happen during scale-in operations.

To Reproduce
Steps to reproduce the behavior:

  1. Given an application with constant workload.
  2. In Azure portal go to apps. Select the application to be scaled-out.
  3. Go to scale out settings.
  4. Increase the number of instances of the application and save.
  5. During the scale-out operation the application throughput is reduced significantly. For instance going from 500K Requests per Second (RPS) to less than 100K RPS. This can be measured using Application Insights.
  6. When the scale-out operation completes the throughput recovers and work as expected, increasing accordingly to new number of instances.

Expected behavior
The expected behavior is that the performance of the application stays stable during the scale-out operation. Once the operation is complete, then throughput should improve.
That is specially important for systems under heavy load. If there is auto-scale in place usally is triggered when more performance is needed, but during this scale-out operation process the performance of the solution goes down.

Screenshots
image

Additional context
To generate the load I was using Azure Load Testing and the load was constant, that is why it is possible to detect this situation.

Can we contact you for additional details? Y/N

If yes, please send us your contact information to [email protected] and include the issue number in the email title.

@qingc
Copy link

qingc commented May 23, 2023

Hi @felipmiguel,
Thank you for submitting this issue.
We will try to reproduce this issue and figure out how to fix it.

@qingc
Copy link

qingc commented Jun 5, 2023

Hi @felipmiguel,

I have reproduced this issue. The RPS drops during application scale-out, see metrics from Application Insights.
image
From dashboard on test client(Azure Load Test) side, response time increased during app scale-out.
image
It can also be observed from server side, see metrics from Application Insights
image

From Application Insights live metrics, we can see part of requests are handled by new Pods which have longer response time during new Pods warm-up. The application is using in-memory cache, the response will be longer in newly created Pods than existing Pods because the cache is not available yet.
image

According to RPS calculation formula: Virtual users = RPS * latency in seconds. Given the number of virtual users is consistent, as latency increases the RPS drops. This explains why the RPS drops during application scale-out, it is expected behavior. See more details here: Key concepts for Azure Load Testing | Microsoft Learn

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants