Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task simulator #3541

Closed
wants to merge 42 commits into from
Closed

Task simulator #3541

wants to merge 42 commits into from

Conversation

nagkumar91
Copy link
Member

@nagkumar91 nagkumar91 commented Jul 11, 2024

Description

Please add an informative description that covers that changes made by the pull request and link all relevant issues.

All Promptflow Contribution checklist:

  • The pull request does not introduce [breaking changes].
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.
  • Create an issue and link to the pull request to get dedicated review from promptflow team. Learn more: suggested workflow.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

@github-actions github-actions bot added examples Improvements on examples promptflow-evals labels Jul 11, 2024
Copy link

github-actions bot commented Jul 15, 2024

promptflow-evals test result

 12 files  ±  0   12 suites  ±0   1h 21m 17s ⏱️ + 55m 8s
 50 tests  -  60   49 ✅  -  61   1 💤 + 1  0 ❌ ±0 
600 runs   - 720  588 ✅  - 732  12 💤 +12  0 ❌ ±0 

Results for commit 5e8cf48. ± Comparison against base commit 4ff9706.

This pull request removes 110 and adds 50 tests. Note that renamed tests count towards both.
tests.evals.unittests.test_batch_run_context.TestBatchRunContext ‑ test_batch_timeout_custom
tests.evals.unittests.test_batch_run_context.TestBatchRunContext ‑ test_batch_timeout_default
tests.evals.unittests.test_batch_run_context.TestBatchRunContext ‑ test_with_codeclient
tests.evals.unittests.test_batch_run_context.TestBatchRunContext ‑ test_with_pfclient
tests.evals.unittests.test_built_in_evaluator.TestBuiltInEvaluators ‑ test_fluency_evaluator
tests.evals.unittests.test_built_in_evaluator.TestBuiltInEvaluators ‑ test_fluency_evaluator_empty_string
tests.evals.unittests.test_built_in_evaluator.TestBuiltInEvaluators ‑ test_fluency_evaluator_non_string_inputs
tests.evals.unittests.test_chat_evaluator.TestChatEvaluator ‑ test_conversation_validation_invalid_citations
tests.evals.unittests.test_chat_evaluator.TestChatEvaluator ‑ test_conversation_validation_missing_role
tests.evals.unittests.test_chat_evaluator.TestChatEvaluator ‑ test_conversation_validation_normal
…
tests.evals.e2etests.test_builtin_evaluators.TestBuiltInEvaluators ‑ test_composite_evaluator_chat[False-True]
tests.evals.e2etests.test_builtin_evaluators.TestBuiltInEvaluators ‑ test_composite_evaluator_chat[True-True]
tests.evals.e2etests.test_builtin_evaluators.TestBuiltInEvaluators ‑ test_composite_evaluator_content_safety
tests.evals.e2etests.test_builtin_evaluators.TestBuiltInEvaluators ‑ test_composite_evaluator_content_safety_chat[False-False]
tests.evals.e2etests.test_builtin_evaluators.TestBuiltInEvaluators ‑ test_composite_evaluator_content_safety_chat[True-False]
tests.evals.e2etests.test_builtin_evaluators.TestBuiltInEvaluators ‑ test_composite_evaluator_qa[False]
tests.evals.e2etests.test_builtin_evaluators.TestBuiltInEvaluators ‑ test_composite_evaluator_qa[True]
tests.evals.e2etests.test_builtin_evaluators.TestBuiltInEvaluators ‑ test_composite_evaluator_qa_for_nans
tests.evals.e2etests.test_builtin_evaluators.TestBuiltInEvaluators ‑ test_content_safety_evaluator_hate_unfairness
tests.evals.e2etests.test_builtin_evaluators.TestBuiltInEvaluators ‑ test_content_safety_evaluator_self_harm
…

♻️ This comment has been updated with latest results.

@github-actions github-actions bot removed the examples Improvements on examples label Jul 15, 2024
@nagkumar91 nagkumar91 marked this pull request as ready for review August 2, 2024 00:45
@nagkumar91 nagkumar91 requested a review from a team as a code owner August 2, 2024 00:45
@nagkumar91 nagkumar91 marked this pull request as draft August 2, 2024 01:42
Copy link

Hi, thank you for your interest in helping to improve the prompt flow experience and for your contribution. We've noticed that there hasn't been recent engagement on this pull request. If this is still an active work stream, please let us know by pushing some changes or leaving a comment.

@github-actions github-actions bot added the no-recent-activity There has been no recent activity on this issue/pull request label Aug 22, 2024
Copy link

Hi, thank you for your contribution. Since there has not been recent engagement, we are going to close this out. Feel free to reopen if you'd like to continue working on these changes. Please be sure to remove the no-recent-activity label; otherwise, this is likely to be closed again with the next cleanup pass.

@github-actions github-actions bot closed this Aug 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-recent-activity There has been no recent activity on this issue/pull request promptflow-evals
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants