Task simulator #3541

nagkumar91 · 2024-07-11T15:51:12Z

Description

Please add an informative description that covers that changes made by the pull request and link all relevant issues.

All Promptflow Contribution checklist:

The pull request does not introduce [breaking changes].
CHANGELOG is updated for new features, bug fixes or other significant changes.
I have read the contribution guidelines.
Create an issue and link to the pull request to get dedicated review from promptflow team. Learn more: suggested workflow.

General Guidelines and Best Practices

Title of the pull request is clear and informative.
There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

Pull request includes test coverage for the included changes.

github-actions · 2024-07-15T15:24:20Z

promptflow-evals test result

12 files ± 0 12 suites ±0 1h 21m 17s ⏱️ + 55m 8s
50 tests - 60 49 ✅ - 61 1 💤 + 1 0 ❌ ±0
600 runs - 720 588 ✅ - 732 12 💤 +12 0 ❌ ±0

Results for commit 5e8cf48. ± Comparison against base commit 4ff9706.

This pull request removes 110 and adds 50 tests. Note that renamed tests count towards both.

tests.evals.unittests.test_batch_run_context.TestBatchRunContext ‑ test_batch_timeout_custom
tests.evals.unittests.test_batch_run_context.TestBatchRunContext ‑ test_batch_timeout_default
tests.evals.unittests.test_batch_run_context.TestBatchRunContext ‑ test_with_codeclient
tests.evals.unittests.test_batch_run_context.TestBatchRunContext ‑ test_with_pfclient
tests.evals.unittests.test_built_in_evaluator.TestBuiltInEvaluators ‑ test_fluency_evaluator
tests.evals.unittests.test_built_in_evaluator.TestBuiltInEvaluators ‑ test_fluency_evaluator_empty_string
tests.evals.unittests.test_built_in_evaluator.TestBuiltInEvaluators ‑ test_fluency_evaluator_non_string_inputs
tests.evals.unittests.test_chat_evaluator.TestChatEvaluator ‑ test_conversation_validation_invalid_citations
tests.evals.unittests.test_chat_evaluator.TestChatEvaluator ‑ test_conversation_validation_missing_role
tests.evals.unittests.test_chat_evaluator.TestChatEvaluator ‑ test_conversation_validation_normal
…

tests.evals.e2etests.test_builtin_evaluators.TestBuiltInEvaluators ‑ test_composite_evaluator_chat[False-True]
tests.evals.e2etests.test_builtin_evaluators.TestBuiltInEvaluators ‑ test_composite_evaluator_chat[True-True]
tests.evals.e2etests.test_builtin_evaluators.TestBuiltInEvaluators ‑ test_composite_evaluator_content_safety
tests.evals.e2etests.test_builtin_evaluators.TestBuiltInEvaluators ‑ test_composite_evaluator_content_safety_chat[False-False]
tests.evals.e2etests.test_builtin_evaluators.TestBuiltInEvaluators ‑ test_composite_evaluator_content_safety_chat[True-False]
tests.evals.e2etests.test_builtin_evaluators.TestBuiltInEvaluators ‑ test_composite_evaluator_qa[False]
tests.evals.e2etests.test_builtin_evaluators.TestBuiltInEvaluators ‑ test_composite_evaluator_qa[True]
tests.evals.e2etests.test_builtin_evaluators.TestBuiltInEvaluators ‑ test_composite_evaluator_qa_for_nans
tests.evals.e2etests.test_builtin_evaluators.TestBuiltInEvaluators ‑ test_content_safety_evaluator_hate_unfairness
tests.evals.e2etests.test_builtin_evaluators.TestBuiltInEvaluators ‑ test_content_safety_evaluator_self_harm
…

♻️ This comment has been updated with latest results.

src/promptflow-evals/promptflow/evals/synthetic/_helpers/_simulator_data_classes.py

github-actions · 2024-08-22T21:33:21Z

Hi, thank you for your interest in helping to improve the prompt flow experience and for your contribution. We've noticed that there hasn't been recent engagement on this pull request. If this is still an active work stream, please let us know by pushing some changes or leaving a comment.

github-actions · 2024-08-30T21:33:15Z

Hi, thank you for your contribution. Since there has not been recent engagement, we are going to close this out. Feel free to reopen if you'd like to continue working on these changes. Please be sure to remove the no-recent-activity label; otherwise, this is likely to be closed again with the next cleanup pass.

nagkumar91 and others added 5 commits July 8, 2024 11:21

Init

788cc6d

Merge branch 'main' into feature/query_response

c6ce91c

Sample working prototype for task simulator

d6d7ef0

Fixing the order of response

b98774f

support for overriding qeury response generating prompty

d6262ca

github-actions bot added examples Improvements on examples promptflow-evals labels Jul 11, 2024

Nagkumar Arkalgud and others added 6 commits July 11, 2024 08:51

Merge branch 'main' into feature/query_response

3bcf712

Add a progress bar

df2baa2

Common tracing for adv sim and add UA header to prompty

9a9f6ab

Remove debugger and old sample file

10abe2b

Merge branch 'main' into feature/query_response

a546027

Support for custom prompty for user simulation

6aef44b

Moved the samples to the right location

92feb61

github-actions bot removed the examples Improvements on examples label Jul 15, 2024

Nagkumar Arkalgud and others added 14 commits July 18, 2024 11:16

Merge branch 'main' into feature/query_response

564d574

update with json line list and eval

e98ed4b

Merge branch 'main' into feature/query_response

ee3713e

Merge branch 'main' into feature/query_response

6cfc487

Update sample for task simulator

b470418

Add docstrings

31fac59

Added docstrings and copyright

6894e66

W0106 and W1515 fix

3c20cdb

More pylint errors and change TaskSimulator to Simulator

af46a93

More pylint errors and change TaskSimulator to Simulator

777d305

fix broken import

ed3f0b0

ignore pylint issues

6543374

Added unittests

9143cfb

rename simulator sample folder

fd95b3c

Nagkumar Arkalgud added 2 commits July 30, 2024 13:28

User persona is now called tasks

13ed044

updated simulator

38711af

MilesHolland reviewed Aug 1, 2024

View reviewed changes

src/promptflow-evals/promptflow/evals/synthetic/_helpers/_simulator_data_classes.py Outdated Show resolved Hide resolved

nagkumar91 and others added 5 commits August 1, 2024 12:08

Merge branch 'main' into feature/query_response

ca40b69

Persona is now task and extend the conversation in simulator

6b6004f

Refactor code and add tests

82dcbb1

Bugfix and progress bar

27fe51a

Suggestion on PR and removed unnecessary else

170b2e8

nagkumar91 marked this pull request as ready for review August 2, 2024 00:45

nagkumar91 requested a review from a team as a code owner August 2, 2024 00:45

Fix progress bar and add some string manipulation to handle exceptions

8508a0a

nagkumar91 marked this pull request as draft August 2, 2024 01:42

nagkumar91 and others added 8 commits August 2, 2024 13:09

Merge branch 'main' into feature/query_response

b2a0b7d

Update _simulator_data_classes.py

86548df

Update __init__.py to fix imports

a24eb78

Fix the imports

b92e56f

More tests

58c115d

Merge branch 'main' into feature/query_response

159edc2

Merge branch 'main' into feature/query_response

09d71ee

Merge branch 'main' into feature/query_response

5e8cf48

github-actions bot added the no-recent-activity There has been no recent activity on this issue/pull request label Aug 22, 2024

github-actions bot closed this Aug 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Task simulator #3541

Task simulator #3541

nagkumar91 commented Jul 11, 2024 •

edited

Loading

github-actions bot commented Jul 15, 2024 •

edited

Loading

github-actions bot commented Aug 22, 2024

github-actions bot commented Aug 30, 2024

Task simulator #3541

Task simulator #3541

Conversation

nagkumar91 commented Jul 11, 2024 • edited Loading

Description

All Promptflow Contribution checklist:

General Guidelines and Best Practices

Testing Guidelines

github-actions bot commented Jul 15, 2024 • edited Loading

promptflow-evals test result

github-actions bot commented Aug 22, 2024

github-actions bot commented Aug 30, 2024

nagkumar91 commented Jul 11, 2024 •

edited

Loading

github-actions bot commented Jul 15, 2024 •

edited

Loading