Auto Prompts Generation¶
Overview¶
This document describes the engineering specification auto prompt generation demo
Flow Diagram¶
flowchart TD
%% Data Generation Stage
A[Start]
A --> |Data Generation| A1[Dataset]
A1 --> A2[DataGenerator]
A2 --> A3[OpenAI Prompt Data Generator]
%% Create Combinations Stage
A1 --> |Create Combinations| B
B --> B1[OpenAI Prompt Based Variation Generator]
B --> B2[Set Combinations Manually]
B --> B3[List of Combinations]
%% Evaluate Stage
B3 --> |Analysis| C
C --> C1[User's Function - Headline Generation]
C --> C2[Results from Function]
A1 --> C1
%% Evaluator Stage
C2 --> |Evaluation| D
D --> D1[OpenAI Prompt Based Evaluator: clarity]
D --> D2[OpenAI Prompt Based Evaluator: catchiness]
D --> D3[OpenAI Prompt Based Evaluator: relevance]
D --> D4[OpenAI Elo Evaluator]
%% Select Stage
D --> |Selection| E
E --> E1[Selection Strategy]
E1 --> E2[AHP Selection Strategy]
%% Enhancer Stage
E --> |Improvement| F
F --> F1[OpenAI PromptBased Combination Enhancer]
F1 --> C
%% Styling
style A fill:#f9d77e,stroke:#f96e5b
style B fill:#a1d4c6,stroke:#f96e5b
style C fill:#f6c3d5,stroke:#f96e5b
style D fill:#b2b1cf,stroke:#f96e5b
style E fill:#f9efaa,stroke:#f96e5b
style F fill:#f2a3b3,stroke:#f96e5b
Specifications¶
Data Generator¶
We use open ai to help us generate test cases
dataset:
data_generators:
openai_prompt_data_generator:
chunk_size: 1000
diversify: false
input_function:
description:
Given an tech startup business, generate a corresponding landing
page headline
name: headline_generation_for_tech_startup_business
parameters:
tech_startup_business: str
number_of_examples: 2
openai_model_name: gpt-4
output_path: test_demo_generated_examples.pkl
prompt: |-
Please provide a concrete and realistic test case as
a dictionary for function invocation using the ** operator.
Only include parameters, excluding description and name and ensure
it's succinct and well-structured.
**Only provide the dictionary.**
source_type: machine_generated
- diversify: Whether to diversify the generated results. If set to true, parallel processing of data generation is not possible.
- input_function: Description and parameters for the function.
- number_of_examples: Number of examples to be generated.
- output_path: Temporary output path for the generated data. If this file exists, the data will be read from it.
- prompt: The prompt used to generate the data.
Custom Function¶
The custom function for this mode is hosted on GitHub. You can find and review it here.
In this function, we take in an input that represents tech startup business, and will otuput headline for the landing page.
We make use of the StringWrapper
to enclose placeholders intended for replacement,
based on the variations configuration. This wrapped string essentially serves
as a namespace:
str(
StringWrapper(
template="""
Generate a landing page headline for {tech_startup_business}
""",
variables={
"tech_startup_business": tech_startup_business,
},
name="task"
)
)
Within the StringWrapper
, instead of a raw string, we utilize a template. This
template integrates both text and variables. The variables are encapsulated
within {}
braces. In the example provided, the variable is tech_startup_business
.
Variations Generation¶
We have configured variations using both ChatGPT and manual settings. Here's the configuration for ChatGPT:
variations:
- name: task
generator_name: openai_prompt_based_variation_generator
generator_config:
diversify: false
max_tokens: 7000
number_of_variations: 2
openai_model_name: gpt-4
output_path: test_demo_generated_prompt.pkl
prompt:
- content: |-
Prompt used to generate the varations.
- generator_name: Represents the class name of the generator.
- number_of_variations: Specifies the number of variations to be generated.
- output_path: This is the temporary storage location for the generated data. If a file at this path exists, the data is read from it.
- prompt: The prompts used to generate the variation.
For manual variations, the configuration is as follows:
variations:
- instantiated_value: Generate landing page headline for {tech_startup_business}
value: Generate landing page headline for {tech_startup_business}
value_type: str
variation_id: null
This manual setup defines a specific way the variation will be presented.
Human Rating Configuration¶
Below is the configuration for human ratings:
human_rating_configs:
- name: clarity
instructions: Evaluate the clarity of the headline: Does it precisely convey the startup's purpose?
scale: [1, 5]
- name: relevance
instructions: Assess the relevance of the headline: Is it pertinent to the subject matter?
scale: [1, 5]
Explanation:
- name: Specifies the criterion for rating.
- instructions: Provides guidelines to the rater on how to evaluate the content based on the defined criterion.
- scale: The rating scale, where
1
is the lowest and5
is the highest.
To better understand the configuration, refer to the screenshot below:
Certainly! Here's the revised description taking into account the context you provided:
Automated Evaluation Configuration¶
Below is the setup for automated evaluations:
evaluators:
- evaluator_type: all
input_description:
Given a tech startup business, produce a single corresponding landing
page headline
name: openai_elo_evaluator
openai_model_name: gpt-4
- evaluator_type: individual
metric_calculators:
- method: AVERAGE
name: openai_prompt_based_evaluator
prompt: |-
You're tasked with evaluating an answer for a specific criterion based on the following information:
- Task: Given a tech startup business, produce a single corresponding landing page headline
- Question: Does the headline clearly convey the startup's mission or the problem it addresses?
The headline should instantly provide clarity about the startup's objective to any reader. Ambiguities can deter potential users or investors.
[Input]: {tech_startup_business}
[Result]: {raw_output}
Kindly rate the response based on the following options:
A. Doesn't meet the criterion.
B. Partially meets the criterion with significant scope for enhancement.
C. Adequately meets the criterion.
D. Meets the criterion impressively.
E. Exemplarily meets the criterion, with negligible areas for improvement.
display_name: clarity
choices: ["A", "B", "C", "D", "E"]
description: Does the headline lucidly articulate the startup's function or the issue it targets?
scale_description: "0-4"
choice_scores:
A: 0
B: 1
C: 2
D: 3
E: 4
Explanation:
- evaluator_type: Designates the type of evaluation.
all
: The evaluator considers all experiment results across all variations. It uses the elo algorithm and employs GPT-4 as the judge.individual
: The evaluator focuses solely on the current variation's results.
- input_description: Describes the type of input the model expects.
- name: Represents the evaluator's name or identifier.
- prompt: Provides the template and context for the automated evaluator to assess a given result.
- display_name: Specifies the displayed criterion name on the user interface.
- choices: Lists all possible rating options for the evaluator.
- description: Offers a brief description of the evaluation criterion.
- scale_description: Details the numeric scoring scale.
- choice_scores: Maps each choice to its respective numeric score.
Certainly! Here's the polished description for the selection configuration:
Selection Configuration¶
Below is the setup detailing the selection strategy:
selection_strategy:
ahp_selection:
criteria:
- openai_elo_evaluator
- average_token_usage
- average_latency
- "openai_prompt_based_evaluator: clarity"
- "openai_prompt_based_evaluator: relevance"
- "openai_prompt_based_evaluator: catchiness"
criteria_maximization:
openai_elo_evaluator: true
average_latency: false
average_token_usage: false
criteria_weights:
openai_elo_evaluator: 0.3
average_latency: 0.2
average_token_usage: 0.2
"openai_prompt_based_evaluator: clarity": 0.1
"openai_prompt_based_evaluator: relevance": 0.1
"openai_prompt_based_evaluator: catchiness": 0.1
Explanation:
- selection_strategy: Represents the overarching approach for making selections.
- ahp_selection: Specifies that the Analytic Hierarchy Process (AHP) algorithm is employed for the selection strategy.
- criteria: Lists the evaluators and metrics that are considered during the selection process.
- criteria_maximization: Indicates whether each criterion should be maximized.
For instance,
while a high score from the
openai_elo_evaluator
is desirable (true
), a loweraverage_latency
oraverage_token_usage
is preferred (false
). - criteria_weights: Assigns a weight to each criterion, determining its importance in the overall evaluation. The weights sum up to 1, indicating the relative significance of each criterion in the final decision-making process.
Auto Enhancer¶
Certainly! Here's the enhanced description, incorporating the additional information:
Auto Enhancer Configuration¶
Below is the setup detailing the auto enhancer strategy:
enhancer:
name: openai_prompt_based_combination_enhancer
max_iterations: 2
openai_model_name: gpt-4
stop_conditions:
"openai_prompt_based_evaluator: catchiness": 3
"openai_prompt_based_evaluator: clarity": 3
"openai_prompt_based_evaluator: relevance": 3
Explanation:
- name: Specifies the identifier or the class of the enhancer.
In this instance,
openai_prompt_based_combination_enhancer
is utilized. - max_iterations: Designates the upper limit for the number of improvement cycles. The process will not exceed 2 iterations, irrespective of other conditions.
- openai_model_name:
Indicates the model to be utilized for the improvement process,
which here is
gpt-4
. - stop_conditions: Outlines conditions under which the Enhancer should halt its operations before reaching the maximum iteration count. The improvement process will terminate if the average score from any of the specified evaluators surpasses 3.
Additionally, it's important to note that during the improvement process, only the best result from the previous selection step will be taken into consideration. The idea is to refine and optimize this top-performing result further.
Full Configuration¶
For a comprehensive view of all configurations related to the basic interactive mode, you can review the full configuration file hosted here.