{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"id": "61a168eb",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"importing Jupyter notebook from cognates_counter.ipynb\n"
]
}
],
"source": [
"from openai import AzureOpenAI\n",
"from envyaml import EnvYAML\n",
"import comet_llm\n",
"import time\n",
"import import_ipynb\n",
"from cognates_counter import TextCognateCounter, calculate_germanic_tendency"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "3bb956f7-d613-443e-9cd7-d6ca2d92e67b",
"metadata": {},
"outputs": [],
"source": [
"env = EnvYAML('../.api_keys.yaml')"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "fa42e195-72b2-488e-94c8-82c3fb056214",
"metadata": {},
"outputs": [],
"source": [
"client = AzureOpenAI(\n",
" azure_endpoint=env['azure']['AZURE_OPENAI_ENDPOINT'],\n",
" api_key=env['azure']['AZURE_OPENAI_KEY'],\n",
" api_version=\"2024-02-01\",\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "508ff147-0608-4d4b-9073-cdb95f0ff26a",
"metadata": {},
"outputs": [],
"source": [
"comet_llm.init(project=\"l2-english-generation\", workspace=\"p-acharya\")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "e98d5922-09bd-4dd3-b5cc-67909bc0fe5e",
"metadata": {},
"outputs": [],
"source": [
"# Define the path to the cognates list file\n",
"cognates_list_file = \"manual_synset_list_with_origin_and_POS.csv\"\n",
"\n",
"# Create an instance of TextCognateCounter\n",
"text_cognate_counter = TextCognateCounter(cognates_list_file)"
]
},
{
"cell_type": "markdown",
"id": "52c18cf9-dd98-4cfd-a62c-c22bcc01c3c5",
"metadata": {},
"source": [
"### L1"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "5372fab8-8227-4752-868c-a4498cf7f83d",
"metadata": {},
"outputs": [],
"source": [
"L1_list = [\n",
" # Romance\n",
" \"Spanish\", \n",
" \"Italian\", \n",
" \"French\",\n",
"\n",
" # Germanic\n",
" \"German\", \n",
" \"Swedish\", \n",
"\n",
" # Hellenic\n",
" \"Greek\",\n",
"\n",
" # Slavic\n",
" \"Russian\",\n",
"\n",
" # Indic\n",
" \"Hindi\",\n",
"\n",
" # Dravidian\n",
" \"Kannada\",\n",
"\n",
" # Semitic\n",
" \"Arabic\",\n",
"\n",
" # Sintic\n",
" \"Chinese\"\n",
"]"
]
},
{
"cell_type": "markdown",
"id": "367e0ee3-6ab0-4f17-9945-53685e93a205",
"metadata": {},
"source": [
"### Short stories"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "35175294-218b-4c58-98c0-5470d57a12f9",
"metadata": {},
"outputs": [],
"source": [
"short_story = \"\"\"Once in a quaint village nestled between verdant hills and whispering woods, there lived an old watchmaker named Elias. He was known far and wide not just for his skill in repairing watches and clocks but also for his collection of timepieces, each with a unique story. Among them, the most mysterious was an ancient, ornate pocket watch that Elias never let anyone touch.\n",
"\n",
"One rainy evening, a curious young boy named Theo, who often visited Elias to marvel at the clocks, asked about the story of the special pocket watch. Elias, with a gentle smile, finally decided to share its tale.\n",
"\n",
"“This watch,” Elias began, gently cradling the old pocket watch, “belongs to neither you nor me but to Time itself. It was given to me many years ago by an old wanderer who claimed it could turn back time for just one minute—but only once every ten years.”\n",
"\n",
"Theo, wide-eyed with wonder, listened intently as rain pattered softly against the window panes.\n",
"\n",
"“Why keep such a magical thing hidden and unused?” Theo asked.\n",
"\n",
"Elias chuckled softly. “Because, my boy, knowing when to use such a minute is a weighty decision. Too many choices and changes one might regret.”\n",
"\n",
"Years passed, and Theo grew into a young man, while Elias became frailer with time. One stormy night, much like the one when Theo first learned of the watch, tragedy struck the village. A sudden landslide threatened to bury the village under mud and debris. Theo, now a brave and quick-thinking man, rushed to Elias’s shop amidst the chaos.\n",
"\n",
"“Elias!” Theo cried as he entered the dimly lit shop. “The watch! We can use it to save the village!”\n",
"\n",
"Elias, slow and pensive, handed the watch to Theo. “Yes, it is time. But remember, it can only take us back one minute. Choose the moment wisely.”\n",
"\n",
"As the village alarm bells tolled, signaling imminent danger, Theo held the watch tightly. His mind raced through the events of the past minute, seeking the pivotal moment that could alter their fate. With no time left to ponder, Theo clicked the watch.\n",
"\n",
"Instantly, Theo found himself back at the moment just before the landslide began. He sprinted to the hillside, where he had seen a small boy playing moments before the disaster. Grabbing the boy, he retreated just as the earth began to move, the place where the child had been playing now engulfed in mud.\n",
"\n",
"The village was saved from greater loss by Theo’s quick action, all thanks to the mysterious pocket watch.\"\"\""
]
},
{
"cell_type": "code",
"execution_count": 57,
"id": "a7117214-9b8e-4b97-b24a-f7b3c7a0852d",
"metadata": {},
"outputs": [],
"source": [
"short_story2 = \"\"\"In the heart of the old town, nestled between a bustling cafe and a quiet antique store, stood a peculiar little bookshop. Its sign read \"Whimsical Reads,\" and it was known for its endless maze of bookshelves that seemed to stretch into another dimension. The shopkeeper, Mr. Thaddeus Quill, was an elderly man with a twinkle in his eye and a knack for finding the perfect book for every visitor.\n",
"\n",
"One rainy afternoon, a young girl named Eliza stumbled into Whimsical Reads, seeking refuge from the downpour. Her curly hair dripped water onto the worn wooden floor as she wandered deeper into the shop, drawn by the scent of old paper and leather bindings. As she roamed, she noticed something odd: the books seemed to whisper to her.\n",
"\n",
"\"Choose me,\" a dusty tome crooned from a high shelf. \"No, pick me,\" another chimed in from a nearby stack. Eliza's curiosity was piqued, and she reached out for a particularly ancient-looking book with a golden cover. The moment her fingers touched it, the room around her began to shimmer.\n",
"\n",
"Suddenly, Eliza found herself standing in a lush, enchanted forest. The trees were tall and covered in vibrant, glowing moss. Birds with feathers like rainbows flitted between the branches, and the air was filled with the sound of a distant, melodic tune. Bewildered but excited, Eliza clutched the book to her chest and took a step forward.\n",
"\n",
"As she explored, she came across a group of tiny, winged creatures fluttering around a sparkling stream. They introduced themselves as the Book Sprites, guardians of the enchanted stories within Whimsical Reads. They explained that each book in the shop contained a portal to a different magical realm, accessible only to those with a pure heart and an adventurous spirit.\n",
"\n",
"Eliza spent what felt like hours in the forest, meeting fantastical creatures, solving ancient riddles, and discovering hidden treasures. But eventually, she felt a gentle tug, as if the book was calling her back. Reluctantly, she said goodbye to her new friends and opened the golden cover once more.\n",
"\n",
"In an instant, she was back in the cozy bookshop, the rain still pattering against the windows. Mr. Quill stood nearby, watching her with a knowing smile. \"Did you enjoy your journey?\" he asked.\n",
"\n",
"Eliza nodded, her eyes wide with wonder. \"It was amazing! How is this possible?\"\n",
"\n",
"Mr. Quill chuckled. \"Magic, my dear. This shop holds more secrets than anyone can imagine. But remember, the magic only works for those who believe in it.\"\n",
"\n",
"Eliza left Whimsical Reads that day with the golden book in hand, a gift from Mr. Quill. She knew she would return many times, eager to explore new realms and embark on countless adventures. From that day forward, she was never without a story to tell, and the magic of the bookshop stayed with her always.\"\"\""
]
},
{
"cell_type": "markdown",
"id": "8ab1b1d9-105c-48a4-96cc-c0a681f1a8ae",
"metadata": {},
"source": [
"### Functions"
]
},
{
"cell_type": "code",
"execution_count": 46,
"id": "f6f40f9e-0714-406b-9a6a-720ce422d5af",
"metadata": {},
"outputs": [],
"source": [
"def get_lexical_tendencies(input, output):\n",
" input_tendencies = calculate_germanic_tendency(input, text_cognate_counter, verbose=False)\n",
" output_tendencies = calculate_germanic_tendency(output, text_cognate_counter, verbose=False)\n",
" delta_GT = output_tendencies[\"germanic_tendency\"] - input_tendencies[\"germanic_tendency\"]\n",
" delta_RT = output_tendencies[\"romance_tendency\"] - input_tendencies[\"romance_tendency\"]\n",
" \n",
" return {\n",
" \"input\": input_tendencies,\n",
" \"output\": output_tendencies,\n",
" \"delta\": {\n",
" \"germanic_tendency\": round(delta_GT, 3),\n",
" \"romance_tendency\": round(delta_RT, 3),\n",
" }\n",
"\n",
" }"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "b556c454-4409-4da6-97df-02e99e75794a",
"metadata": {},
"outputs": [],
"source": [
"class Prompt:\n",
" def __init__(self, instructions: str, input_text=None, template_variables={}):\n",
" self.instructions = instructions\n",
" self.input_text = input_text if input_text is not None else \"\"\n",
" self.template_variables = template_variables\n",
"\n",
" def generate_prompt(self):\n",
" return f\"{self.instructions.format(**self.template_variables)}\\n{self.input_text}\""
]
},
{
"cell_type": "code",
"execution_count": 60,
"id": "abcf2851-7bdc-4513-a108-008ebc6e6af2",
"metadata": {},
"outputs": [],
"source": [
"default_parameters = {\n",
" \"temperature\": 1,\n",
" \"max_tokens\": 1024,\n",
" \"top_p\": 1,\n",
" \"frequency_penalty\": 0,\n",
" \"presence_penalty\": 0,\n",
" \"stop\": None,\n",
"}\n",
"\n",
"def end_to_end(prompt_instructions, input_text, prompt_template_variables={}, parameters={}):\n",
" prompt = Prompt(prompt_instructions, input_text, prompt_template_variables)\n",
" final_prompt = prompt.generate_prompt()\n",
" parameters = {**default_parameters, **parameters} # fill missing params with default\n",
" \n",
" start_time = time.time()\n",
" completion = client.chat.completions.create(\n",
" # model=env['azure']['CHAT_COMPLETIONS_DEPLOYMENT_NAME'],\n",
" model=\"minerva\",\n",
" messages=[\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": final_prompt,\n",
" }\n",
" ],\n",
" **parameters\n",
" )\n",
" end_time = time.time()\n",
" duration = round(end_time - start_time, 2)\n",
" \n",
" model_output = completion.choices[0].message.content\n",
" comet_llm.log_prompt(\n",
" prompt=final_prompt,\n",
" output=model_output,\n",
" prompt_template=prompt_instructions,\n",
" prompt_template_variables=prompt_template_variables,\n",
" metadata={\n",
" \"model\": completion.model, \n",
" **get_lexical_tendencies(input_text, model_output),\n",
" \"usage\": completion.usage.__dict__, \n",
" \"parameters\": parameters,\n",
" \"content_filter_results\": completion.choices[0].content_filter_results,\n",
" },\n",
" duration=duration\n",
" )"
]
},
{
"cell_type": "markdown",
"id": "e213ae21-52de-46f5-b0d6-b33871304bee",
"metadata": {},
"source": [
"### Parameters"
]
},
{
"cell_type": "code",
"execution_count": 55,
"id": "9388ee49-f4cb-47f0-89f2-529bc10ae828",
"metadata": {},
"outputs": [],
"source": [
"parameters = {\n",
" \"temperature\": 1.2,\n",
" \"max_tokens\": 800,\n",
" \"top_p\": 0.95,\n",
" \"frequency_penalty\": 0,\n",
" \"presence_penalty\": 0,\n",
" \"stop\": None,\n",
"}"
]
},
{
"cell_type": "markdown",
"id": "28ef9ae3-bd97-49a5-ba04-7632b3f6ca21",
"metadata": {},
"source": [
"### "
]
},
{
"cell_type": "code",
"execution_count": 51,
"id": "94e522ef-11f4-4807-81a1-02cd34807929",
"metadata": {},
"outputs": [],
"source": [
"end_to_end(\n",
" prompt_instructions=\"Rewrite this short story as if you were a native {L1} speaker writing in English:\", \n",
" input_text=short_story, \n",
" prompt_template_variables={\"L1\": \"Spanish\", \"short_story\": short_story},\n",
" parameters=parameters\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 61,
"id": "c6d63ec0-5cdd-4787-8d7a-7914baedd417",
"metadata": {},
"outputs": [],
"source": [
"for L1 in L1_list:\n",
" end_to_end(\n",
" prompt_instructions=\"Rewrite this short story as if you were a native {L1} speaker writing in English:\", \n",
" input_text=short_story2, \n",
" prompt_template_variables={\"L1\": L1, \"short_story\": short_story2},\n",
" # parameters=\n",
" )"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3edb4b50-20b2-4f7a-9d31-31934fc39a2f",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.7"
}
},
"nbformat": 4,
"nbformat_minor": 5
}