mirror of
https://github.com/0xSojalSec/airllm.git
synced 2026-03-06 22:03:41 +00:00
add eval dataset, eval code, elo rating code
This commit is contained in:
@@ -61,10 +61,14 @@ Anima模型基于QLoRA开源的[33B guanaco](https://huggingface.co/timdettmers/
|
||||
|
||||
#### 评估方法论
|
||||
|
||||
* **数据集的选择**:如[Belle Paper](https://github.com/LianjiaTech/BELLE/blob/main/docs/Towards%20Better%20Instruction%20Following%20Language%20Models%20for%20Chinese.pdf)中论述,评估集的不同类型分布对于评估结论影响巨大。如田忌赛马,以己之长攻人之短,很容易占优势。因此我们选择了英文chatbot模型研究工作中比较普遍公认的[Vicuna benchmark](https://lmsys.org/blog/2023-03-30-vicuna/)。为了评测中文,我们使用GPT4对于问题做了翻译。翻译代码和数据集如下:。
|
||||
* **数据集的选择**:如[Belle Paper](https://github.com/LianjiaTech/BELLE/blob/main/docs/Towards%20Better%20Instruction%20Following%20Language%20Models%20for%20Chinese.pdf)中论述,评估集的不同类型分布对于评估结论影响巨大。如田忌赛马,以己之长攻人之短,很容易占优势。因此我们选择了英文chatbot模型研究工作中比较普遍公认的[Vicuna benchmark](https://lmsys.org/blog/2023-03-30-vicuna/)。为了评测中文,我们使用GPT4对于问题做了翻译。翻译代码和[数据集]([elo_tournanment_all_models_on_translated_vicuna.ipynb](https://github.com/lyogavin/Anima/blob/main/data/translated_vicuna_eval_set.json))。
|
||||
* **评估方法**: 为了平衡成本,我们主要采用GPT4进行评估。如[QLoRA](https://arxiv.org/abs/2305.14314) 论证,单纯GPT4打分进行模型的对比随机波动性较大。这与我们的观察一致。因此采用了[QLoRA](https://arxiv.org/abs/2305.14314) 推荐的,现在比较普遍采用的Elo Rating tournament评测方法。
|
||||
* **超参选择**:出于成本考虑,我们选择:300轮随机评估,随机选择模型PK的先后顺序以抵消先后顺序的影响,随机种子为:42。Elo rating的实现代码和其他超参参照[Vicuna的Elo代码](https://raw.githubusercontent.com/lm-sys/FastChat/833d65032a715240a3978f4a8f08e7a496c83cb1/fastchat/serve/monitor/elo_analysis.py): K=32, init rating=1000。
|
||||
|
||||
#### Elo rating tournament过程代码
|
||||
|
||||
[elo_tournanment_all_models_on_translated_vicuna.ipynb](https://github.com/lyogavin/Anima/blob/main/eval/elo_tournanment_all_models_on_translated_vicuna.ipynb)
|
||||
|
||||
#### 结论
|
||||
|
||||
LLM模型最重要的还是logical reasoning的能力和encode knowledge的能力。因此模型的规模还是最重要的因素。通过QLoRA的方式可以让我们以足够低的成本finetune优化给定硬件条件下最大的模型。从而达到最优的效果。
|
||||
|
||||
669
data/gpt4_translate_vicuna_eval_set.ipynb
Normal file
669
data/gpt4_translate_vicuna_eval_set.ipynb
Normal file
@@ -0,0 +1,669 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 10,
|
||||
"id": "6e22cd6d-1226-4a66-9811-e49dac231d98",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"vicuna_eval_set = [{\"question_id\": 1, \"text\": \"How can I improve my time management skills?\", \"category\": \"generic\"},\n",
|
||||
"{\"question_id\": 2, \"text\": \"What are the most effective ways to deal with stress?\", \"category\": \"generic\"},\n",
|
||||
"{\"question_id\": 3, \"text\": \"What are the main differences between Python and JavaScript programming languages?\", \"category\": \"generic\"},\n",
|
||||
"{\"question_id\": 4, \"text\": \"How can I increase my productivity while working from home?\", \"category\": \"generic\"},\n",
|
||||
"{\"question_id\": 5, \"text\": \"Can you explain the basics of quantum computing?\", \"category\": \"generic\"},\n",
|
||||
"{\"question_id\": 6, \"text\": \"What are the differences between plant-based and animal-based protein sources?\", \"category\": \"generic\"},\n",
|
||||
"{\"question_id\": 7, \"text\": \"How can I develop my critical thinking skills?\", \"category\": \"generic\"},\n",
|
||||
"{\"question_id\": 8, \"text\": \"What are the major challenges faced by the education sector today?\", \"category\": \"generic\"},\n",
|
||||
"{\"question_id\": 9, \"text\": \"What are the primary factors that influence consumer behavior?\", \"category\": \"generic\"},\n",
|
||||
"{\"question_id\": 10, \"text\": \"What are the most effective strategies for conflict resolution in the workplace?\", \"category\": \"generic\"},\n",
|
||||
"{\"question_id\": 11, \"text\": \"What are some potential implications of using a single-use plastic bottle versus a reusable bottle on both the environment and human health?\", \"category\": \"knowledge\"},\n",
|
||||
"{\"question_id\": 12, \"text\": \"What factors would you consider when designing an inclusive and accessible public transportation system?\", \"category\": \"knowledge\"},\n",
|
||||
"{\"question_id\": 13, \"text\": \"How can governments utilize fiscal and monetary policies to combat economic recessions?\", \"category\": \"knowledge\"},\n",
|
||||
"{\"question_id\": 14, \"text\": \"How do language and cultural barriers affect the way people communicate and form relationships in multicultural societies?\", \"category\": \"knowledge\"},\n",
|
||||
"{\"question_id\": 15, \"text\": \"Describe a scenario where artificial intelligence could be used to improve the quality and efficiency of healthcare delivery.\", \"category\": \"knowledge\"},\n",
|
||||
"{\"question_id\": 16, \"text\": \"Explain the process of gene editing using CRISPR-Cas9 technology, and discuss its potential applications and ethical implications.\", \"category\": \"knowledge\"},\n",
|
||||
"{\"question_id\": 17, \"text\": \"How do vaccinations work to protect individuals and communities from infectious diseases, and what is herd immunity?\", \"category\": \"knowledge\"},\n",
|
||||
"{\"question_id\": 18, \"text\": \"How do social media platforms influence the way people consume and share news, and what are the potential implications for the spread of misinformation?\", \"category\": \"knowledge\"},\n",
|
||||
"{\"question_id\": 19, \"text\": \"How do cultural, social, and economic factors influence people's food choices, and how can this knowledge be used to promote healthier diets?\", \"category\": \"knowledge\"},\n",
|
||||
"{\"question_id\": 20, \"text\": \"Explain the process of natural selection and how it contributes to the evolution and adaptation of species.\", \"category\": \"knowledge\"},\n",
|
||||
"{\"question_id\": 21, \"text\": \"How would you introduce yourself as a medieval knight at a royal banquet?\", \"category\": \"roleplay\"},\n",
|
||||
"{\"question_id\": 22, \"text\": \"As a pirate captain, what would you say to your crew to motivate them to search for hidden treasure?\", \"category\": \"roleplay\"},\n",
|
||||
"{\"question_id\": 23, \"text\": \"If you were a Shakespearean character, how would you declare your love for someone in a soliloquy?\", \"category\": \"roleplay\"},\n",
|
||||
"{\"question_id\": 24, \"text\": \"As a superhero, how would you explain your origin story to a curious child?\", \"category\": \"roleplay\"},\n",
|
||||
"{\"question_id\": 25, \"text\": \"Imagine you are a time traveler from the year 3000. What technological advancements would you tell people about?\", \"category\": \"roleplay\"},\n",
|
||||
"{\"question_id\": 26, \"text\": \"As a sports commentator, describe the winning play in the final seconds of a championship game.\", \"category\": \"roleplay\"},\n",
|
||||
"{\"question_id\": 27, \"text\": \"Pretend to be a world-famous chef. How would you describe your signature dish to a panel of judges?\", \"category\": \"roleplay\"},\n",
|
||||
"{\"question_id\": 28, \"text\": \"You are a mountain climber reaching the summit of Mount Everest. Describe your emotions and the view from the top.\", \"category\": \"roleplay\"},\n",
|
||||
"{\"question_id\": 29, \"text\": \"As a space colonist on Mars, describe your daily life and the challenges you face living on another planet.\", \"category\": \"roleplay\"},\n",
|
||||
"{\"question_id\": 30, \"text\": \"Pretend to be a character in a post-apocalyptic world. Describe how you survive and the allies you encounter.\", \"category\": \"roleplay\"},\n",
|
||||
"{\"question_id\": 31, \"text\": \"How can you determine if a restaurant is popular among locals or mainly attracts tourists, and why might this information be useful?\", \"category\": \"common-sense\"},\n",
|
||||
"{\"question_id\": 32, \"text\": \"What are some subtle clues that suggest someone is pretending to understand a topic or conversation when they are actually confused or uninformed?\", \"category\": \"common-sense\"},\n",
|
||||
"{\"question_id\": 33, \"text\": \"Why might someone choose to use a paper map or ask for directions instead of relying on a GPS device or smartphone app?\", \"category\": \"common-sense\"},\n",
|
||||
"{\"question_id\": 34, \"text\": \"How can you determine if a person is genuinely interested in a conversation or simply being polite?\", \"category\": \"common-sense\"},\n",
|
||||
"{\"question_id\": 35, \"text\": \"Why might someone prefer to shop at a small, locally-owned business instead of a large chain store, even if the prices are higher?\", \"category\": \"common-sense\"},\n",
|
||||
"{\"question_id\": 36, \"text\": \"How can you assess the credibility of a source of information, such as a news article or blog post, without relying solely on the reputation of the author or publisher?\", \"category\": \"common-sense\"},\n",
|
||||
"{\"question_id\": 37, \"text\": \"Why do some people enjoy the sensation of being scared, such as by watching horror movies or going on roller coasters, while others avoid these experiences?\", \"category\": \"common-sense\"},\n",
|
||||
"{\"question_id\": 38, \"text\": \"How can observing the behavior of other people in a social situation provide clues about cultural norms and expectations?\", \"category\": \"common-sense\"},\n",
|
||||
"{\"question_id\": 39, \"text\": \"Do we have a moral obligation to explore space, or should we focus on solving Earth's problems first?\", \"category\": \"common-sense\"},\n",
|
||||
"{\"question_id\": 40, \"text\": \"In a world where automation is becoming increasingly prevalent, is it more important to prioritize job creation or technological progress?\", \"category\": \"common-sense\"},\n",
|
||||
"{\"question_id\": 41, \"text\": \"How many times does the average human blink in a lifetime? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.\", \"category\": \"fermi\"},\n",
|
||||
"{\"question_id\": 42, \"text\": \"How many atoms are in a grain of salt? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.\", \"category\": \"fermi\"},\n",
|
||||
"{\"question_id\": 43, \"text\": \"How many lightning strikes occur on Earth each day? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.\", \"category\": \"fermi\"},\n",
|
||||
"{\"question_id\": 44, \"text\": \"How many balloons would it take to lift a house like in the movie \\\"Up\\\"? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.\", \"category\": \"fermi\"},\n",
|
||||
"{\"question_id\": 45, \"text\": \"How many text messages are sent globally in a minute? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.\", \"category\": \"fermi\"},\n",
|
||||
"{\"question_id\": 46, \"text\": \"How many words are spoken daily on Earth? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.\", \"category\": \"fermi\"},\n",
|
||||
"{\"question_id\": 47, \"text\": \"How many snowflakes fall during a typical winter? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.\", \"category\": \"fermi\"},\n",
|
||||
"{\"question_id\": 48, \"text\": \"How many pages are in all the books ever written? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.\", \"category\": \"fermi\"},\n",
|
||||
"{\"question_id\": 49, \"text\": \"How many times has the Earth orbited the Sun since the beginning of life? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.\", \"category\": \"fermi\"},\n",
|
||||
"{\"question_id\": 50, \"text\": \"How many songs have been recorded throughout history? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.\", \"category\": \"fermi\"},\n",
|
||||
"{\"question_id\": 51, \"text\": \"What if the Internet had been invented during the Renaissance period?\", \"category\": \"counterfactual\"},\n",
|
||||
"{\"question_id\": 52, \"text\": \"What if the Aztecs had successfully repelled the Spanish conquistadors?\", \"category\": \"counterfactual\"},\n",
|
||||
"{\"question_id\": 53, \"text\": \"What if the Black Death had not occurred in the 14th century?\", \"category\": \"counterfactual\"},\n",
|
||||
"{\"question_id\": 54, \"text\": \"What if Isaac Newton had focused on biology instead of physics?\", \"category\": \"counterfactual\"},\n",
|
||||
"{\"question_id\": 55, \"text\": \"What if the Beatles had never formed as a band?\", \"category\": \"counterfactual\"},\n",
|
||||
"{\"question_id\": 56, \"text\": \"What if Alan Turing had not cracked the Enigma code during World War II?\", \"category\": \"counterfactual\"},\n",
|
||||
"{\"question_id\": 57, \"text\": \"What if the Suez Canal had never been constructed?\", \"category\": \"counterfactual\"},\n",
|
||||
"{\"question_id\": 58, \"text\": \"What if the Maya civilization had never mysteriously collapsed?\", \"category\": \"counterfactual\"},\n",
|
||||
"{\"question_id\": 59, \"text\": \"What if Christopher Columbus had not discovered the Americas?\", \"category\": \"counterfactual\"},\n",
|
||||
"{\"question_id\": 60, \"text\": \"What if Vincent van Gogh had been a successful artist during his lifetime?\", \"category\": \"counterfactual\"},\n",
|
||||
"{\"question_id\": 61, \"text\": \"Develop a C++ program that reads a text file line by line and counts the number of occurrences of a specific word in the file.\", \"category\": \"coding\"},\n",
|
||||
"{\"question_id\": 62, \"text\": \"Implement a Python function to find the longest common subsequence of two input strings using dynamic programming.\", \"category\": \"coding\"},\n",
|
||||
"{\"question_id\": 63, \"text\": \"Implement a regular expression in Python to validate an email address.\", \"category\": \"coding\"},\n",
|
||||
"{\"question_id\": 64, \"text\": \"Write a program to find the nth Fibonacci number using dynamic programming.\", \"category\": \"coding\"},\n",
|
||||
"{\"question_id\": 65, \"text\": \"Implement a binary search algorithm to find a specific element in a sorted array.\", \"category\": \"coding\"},\n",
|
||||
"{\"question_id\": 66, \"text\": \"Implement a queue data structure using two stacks in Python.\", \"category\": \"coding\"},\n",
|
||||
"{\"question_id\": 67, \"text\": \"Implement a program to find the common elements in two arrays without using any extra data structures.\", \"category\": \"coding\"},\n",
|
||||
"{\"question_id\": 68, \"text\": \"Given that f(x) = 5x^3 - 2x + 3, find the value of f(2).\", \"category\": \"math\"},\n",
|
||||
"{\"question_id\": 69, \"text\": \"Solve for x in the equation 3x + 10 = 5(x - 2).\", \"category\": \"math\"},\n",
|
||||
"{\"question_id\": 70, \"text\": \"If the endpoints of a line segment are (2, -2) and (10, 4), what is the length of the segment?\", \"category\": \"math\"},\n",
|
||||
"{\"question_id\": 71, \"text\": \"Can you help me write a formal email to a potential business partner proposing a joint venture?\", \"category\": \"writing\"},\n",
|
||||
"{\"question_id\": 72, \"text\": \"Can you help me write a resignation letter to my current employer, while leaving on good terms and expressing gratitude for the opportunities provided?\", \"category\": \"writing\"},\n",
|
||||
"{\"question_id\": 73, \"text\": \"Use an appropriate format to structure a formal letter of recommendation for a student applying to a prestigious graduate program in computer science.\", \"category\": \"writing\"},\n",
|
||||
"{\"question_id\": 74, \"text\": \"Write a compelling product launch announcement email to inform our customers of our new software solution.\", \"category\": \"writing\"},\n",
|
||||
"{\"question_id\": 75, \"text\": \"Draft an apology email to a customer who experienced a delay in their order, and provide reassurance that the issue has been resolved.\", \"category\": \"writing\"},\n",
|
||||
"{\"question_id\": 76, \"text\": \"Write a script for a YouTube video exploring the history and cultural significance of jazz.\", \"category\": \"writing\"},\n",
|
||||
"{\"question_id\": 77, \"text\": \"Compose an engaging travel blog post about a recent trip to Hawaii, highlighting cultural experiences and must-see attractions.\", \"category\": \"writing\"},\n",
|
||||
"{\"question_id\": 78, \"text\": \"Write a captivating movie review for a recently released science fiction film, discussing its plot, characters, and special effects.\", \"category\": \"writing\"},\n",
|
||||
"{\"question_id\": 79, \"text\": \"Structure a podcast script for an episode discussing the influence of streaming platforms on the music industry.\", \"category\": \"writing\"},\n",
|
||||
"{\"question_id\": 80, \"text\": \"Write a symphony concert review, discussing the orchestra's performance and overall audience experience.\", \"category\": \"writing\"}]\n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "1ec188cf-ab4f-4ae6-9237-fdffa9dc39b4",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# translate with gpt4"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 11,
|
||||
"id": "1dfdab33-132f-4d1b-a59d-f797881f9dc2",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from tqdm import tqdm"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"id": "f95d2302-596c-413b-b341-28c458d117ae",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# fix this issue:\n",
|
||||
"#TypeError: Descriptors cannot not be created directly.\n",
|
||||
"#If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.\n",
|
||||
"#If you cannot immediately regenerate your protos, some other possible workarounds are:\n",
|
||||
"# 1. Downgrade the protobuf package to 3.20.x or lower.\n",
|
||||
"# 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).#\n",
|
||||
"\n",
|
||||
"#More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"import os\n",
|
||||
"os.environ[\"PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION\"] = \"python\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"id": "18ad735c-dafb-476c-b418-ca73647a45a2",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from tqdm import tqdm\n",
|
||||
"import json"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "af19d74a-ce78-49c5-98bf-0c5580ea2367",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"\n",
|
||||
"import backoff\n",
|
||||
"import openai\n",
|
||||
"openai.api_key = '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'"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 15,
|
||||
"id": "9ed2fe32-875b-487b-ab44-46376623307d",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"\n",
|
||||
"def run_gpt4(prompt=\"Hello! What's the capital of China?\", n=1, oa_model_type='gpt-4', max_tokens=None):\n",
|
||||
" if max_tokens is None:\n",
|
||||
" completion = openai.ChatCompletion.create(model=oa_model_type,\n",
|
||||
" n=n,\n",
|
||||
" temperature=0.9,\n",
|
||||
" messages=[\n",
|
||||
" {\"role\": \"system\", \"content\": prompt}\n",
|
||||
" ])\n",
|
||||
" else:\n",
|
||||
" completion = openai.ChatCompletion.create(model=oa_model_type,\n",
|
||||
" n=n,\n",
|
||||
" temperature=0.9,\n",
|
||||
" max_tokens=max_tokens,\n",
|
||||
" messages=[\n",
|
||||
" {\"role\": \"system\", \"content\": prompt}\n",
|
||||
" ])\n",
|
||||
"\n",
|
||||
" #print(f\"calling openai with params: {(oa_model_type, n, 0.9)}\")\n",
|
||||
"\n",
|
||||
"\n",
|
||||
" to_ret = []\n",
|
||||
"\n",
|
||||
" for c in completion['choices']:\n",
|
||||
" to_ret.append(c['message']['content'])\n",
|
||||
" return to_ret"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 19,
|
||||
"id": "deabea53-dbe0-4785-8353-81acb30d6653",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"\n",
|
||||
"@backoff.on_exception(backoff.expo, openai.error.RateLimitError,\n",
|
||||
" \n",
|
||||
" max_tries=10,\n",
|
||||
" raise_on_giveup=False,)\n",
|
||||
"def run_gpt4_backoff(*args,**kwargs):\n",
|
||||
" return run_gpt4(*args,**kwargs)\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 20,
|
||||
"id": "aeec108d-be2a-448c-be6c-21c150c5990f",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'中国的首都是北京。'"
|
||||
]
|
||||
},
|
||||
"execution_count": 20,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"run_gpt4_backoff('中国的首都是哪里?')[0]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "cb211f1c-94f8-4859-b354-fb0ffcea91f2",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# loop"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 21,
|
||||
"id": "cf78d751-f796-42fd-a187-e011f812c7d6",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"100%|██████████| 80/80 [09:55<00:00, 7.45s/it]\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"for item in tqdm(vicuna_eval_set, total=len(vicuna_eval_set)):\n",
|
||||
" prompt = \"Translate the follow question to Chinese:\\nQuestion:{question}\\nChinese Translation:\"\n",
|
||||
" \n",
|
||||
" prompt = prompt.format(question=item['text'])\n",
|
||||
" \n",
|
||||
" item['translation'] = run_gpt4_backoff(prompt)[0]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 22,
|
||||
"id": "74add969-09b5-4a3a-b98a-20aa4641a2e2",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"[{'question_id': 1,\n",
|
||||
" 'text': 'How can I improve my time management skills?',\n",
|
||||
" 'category': 'generic',\n",
|
||||
" 'translation': '如何提高我的时间管理技能?'},\n",
|
||||
" {'question_id': 2,\n",
|
||||
" 'text': 'What are the most effective ways to deal with stress?',\n",
|
||||
" 'category': 'generic',\n",
|
||||
" 'translation': '问题:应对压力最有效的方法是什么?'},\n",
|
||||
" {'question_id': 3,\n",
|
||||
" 'text': 'What are the main differences between Python and JavaScript programming languages?',\n",
|
||||
" 'category': 'generic',\n",
|
||||
" 'translation': 'Python 和 JavaScript 编程语言之间的主要区别是什么?'},\n",
|
||||
" {'question_id': 4,\n",
|
||||
" 'text': 'How can I increase my productivity while working from home?',\n",
|
||||
" 'category': 'generic',\n",
|
||||
" 'translation': '在家工作时,我如何提高我的工作效率?'},\n",
|
||||
" {'question_id': 5,\n",
|
||||
" 'text': 'Can you explain the basics of quantum computing?',\n",
|
||||
" 'category': 'generic',\n",
|
||||
" 'translation': '您能解释一下量子计算的基本原理吗?'},\n",
|
||||
" {'question_id': 6,\n",
|
||||
" 'text': 'What are the differences between plant-based and animal-based protein sources?',\n",
|
||||
" 'category': 'generic',\n",
|
||||
" 'translation': '植物性蛋白质来源与动物性蛋白质来源之间的差异是什么?'},\n",
|
||||
" {'question_id': 7,\n",
|
||||
" 'text': 'How can I develop my critical thinking skills?',\n",
|
||||
" 'category': 'generic',\n",
|
||||
" 'translation': '如何培养我的批判性思维能力?'},\n",
|
||||
" {'question_id': 8,\n",
|
||||
" 'text': 'What are the major challenges faced by the education sector today?',\n",
|
||||
" 'category': 'generic',\n",
|
||||
" 'translation': '当今教育部门面临的主要挑战是什么?'},\n",
|
||||
" {'question_id': 9,\n",
|
||||
" 'text': 'What are the primary factors that influence consumer behavior?',\n",
|
||||
" 'category': 'generic',\n",
|
||||
" 'translation': '问题:什么是影响消费者行为的主要因素?'},\n",
|
||||
" {'question_id': 10,\n",
|
||||
" 'text': 'What are the most effective strategies for conflict resolution in the workplace?',\n",
|
||||
" 'category': 'generic',\n",
|
||||
" 'translation': '在职场中解决冲突最有效的策略是什么?'},\n",
|
||||
" {'question_id': 11,\n",
|
||||
" 'text': 'What are some potential implications of using a single-use plastic bottle versus a reusable bottle on both the environment and human health?',\n",
|
||||
" 'category': 'knowledge',\n",
|
||||
" 'translation': '使用一次性塑料瓶与可重复使用瓶子在环境和人类健康方面可能产生哪些潜在影响?'},\n",
|
||||
" {'question_id': 12,\n",
|
||||
" 'text': 'What factors would you consider when designing an inclusive and accessible public transportation system?',\n",
|
||||
" 'category': 'knowledge',\n",
|
||||
" 'translation': '在设计一个包容性和无障碍的公共交通系统时,您会考虑哪些因素?'},\n",
|
||||
" {'question_id': 13,\n",
|
||||
" 'text': 'How can governments utilize fiscal and monetary policies to combat economic recessions?',\n",
|
||||
" 'category': 'knowledge',\n",
|
||||
" 'translation': '问题:政府如何利用财政和货币政策来应对经济衰退?'},\n",
|
||||
" {'question_id': 14,\n",
|
||||
" 'text': 'How do language and cultural barriers affect the way people communicate and form relationships in multicultural societies?',\n",
|
||||
" 'category': 'knowledge',\n",
|
||||
" 'translation': '问题:在多元文化社会中,语言和文化障碍如何影响人们的交流方式和建立关系?'},\n",
|
||||
" {'question_id': 15,\n",
|
||||
" 'text': 'Describe a scenario where artificial intelligence could be used to improve the quality and efficiency of healthcare delivery.',\n",
|
||||
" 'category': 'knowledge',\n",
|
||||
" 'translation': '请描述一个场景,其中可以使用人工智能来提高医疗保健质量和效率。'},\n",
|
||||
" {'question_id': 16,\n",
|
||||
" 'text': 'Explain the process of gene editing using CRISPR-Cas9 technology, and discuss its potential applications and ethical implications.',\n",
|
||||
" 'category': 'knowledge',\n",
|
||||
" 'translation': '请解释使用CRISPR-Cas9技术进行基因编辑的过程,并讨论其潜在应用和伦理影响。'},\n",
|
||||
" {'question_id': 17,\n",
|
||||
" 'text': 'How do vaccinations work to protect individuals and communities from infectious diseases, and what is herd immunity?',\n",
|
||||
" 'category': 'knowledge',\n",
|
||||
" 'translation': '疫苗接种如何保护个人和社区免受传染病的侵害,以及何为群体免疫?'},\n",
|
||||
" {'question_id': 18,\n",
|
||||
" 'text': 'How do social media platforms influence the way people consume and share news, and what are the potential implications for the spread of misinformation?',\n",
|
||||
" 'category': 'knowledge',\n",
|
||||
" 'translation': '社交媒体平台如何影响人们消费和分享新闻的方式?以及这对于错误信息传播的潜在影响有哪些?'},\n",
|
||||
" {'question_id': 19,\n",
|
||||
" 'text': \"How do cultural, social, and economic factors influence people's food choices, and how can this knowledge be used to promote healthier diets?\",\n",
|
||||
" 'category': 'knowledge',\n",
|
||||
" 'translation': '问题:文化、社会和经济因素如何影响人们的食物选择,以及如何利用这些知识来推广更健康的饮食?'},\n",
|
||||
" {'question_id': 20,\n",
|
||||
" 'text': 'Explain the process of natural selection and how it contributes to the evolution and adaptation of species.',\n",
|
||||
" 'category': 'knowledge',\n",
|
||||
" 'translation': '请解释自然选择的过程以及它如何促进物种的进化和适应性。'},\n",
|
||||
" {'question_id': 21,\n",
|
||||
" 'text': 'How would you introduce yourself as a medieval knight at a royal banquet?',\n",
|
||||
" 'category': 'roleplay',\n",
|
||||
" 'translation': '问题:如果您是一位中世纪骑士参加皇家宴会,您将如何介绍自己?'},\n",
|
||||
" {'question_id': 22,\n",
|
||||
" 'text': 'As a pirate captain, what would you say to your crew to motivate them to search for hidden treasure?',\n",
|
||||
" 'category': 'roleplay',\n",
|
||||
" 'translation': '作为海盗船长,您会对船员说什么来激发他们寻找隐藏的宝藏?'},\n",
|
||||
" {'question_id': 23,\n",
|
||||
" 'text': 'If you were a Shakespearean character, how would you declare your love for someone in a soliloquy?',\n",
|
||||
" 'category': 'roleplay',\n",
|
||||
" 'translation': '如果您是莎士比亚的角色,您将如何在独白中向某人表达爱意?'},\n",
|
||||
" {'question_id': 24,\n",
|
||||
" 'text': 'As a superhero, how would you explain your origin story to a curious child?',\n",
|
||||
" 'category': 'roleplay',\n",
|
||||
" 'translation': '作为超级英雄,你会如何向一个好奇的孩子解释你的起源故事?'},\n",
|
||||
" {'question_id': 25,\n",
|
||||
" 'text': 'Imagine you are a time traveler from the year 3000. What technological advancements would you tell people about?',\n",
|
||||
" 'category': 'roleplay',\n",
|
||||
" 'translation': '假设您是来自公元3000年的时间旅行者,您会告诉人们哪些科技进步?'},\n",
|
||||
" {'question_id': 26,\n",
|
||||
" 'text': 'As a sports commentator, describe the winning play in the final seconds of a championship game.',\n",
|
||||
" 'category': 'roleplay',\n",
|
||||
" 'translation': '作为一名体育评论员,在冠军比赛最后几秒钟内描述获胜的关键一击。'},\n",
|
||||
" {'question_id': 27,\n",
|
||||
" 'text': 'Pretend to be a world-famous chef. How would you describe your signature dish to a panel of judges?',\n",
|
||||
" 'category': 'roleplay',\n",
|
||||
" 'translation': '假设自己是一位世界著名的大厨,请问您会如何向评委们介绍您的招牌菜?'},\n",
|
||||
" {'question_id': 28,\n",
|
||||
" 'text': 'You are a mountain climber reaching the summit of Mount Everest. Describe your emotions and the view from the top.',\n",
|
||||
" 'category': 'roleplay',\n",
|
||||
" 'translation': '问题:作为一名登山者,当你登顶珠穆朗玛峰时,描述一下你的情感以及从顶峰看到的景色。'},\n",
|
||||
" {'question_id': 29,\n",
|
||||
" 'text': 'As a space colonist on Mars, describe your daily life and the challenges you face living on another planet.',\n",
|
||||
" 'category': 'roleplay',\n",
|
||||
" 'translation': '作为火星上的太空殖民者,请描述您的日常生活以及在另一个星球上生活所面临的挑战。'},\n",
|
||||
" {'question_id': 30,\n",
|
||||
" 'text': 'Pretend to be a character in a post-apocalyptic world. Describe how you survive and the allies you encounter.',\n",
|
||||
" 'category': 'roleplay',\n",
|
||||
" 'translation': '假设您是一个末日后世界的角色。描述你是如何生存下来的,以及你遇到的盟友。'},\n",
|
||||
" {'question_id': 31,\n",
|
||||
" 'text': 'How can you determine if a restaurant is popular among locals or mainly attracts tourists, and why might this information be useful?',\n",
|
||||
" 'category': 'common-sense',\n",
|
||||
" 'translation': '问题:如何判断一家餐厅是当地人喜欢还是主要吸引游客,这个信息为何有用?'},\n",
|
||||
" {'question_id': 32,\n",
|
||||
" 'text': 'What are some subtle clues that suggest someone is pretending to understand a topic or conversation when they are actually confused or uninformed?',\n",
|
||||
" 'category': 'common-sense',\n",
|
||||
" 'translation': '有哪些不易察觉的线索,暗示某人在假装理解一个话题或对话,而实际上他们却很困惑或无知?'},\n",
|
||||
" {'question_id': 33,\n",
|
||||
" 'text': 'Why might someone choose to use a paper map or ask for directions instead of relying on a GPS device or smartphone app?',\n",
|
||||
" 'category': 'common-sense',\n",
|
||||
" 'translation': '为什么有人会选择使用纸质地图或询问路线,而不是依赖GPS设备或智能手机应用程序?'},\n",
|
||||
" {'question_id': 34,\n",
|
||||
" 'text': 'How can you determine if a person is genuinely interested in a conversation or simply being polite?',\n",
|
||||
" 'category': 'common-sense',\n",
|
||||
" 'translation': '您如何判断一个人是真的对谈话感兴趣还是只是在礼貌地应对?'},\n",
|
||||
" {'question_id': 35,\n",
|
||||
" 'text': 'Why might someone prefer to shop at a small, locally-owned business instead of a large chain store, even if the prices are higher?',\n",
|
||||
" 'category': 'common-sense',\n",
|
||||
" 'translation': '为什么有人可能更喜欢在小型、本地拥有的商店购物,而不是在大型连锁商店购物,即使价格更高呢?'},\n",
|
||||
" {'question_id': 36,\n",
|
||||
" 'text': 'How can you assess the credibility of a source of information, such as a news article or blog post, without relying solely on the reputation of the author or publisher?',\n",
|
||||
" 'category': 'common-sense',\n",
|
||||
" 'translation': '问题:在不完全依赖作者或出版商的声誉的情况下,如何评估信息来源(如新闻文章或博客文章)的可信度?'},\n",
|
||||
" {'question_id': 37,\n",
|
||||
" 'text': 'Why do some people enjoy the sensation of being scared, such as by watching horror movies or going on roller coasters, while others avoid these experiences?',\n",
|
||||
" 'category': 'common-sense',\n",
|
||||
" 'translation': '为什么有些人喜欢害怕的感觉,比如观看恐怖电影或玩过山车,而其他人却避免这些体验?'},\n",
|
||||
" {'question_id': 38,\n",
|
||||
" 'text': 'How can observing the behavior of other people in a social situation provide clues about cultural norms and expectations?',\n",
|
||||
" 'category': 'common-sense',\n",
|
||||
" 'translation': '观察社交场合中其他人的行为如何为我们提供有关文化规范和期望的线索?'},\n",
|
||||
" {'question_id': 39,\n",
|
||||
" 'text': \"Do we have a moral obligation to explore space, or should we focus on solving Earth's problems first?\",\n",
|
||||
" 'category': 'common-sense',\n",
|
||||
" 'translation': '我们是否有道德义务去探索太空,还是应该先集中精力解决地球上的问题?'},\n",
|
||||
" {'question_id': 40,\n",
|
||||
" 'text': 'In a world where automation is becoming increasingly prevalent, is it more important to prioritize job creation or technological progress?',\n",
|
||||
" 'category': 'common-sense',\n",
|
||||
" 'translation': '在一个自动化日益普及的世界中,是更重视创造就业机会还是技术进步?'},\n",
|
||||
" {'question_id': 41,\n",
|
||||
" 'text': 'How many times does the average human blink in a lifetime? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.',\n",
|
||||
" 'category': 'fermi',\n",
|
||||
" 'translation': '一个人一生中平均眨眼多少次?请尝试解释您的答案。您的解释应该引导读者逐步了解您的推理过程。'},\n",
|
||||
" {'question_id': 42,\n",
|
||||
" 'text': 'How many atoms are in a grain of salt? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.',\n",
|
||||
" 'category': 'fermi',\n",
|
||||
" 'translation': '一个盐粒中有多少个原子?请尝试解释您的答案。您的解释应该逐步引导读者了解您的推理过程。'},\n",
|
||||
" {'question_id': 43,\n",
|
||||
" 'text': 'How many lightning strikes occur on Earth each day? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.',\n",
|
||||
" 'category': 'fermi',\n",
|
||||
" 'translation': '问题:每天地球上发生多少次闪电袭击? 请尝试解释您的答案。您的解释应该一步一步地带领读者了解您的推理过程。'},\n",
|
||||
" {'question_id': 44,\n",
|
||||
" 'text': 'How many balloons would it take to lift a house like in the movie \"Up\"? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.',\n",
|
||||
" 'category': 'fermi',\n",
|
||||
" 'translation': '问题:像电影《飞屋环游记》中那样,需要多少气球来使房子升空?请尝试解释您的答案。您的解释应该引导读者逐步了解您的推理过程。'},\n",
|
||||
" {'question_id': 45,\n",
|
||||
" 'text': 'How many text messages are sent globally in a minute? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.',\n",
|
||||
" 'category': 'fermi',\n",
|
||||
" 'translation': '问题:全球一分钟内发送了多少条短信?请尝试解释您的答案。您的解释应该引导读者逐步了解您的推理过程。'},\n",
|
||||
" {'question_id': 46,\n",
|
||||
" 'text': 'How many words are spoken daily on Earth? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.',\n",
|
||||
" 'category': 'fermi',\n",
|
||||
" 'translation': '问题:每天地球上说了多少单词?尝试解释您的答案。您的解释应该引导读者一步一步了解您的推理过程。'},\n",
|
||||
" {'question_id': 47,\n",
|
||||
" 'text': 'How many snowflakes fall during a typical winter? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.',\n",
|
||||
" 'category': 'fermi',\n",
|
||||
" 'translation': '在一个典型的冬天里,会有多少雪花飘落?请尝试解释您的答案。您的解释应该一步步地引导读者了解您的推理过程。'},\n",
|
||||
" {'question_id': 48,\n",
|
||||
" 'text': 'How many pages are in all the books ever written? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.',\n",
|
||||
" 'category': 'fermi',\n",
|
||||
" 'translation': '问题:所有写过的书籍共有多少页?尝试解释您的答案。您的解释应该引导读者逐步了解您的推理过程。'},\n",
|
||||
" {'question_id': 49,\n",
|
||||
" 'text': 'How many times has the Earth orbited the Sun since the beginning of life? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.',\n",
|
||||
" 'category': 'fermi',\n",
|
||||
" 'translation': '问题:自生命开始以来,地球围绕太阳已经转了多少圈?请尝试解释您的答案。您的解释应该一步一步地引导读者了解您的推理过程。'},\n",
|
||||
" {'question_id': 50,\n",
|
||||
" 'text': 'How many songs have been recorded throughout history? Try to explain your answer. Your explanation should take the reader through your reasoning step-by-step.',\n",
|
||||
" 'category': 'fermi',\n",
|
||||
" 'translation': '问题:有史以来共录制了多少首歌曲?请尝试解释您的答案。您的解释应该引导读者逐步了解您的推理过程。'},\n",
|
||||
" {'question_id': 51,\n",
|
||||
" 'text': 'What if the Internet had been invented during the Renaissance period?',\n",
|
||||
" 'category': 'counterfactual',\n",
|
||||
" 'translation': '问题:如果互联网是在文艺复兴时期发明的,会怎么样?'},\n",
|
||||
" {'question_id': 52,\n",
|
||||
" 'text': 'What if the Aztecs had successfully repelled the Spanish conquistadors?',\n",
|
||||
" 'category': 'counterfactual',\n",
|
||||
" 'translation': '如果阿兹特克人成功抵挡住了西班牙征服者,会怎么样?'},\n",
|
||||
" {'question_id': 53,\n",
|
||||
" 'text': 'What if the Black Death had not occurred in the 14th century?',\n",
|
||||
" 'category': 'counterfactual',\n",
|
||||
" 'translation': '如果十四世纪黑死病没有发生,那会怎么样?'},\n",
|
||||
" {'question_id': 54,\n",
|
||||
" 'text': 'What if Isaac Newton had focused on biology instead of physics?',\n",
|
||||
" 'category': 'counterfactual',\n",
|
||||
" 'translation': '如果艾萨克·牛顿专注于生物学而不是物理学,会怎么样?'},\n",
|
||||
" {'question_id': 55,\n",
|
||||
" 'text': 'What if the Beatles had never formed as a band?',\n",
|
||||
" 'category': 'counterfactual',\n",
|
||||
" 'translation': '如果披头士乐队从未组成,会怎么样?'},\n",
|
||||
" {'question_id': 56,\n",
|
||||
" 'text': 'What if Alan Turing had not cracked the Enigma code during World War II?',\n",
|
||||
" 'category': 'counterfactual',\n",
|
||||
" 'translation': '问题:如果艾伦·图灵在二战期间没有破解谜机密码,会怎么样?'},\n",
|
||||
" {'question_id': 57,\n",
|
||||
" 'text': 'What if the Suez Canal had never been constructed?',\n",
|
||||
" 'category': 'counterfactual',\n",
|
||||
" 'translation': '假如苏伊士运河从未建造,会怎么样?'},\n",
|
||||
" {'question_id': 58,\n",
|
||||
" 'text': 'What if the Maya civilization had never mysteriously collapsed?',\n",
|
||||
" 'category': 'counterfactual',\n",
|
||||
" 'translation': '问题:如果玛雅文明从未神秘消失,会发生什么?'},\n",
|
||||
" {'question_id': 59,\n",
|
||||
" 'text': 'What if Christopher Columbus had not discovered the Americas?',\n",
|
||||
" 'category': 'counterfactual',\n",
|
||||
" 'translation': '如果克里斯托弗·哥伦布没有发现美洲会怎么样?'},\n",
|
||||
" {'question_id': 60,\n",
|
||||
" 'text': 'What if Vincent van Gogh had been a successful artist during his lifetime?',\n",
|
||||
" 'category': 'counterfactual',\n",
|
||||
" 'translation': '如果文森特·梵高在他的一生中成为了一位成功的艺术家,那会怎么样?'},\n",
|
||||
" {'question_id': 61,\n",
|
||||
" 'text': 'Develop a C++ program that reads a text file line by line and counts the number of occurrences of a specific word in the file.',\n",
|
||||
" 'category': 'coding',\n",
|
||||
" 'translation': '编写一个C++程序,逐行读取文本文件,并统计文件中特定单词出现的次数。'},\n",
|
||||
" {'question_id': 62,\n",
|
||||
" 'text': 'Implement a Python function to find the longest common subsequence of two input strings using dynamic programming.',\n",
|
||||
" 'category': 'coding',\n",
|
||||
" 'translation': '问题:使用动态规划实现一个 Python 函数,用于查找两个输入字符串的最长公共子序列。'},\n",
|
||||
" {'question_id': 63,\n",
|
||||
" 'text': 'Implement a regular expression in Python to validate an email address.',\n",
|
||||
" 'category': 'coding',\n",
|
||||
" 'translation': '在 Python 中实现一个正则表达式来验证电子邮件地址。'},\n",
|
||||
" {'question_id': 64,\n",
|
||||
" 'text': 'Write a program to find the nth Fibonacci number using dynamic programming.',\n",
|
||||
" 'category': 'coding',\n",
|
||||
" 'translation': '编写一个使用动态规划查找第n个斐波那契数的程序。'},\n",
|
||||
" {'question_id': 65,\n",
|
||||
" 'text': 'Implement a binary search algorithm to find a specific element in a sorted array.',\n",
|
||||
" 'category': 'coding',\n",
|
||||
" 'translation': '问题:实现一个二分搜索算法,在一个已排序的数组中查找特定元素。'},\n",
|
||||
" {'question_id': 66,\n",
|
||||
" 'text': 'Implement a queue data structure using two stacks in Python.',\n",
|
||||
" 'category': 'coding',\n",
|
||||
" 'translation': '问题:使用Python中的两个栈实现一个队列数据结构。'},\n",
|
||||
" {'question_id': 67,\n",
|
||||
" 'text': 'Implement a program to find the common elements in two arrays without using any extra data structures.',\n",
|
||||
" 'category': 'coding',\n",
|
||||
" 'translation': '问题:实现一个程序,找出两个数组中的公共元素,不使用任何额外的数据结构。'},\n",
|
||||
" {'question_id': 68,\n",
|
||||
" 'text': 'Given that f(x) = 5x^3 - 2x + 3, find the value of f(2).',\n",
|
||||
" 'category': 'math',\n",
|
||||
" 'translation': '已知f(x) = 5x^3 - 2x + 3,请求出f(2)的值。'},\n",
|
||||
" {'question_id': 69,\n",
|
||||
" 'text': 'Solve for x in the equation 3x + 10 = 5(x - 2).',\n",
|
||||
" 'category': 'math',\n",
|
||||
" 'translation': '求解方程 3x + 10 = 5(x - 2) 中的 x。'},\n",
|
||||
" {'question_id': 70,\n",
|
||||
" 'text': 'If the endpoints of a line segment are (2, -2) and (10, 4), what is the length of the segment?',\n",
|
||||
" 'category': 'math',\n",
|
||||
" 'translation': '如果线段的端点是(2,-2)和(10,4),那么线段的长度是多少?'},\n",
|
||||
" {'question_id': 71,\n",
|
||||
" 'text': 'Can you help me write a formal email to a potential business partner proposing a joint venture?',\n",
|
||||
" 'category': 'writing',\n",
|
||||
" 'translation': '问题:您能帮我写一封正式的邮件给潜在的商业伙伴,提议共同合作吗?'},\n",
|
||||
" {'question_id': 72,\n",
|
||||
" 'text': 'Can you help me write a resignation letter to my current employer, while leaving on good terms and expressing gratitude for the opportunities provided?',\n",
|
||||
" 'category': 'writing',\n",
|
||||
" 'translation': '您能帮我写一封辞职信给我现在的雇主吗?在保持良好关系的同时,表达对他们提供的机会的感激之情。'},\n",
|
||||
" {'question_id': 73,\n",
|
||||
" 'text': 'Use an appropriate format to structure a formal letter of recommendation for a student applying to a prestigious graduate program in computer science.',\n",
|
||||
" 'category': 'writing',\n",
|
||||
" 'translation': '问题:请使用适当的格式来为申请著名计算机科学研究生项目的学生撰写一封正式的推荐信。'},\n",
|
||||
" {'question_id': 74,\n",
|
||||
" 'text': 'Write a compelling product launch announcement email to inform our customers of our new software solution.',\n",
|
||||
" 'category': 'writing',\n",
|
||||
" 'translation': '問題:編写一封引人注目的产品发布公告电子邮件,以通知我们的客户我们的新软件解决方案。'},\n",
|
||||
" {'question_id': 75,\n",
|
||||
" 'text': 'Draft an apology email to a customer who experienced a delay in their order, and provide reassurance that the issue has been resolved.',\n",
|
||||
" 'category': 'writing',\n",
|
||||
" 'translation': '问题:草拟一封致歉邮件,给一位订单延迟的客户,并向他们保证问题已得到解决。'},\n",
|
||||
" {'question_id': 76,\n",
|
||||
" 'text': 'Write a script for a YouTube video exploring the history and cultural significance of jazz.',\n",
|
||||
" 'category': 'writing',\n",
|
||||
" 'translation': '问题:为一个探讨爵士乐历史和文化意义的YouTube视频编写剧本。'},\n",
|
||||
" {'question_id': 77,\n",
|
||||
" 'text': 'Compose an engaging travel blog post about a recent trip to Hawaii, highlighting cultural experiences and must-see attractions.',\n",
|
||||
" 'category': 'writing',\n",
|
||||
" 'translation': '问题:请撰写一篇关于最近一次夏威夷之旅的吸引人的旅行博客文章,强调文化体验和必游景点。'},\n",
|
||||
" {'question_id': 78,\n",
|
||||
" 'text': 'Write a captivating movie review for a recently released science fiction film, discussing its plot, characters, and special effects.',\n",
|
||||
" 'category': 'writing',\n",
|
||||
" 'translation': '问题:请为最近上映的一部科幻电影撰写一篇引人入胜的影评,讨论其情节、角色和特效。'},\n",
|
||||
" {'question_id': 79,\n",
|
||||
" 'text': 'Structure a podcast script for an episode discussing the influence of streaming platforms on the music industry.',\n",
|
||||
" 'category': 'writing',\n",
|
||||
" 'translation': '问题:请构建一个播客剧本,用于讨论流媒体平台对音乐产业的影响。'},\n",
|
||||
" {'question_id': 80,\n",
|
||||
" 'text': \"Write a symphony concert review, discussing the orchestra's performance and overall audience experience.\",\n",
|
||||
" 'category': 'writing',\n",
|
||||
" 'translation': '问题:撰写一篇交响音乐会评论,讨论乐团的表现和观众的整体体验。'}]"
|
||||
]
|
||||
},
|
||||
"execution_count": 22,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"vicuna_eval_set"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "c4f18782-a499-4893-975b-637cf68257e0",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# save translated vicuna eval questions"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 23,
|
||||
"id": "47634433-a146-44b0-94c8-1f7622694a32",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"!mkdir -p /home/ubuntu/cloudfs/ghost_data/anima_eval/"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 24,
|
||||
"id": "f696afce-6cdc-4ce8-8f93-0628a9e69775",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import json\n",
|
||||
"\n",
|
||||
"a = vicuna_eval_set\n",
|
||||
"\n",
|
||||
"save_path = \"/home/ubuntu/cloudfs/ghost_data/anima_eval/translated_vicuna_eval_set.json\"\n",
|
||||
"\n",
|
||||
"with open(save_path, 'w') as handle:\n",
|
||||
" json.dump(a, handle, ensure_ascii=False)\n",
|
||||
"\n",
|
||||
"with open(save_path, 'r') as handle:\n",
|
||||
" b = json.load(handle)\n",
|
||||
"\n",
|
||||
"assert a == b"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "642ba168-cb1b-4cc7-9454-ec0a548bba37",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.8.16"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
1
data/translated_vicuna_eval_set.json
Normal file
1
data/translated_vicuna_eval_set.json
Normal file
File diff suppressed because one or more lines are too long
12795
eval/elo_tournanment_all_models_on_translated_vicuna.ipynb
Normal file
12795
eval/elo_tournanment_all_models_on_translated_vicuna.ipynb
Normal file
File diff suppressed because one or more lines are too long
Reference in New Issue
Block a user