Maximize Your Scientific Writing Potential with Prompting Techniques
Written on
How to Elevate Your Scientific Writing with Prompting Techniques
Prepare to enhance your scientific writing and produce your ideal text using Large Language Models (LLMs)!
The excitement surrounding AI began several years ago, and the debut of Chat-GPT in 2022 remains fresh in our minds. Since then, the application of LLMs has surged across numerous fields, both personal and professional. While LLMs excel in various tasks, their standout ability is text generation.
Initially, I was not very enthusiastic about writing scientific papers during my undergraduate studies. I was often dissatisfied with my work due to my perfectionist tendencies. Like many newcomers, I was still mastering the craft, and none of us were experts at the onset.
I frequently struggled to articulate my thoughts as I wanted, and my punctuation left much to be desired. Fortunately, after years of practice, I can now produce texts that meet acceptable scientific standards. I write faster today, not only due to my learning but also thanks to LLMs that assist with phrasing and punctuation corrections.
The speed I achieve now is a combination of my university knowledge and the use of LLMs. Although LLM outputs may not always be perfect, possessing linguistic skills and knowing how to craft effective prompts for scientific writing can significantly enhance your productivity.
What Constitutes Scientific Writing?
Scientific writing is a critical step following thorough research on a topic. Its primary purpose is to create a text that objectively conveys information and elucidates facts. Scientific texts are inherently non-fictional and should reflect extensive research or work. The structure must be coherent, and typographical errors are unacceptable. Proper grammar is essential to convey credibility and respectability, alongside logical coherence.
This type of writing is prevalent in professional settings, academia, and scientific research.
Understanding Prompting
In the context of LLMs, a prompt refers to the input provided to generate a specific output. Typically text-based, prompts may include words, special characters, numbers, or links.
Users often employ straightforward inputs akin to those used in Google searches. However, to achieve superior results for our specific needs, it’s crucial to learn how to formulate effective prompts.
What Exactly Is an LLM?
So, what is an LLM?
A Large Language Model (LLM) is grounded in Machine Learning and comprises four essential components:
- Features: These are the aspects the model evaluates to predict an outcome or category.
- Labels: These are the output values generated by the model, such as the classification of an image or predicted visitor numbers.
- Initial Knowledge: The model begins without any prior knowledge, capable only of making guesses.
- Optimizer: This component assesses the model's predictions and provides feedback for improvement.
Through numerous training iterations, the model learns to interpret features accurately and enhance its predictions. The introduction of the Attention mechanism, as described in Vaswani's pivotal paper “Attention Is All You Need,” revolutionized AI models by enabling them to understand token relationships contextually.
Afterward, advancements like Masked Language Training, RAG, RLHF, and Transformer architectures emerged, contributing to the development of contemporary LLMs.
I won’t delve into the technical details here; my focus is on how to utilize LLMs effectively rather than building them. There are numerous LLMs and tools available, such as Chat-GPT, Gemini, Claude 3, Phind, and Llama 3.
Don't stress about which LLM to choose; simply select the one that best fits your needs. For this discussion, I’ll use the free version of Chat-GPT 4o.
Using Prompting in Scientific Writing
We aim to avoid basic zero-shot prompting, where the LLM receives a vague task without sufficient context.
For effective scientific prompting, the LLM must comprehend the task, which necessitates clear communication to prevent misunderstandings.
We should focus on: - Clear and logical language - Avoiding ambiguity - Outlining task steps - Defining success criteria
However, the LLM cannot infer this context independently; it must be included in our prompt. We will employ structured prompting, where formal and content-specific prompts are meticulously organized.
To achieve this, our prompt should include: - Role: What role will the LLM assume? - Addressee: Who is the intended audience for the text? - Style: In what style should the text be composed? - Context: Specify the timeframe and any relevant instructions.
Once these elements are established, we can relay our instructions to the LLM.
You may have noticed that I recommend limiting the LLM's output to 50 words.
Why is this important?
LLMs often provide not only the answer to your question but also additional information. Therefore, capping the output can enhance focus.
Politeness in prompts is also crucial. Including words like "please" not only shows courtesy but can also lead to better outcomes. Research indicates that polite prompts yield more qualitative results.
For further reading, I suggest the following paper:
Ziqi Yin et al. (2024): “Should We Respect LLMs? A Cross-Lingual Study on the Influence of Prompt Politeness on LLM Performance,” https://doi.org/10.48550/arXiv.2402.14531
Handling Complex Tasks with Chained Prompting
If your task involves multiple steps, use chained prompting.
For instance, a simple prompt like “Correct my text” may yield superficial feedback. Instead, we want the LLM to address specific aspects such as formatting, voice, and modality. It’s advisable to separate these tasks into distinct prompts rather than combining them.
Typesetting
- Use main clauses with limited subordinate clauses
- Establish clear references between clauses
- Write complete sentences
- Avoid overly lengthy lists
Passive/Active Voice
- Excessive passive voice can disrupt readability
- Favor active voice formulations
- To convert passive to active, identify the actor as the subject and choose appropriate verbs
Modality
- Modal verbs and subjunctive forms alter the subjectivity of statements
- Ensure targeted usage
- Evaluate each instance for suitability
Modal verbs: should, can, must, may, need
Subjunctive: could, would have, would be
Enhancing Structure with Markdown
Few realize that you can structure prompts using Markdown.
Markdown is an intuitive formatting language. If interested, visit the Markdown guide for a comprehensive cheatsheet of formatting commands: https://www.markdownguide.org/cheat-sheet/
Markdown allows for better organization of prompts through paragraphs and formatting, enabling the inclusion of examples to optimize results.
Note that you can include multiple examples in your prompt.
The Benefits and Risks of Prompting in Scientific Writing
Why should you utilize prompting in your scientific writing? The world is increasingly complex and demands swift adaptation.
Few individuals possess the innate ability to write proficiently without considerable training. However, with practice and the aid of tools like LLMs, anyone can achieve a high level of competence.
Thus, utilizing LLMs for your scientific writing, particularly if you have doubts about your skills, is advisable.
That said, LLMs should not be used to generate complete texts. Instead, they can assist in formulating bullet points, simplifying content, correcting errors, and enhancing clarity.
Please be aware that LLMs should not replace expert knowledge in content-critical areas. We, as users, provide the data, context, and intent, and it is imperative to verify the output. Just because a response appears correct does not guarantee its accuracy.
LLMs only know what is included in their training data; outdated data will yield outdated responses.
“[…] it’s merely stating things that ‘sound right‘ based on its training material.” (Stephen Wolfram (2023), “What Is ChatGPT Doing … and Why Does It Work?”)
Lastly, when employing LLMs for scientific writing, always clarify whether and which generative models were used, for what purpose, and to what extent.
Example for this article:
In preparing this text, I utilized Chat-GPT 4o and DeepL. Chat-GPT 4o aided in refining the structural coherence of my sentences and correcting grammatical errors, while DeepL functioned as a lexical tool for translating German terms into English. I affirm that this text is my original work and not solely generated by a computer program.
Important Reminder! Avoid relying on LLMs for writing tasks when short on time, as this increases the likelihood of errors and misapplications in content-sensitive areas.
Using LLMs for scientific writing does not necessarily expedite the writing process; time is still required for corrections and familiarization with both the LLM and the topic.
Conclusion
- Notable LLMs include: Chat-GPT, Gemini, Claude 3, Phind, Llama 3
- Steer clear of basic zero-shot prompting
- Implement structured prompting and provide context
- Use chained prompting for multi-step tasks
- Incorporate Markdown for enhanced structure
- Always specify whether and which models were employed for what purpose, and to what extent
- Consistently verify the output generated by the model
- Avoid using LLMs for content-sensitive areas
If this topic interests you, consider following my work on information and knowledge management.
Feel free to share your experiences with LLMs in your work or studies in the comments for an engaging discussion!