Building a Text-to-SQL App with Google Gemini in Just a Few Steps
Written on
Today, we will dive into the process of developing a straightforward application that enables users to generate SQL queries from simple text inputs. This tool permits users to enter plain English and receive the corresponding SQL code, complete with an example output and an explanation of the generated code.
In this tutorial, we will use Streamlit for the frontend, which provides an interactive user experience with minimal coding required. If you're unfamiliar with Streamlit, feel free to check out my previous article for a beginner's overview. Additionally, we will implement Google Gemini, which is the core of our application, utilizing a Large Language Model (LLM) to transform user inputs into SQL queries, serving as the backend of our tool.
The application's workflow is illustrated in the following image, where the user submits a prompt, and Gemini Pro (LLM) interprets it, generates a SQL query, and returns a response.
For the sake of simplicity, in this tutorial, I will assume we have a database and will only convert user input into SQL queries and provide explanations. However, we can also connect to our database to streamline the data retrieval process, eliminating the need for complex SQL query writing.
First, we need to install the required packages using pip as follows:
# Installing the necessary libraries
pip install google-generativeai streamlit
Next, we will import the necessary libraries and start constructing our web application with the following code.
# Importing the required libraries
import streamlit as st
import google.generativeai as genai
# Setting the web app title and selecting a wide layout (optional)
st.set_page_config(page_title='Text-To-SQL APP', page_icon=None)
# Defining column sizes
col1, col2 = st.columns((0.3, 1.7))
col1.image('text_to_sql_logo.jpeg')
col2.markdown("# :rainbow[ SQL QUERY AI ASSISTANT APP]")
st.write("#### :blue[ This is SQL Query Generator Web App Using Google Gemini! ]")
query_input = st.text_area('Please enter your prompt using simple English')
submit = st.button("Generate SQL")
This code constructs the desired interface for the web application. When executed, it produces a user-friendly interface that allows users to enter their SQL requests in straightforward English.
Next, we will utilize Google Gemini, the LLM that converts the input text into SQL queries. To do this, we need to obtain the Google Gemini API key, which can be acquired from makersuite.google.com. After generating our API key, we can write the following code:
# Defining the API key and loading Gemini Pro
genai.configure(api_key="API_KEY") # Replace with your actual API key
model = genai.GenerativeModel('gemini-pro')
# This information assists the LLM model in generating accurate SQL queries
supportive_info1 = ["Based on the prompt text, create a SQL query, and ensure to exclude ''' at the start and end."]
# If submit is clicked
if submit:
# Creating the columns side by side
with st.spinner("Generating.."):
st.write("##### 1. The Generated SQL Query Code :")
response = model.generate_content([supportive_info1[0], query_input])
st.code(response.text)
The code above employs the Gemini model to convert user text into SQL queries. To enhance the accuracy of the output, we provide context or background information, like the supportive_info1 variable defined earlier. Now, let's test our first query as follows:
# Sample query question -1
get the name, age and gender from the attendance table who are only above 25 years old
We successfully generated our first SQL result, as depicted above. To provide an explanation of the generated SQL query alongside example outputs, we can adjust our original code by adding a few more lines.
# Defining supportive information for the Gemini LLM model
supportive_info2 = ["Based on the SQL query code, create an example input dataframe before the SQL query code is applied and the output dataframe after the SQL query is applied."]
supportive_info3 = ["Explain the SQL query in detail without any example output."]
# Using the model to generate content based on the supportive information and query input provided
response = model.generate_content([supportive_info1[0], query_input])
response2 = model.generate_content([supportive_info2[0], response.text])
response3 = model.generate_content([supportive_info3[0], response.text])
# If submit is clicked
if submit:
with st.spinner("Generating.."):
st.write("##### 1. The Generated SQL Query Code :")
response = model.generate_content([supportive_info1[0], query_input])
st.code(response.text)
st.write("##### 2. A Sample Of Expected Output :")
response2 = model.generate_content([supportive_info2[0], response.text])
st.write(response2.text)
st.write("##### 3. Explanation of the SQL Query code generated :")
response3 = model.generate_content([supportive_info3[0], response.text])
st.write(response3.text)
In this updated code, we added further supportive information to include explanations and sample outputs for the generated query. Let's give it a try.
As demonstrated, we received the SQL query, sample output, and a comprehensive explanation of the generated code. Now, let's test it with a slightly more complex query as follows:
# Sample query Question-2
get the genre and rating from the movies table, grouping them by genre and ordering by the average of rating in descending order
Conclusion
That's all for today! Thank you for reading. We have covered the steps involved in creating a Streamlit-based application that converts user text inputs into SQL queries. I hope you found this tutorial helpful. Stay tuned for more engaging and educational articles related to AI, data science, and data analysis. Your support with 10+ claps is greatly appreciated!
You can refer to the complete code below, or find it in my GitHub repository along with all necessary files.