Version: v3.0 print this page

SQL AI - Your SQL Assistant

SQL AI, your intelligent SQL assistant in the Amorphic Cloud Platform! This powerful feature leverages advanced Large Language Models (LLMs) to simplify your data querying experience. Whether you're a SQL expert or new to database queries, SQL AI helps you interact with your datasets using natural language, making data exploration more intuitive and accessible.

With SQL AI, you can:

Convert natural language questions into SQL queries
Generate SQL code without writing complex syntax
Get explanations for existing SQL queries
Receive suggestions for exploring your datasets
Interact with your data using plain English

This guide will walk you through everything you need to know to leverage SQL AI's capabilities effectively.

sql-ai

Getting Started

Overview

SQL AI is a generative AI add-on that bridges the gap between natural language and SQL. It enables you to:

Ask questions about your data in plain English
Generate ready-to-use SQL queries from conversational prompts
Understand existing SQL code through clear explanations

The system supports domain-specific queries, ensuring datasets in different domains are easily distinguishable, even if they share the same name.

Supported Data Sources

SQL AI currently supports:

S3-Athena datasets
- Standard S3 datasets accessible through AWS Athena
- Supports querying structured data in various formats (Parquet, ORC, CSV)
Redshift datasets
- Data stored in Amazon Redshift clusters
- Supports both provisioned and Redshift Serverless workgroups
System datasets
- Datasets created using Amorphic's system features
- Includes sys_cost_analysis_table, api-sys_cloudtrail_logs, etc. tables to query upon (System Datasets)

Note

All datasets must be properly registered in the Amorphic platform to be accessible via SQL AI. For optimal performance, ensure that your datasets have complete and accurate metadata configured.

Key Features

Natural Language to SQL Translation

SQL AI leverages powerful LLMs to interpret your inquiries in natural language, converting them into executable SQL queries. This reduces the need for technical expertise, allowing you to ask questions in plain language while receiving the corresponding SQL code or results.

Feature	Description
Natural language processing	Converts English questions into SQL queries
Domain-specific queries	Distinguishes between datasets with the same name in different domains
Case-sensitive querying	Matches resource names exactly as created

Note

When querying datasets with the same name but in different domains, use [domain-name].[dataset-name] for accurate results. (Querying is case-sensitive; resource names must match their exact case as created.)
Alternatively, you can select the dataset from the dropdown list.
It is recommended to use clear column names and descriptions in datasets to provide better context, improving query accuracy.
Generated queries are based on the schema of the data source and not the actual data.

Interactive Query Categories

SQL AI supports three main categories of interaction:

Category	Description	Example
Natural Language Suggestions	Get insights and suggestions about your data	"Suggest some queries I can run on the sales dataset."
Query Generation	Convert natural language into SQL code	"Generate a query to find the top 10 salespeople by revenue."
Query Explanation	Understand what existing SQL code does	"Explain this query: SELECT * FROM orders WHERE order_date > '2023-01-01'"

Once a query is generated, you can review it and execute it directly from the playground using the Run Query button in the right corner of the query tab.

Model Customization

SQL AI offers flexibility in choosing between multiple LLM models for query generation and explanation:

Available Models	Characteristics
Claude 3 Haiku	Faster response time, good for simple queries
Claude 3 Sonnet	More comprehensive responses, better for complex queries

Select the model that best suits your needs based on query complexity and response time requirements.

Resource Selection

You can prompt SQL AI on resources in the following ways:

Select the resource from the dropdown list
Enter the dataset name in your prompt
Enter the domain name followed by the dataset name in your prompt

Best Practices

At least one resource must be selected for query suggestions or generation-related prompts to work
For optimal results, select only the resources relevant to your question
Selecting more than 20 resources is not recommended, as responses may become inaccurate due to context limitations

Use Cases

Natural Language Suggestions

When you need ideas for exploring your data, SQL AI can suggest useful queries.

Example Prompt: "Suggest some queries I can run on the customer_orders dataset."

SQL AI Response: Here are some queries you might find useful:

"Show me the total number of orders by customer"

"Find the average order value by month"

"Identify customers who haven't placed an order in the last 6 months"

Natural language processing

SQL Generation

When you need to query your data but prefer not to write SQL code manually:

Example Prompt: "Generate a query to retrieve top 10 salespeople sorted in ascending order by sales amount."

SQL AI Response:
SELECT
    salesperson_name,
    SUM(sales_amount) as total_sales
FROM
    sales_domain.sales_data
GROUP BY
    salesperson_name
ORDER BY
    total_sales ASC
LIMIT 10;

SQL Generation

SQL Explanation

When you need to understand an existing SQL query:

Example Prompt: "Explain this query: SELECT * FROM test_domain.test_dataset WHERE DATE_FORMAT(ModifiedDate, '%Y') = '2006';"

SQL AI Response: This query is retrieving all columns from the table test_dataset in the test_domain schema where the year part of the ModifiedDate column equals '2006'.

Breaking it down:

SELECT * retrieves all columns

FROM test_domain.test_dataset specifies the data source

WHERE DATE_FORMAT(ModifiedDate, '%Y') = '2006' filters for records from the year 2006

The DATE_FORMAT function extracts just the year portion from the date

SQL Explanation

Note

Occasionally, implementing the suggested responses verbatim may not produce optimal results. For best outcomes, please incorporate prefixes such as 'Generate a query or SQL' when formulating your questions.

Limitations and Considerations

Current Limitations

The chatbot is limited to SQL-related actions mentioned above and cannot answer questions outside this context
Generated queries might not always be correct; precise prompts lead to more precise queries
Follow-up prompting on query results is not optimal
Error identification and correction in generated SQL queries is not currently supported
For Redshift resources outside the default tenant, you need to manually add the tenant name as the generated queries use only the <domain>.<table_name> format instead of the required <tenant_name>.<domain_name>.<table_name> syntax

Tips for Better Results

To get the most out of SQL AI, consider these best practices:

Be specific in your prompts - Clearly state what data you're looking for
Mention column names when possible to improve query accuracy
Start simple and build complexity through follow-up prompts
Review generated queries before execution to ensure they match your intent
Provide context about your data and what you're trying to achieve

Improving Query Quality

Using clear and descriptive column names in your datasets will significantly improve SQL AI's ability to generate accurate queries. Consider adding comments and descriptions to your data schema when possible.

Getting Started​

Overview​

Supported Data Sources​

Key Features​

Natural Language to SQL Translation​

Interactive Query Categories​

Model Customization​

Resource Selection​

Use Cases​

Natural Language Suggestions​

SQL Generation​

SQL Explanation​

Limitations and Considerations​

Tips for Better Results​

Getting Started

Overview

Supported Data Sources

Key Features

Natural Language to SQL Translation

Interactive Query Categories

Model Customization

Resource Selection

Use Cases

Natural Language Suggestions

SQL Generation

SQL Explanation

Limitations and Considerations

Tips for Better Results