KnowledgeBase - Your Intelligent Data Repository
KnowledgeBase, your intelligent data repository in the Amorphic Cloud Platform! This powerful feature uses advanced AI technology to create, manage, and query knowledge bases that transform your datasets into searchable, AI-ready information repositories. Whether you're building a document search system, creating a Q&A interface, or organizing enterprise knowledge, KnowledgeBase provides the tools to make your data more accessible and intelligent.
With KnowledgeBase, you can:
- Create and manage knowledge bases with multiple data sources
- Sync and index your files present in your dataset, domain
- Query knowledge bases using natural language
- Get intelligent responses based on your data content
- Track indexing metrics and sync status
- Manage access permissions and resource associations
This guide will walk you through everything you need to know to leverage KnowledgeBase's capabilities effectively.
Knowledge Base Operations
Amorphic provides the following operations for Knowledge Base:
Operation | Description |
---|---|
Create Knowledge Base | Creates a knowledge base in AWS Bedrock and other necessary AWS resources. |
View Knowledge Base | View the details of an existing knowledge base. |
Update Knowledge Base | Update an existing knowledge base configuration. |
Add Sources | Add new data sources to an existing knowledge base. |
Sync Knowledge Base and Sources | Sync data sources in a knowledge base to update indexed content. |
View Sync Status | View sync status and metrics for a knowledge base. |
Query Knowledge Base | Query a knowledge base using natural language. |
Remove Sources | Remove data sources from a knowledge base. |
Delete Knowledge Base | Delete an existing knowledge base. |
Getting Started
Overview
KnowledgeBase is an AI-powered data repository system that enables you to:
- Transform your datasets and files into searchable knowledge bases
- Query your data using natural language
- Get intelligent responses based on your actual data content
- Sync and index new or updated data
- Manage multiple sources within a single knowledge base
The system integrates with AWS Bedrock to provide advanced natural language processing and retrieval capabilities, making your data more accessible and useful.
All sources must be properly registered in the Amorphic platform and accessible to your user account. The system automatically handles file format detection and content extraction during the indexing process.
Key Features
Knowledge Base Management
KnowledgeBase provides comprehensive management capabilities for creating, updating, and maintaining your data repositories.
Feature | Description |
---|---|
Knowledge base creation | Create new knowledge bases with custom names and descriptions |
Source association | Attach multiple datasets, domains to a knowledge base |
Sync management | Sync and index sources with status tracking |
Access control | Manage permissions and user access to knowledge bases |
Metrics tracking | Monitor indexing statistics and sync performance |
- Knowledge bases are created with unique identifiers and can contain multiple data sources
- Sync operations are performed sequentially to avoid conflicts
- You can choose to sync either the entire knowledge base or individual sources one at a time
- All operations are logged for audit and compliance purposes
- Access permissions are inherited from the underlying data sources
Natural Language Querying
KnowledgeBase leverages advanced LLMs to enable natural language interactions with your data:
Feature | Description |
---|---|
Natural language processing | Query your data using plain English |
Context-aware responses | Get answers based on your actual data content |
Chunk-based retrieval | Intelligent document chunking for better responses |
Response formatting | Structured responses with source attribution |
Source Synchronization
The system provides robust synchronization capabilities for keeping your knowledge bases up-to-date:
Feature | Description |
---|---|
Indexing | Detect and index new or modified files |
Incremental sync | Only process changed content for efficiency |
Status tracking | Monitor sync progress and completion status |
Error handling | Retry logic with exponential backoff |
Email notifications | Notify owners and editors of sync completion |
Create Knowledge Base
To create a Knowledge Base:
- Navigate to the
Explore
section in the left sidebar - Click on
AI Space
to expand the menu - Select
Knowledge Bases
from the available options - Click on
+ Create Knowledge Base
. - Fill in the details shown in the table:
Attribute | Description |
---|---|
Knowledge Base Name | Give your knowledge base a unique name. |
Description | Describe the knowledge base's purpose and relevant details. |
Models | Select the Embedding Models enabled in your account. These models convert text into numerical vectors for semantic search. |
Keywords | Add relevant keywords to the knowledge base. |
Guardrail | Select a relevant guardrail for the knowledge base. If no guardrail is specified, the system will apply a default guardrail automatically. |
Access Control | Configure access permissions for the knowledge base. By default, the creator has full access. |
- Provide a clear and detailed description that accurately reflects the content and purpose of your knowledge base sources
- A well-written description helps users understand the knowledge base scope and enables agents to effectively access and utilize the information
- Include key topics, data types, and intended use cases in the description for better discoverability
View Knowledge Base
The Knowledge Base details page provides comprehensive information organized into three main tabs:
Tab | Component | Description |
---|---|---|
Overview | Basic Information | Knowledge Base Name: Unique identifier Description: Purpose and content details Created: Creator and creation date Updated: Last modifier and modification date |
Model Information | Model: Embedding model (e.g., amazon.titan-embed-text-v2:0) Last Synced: Most recent sync timestamp Last Synced Status: Current sync state (SUCCEEDED/FAILED/IN_PROGRESS) | |
Keywords | Associated tags and owner information | |
Donut Chart Metrics | Sources Attached: Connected datasets/domains Files Scanned: Total processed files Files Deleted: Removed files Files Failed: Failed indexing attempts Metadata Files Scanned: Processed metadata files Metadata Files Modified: Updated metadata files Modified Files Indexed: Re-indexed existing files New Files Indexed: Successfully indexed new files | |
Summary Cards | Sources Added: Total attached sources Latest Files Processed: Recent processing count Latest Files Indexed: Recent indexing success count Latest Files Failed: Recent indexing failure count | |
Sources | Source Management | Information about connected data sources and their status |
Runs | Sync Operations | Details about synchronization operations and their outcomes |
Activity Logs | Timeline Events | Creation events Source addition records Sync operation history Knowledge base modifications Each log entry includes: User who performed the action Action description and timestamp |
The Knowledge Base details page provides comprehensive information about your knowledge base, including its configuration, metrics, and activity history. The page is organized into three main tabs: Overview, Sources, and Runs.
- The Overview tab provides the most comprehensive view of your knowledge base status and performance
- Use the Test Knowledge Base button to verify your knowledge base is working correctly
- Monitor the Activity Logs to track all changes and operations performed on your knowledge base
- The metrics help you understand the scope and health of your indexed content
Update Knowledge Base
To update a Knowledge Base description:
- Navigate to the Knowledge Base details page
- Click on the
Edit
action button - Update the description field with your new text
- Click
Save
to apply the changes
Only the description field can be modified after a knowledge base is created. The name and models configurations cannot be changed.
Add Sources
To add sources to your Knowledge Base:
- Navigate to the Knowledge Base details page
- Click the
Add Source
button - Select your source type (Dataset or Domain)
- Configure the required fields
- Click
Save
to attach the source
The following fields need to be configured when adding a source:
Field | Description |
---|---|
Source Type | Select between Dataset or Domain as the source type |
Name | Provide a unique identifier for the source |
Description | Add details about the source content and purpose |
Chunking Strategy | FIXED_SIZE HIERARCHICAL SEMANTIC NONE Additional parameters will be required based on selected strategy |
Parsing Strategy | Select the appropriate parsing strategy for your content type |
Important limitations:
- Currently in version 3.1, only the Default Parsing Strategy is supported for processing source content
- Maximum 5 sources can be attached per knowledge base
- If a domain is selected as a source, individual datasets from that domain cannot be added separately
- This limitation helps optimize query performance across the knowledge base
Sync Knowledge Base and Sources
Individual Source Sync
Step | Action | Details |
---|---|---|
1 | Navigate | Go to the Sources tab |
2 | Initiate | Click Sync on your target source |
3 | Monitor | Track progress in the Runs tab |
4 | Review Metrics | View detailed source metrics: • Files scanned, deleted, and failed • Metadata files processed and indexed • New files indexed • Latest processing status |
5 | Verify | Check file status (INDEXED or FAILED) |
Complete Knowledge Base Sync
Step | Action | Details |
---|---|---|
1 | Initiate | Click Sync at knowledge base level |
2 | Monitor | Track progress in Runs tab |
3 | Review | Check metrics for all sources |
Important considerations:
- Only one sync operation can run at a time per knowledge base
- Sync duration depends on file count and size
- Sync operations run sequentially to prevent conflicts
- Large files require more processing time
- Email notifications confirm completion
- Failed syncs automatically retry with exponential backoff
View Sync Status
Monitoring Dashboard
Navigate to the Runs tab to view comprehensive sync details:
Information | Description |
---|---|
Source Name | Individual source or knowledge base |
Execution Scope | Datasource/KnowledgeBase |
Status | Current sync status |
Start Time | Operation start timestamp |
End Time | Operation completion timestamp |
Synced By | User who initiated the sync |
Detailed Metrics
Each sync operation provides:
Metric Type | Details Tracked |
---|---|
File Processing | • Files scanned • Files deleted • Failed files |
Metadata Status | • Files scanned • Files modified • Files indexed |
Index Updates | • New files indexed • Processing status • Latest results |
Query Knowledge Base
The Knowledge Base provides an intuitive interface for querying your indexed content using natural language.
Step | Field | Description |
---|---|---|
1 | Access Query Interface | Select your target knowledge base from the list Click the Test Knowledge Base button in the top rightA chat interface window will appear |
2 | Configure Query Scope | Choose your preferred scope: • Query the entire knowledge base • Select a specific data source • Target an individual file • Combine source and file selection |
3 | Select AI Model | Choose an appropriate AI model for your query Recommended models for optimal results: • Claude-3.5-Sonnet v2 • Haiku • Other advanced models |
4 | Submit and Review | Enter your natural language query Click Submit to processReview the AI-generated response Examine source references provided as chunks below each response |
Important considerations:
- Queries only work on successfully indexed content
- Use specific, well-formed questions for better accuracy
- All responses include source references for verification
- Access control ensures users only receive information from files they have permission to view
For optimal results:
- Use advanced models like Claude-3.5-Sonnet v2 or Haiku
- Craft clear and specific prompts
- Review source references to validate responses
- Start with broader queries, then refine as needed
Remove Sources
To remove sources from a knowledge base:
- Navigate to Knowledge Base: Go to the knowledge base details page
- Select Remove Sources: Choose the option to remove data sources
- Confirm Removal: The data sources will be detached from the knowledge base
- Clean Up: Associated metadata will be cleaned up automatically
- Removing sources will make their content unavailable for querying
- The operation cannot be undone
- Associated metadata will be cleaned up automatically
Delete Knowledge Base
To delete a knowledge base:
- Navigate to Knowledge Base: Go to the knowledge base details page
- Select Delete Option: Click the delete button in the top right
- Confirm Deletion: Review the warning message and confirm deletion
- Automatic Cleanup: The system will automatically:
- Remove all associated data sources
- Delete corresponding indexed files
- Clean up related metadata
This action is permanent and cannot be undone. Make sure you want to delete the knowledge base before confirming.
The knowledge base must be in Active state in order to perform the delete operation on it.
Access Control and Permissions
The system implements robust access control:
- Owner Access: Full control over knowledge base operations
- Editor Access: Can modify and sync knowledge bases, including adding/removing sources and updating settings, but cannot delete knowledge bases
- Reader Access: Can query knowledge bases
- Resource-level Permissions: Inherited from underlying data sources
Best Practices
To get the most out of KnowledgeBase, consider these best practices:
-
Organize Sources
- Group related datasets and files logically
- Use descriptive names for knowledge bases
- Consider domain-based organization
-
Optimize Sync Operations
- Monitor sync status and address failures promptly
- Use incremental syncs when possible
-
Query Optimization
- Be specific in your questions for better results
- Use context from previous queries when relevant
- Review source attribution for accuracy verification
-
Access Management
- Regularly review and update access permissions
- Monitor usage patterns and adjust accordingly
- Implement least-privilege access principles
Using clear and descriptive file names and metadata will significantly improve KnowledgeBase's ability to provide accurate responses. Consider adding tags and descriptions to your data sources when possible.
File Type | Extension |
---|---|
Plain text (ASCII only) | .txt |
Markdown | .md |
HyperText Markup Language | .html |
Microsoft Word document | .doc/.docx |
Comma-separated values | .csv |
Microsoft Excel spreadsheet | .xls/.xlsx |
Portable Document Format |
- Maximum of 100 Knowledge Bases can be created per account
- Individual file size must not exceed 50MB quota
- Only S3 datasets are currently supported as data sources
- Maximum 5 data sources can be attached per Knowledge Base
- Knowledge Base names must be unique within your account
- Sync operations run sequentially - only one sync can be active at a time
- Knowledge base queries are limited to indexed content only
- Large files may take significant time to index
- Query responses are based on indexed chunks and may not include full context
- Real-time updates require manual sync operations