Skip to main content
Version: v3.1 print this page

KnowledgeBase - Your Intelligent Data Repository

KnowledgeBase, your intelligent data repository in the Amorphic Cloud Platform! This powerful feature uses advanced AI technology to create, manage, and query knowledge bases that transform your datasets into searchable, AI-ready information repositories. Whether you're building a document search system, creating a Q&A interface, or organizing enterprise knowledge, KnowledgeBase provides the tools to make your data more accessible and intelligent.

With KnowledgeBase, you can:

  • Create and manage knowledge bases with multiple data sources
  • Sync and index your files present in your dataset, domain
  • Query knowledge bases using natural language
  • Get intelligent responses based on your data content
  • Track indexing metrics and sync status
  • Manage access permissions and resource associations

This guide will walk you through everything you need to know to leverage KnowledgeBase's capabilities effectively.

knowledge-base-overview

Knowledge Base Operations

Amorphic provides the following operations for Knowledge Base:

OperationDescription
Create Knowledge BaseCreates a knowledge base in AWS Bedrock and other necessary AWS resources.
View Knowledge BaseView the details of an existing knowledge base.
Update Knowledge BaseUpdate an existing knowledge base configuration.
Add SourcesAdd new data sources to an existing knowledge base.
Sync Knowledge Base and SourcesSync data sources in a knowledge base to update indexed content.
View Sync StatusView sync status and metrics for a knowledge base.
Query Knowledge BaseQuery a knowledge base using natural language.
Remove SourcesRemove data sources from a knowledge base.
Delete Knowledge BaseDelete an existing knowledge base.

Getting Started

Overview

KnowledgeBase is an AI-powered data repository system that enables you to:

  • Transform your datasets and files into searchable knowledge bases
  • Query your data using natural language
  • Get intelligent responses based on your actual data content
  • Sync and index new or updated data
  • Manage multiple sources within a single knowledge base

The system integrates with AWS Bedrock to provide advanced natural language processing and retrieval capabilities, making your data more accessible and useful.

Note

All sources must be properly registered in the Amorphic platform and accessible to your user account. The system automatically handles file format detection and content extraction during the indexing process.

Key Features

Knowledge Base Management

KnowledgeBase provides comprehensive management capabilities for creating, updating, and maintaining your data repositories.

FeatureDescription
Knowledge base creationCreate new knowledge bases with custom names and descriptions
Source associationAttach multiple datasets, domains to a knowledge base
Sync managementSync and index sources with status tracking
Access controlManage permissions and user access to knowledge bases
Metrics trackingMonitor indexing statistics and sync performance
Note
  • Knowledge bases are created with unique identifiers and can contain multiple data sources
  • Sync operations are performed sequentially to avoid conflicts
  • You can choose to sync either the entire knowledge base or individual sources one at a time
  • All operations are logged for audit and compliance purposes
  • Access permissions are inherited from the underlying data sources

Natural Language Querying

KnowledgeBase leverages advanced LLMs to enable natural language interactions with your data:

FeatureDescription
Natural language processingQuery your data using plain English
Context-aware responsesGet answers based on your actual data content
Chunk-based retrievalIntelligent document chunking for better responses
Response formattingStructured responses with source attribution

Source Synchronization

The system provides robust synchronization capabilities for keeping your knowledge bases up-to-date:

FeatureDescription
IndexingDetect and index new or modified files
Incremental syncOnly process changed content for efficiency
Status trackingMonitor sync progress and completion status
Error handlingRetry logic with exponential backoff
Email notificationsNotify owners and editors of sync completion

Create Knowledge Base

Create Knowledge Base

To create a Knowledge Base:

  1. Navigate to the Explore section in the left sidebar
  2. Click on AI Space to expand the menu
  3. Select Knowledge Bases from the available options
  4. Click on + Create Knowledge Base.
  5. Fill in the details shown in the table:
AttributeDescription
Knowledge Base NameGive your knowledge base a unique name.
DescriptionDescribe the knowledge base's purpose and relevant details.
ModelsSelect the Embedding Models enabled in your account. These models convert text into numerical vectors for semantic search.
KeywordsAdd relevant keywords to the knowledge base.
GuardrailSelect a relevant guardrail for the knowledge base. If no guardrail is specified, the system will apply a default guardrail automatically.
Access ControlConfigure access permissions for the knowledge base. By default, the creator has full access.
Note
  • Provide a clear and detailed description that accurately reflects the content and purpose of your knowledge base sources
  • A well-written description helps users understand the knowledge base scope and enables agents to effectively access and utilize the information
  • Include key topics, data types, and intended use cases in the description for better discoverability

View Knowledge Base

Knowledge Base Details The Knowledge Base details page provides comprehensive information organized into three main tabs:

TabComponentDescription
OverviewBasic InformationKnowledge Base Name: Unique identifier
Description: Purpose and content details
Created: Creator and creation date
Updated: Last modifier and modification date
Model InformationModel: Embedding model (e.g., amazon.titan-embed-text-v2:0)
Last Synced: Most recent sync timestamp
Last Synced Status: Current sync state (SUCCEEDED/FAILED/IN_PROGRESS)
KeywordsAssociated tags and owner information
Donut Chart MetricsSources Attached: Connected datasets/domains
Files Scanned: Total processed files
Files Deleted: Removed files
Files Failed: Failed indexing attempts
Metadata Files Scanned: Processed metadata files
Metadata Files Modified: Updated metadata files
Modified Files Indexed: Re-indexed existing files
New Files Indexed: Successfully indexed new files
Summary CardsSources Added: Total attached sources
Latest Files Processed: Recent processing count
Latest Files Indexed: Recent indexing success count
Latest Files Failed: Recent indexing failure count
SourcesSource ManagementInformation about connected data sources and their status
RunsSync OperationsDetails about synchronization operations and their outcomes
Activity LogsTimeline EventsCreation events
Source addition records
Sync operation history
Knowledge base modifications

Each log entry includes:
User who performed the action
Action description and timestamp

The Knowledge Base details page provides comprehensive information about your knowledge base, including its configuration, metrics, and activity history. The page is organized into three main tabs: Overview, Sources, and Runs.

Note
  • The Overview tab provides the most comprehensive view of your knowledge base status and performance
  • Use the Test Knowledge Base button to verify your knowledge base is working correctly
  • Monitor the Activity Logs to track all changes and operations performed on your knowledge base
  • The metrics help you understand the scope and health of your indexed content

Update Knowledge Base

To update a Knowledge Base description:

  1. Navigate to the Knowledge Base details page
  2. Click on the Edit action button
  3. Update the description field with your new text
  4. Click Save to apply the changes
Note

Only the description field can be modified after a knowledge base is created. The name and models configurations cannot be changed.

Add Sources

Add Data Sources To add sources to your Knowledge Base:

  1. Navigate to the Knowledge Base details page
  2. Click the Add Source button
  3. Select your source type (Dataset or Domain)
  4. Configure the required fields
  5. Click Save to attach the source

The following fields need to be configured when adding a source:

FieldDescription
Source TypeSelect between Dataset or Domain as the source type
NameProvide a unique identifier for the source
DescriptionAdd details about the source content and purpose
Chunking Strategy
FIXED_SIZE
HIERARCHICAL
SEMANTIC
NONE
Additional parameters will be required based on selected strategy
Parsing StrategySelect the appropriate parsing strategy for your content type
Note

Important limitations:

  • Currently in version 3.1, only the Default Parsing Strategy is supported for processing source content
  • Maximum 5 sources can be attached per knowledge base
  • If a domain is selected as a source, individual datasets from that domain cannot be added separately
  • This limitation helps optimize query performance across the knowledge base

Sync Knowledge Base and Sources

Individual Source Sync

Sync Individual Source

StepActionDetails
1NavigateGo to the Sources tab
2InitiateClick Sync on your target source
3MonitorTrack progress in the Runs tab
4Review MetricsView detailed source metrics:
• Files scanned, deleted, and failed
• Metadata files processed and indexed
• New files indexed
• Latest processing status
5VerifyCheck file status (INDEXED or FAILED)

Complete Knowledge Base Sync

Sync Knowledge Base

StepActionDetails
1InitiateClick Sync at knowledge base level
2MonitorTrack progress in Runs tab
3ReviewCheck metrics for all sources
Note

Important considerations:

  • Only one sync operation can run at a time per knowledge base
  • Sync duration depends on file count and size
  • Sync operations run sequentially to prevent conflicts
  • Large files require more processing time
  • Email notifications confirm completion
  • Failed syncs automatically retry with exponential backoff

View Sync Status

View Sync Status

Monitoring Dashboard

Navigate to the Runs tab to view comprehensive sync details:

InformationDescription
Source NameIndividual source or knowledge base
Execution ScopeDatasource/KnowledgeBase
StatusCurrent sync status
Start TimeOperation start timestamp
End TimeOperation completion timestamp
Synced ByUser who initiated the sync

Detailed Metrics

Each sync operation provides:

Metric TypeDetails Tracked
File Processing• Files scanned
• Files deleted
• Failed files
Metadata Status• Files scanned
• Files modified
• Files indexed
Index Updates• New files indexed
• Processing status
• Latest results

Query Knowledge Base

Query Knowledge Base

The Knowledge Base provides an intuitive interface for querying your indexed content using natural language.

StepFieldDescription
1Access Query InterfaceSelect your target knowledge base from the list
Click the Test Knowledge Base button in the top right
A chat interface window will appear
2Configure Query ScopeChoose your preferred scope:
  • Query the entire knowledge base
  • Select a specific data source
  • Target an individual file
  • Combine source and file selection
3Select AI ModelChoose an appropriate AI model for your query
Recommended models for optimal results:
  • Claude-3.5-Sonnet v2
  • Haiku
  • Other advanced models
4Submit and ReviewEnter your natural language query
Click Submit to process
Review the AI-generated response
Examine source references provided as chunks below each response
Note

Important considerations:

  • Queries only work on successfully indexed content
  • Use specific, well-formed questions for better accuracy
  • All responses include source references for verification
  • Access control ensures users only receive information from files they have permission to view
Best Practices

For optimal results:

  • Use advanced models like Claude-3.5-Sonnet v2 or Haiku
  • Craft clear and specific prompts
  • Review source references to validate responses
  • Start with broader queries, then refine as needed

Remove Sources

Remove Data Sources To remove sources from a knowledge base:

  1. Navigate to Knowledge Base: Go to the knowledge base details page
  2. Select Remove Sources: Choose the option to remove data sources
  3. Confirm Removal: The data sources will be detached from the knowledge base
  4. Clean Up: Associated metadata will be cleaned up automatically
Note
  • Removing sources will make their content unavailable for querying
  • The operation cannot be undone
  • Associated metadata will be cleaned up automatically

Delete Knowledge Base

Delete Knowledge Base

To delete a knowledge base:

  1. Navigate to Knowledge Base: Go to the knowledge base details page
  2. Select Delete Option: Click the delete button in the top right
  3. Confirm Deletion: Review the warning message and confirm deletion
  4. Automatic Cleanup: The system will automatically:
    • Remove all associated data sources
    • Delete corresponding indexed files
    • Clean up related metadata
warning

This action is permanent and cannot be undone. Make sure you want to delete the knowledge base before confirming.

Note

The knowledge base must be in Active state in order to perform the delete operation on it.

Access Control and Permissions

The system implements robust access control:

  • Owner Access: Full control over knowledge base operations
  • Editor Access: Can modify and sync knowledge bases, including adding/removing sources and updating settings, but cannot delete knowledge bases
  • Reader Access: Can query knowledge bases
  • Resource-level Permissions: Inherited from underlying data sources

Best Practices

To get the most out of KnowledgeBase, consider these best practices:

  1. Organize Sources

    • Group related datasets and files logically
    • Use descriptive names for knowledge bases
    • Consider domain-based organization
  2. Optimize Sync Operations

    • Monitor sync status and address failures promptly
    • Use incremental syncs when possible
  3. Query Optimization

    • Be specific in your questions for better results
    • Use context from previous queries when relevant
    • Review source attribution for accuracy verification
  4. Access Management

    • Regularly review and update access permissions
    • Monitor usage patterns and adjust accordingly
    • Implement least-privilege access principles
Improving Query Quality

Using clear and descriptive file names and metadata will significantly improve KnowledgeBase's ability to provide accurate responses. Consider adding tags and descriptions to your data sources when possible.

Supported File Types
File TypeExtension
Plain text (ASCII only).txt
Markdown.md
HyperText Markup Language.html
Microsoft Word document.doc/.docx
Comma-separated values.csv
Microsoft Excel spreadsheet.xls/.xlsx
Portable Document Format.pdf
Current Limitations
  • Maximum of 100 Knowledge Bases can be created per account
  • Individual file size must not exceed 50MB quota
  • Only S3 datasets are currently supported as data sources
  • Maximum 5 data sources can be attached per Knowledge Base
  • Knowledge Base names must be unique within your account
  • Sync operations run sequentially - only one sync can be active at a time
  • Knowledge base queries are limited to indexed content only
  • Large files may take significant time to index
  • Query responses are based on indexed chunks and may not include full context
  • Real-time updates require manual sync operations