Local Data File Connector
Upload your data files directly to create searchable content from CSV or JSON structured data formats with flexible schema support.
Input Requirements
The Local Data File connector allows you to upload structured data files directly to Atri AI Search. Here are the key requirements:
The connector supports popular structured data formats including CSV and JSON, with a maximum file size of 100MB per upload. Files can be uploaded through an intuitive drag-and-drop interface or traditional file browser selection.
We support the most common structured data formats to ensure compatibility with your existing data workflows. CSV files should follow the traditional spreadsheet format with headers in the first row, while JSON files should contain an array of objects with consistent field structure throughout the dataset.
Data Structure Requirements
To ensure optimal search functionality, your data files must contain specific required fields and follow certain structural guidelines: The system requires three essential fields to function properly, while additional optional fields can provide richer context that enhances the search experience.
Required Fields
Every record in your data file must contain these essential fields to ensure proper indexing and search functionality:
- id: Unique identifier for each record (String or Number format, such as 1, 2, 3 or 'prod_001', 'prod_002')
- title: Primary title or name of the item (String format, such as 'Premium Wireless Headphones', 'Organic Cotton T-Shirt')
- variant_title: Variant or subtitle information (String format, such as 'Black / Large', 'Blue / Medium', 'Default Title'). Leave this as an empty string if not applicable.
In addition to these required fields, you can include any other fields that are relevant to your data. These additional fields should help in providing additional context, and provide filtering and faceting capabilities as per your use case.
Data Structure Examples
Below are examples of properly formatted data files that demonstrate the required structure for both CSV and JSON formats. These examples show how to organize your data to ensure optimal search performance and compatibility with the system.
CSV Example
id,title,variant_title,description,category,price,tags
1,"Premium Wireless Headphones","Black","High-quality wireless headphones with noise cancellation","Electronics",199.99,"audio,wireless,premium"
2,"Organic Cotton T-Shirt","Blue / Medium","Comfortable organic cotton t-shirt","Clothing",29.99,"clothing,organic,cotton"
3,"Stainless Steel Water Bottle","500ml","Durable stainless steel water bottle","Home & Garden",24.99,"bottle,steel,eco-friendly"
JSON Example
[
{
"id": "1",
"title": "Premium Wireless Headphones",
"variant_title": "Black",
"description": "High-quality wireless headphones with noise cancellation",
"category": "Electronics",
"price": 199.99,
"tags": "audio,wireless,premium"
},
{
"id": "2",
"title": "Organic Cotton T-Shirt",
"variant_title": "Blue / Medium",
"description": "Comfortable organic cotton t-shirt",
"category": "Clothing",
"price": 29.99,
"tags": "clothing,organic,cotton"
}
]
Best Practices and Guidelines
Follow the below recommendations to ensure optimal search performance and data quality. Following these guidelines will help ensure your data is processed efficiently and delivers the best possible search experience for your users.
Data Quality
Maintaining high data quality is essential for optimal search performance. Focus on these key areas:
- Ensure all records have unique IDs to prevent data conflicts
- Use consistent field names and data types throughout your file
- Include descriptive titles and variant information for better search results
- Populate description fields with rich, searchable content
Performance Optimization
These practices will help ensure your files process quickly and efficiently:
- Keep file sizes reasonable (under 100MB) for faster processing
- Use consistent formatting for dates, numbers, and categorical data
- Avoid empty or null values in required fields
- Consider splitting very large datasets into multiple files
Search Enhancement
Improve discoverability and search relevance with these content strategies:
- Include relevant tags and categories to improve discoverability
- Use clear, descriptive language in titles and descriptions
- Ensure price and numerical fields are properly formatted
- Consider SEO-friendly titles and descriptions for web applications
File Preparation
Prepare your files properly to avoid common issues during upload:
- Validate your data structure before uploading
- Remove any sensitive or unnecessary information
- Use UTF-8 encoding for proper character support
- Test with a small sample file first to verify structure
Upload Process
The upload process follows five distinct steps:
- First, select 'Upload Data File' from the data source options in your configuration.
- Next, upload your file by dragging and dropping it onto the interface or clicking 'Select File' to browse and choose your data file.
- The system then automatically processes your file, detecting field types and structure without requiring manual configuration.
- After processing, you can configure searchable, embedding, and filterable attributes based on your detected fields.
- Finally, your data is indexed and ready for search within minutes of upload completion.
Important Limitations
Please be aware of these limitations when using the Local Data File connector:
The most significant limitation is that local files do not support automatic updates, unlike connected data sources. You must manually re-upload files when your data changes. Additionally, there is a maximum file size limit of 100MB per upload. For larger datasets, consider using a connected data source. Your uploaded data is stored securely, but you should maintain backups of your original files for reference and updates.
Common Issues and Solutions
Most issues can be resolved by ensuring proper file formatting and field configuration. However, if you do encounter issues during the upload or configuration process, here are solutions that might help.
- File upload fails with 'Invalid format' error: Ensure your file is in CSV or JSON format and contains the required fields (id, title, variant_title). Check that your file structure matches the examples provided above.
- Search results are not appearing: Check that your searchable attributes are properly configured and that your data contains text content in the selected fields. Verify that the fields you want to search are marked as searchable in your configuration.
- Special characters not displaying correctly: Ensure your file is saved with UTF-8 encoding to support international characters and symbols. Most modern text editors and spreadsheet applications support UTF-8 encoding by default.
- Numeric fields not filtering properly: Verify that numeric fields contain only numbers (no currency symbols or text) and are configured as filterable attributes. Format numbers consistently throughout your dataset to ensure proper filtering functionality.