Feature: Crawler
The "Crawler" is an innovative feature designed to enhance your AI chatbot's capabilities by creating dynamic knowledge bases directly from web pages. This powerful tool automatically extracts and processes information from specified URLs, turning it into a structured knowledge base that the AI can use to generate responses.
Click “Get Started” below to learn how to create a knowledge base with the crawler.
...
To create a new knowledge base out of the one or multiple web pages you need to select the “Crawler“ mode in the “Create Knowledge Base“ page.
...
Name: Enter a unique name for your knowledge base in the “Knowledge Base“ text field.
Description: Provide a brief description of your knowledge base in the “Description” text field.
URL: Input the starting URL of the web page you want the crawler to analyze in the “Initial URL” text field.
...
Configuring Your Crawler
...
Bypass Settings: Decide whether to “Comply” or “Bypass” websites that normally block crawlers.
Comply: The crawler will respect website settings that block crawling and won’t be able to crawl blocked pages.
Bypass: The crawler attempts to access and crawl even those websites that have anti-crawling measures. (*bypass of the blocked pages consumes 50 more tokens per page)
...
Setting Limits for Crawling
...
The “Crawler” feature is a versatile and robust tool that significantly enhances your AI chatbot's ability to access and deliver accurate information.
...
Seeing the Data You’ve Stored in a Crawler Knowledge Base
Info |
---|
In Await Cortex, you can view the data you’ve crawled after creating a knowledge base with the crawler. |
Video Example:
...
Instructions
Click a crawler knowledge base
Click the expand button
Click the data symbol on one of the URLs.
Now you can see the data that has been stored from the crawled URL.
Click the URL dropdown to switch between different crawled URLs.
...
Updating a Crawler Knowledge Base
Info |
---|
You can re-crawl previously crawled knowledge bases with the click of a button |
Instructions
Click a crawler knowledge base
Click the update button on your knowledge base
Agree to the disclaimer and click “update“