Feature: Crawler

The "Crawler" is an innovative feature designed to enhance your AI chatbot's capabilities by creating dynamic knowledge bases directly from web pages. This powerful tool automatically extracts and processes information from specified URLs, turning it into a structured knowledge base that the AI can use to generate responses.

Click “Get Started” below to learn how to use create a knowledge base with the crawler.

...

To create a new knowledge base out of the one or multiple web pages you need to select the “Crawler“ mode in the “Create Knowledge Base“ page.

...

Name: Enter a unique name for your knowledge base in the “Knowledge Base“ text field.
Description: Provide a brief description of your knowledge base in the “Description” text field.
URL: Input the starting URL of the web page you want the crawler to analyze in the “Initial URL” text field.

...

Configuring Your Crawler

Privacy Settings: Choose between “Private” or “Public” toggle. A “Public” knowledge base is accessible to all users, while a “Private” knowledge base is only accessible to the owner.
Bypass Settings: Decide whether to “Comply” or “Bypass” websites that normally block crawlers.
- Comply: The crawler will respect website settings that block crawling and won’t be able to crawl blocked pages.
- Bypass: The crawler attempts to access and crawl even those websites that have anti-crawling measures. (*bypass of the blocked pages consumes more tokens)

...

The “Crawler” feature is a versatile and robust tool that significantly enhances your AI chatbot's ability to access and deliver accurate information.

...

Seeing the Data You’ve Stored in a Crawler Knowledge Base

Info
In Await Cortex, you can view the data you’ve crawled after creating a knowledge base with the crawler.

Video Example:

...

Instructions

Click a crawler knowledge base
Image Added
Click the expand button
Image Added
Click the data symbol on one of the URLs.
Image Added
Now you can see the data that has been stored from the crawled URL.
Image Added
Click the URL dropdown to switch between different crawled URLs.
Image AddedImage Added