Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Feature: Crawler

The "Crawler" is an innovative feature designed to enhance your AI chatbot's capabilities by creating dynamic knowledge bases directly from web pages. This powerful tool automatically extracts and processes information from specified URLs, turning it into a structured knowledge base that the AI can use to generate responses.

Click “Get Started” below to learn how to create a knowledge base with the crawler.

...

To create a new knowledge base out of the one or multiple web pages you need to select the “Crawler“ mode in the “Create Knowledge Base“ page.

...

  1. Name: Enter a unique name for your knowledge base in the “Knowledge Base“ text field.

  2. Description: Provide a brief description of your knowledge base in the “Description” text field.

  3. URL: Input the starting URL of the web page you want the crawler to analyze in the “Initial URL” text field.

...

Configuring Your Crawler

...

  1. Bypass Settings: Decide whether to “Comply” or “Bypass” websites that normally block crawlers.

    • Comply: The crawler will respect website settings that block crawling and won’t be able to crawl blocked pages.

    • Bypass: The crawler attempts to access and crawl even those websites that have anti-crawling measures. (*bypass of the blocked pages consumes 50 more tokens per page)

...

Setting Limits for Crawling

...

The “Crawler” feature is a versatile and robust tool that significantly enhances your AI chatbot's ability to access and deliver accurate information.

...

Seeing the Data You’ve Stored in a Crawler Knowledge Base

Info

In Await Cortex, you can view the data you’ve crawled after creating a knowledge base with the crawler.

Video Example:

...

Instructions

  1. Click a crawler knowledge base

    image-20240701-230555.pngImage Added
  2. Click the expand button

    image-20240701-230835.pngImage Added
  3. Click the data symbol on one of the URLs.

    image-20240701-230950.pngImage Added
  4. Now you can see the data that has been stored from the crawled URL.

    Image Added
  5. Click the URL dropdown to switch between different crawled URLs.

    image-20240701-231516.pngImage Addedimage-20240701-231432.pngImage Added

...

Updating a Crawler Knowledge Base

Info

You can re-crawl previously crawled knowledge bases with the click of a button

Instructions

  1. Click a crawler knowledge base

    image-20240701-230555.pngImage Added
  2. Click the update button on your knowledge base

    image-20240701-235332.pngImage Added
  3. Agree to the disclaimer and click “update“

    image-20240701-235117.pngImage Added