LlamaIndex Chat Engine

LlamaIndex is a toolkit to easily connect Large Language Models (LLMs) to external data sources. Data sources include documents, web pages and more. By default LlamaIndex uses the OpenAI GPT3 (text-davinci-003) model. There are also underlying features which leverages LangChain.

5 min readJun 13, 2023

I’m currently the Chief Evangelist @ HumanFirst. I explore and write about all things at the intersection of AI and language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces and more.

Underlying LLMs

By default, we use the OpenAI GPT-3 text-davinci-003 model. To make use of LlamaIndex in it’s default install state, you will need to define your OpenAI API Key:

os.environ['OPENAI_API_KEY'] = str("xxxxxxxxxxxxxxxxxxxxxxxxx")

You can change the underlying LLM used and configurations, LlamaIndex achieves this by leveraging LangChain. Obviously you will need to define environment keys and tokens depending on the LLMs used.

LlamaIndex At Its Core

LlamaIndex is a toolkit to easily connect LLMs with external data.

Connectors include linking to documents, web pages, Slack, Discord, and more.

LlamaIndex Chat Engine

The LlamaIndex Chat Engine is an interface which enables you to have a conversation with your data.

The conversations enabled by the LlamaIndex Chat Engine is not merely a single dialog turn, question and answer conversation.

It allows for a multi-turn contextually aware conversation for implicit referencing of memory.

There are two modes:

Condensed Question Mode
Agent Mode

Chat Engine — Condense Question Mode

Condense question and answer mode is a simple chat interface built on top of a query engine.

For each interaction:

A question is generated from the conversational context and the last user message.
The query engine is queried with the condensed question for a response.

This approach is simple, and works for questions directly related to the knowledge base.

Below the code to install and run the LlamaIndex application within a Notebook. You will see that I needed to install html2text seeing document I reference is a web url.

pip install llama_index
pip install html2text

import os
import openai

os.environ['OPENAI_API_KEY'] = str("xxxxxxxxxxxxxx")

from llama_index import VectorStoreIndex, SimpleWebPageReader
data = SimpleWebPageReader(html_to_text=True).load_data(["https://en.wikipedia.org/wiki/South_Africa"])
index = VectorStoreIndex.from_documents(data)

chat_engine = index.as_chat_engine(verbose=True)

Below is a highly contextual question based on the document provided:

response = chat_engine.chat('What languages are spoken there?')

And below the response:

The languages spoken in South Africa are Zulu, Xhosa, Afrikaans, English, 
Pedi, Tswana, Southern Sotho, Tsonga, Swazi, Venda, and Southern Ndebele. 
Additionally, Fanagalo, Khoe, Lobedu, Nama, Northern Ndebele, Phuthi, 
and South African Sign Language are also spoken, as well as 
European languages such as Italian, Portuguese, Dutch, German, and Greek, 
and Indian languages such as Gujarati, Hindi, Tamil, Telugu, and Urdu. 
French is spoken by migrants from Francophone Africa

Here a highly contextual follow-up question is asked, not only contextual with reference to the document supplied, but also contextually relevant to the previous question:

response = chat_engine.chat('Of those, which are the two minorities?')

And the correct response is received:

Fanagalo and Khoe are two languages spoken by minorities in South Africa.

Agent Mode

ReAct is an agent based chat mode built on top of a query engine which references your data. Implemented via a LangChain agent.

The two lines of code below, can merely be added at the bottom of the existing code.

For each chat interaction, the agent enters a ReAct loop:

Deciding if the query engine tool should be used.
(optional) use the query engine tool and observe its output
decide whether to repeat or give final response

Agent mode is flexible, as agents have the flexibility to choose between the querying the knowledge base or not.

The performance is also more dependent on the quality of the LLM.

chat_engine = index.as_chat_engine(chat_mode='react', verbose=True)
response = chat_engine.chat('Use the tool to answer: What happened in the year 1652?')

Lastly

Below are three lines of code which can be run in the notebook, it creates an interactive looping chat interface. It is quite a neat way to test a conversational interface:

from llama_index.chat_engine import SimpleChatEngine
chat_engine = SimpleChatEngine.from_defaults()
chat_engine.chat_repl()

An the chat interface view below as seen in a Colab notebook:

⭐️ Please follow me on LinkedIn for updates on LLMs ⭐️

I’m currently the Chief Evangelist @ HumanFirst. I explore and write about all things at the intersection of AI and language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces and more.

NLU design tooling

“Conversation Designer, Retail, 10k+ employees The tool that turned conversation designers, into NLU designers” ★★★★★…

www.humanfirst.ai

https://www.linkedin.com/in/cobusgreyling

Get an email whenever Cobus Greyling publishes.

Get an email whenever Cobus Greyling publishes. By signing up, you will create a Medium account if you don’t already…

cobusgreyling.medium.com

COBUS GREYLING

At the intersection of AI & Language | NLP/NLU/LLM, Chat/Voicebots, CCAI Chief Evangelist @ HumanFirst. I explore and…

www.cobusgreyling.com

LlamaIndex 🦙 0.6.23

LlamaIndex (GPT Index) is a data framework for your LLM application. We need a comprehensive toolkit to help perform…

gpt-index.readthedocs.io

GitHub - jerryjliu/llama_index: LlamaIndex (GPT Index) is a data framework for your LLM…

LlamaIndex (GPT Index) is a data framework for your LLM application. PyPI: Documentation…

github.com

LlamaIndex Chat Engine

LlamaIndex is a toolkit to easily connect Large Language Models (LLMs) to external data sources. Data sources include documents, web pages and more. By default LlamaIndex uses the OpenAI GPT3 (text-davinci-003) model. There are also underlying features which leverages LangChain.

Underlying LLMs

LlamaIndex At Its Core

LlamaIndex Chat Engine

Chat Engine — Condense Question Mode

Agent Mode

Lastly

NLU design tooling

“Conversation Designer, Retail, 10k+ employees The tool that turned conversation designers, into NLU designers” ★★★★★…

Get an email whenever Cobus Greyling publishes.

Get an email whenever Cobus Greyling publishes. By signing up, you will create a Medium account if you don’t already…

COBUS GREYLING

At the intersection of AI & Language | NLP/NLU/LLM, Chat/Voicebots, CCAI Chief Evangelist @ HumanFirst. I explore and…

LlamaIndex 🦙 0.6.23

LlamaIndex (GPT Index) is a data framework for your LLM application. We need a comprehensive toolkit to help perform…

GitHub - jerryjliu/llama_index: LlamaIndex (GPT Index) is a data framework for your LLM…

LlamaIndex (GPT Index) is a data framework for your LLM application. PyPI: Documentation…

Written by Cobus Greyling

Responses (2)