Vellum AI Has Introduced Workflows To Their Platform

Vellum AI can be described as a no-code to low-code LLM based UI with a playground, semantic search, testing, and now workflows.

Cobus Greyling
6 min readAug 22, 2023


I’m currently the Chief Evangelist @ HumanFirst. I explore & write about all things at the intersection of AI & language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces & more.

There has been a shift in Generative App (LLM-based) applications, away from fine-tuning and towards a RAG approach.

Via Retrieval Augmented Generation (RAG) a contextual reference is injected or included in the prompt for the LLM to reference.

This can be considered a prompt pipeline, delivering the correct, contextual information, in the right size, at the right time at inference.

So it stands to reason that focus has shifted away from LLM fine-tuning and towards enriching Prompts, on the fly, to act as a contextual reference.

Enrichment takes place from sources like uploaded documents, vector databases or tools with access to other sources, like the web, APIs, etc.

Hence the importance to create a flow or logic path to create chains or sequence through which a workflow can be created. This workflow is tasked with creating the right output in certain circumstances.

There has been a number of players entering the LLM workflow arena, Voiceflow and Botpress pivoted from traditional voicebot/chatbot development into LLM workflows / chaining.

And langFlow and Flowise are two LangChain based implementations. LangFlow is launching a hosted web-based version. Another notable commercial offering is Stack AI.

This begs the question, is the flow builder market turning into a red ocean from a blue ocean, with more companies entering this space?

Back to Vellum AI

A seen below, the Vellum AI UI can be divided into seven components. The ideal approach would be to build prompts in the playground, compare prompt variations, and prompt performance based on different LLMs.

Large Language Models (LLMs) available in Vellum are OpenAI, Anthropic, Cohere, Google and MosiacML. You will need to provide an API key for Anthropic, Cohere and Google.

The Vellum hosted LLMs are falcon-40b-instruct, llama2–7B, llama2–13B, llama2–7B-chat and llama2–13B-chat.

Even-though the Vellum UI is minimalistic, this wide variety of LLMs make for a powerful playground. Being able to premise prompts on a wide array of models, assist in managing cost and performance.

Vellum Workflows

The Vellum Workflows UI has five nodes, which is in stark contrast with other workflows in terms of sheer number of nodes. It does seem like Vellum is focussed on a RAG approach, considering the ease of use of their Documents section.

The workflow UI is more focussed on workflows than conversations, at this stage. There is no customary chat window on the bottom right to perform a few dialog turns.

Input variables can be defined, flows can split and merge based on set conditions with a final output node.

The workflow acts more as a space to take prompts a step further with orchestration, conditions and semantic search.

RAG Example

Considering RAG prompt engineering, below a question is asked: What happened in 1934?

The question is asked against the falcon-40b-instruct model, and a relevant and succinct answer is given by the LLM. However, how do we introduce a contextual reference for this question?

Document Upload

I took a Wikipedia article on South Africa, and uploaded the text to Vellum. The embeddings model used is intfloat/multilingual-e5-large, the settings are all visible below.

This is the view of the document management interface, uploaded documents are all listed on the left, and can be visually inspected. Documents can also be programmatically uploaded.

Here below you see how the document can be searched, the same phrase is used and the contextual and semantically relevant text is returned.

Bringing Search and Prompt Engineering Together

Below is a simple RAG flow, with the Entry point defined and the input as: What happened in 1934?

The document is searched, and an extract is returned. In turn the extract is submitted as context to the LLM with the same question, but this time the document extract serves as a contextual reference for the propt. And finally the correct answer is given.

In Closing

Near-future developments within the workflow environment are sure to include an increase in nodes, especially for making calls to third party APIs, executing code and nested workflows.

I would love to have horizontal tabs to switch between workflows, and even prompts and documents.

Being able to define, view and edit the prompt within the workflow will also be handy.

Finally Vellum AI’s value proposition are centred around custom-tailored advice, direct access to founders and rapid onboarding.

Will a Vellum workflow be used as a standalone application? Most probably not (for now). But, it is a good route for creating smart APIs to use in a whole host of Generative Apps.

⭐️ Follow me on LinkedIn for updates on Large Language Models ⭐️

I’m currently the Chief Evangelist @ HumanFirst. I explore & write about all things at the intersection of AI & language; ranging from LLMs, Chatbots, Voicebots, Development Frameworks, Data-Centric latent spaces & more.




Cobus Greyling

I explore and write about all things at the intersection of AI & language; LLMs/NLP/NLU, Chat/Voicebots, CCAI.