Example Notebooks

  • TruthfulQA

    Explores the performance of an open-source LLM on the TruthfulQA benchmark, identifying specific types of questions the model struggles to answer correctly. This example is also covered in the Tutorial.

  • Exploring a Finance RAG Dataset

    Explores the FinDER benchmark dataset for retrieval augmented generation (RAG). Uses dataset linking to help understand the relationship between queries and retrieved evidence.

  • Exploring LLM Preference Data

    Explores a human preference dataset from LMArena at different levels of abstraction: comparisons between model pairs, conversations, and individual turns in a conversation. Uses advanced dataset linking.

  • Text Classification

    Uses Cobalt to explore and debug a transformer-based text classification model from Hugging Face. Requires the transformers package to be installed.

  • Basic Tabular Tutorial

    This is a simple example using a synthetic tabular dataset and scikit-learn-based model to illustrate the main parts of the Cobalt interface.

  • Image Clustering with CLIP

    This is an example that demonstrates the use of Cobalt to explore the ImageNette dataset by making use of embeddings generated by the CLIP model.

    This notebook requires the OpenAI clip package to be installed, which includes torch, torchvision, etc…

    You can watch a live walkthrough of this example here.