Discovering Large Language Models: A Mini Workshop Series

By: Ivy Sandberg and Wyatt Shabe

The world of large language models (LLMs) is expanding at an unprecedented pace, opening up exciting opportunities for businesses and researchers alike. Those of us at Data Machines (DMC) are eager to harness the full potential of LLMs. In order to quickly up-skill our teams and customers, we had our summer intern work with professionals to put together a mini workshop series on language methods and capabilities. We coined the workshop “Discovering-LLMs!”

In order to showcase the methods and capabilities covered in this workshop, we created a short Python notebook tailored to each topic, to help folks learn through simple, hands-on examples. Folks are welcome to contact us to learn more about the code samples.

Image courtesy of CAMERON R. WOLFE, PH.D.

What's in Store

By the end of this mini workshop series, learners will be equipped with an array of skills and knowledge that will empower learners to unlock the capabilities of Language Models. Here's a glimpse of what topics are covered:

Propt Engineering
Crafting Effective Prompts: Learn the art of formulating prompts that yield accurate and insightful responses from LLMs.
Types of Prompting Tasks: Explore a variety of tasks where you can leverage LLMs through prompting, such as summarization and transformation.
Training and Fine-Tuning
Techniques and Strategies: Overview basic steps and techniques for building and tuning language models, such as Pre-Training, Reinforcement Learning from Human Feedback, Fine Tuning, Instruct-Tuning and In-Context Learning.
Fine-Tuning: Compare and understand tuning methods (e.g., Fine Tuning and Instruct-Tuning), versus prompting methods (e.g. In-Context Learning or Few-Shot Prompting).
Using Your Own Data
Tailored to Your Data: Find out how to make an LLM work with your data, including vector databases, LangChain document loaders, and questioning the model on your unique dataset.
Evaluating Model Outputs
Quality Assessment: Learn methods for evaluating your model's output, understanding when it's performing well and when improvements are needed.
Benchmarking and Testing: Dive into benchmarks and testing frameworks for evaluating LLMs.
Model Alignment
Ethical AI: Explore methods to detect and limit bias in your LM, such as doing a blind taste test and alignment techniques like RLHF (Reinforcement Learning from Human Feedback) and Constitutional AI.

Tools Used

To navigate this series, we leveraged the following essential tools and resources:

Python: You'll need to understand how to interact with LLMs using Python code.
HuggingFace: HuggingFace grants you access to a rich repository of open-source models and datasets, which you can make use of through self-hosting or through their API.
OpenAI: Run and compare state-of-the-art proprietary models from OpenAI via their API integration.
Jupyter Notebooks: We'll provide demo notebooks for interactive learning, ensuring an engaging and hands-on learning experience. Langchain: a majority of the example notebooks leverage the LangChain Python package for methods and frameworks for developing language model-based application

Summary

The Discovering-LLMs Mini Workshop Series is intended to be a gateway to understanding and leveraging LLM capabilities. If you are interested in DMC supporting you in effectively utilizing LLMs to your specific use cases, don't hesitate to reach out.

Discovering Large Language Models: A Mini Workshop Series

What's in Store

Propt Engineering

Training and Fine-Tuning

Using Your Own Data

Evaluating Model Outputs

Model Alignment

Tools Used

Summary

DMC Graduates from MissionLink’s 2023 Cohort; Joins Esteemed ‘Trusted Innovation Ecosystem’ in NatSec Tech

DMC Supports Groundbreaking Efforts