Category Tutorials

Tutorials

What Is Self-Attention? Simply Explained

The self-attention mechanism lies at the core of the transformer architecture, a breakthrough innovation responsible for the remarkable success of modern large language models. In fact, understanding self-attention is key to grasping 80% of what makes transformers so effective. What…

KaiJ.
December 11, 2024

LLM Guides, LLM Practice, Tutorials

Integrating RAG with a Knowledge Graph: Step-by-Step Guide

Retrieval-Augmented Generation (RAG) workflows have gained significant traction as a means to enhance the capabilities of large language models. However, integrating knowledge graphs into this workflow can address some of its inherent limitations. Let’s explore why this integration is beneficial…

KaiJ.
December 11, 2024

LLM Guides, Tutorials

What Are Knowledge Graphs and How Do They Relate to LLMs?

In this post, we’ll explore what a knowledge graph is and how it relates to large language models (LLMs). While the concept of knowledge graphs gained popularity around 2014 or 2015 and then plateaued, it has resurfaced strongly in 2023…

KaiJ.
December 11, 2024

LLM Guides, Tutorials

A Brief Summary and Insights on the Llama 3.1 Model

Meta has launched the Llama 3.1 series, including the groundbreaking 405B parameter model, the largest and most efficient open-source model to date. In this blog, I summarize the key features, insights, and advancements this release brings to the AI landscape.…

KaiJ.
November 19, 2024

LLM Guides, Tutorials

How Much GPU Memory is Needed for LLM Inference?

How much GPU memory do you need to deploy a large language model like Llama 70B for real-world applications? In this video, I’ll break down a simple method to estimate GPU memory requirements based on your use case. Key Points…

KaiJ.
November 19, 2024

LLM Guides, Tutorials

How Much GPU Memory Is Needed for LLM Fine-Tuning?

How much GPU memory do you need to fine-tune a model? In this blog, we break down the memory requirements for full fine-tuning and parameter-efficient methods like LoRA and QLoRA, helping you plan your AI projects effectively. We’ll cover: Memory…

KaiJ.
November 19, 2024

LLM Practice, RAG, Tutorials

RAG vs Fine-Tuning: A Practical Case Study

In this blog, we dive deeper into the practical application of RAG (Retrieval-Augmented Generation) and fine-tuning by exploring real-world scenarios. If you’re deciding between these approaches for your AI solutions, this breakdown will clarify which to use based on specific…

KaiJ.
November 19, 2024

LLM Practice, RAG, Tutorials

RAG vs. Fine-Tuning: Key Criteria for LLM Projects

When building applications with large language models, the choice between RAG (Retrieval-Augmented Generation) and fine-tuning depends on the use case. Here’s a quick breakdown: RAG: Adds external knowledge to the base model without altering it, making it efficient for dynamic…

KaiJ.
November 19, 2024

LLM Guides, Tutorials

What is Temperature in LLM

In this blog, we explain how Temperature influence large language models by controlling token sampling probabilities, balancing randomness, and improving output consistency. For detailed information, please watch our YouTube video: What is Temperature in LLM: Simply Explained When working with…

KaiJ.
November 19, 2024

LLM Guides, Tutorials

What are Top-K & Top-P in LLM?

In this blog, we explain how top-k, top-p influence large language models by controlling token sampling probabilities, balancing randomness, and improving output consistency. For detailed information, please watch the YouTube video: What are Top-K & Top-P in LLM?: Simply Explained…

KaiJ.
November 19, 2024

Trending now