11-29-2024, 10:42 AM
Gpt Vs Gemini For Structured Information Extraction
Published 11/2024
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 856.27 MB | Duration: 0h 34m
A systematic approach for evaluating the Structured Output accuracy of Large Language Models
[b]What you'll learn[/b]
How to use the Structured Output feature in GPT
How to use the Structured Output feature in Gemini
How to extract different data types like numerical values, booleans etc
How to measure the accuracy of the structured information you extracted
[b]Requirements[/b]
Fairly proficient in Python
You should already know how to use Jupyter
Preferable: basic knowledge of the spaCy NLP library
[b]Description[/b]
Natural Language Processing (NLP) is often* considered to be the combination of two branches of study - Natural Language Understanding (NLU) and Natural Language Generation (NLG). Large Language Models can do both NLU and NLG. In this course we are primarily interested in the NLU aspect - more specifically we are interested in how to extract structured information from free form text. (There is also an NLG aspect to the course which you will notice as you watch the video lessons).Recently both GPT and Gemini introduced the ability to extract structured output from the prompt text. As of this writing (November 2024), they are the only LLMs which provide native support for this feature via their API itself - in other words, you can simply specify the response schema as a Python class, and the LLMs will give you a "best effort" response which is guaranteed to follow the schema. It is best effort because while the response is guaranteed to follow the schema, sometimes the fields are empty. How can we assess the accuracy of this structured information extraction?This course provides a practical and systematic approach for assessing the accuracy of LLM Structured Output responses. So which one is better - GPT or Gemini? Watch the course to find out :-)*For example, that is how Ines Montani, co-founder of spaCy recently described the fields in a podcast interview.
Overview
Section 1: Introduction
Lecture 1 Is this meme still true?
Lecture 2 About this course
Lecture 3 Why not use client libraries
Section 2: Getting started
Lecture 4 Install libraries
Lecture 5 Set environment variables
Lecture 6 Download the Jupyter notebook
Section 3: Numerical values
Lecture 7 Exploring numerical values in the dataset
Lecture 8 Extracting numerical values using Gemini
Lecture 9 Measuring Gemini accuracy for numerical values
Lecture 10 Extracting numerical values using GPT
Lecture 11 Measuring GPT accuracy for numerical values
Lecture 12 Comparing Gemini and GPT accuracy for numerical values
Section 4: Date values
Lecture 13 Exploring date values in the dataset
Lecture 14 Extracting date values using Gemini
Lecture 15 Measuring Gemini accuracy for date values
Lecture 16 Extracting date values using GPT
Lecture 17 Measuring GPT accuracy for date values
Lecture 18 Comparing GPT and Gemini accuracy for date values
Section 5: Boolean values
Lecture 19 Exploring boolean values in the dataset
Lecture 20 Extracting boolean values using Gemini
Lecture 21 Measuring Gemini accuracy for boolean values
Lecture 22 Extracting boolean values using GPT
Lecture 23 Measuring GPT accuracy for boolean values
Lecture 24 Comparing GPT and Gemini accuracy for boolean values
Section 6: Why use an Explanation
Lecture 25 Downsides of using the Explanation class
Lecture 26 Explanation provides a future reference
Lecture 27 Explanation can speed up annotation for spaCy Prodigy
Lecture 28 Explanation can provide more accurate responses
Lecture 29 Better responses: an example
Lecture 30 What we can infer from the quality of GPT and Gemini explanations
Intermediate Python developers who want to learn how to use GPT and Gemini to extract structured information from any dataset