Udemy - vPractical Text Analytics Using Spacy V3.0 - Printable Version +- Softwarez.Info - Software's World! (https://softwarez.info) +-- Forum: Library Zone (https://softwarez.info/Forum-Library-Zone) +--- Forum: Video Tutorials (https://softwarez.info/Forum-Video-Tutorials) +--- Thread: Udemy - vPractical Text Analytics Using Spacy V3.0 (/Thread-Udemy-vPractical-Text-Analytics-Using-Spacy-V3-0) |
Udemy - vPractical Text Analytics Using Spacy V3.0 - OneDDL - 10-11-2024 Free Download Udemy - vPractical Text Analytics Using Spacy V3.0 Published 10/2024 MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz Language: English | Size: 1.16 GB | Duration: 2h 0m How to extract information WITHOUT building custom Machine Learning models What you'll learn Understand the spaCy document object How spaCy pipelines work How to use Rule based Matching for Information Extraction A system for practical, iterative Text Analytics using the itables library Requirements Intermediate Knowledge of Python programming Basic knowledge of the pandas dataframe library Description What is text analytics?I like this definition: "Text analytics is the process of transforming unstructured text documents into usable, structured data. Text analysis works by breaking apart sentences and phrases into their components, and then evaluating each part's role and meaning using complex software rules and machine learning algorithms."[Source: Lexalytics website]In spaCy, you can use machine learning algorithms in two ways1) pretrained models provided by spaCy and other organizations - for example the en_core_web_md, which I use in this course, is a pretrained model provided by Explosion, the company which created spaCy2) custom machine learning models that you train on your data - which is often referred to in the documentation as "statistical models"Why not statistical models?This is what the makers of spaCy say in their documentation:"For complex tasks, it's usually better to train a statistical entity recognition model. However, statistical models require training data, so for many situations, rule-based approaches are more practical. This is especially true at the start of a project: you can use a rule-based approach as part of a data collection process, to help you "bootstrap" a statistical model.Training a model is useful if you have some examples and you want your system to be able to generalize based on those examples. It works especially well if there are clues in the local context. For instance, if you're trying to detect person or company names, your application may benefit from a statistical named entity recognition model.Rule-based systems are a good choice if there's a more or less finite number of examples that you want to find in the data, or if there's a very clear, structured pattern you can express with token rules or regular expressions. For instance, country names, IP addresses or URLs are things you might be able to handle well with a purely rule-based approach."Just to clarify, I am not against developing statistical models - but as the documentation states quite clearly, it is often more practical to start with rules based systems. One of my main aims in this course is to provide a solid understanding of what you can and cannot do using just a rules based system - in fact I use only one dataset in this entire course so it is a lot easier for the students to make this distinction.When you combine a rules based system with the data visualization technique I describe in this course, you will also gain a very good understanding of your dataset. You can then use this understanding to improve your statistical model if you choose to build one. In my view, most people barely scratch the surface when it comes to using spaCy rules for text analytics. I hope this course will provide them a lot of new insight into how to approach this task. Overview Section 1: About this course Lecture 1 How this course is different from other spaCy courses Lecture 2 The best dataset for learning text analytics Section 2: Exploring spaCy document objects Lecture 3 Import libraries Lecture 4 Splitting text into sentences Lecture 5 Splitting text into words Lecture 6 Part-of-speech tagging Lecture 7 Stop words and punctuation Lecture 8 Text spans Lecture 9 Dependency Parse Tree Lecture 10 Named Entity Recognition Lecture 11 Token is_ attributes Lecture 12 Token like_ attributes Lecture 13 More token attributes Lecture 14 Remaining token attributes Lecture 15 Visualizing the Subtree Lecture 16 Visualizing the token head Section 3: spaCy pipelines Lecture 17 Display pipeline Lecture 18 Tokenizer is unique Lecture 19 tagger Lecture 20 parser Lecture 21 attribute_ruler Lecture 22 lemmatizer Lecture 23 ner Section 4: Rule based matching Lecture 24 Token matcher Lecture 25 Dependency Matcher based on position Lecture 26 Dependency Matcher based on the parse tree Lecture 27 Phrase matcher Section 5: Download the Jupyter notebook Lecture 28 Download the Jupyter notebook used in this course Data Science practitioners who want to use spaCy and Natural Language Processing,Anyone who has a spreadsheet where one of the columns is a paragraph of text and wants to know how to extract useful information from that text to use with the filters you can apply on the OTHER columns (sort, less than, greater than etc) in spreadsheet tools like Excel and Airtable Homepage Recommend Download Link Hight Speed | Please Say Thanks Keep Topic Live No Password - Links are Interchangeable |