Home / Course catalog / Annotations for NLP: Challenges and Best ... (MC02)

expert.ai NL Platform eLearning → Masterclasses

Annotations for NLP: Challenges and Best Practices (MC02)


Description

In this course, you will learn:

- why high-quality annotations matter so much to the training of an effective machine learning model

- the most common kinds of labels used for categorization and for extraction projects

- the factors that you should take into account while designing the annotation task

- the practical concerns that determine how the annotation project should be managed

- a few tips on how to write effective annotation guidelines

- the most common gray areas that might cause inconsistencies in the annotations

-the metrics that might be used to assess the quality of the annotations

- the techniques that can be used to facilitate the task of manual annotation

Content
  • Welcome! Course Overview and Objectives
  • 1 Why Good Annotations Matter
  • 1.1 Annotations and Supervised Machine Learning
  • 1.2 Attribute Noise and Class Noise
  • 1.3 Random Class Noise and Structured Class Noise
  • 1.4 The Impact of Annotations on a Machine Learning Project
  • 2 The Challenges of Manual Annotation
  • 2.1 Linguistic Ambiguity and Ground Truth
  • 2.2 Deciding What to Annotate
  • 2.3 The Criterion of Explicit Linguistic Evidence
  • 2.4 Deciding When to Annotate
  • 2.5 Deciding How Much to Annotate
  • 2.6 Repetitiveness and Error
  • 3 A Breakdown of Categorization and Extraction Labels
  • 3.1 Introduction to Categorization and Extraction Labels
  • 3.2 Stand-off Annotations
  • 3.3 Labels for Categorization Projects
  • 3.4 Handling Complex Categorization Taxonomies
  • 3.5 Labels for Extraction Projects
  • 3.6 Attributes and Links in Extraction Annotations
  • 3.7 Rendering Attributes and Links in the expert.ai Platform
  • 4 Designing the Annotation Task
  • 4.1 Identifying Goals and Constraints
  • 4.2 Drafting the List of Labels
  • 4.3 Understanding Correctness and Informativity
  • 5 Practical Concerns
  • 5.1 Practical Concerns in Annotation Projects
  • 5.2 Picking the Annotation Methods
  • 5.3 Selecting the Workforce
  • 5.4 Sizing the Annotation Team
  • 5.5 Resorting to Microtasks
  • 6 The Annotation Guidelines
  • 6.1 Introduction to the Annotation Guidelines
  • 6.2 Tailoring the Annotation Guidelines
  • 6.3 Summarizing the End Goal
  • 6.4 Explaining Each Target Label
  • 6.5 Addressing Potential Doubts
  • 6.6 Creating an "Annotation Algorithm"
  • 6.7 Providing Clear Criteria
  • 6.8 Researching Other Annotation Guidelines
  • 7 Handling Gray Areas
  • 7.1 Introduction to Common Gray Areas
  • 7.2 Co-occurring Labels
  • 7.3 The Span of the Tag
  • 7.4 Anaphoric References
  • 7.5 Defining Explicit Linguistic Evidence
  • 7.6 Keeping It Simple
  • 8 Best Practices for Annotation
  • 8 Best Practices for Annotation
  • 9 Assessing the Quality of Annotations: The Metrics
  • 9 Assessing the Quality of Annotations: The Metrics
  • 10 Beyond Manual Annotation
  • 10.1 The Alternatives to Manual Annotation
  • 10.2 Weak Supervision
  • 10.3 Semi-supervision
  • 10.4 Active Learning
  • Appendix
  • Thank You & Main References
  • Course Feedback Survey
  • TEST - Final Test
Completion rules
  • All units must be completed
  • Leads to a certificate with a duration: 1 year