AI

Start a Classification in Weaviate

Classification in Weaviate refers to assigning a predefined label or category to a given data instance based on its inherent characteristics or features. It involves utilizing the machine learning algorithms or statistical models to analyze the properties of the data and make accurate predictions about its class membership.

Weaviate’s classification framework involves feature extraction, dimensionality reduction, and model training where the system learns patterns and relationships within the data to generalize and classify new, unseen instances.

The resulting classifier can then classify the incoming data instances based on their similarity to the learned patterns, enabling an efficient and automated categorization in the Weaviate knowledge graph.

This tutorial teaches us how to use the Weaviate Python client endpoints to create a KNN classification in Weaviate.

Create a Class in Weaviate

Let us start by creating a class to store the film data. We use this to demonstrate the classification and gather the data about the classification.

import weaviate
import json

client = weaviate.Client("http://localhost:8080")

class_obj = {
      "class": "Film",
      "description": "A class representing a film.",
      "vectorIndexType": "hnsw",
      "vectorIndexConfig": {
        "distance": "cosine",
        "efConstruction": 128,
        "maxConnections": 32
      },
      "properties": [
        {
          "name": "title",
          "dataType": ["string"],
          "description": "The title of the film."
        },
        {
          "name": "director",
          "dataType": ["string"],
          "description": "The director of the film."
        },
        {
          "name": "year",
          "dataType": ["int"],
          "description": "The year the film was released."
        },
        {
          "name": "genre",
          "dataType": ["string"],
          "description": "The genre of the film."
        },
        {
          "name": "actors",
          "dataType": ["string"],
          "description": "The actors in the film."
        }
      ]
    }
client.schema.create_class(class_obj)
schema = client.schema.get()

This should create a “Film “class with various properties.

Start a Classification in Weaviate

Once we have the class and sample data, we can start a classificiation. We use a KNN classification for this post.

trainingSetWhere = {
  "path": ["wordCount"],
  "operator": "GreaterThan",
  "valueInt": 1
}

client.classification.schedule()\
            .with_type("knn")\
            .with_class_name("Film")\
            .with_based_on_properties(["title"])\
            .with_training_set_where_filter(trainingSetWhere)\
            .with_settings({"k":3})\
            .do()

This should start a KNN classification on the target class. Once you get the classification ID, use it to gather the details about the classification.

Conclusion

We demonstrated the basics of working with the classification API in Weaviate to create a basic KNN classification. You can refer to the documentation for the various supported classifications and which one suits your data.

About the author

John Otieno

My name is John and am a fellow geek like you. I am passionate about all things computers from Hardware, Operating systems to Programming. My dream is to share my knowledge with the world and help out fellow geeks. Follow my content by subscribing to LinuxHint mailing list