AI

Weaviate List Data Objects

Weaviate is a free, open-source, GraphQL, and RESTful API-enabled knowledge graph that allows you to leverage the data linkage and automatic classification. Weaviate leverages the power of machine learning and vector databases to make the information from different sources easily accessible and to connect the dots between your data objects.

In Weaviate, an object refers to a central data unit in the Weaviate ecosystem representing an instance of a specific schema class. It is mainly comprised of four main components:

  1. Id – This is a unique identifier for the data object, typically a UUID value.
  2. Class – The class defines the schema to which the data object belongs. The class determines the properties of the object.
  3. Properties – The properties part of the object stores the actual object data. Properties correspond to the properties that are defined in the schema class and can be of various supported data types such as text, numeric, date, Boolean, etc.
  4. Meta – This component includes the metadata about the object such as the last update time, creation time, vector representation, etc.

This tutorial explores the fundamentals of working with data objects in Weaviate. We will learn how to build a basic data object in Weaviate and finally learn the various methods of listing the data objects in the Weaviate instance.

Create a Data Object in Weaviate

The first step is to setup a basic data object for demonstration. This post does not cover the full concept of creating the data objects in Weaviate; you can check our tutorial to learn more or dive into the documentation.

The first step is configuring the schema that represents the data structure that we wish to store. The schema itself is comprised of classes that represent the type of objects that we wish to create.

Suppose we wish to create a schema that stores the database information. We can start by creating a class called “DatabaseServers” with the properties as follows:

  1. Name – This property specifies the name of the database server.
  2. Type – It defines the type of database, e.g. MySQL, PostgreSQL, MongoDB, etc.
  3. Version – We can also have a property that stores the version of the running database server.
  4. Port – Finally, we have the port property which stores the information about the port on which the server is running.

Once we define the schema class, we can create the data objects that represent the instances of “DatabaseServer”.

We can use the Weaviate SDK for the Python programming language to create such as class as shown in the following example code:

import weaviate

client = weaviate.Client("http://localhost:8080 ")

class_obj = {
    "class": "DatabaseServer",
    "description": "Information about a database server",
    "properties": [
        {
            "dataType": ["string"],
            "description": "Name of the database server",
            "name": "name",
        },
        {
            "dataType": ["string"],
            "description": "Type of the database server (MySQL, PostgreSQL, MongoDB, etc.)",
            "name": "type"
        },
        {
            "dataType": ["string"],
            "description": "Version of the database server",
            "name": "version"
        },
        {
            "dataType": ["string"],
            "description": "Host of the database server",
            "name": "host"
        },
        {
            "dataType": ["int"],
            "description": "Port of the database server",
            "name": "port"
        }
    ]
}

client.schema.create_class(class_obj)

data_obj = {
        "name": "Primary Database",
        "type": "MySQL",
        "version": "8.0.23",
        "host": "192.168.1.100",
        "port": 3306
}

data_uuid = client.data_object.create(data_obj, "DatabaseServer", consistency_level=weaviate.data.replication.ConsistencyLevel.ALL)

This should create the defined schema class and all the defined objects.

List the Data Objects in Weaviate

There are various methods of fetching the data objects. The first method is using the API endpoint.

Using the API Endpoint

We can simply send a GET request to the /v1/objects endpoints to retrieve the data objects in the Weaviate instance.

NOTE: Requesting the previous endpoint removes any restrictions. Hence, the request returns all the data objects across all the classes. However, it has a default limit of 25.

You can also perform a more granular filtering as shown in the following example syntax:

GET /v1/objects?class={ClassName}&limit={limit}&include={include}

This allows you to specify which class you wish to target and the limit of the data objects that you wish the request to return.

You can also use the offset parameter to perform paging. The offset parameter defines at which position you wish to start fetching the data objects.

For example, to fetch the first 10 data objects, you can run the request as follows:

GET /v1/objects?class=MyClass&limit=10

To fetch the next 10, run the request as follows:

GET /v1/objects?class=MyClass&limit=10&offset=10

Similarly, to fetch the next 10 topics after that, run the request as follows:

GET /v1/objects?class=MyClass&limit=10&offset=20

For example, to fetch the data objects from the DatabaServer class that we created in the previous example, we can run a query as follows:

curl -X GET "http://localhost:8080 /v1/objects?class={DatabaseServer}&limit={10}" | jq

The previous command performs a request to the /v1/objects endpoint to fetch the first 10 data objects from the “DatabaseServer” class. We also pass the output to JQ to format the output more readably.

An example output is as follows:

As you can see, the query returns the data objects with detailed information such as the source Class, creation time as a UNIX timestamp, last update time, total results, and more.

Conclusion

We explored the fundamentals of working with the /v1/objects API endpoint in Weaviate to gather the information about the data objects of a given class.

About the author

John Otieno

My name is John and am a fellow geek like you. I am passionate about all things computers from Hardware, Operating systems to Programming. My dream is to share my knowledge with the world and help out fellow geeks. Follow my content by subscribing to LinuxHint mailing list