LLMs
Image Embeddings

Image Processing (MNIST Example)

Open In Colab (opens in a new tab)

import pandas as pd
import turboml as tb
from torchvision import datasets, transforms
import io
from PIL import Image
class PILToBytes:
    def __init__(self, format="JPEG"):
        self.format = format
 
    def __call__(self, img):
        if not isinstance(img, Image.Image):
            raise TypeError(f"Input should be a PIL Image, but got {type(img)}.")
        buffer = io.BytesIO()
        img.save(buffer, format=self.format)
        return buffer.getvalue()
 
 
transform = transforms.Compose(
    [
        transforms.Resize((28, 28)),
        PILToBytes(format="PNG"),
    ]
)

Data Inspection

Downloading the MNIST dataset to be used in ML modelling.

mnist_dataset_train = datasets.MNIST(
    root="./data", train=True, download=True, transform=transform
)
mnist_dataset_test = datasets.MNIST(
    root="./data", train=False, download=True, transform=transform
)
images_train = []
images_test = []
labels_train = []
labels_test = []
 
for image, label in mnist_dataset_train:
    images_train.append(image)
    labels_train.append(label)
 
for image, label in mnist_dataset_test:
    images_test.append(image)
    labels_test.append(label)

Transforming the lists into Pandas DataFrames.

image_dict_train = {"images": images_train}
label_dict_train = {"labels": labels_train}
image_df_train = pd.DataFrame(image_dict_train)
label_df_train = pd.DataFrame(label_dict_train)
 
image_dict_test = {"images": images_test}
label_dict_test = {"labels": labels_test}
image_df_test = pd.DataFrame(image_dict_test)
label_df_test = pd.DataFrame(label_dict_test)

Adding index columns to the DataFrames to act as primary keys for the datasets.

image_df_train.reset_index(inplace=True)
label_df_train.reset_index(inplace=True)
 
image_df_test.reset_index(inplace=True)
label_df_test.reset_index(inplace=True)
image_df_train.head()
image_df_test.head()
image_df_test = image_df_test[:5].reset_index(drop=True)
label_df_test = label_df_test[:5].reset_index(drop=True)

Using PandasDataset class for compatibility with the TurboML platform.

images_train = tb.PandasDataset(
    dataframe=image_df_train, key_field="index", streaming=False
)
labels_train = tb.PandasDataset(
    dataframe=label_df_train, key_field="index", streaming=False
)
 
images_test = tb.PandasDataset(
    dataframe=image_df_test, key_field="index", streaming=False
)
labels_test = tb.PandasDataset(
    dataframe=label_df_test, key_field="index", streaming=False
)

Extracting the features and the targets from the TurboML-compatible datasets.

imaginal_fields = ["images"]
 
features_train = images_train.get_input_fields(imaginal_fields=imaginal_fields)
targets_train = labels_train.get_label_field(label_field="labels")
 
features_test = images_test.get_input_fields(imaginal_fields=imaginal_fields)
targets_test = labels_test.get_label_field(label_field="labels")

Clip Model Initialization

We Simply create a ClipEmbedding model with gguf_model. The CLIP model is pulled from the Huggingface repository. As it is already quantized, we can directly pass the model file name in 'select_model_file' parameter.

gguf_model = tb.acquire_hf_model_as_gguf(
    "xtuner/llava-llama-3-8b-v1_1-gguf", "auto", "llava-llama-3-8b-v1_1-mmproj-f16.gguf"
)
gguf_model
model = tb.ClipEmbedding(gguf_model_id=gguf_model)

Model Training

Setting the model combined with the ImageToNumeric PreProcessor to learn on the training data.

model = model.learn(features_train, targets_train)

Model Inference

Performing inference on the trained model using the test data.

outputs_test = model.predict(features_test)
outputs_test