Introduction to the DecentralML Pallet

In a landscape dominated by tech behemoths like OpenAI, Google, Microsoft, and Amazon, who invest colossal sums (10s x and soon to be 100s of $USD billions) in foundational model training, DecentralML stands out as a pioneer of decentralization in machine learning that helps protect privacy while providing an alternative and more specifically focused (think individuals or brands) set of models when compared to centralised foundational models. This innovative approach is tailored to address the significant financial and infrastructural challenges associated with server datacenters use and human data labeling (both resources are costly and provide tech giants a significant competitive advantage). DecentralML revolutionizes this domain by distributing the task of training across a diverse array of machines, devices, or nodes, thereby challenging the supremacy of traditional data centers.

A key aspect of DecentralML is its empowerment of data labelers, often located in economically disadvantaged regions. These individuals, crucial in building models for companies such as OpenAI and Google, typically receive meager compensation. DecentralML transforms this dynamic by offering direct cryptocurrency payments, a solution that also addresses prevalent banking issues in these regions and provides the foundation for a new set of model labelling dApps. This approach not only democratizes the task of data labeling but also efficiently tackles various MLOps task challenges faced by model engineers (see the problem space study Hidden Technical Debt in Machine Learning Systems).

Through the introduction of federated machine learning tasks, DecentralML presents a decentralized alternative for the development of foundational models, challenging the existing paradigm. This initiative marks a significant shift towards a more equitable and decentralized approach in machine learning development, where tasks are not only distributed but also fairly compensated, ensuring a more balanced and inclusive future in the field.

4 Key Roles in Task<DecentralML>

The DecentralML ecosystem features tasks assigned to roles, each contributing to the success of the machine learning project:

1. Model Creator

The Model Creator plays a pivotal role in orchestrating tasks within the DecentralML project. Their responsibilities include defining the tasks, building the initial model, setting federated machine learning parameters, and coordinating the update of the model from task results produced by Model Engineers and Data Contributors.

2. Model Engineer

Model Engineers are crucial for the technical development and refinement of machine learning models. As team members designated by the Model Creator, they are responsible for creating project tasks, refining models, and managing the technical evolution through Tasks recorded on-chain (auditing useful for MLOps and versioning model).

3. Data Contributor

Data Contributors are tasked with labeling data, a critical process for training and refining machine learning models. They ensure the accuracy of model predictions and contribute to enhancing the model's algorithms' effectiveness.

4. Model Contributor

Model Contributors engage in federated machine learning tasks, downloading and executing data science tasks within dockerised environments, and submitting the results for integration. Compensated by the Model Creator, their work is validated through strategies established by the Model Creator.

DecentralML Tasks

Tasks in DecentralML are central to its function, recorded on-chain, and auditable they facilitating direct payment in any crypto-currency for specific contributions from Model Contributors, Data Contributors, and Model Engineers. These tasks are tailored to machine learning projects, supporting a wide variety of federated machine learning frameworks through the use of docker and associated files.

Overview of `pallet_decentralml`

pallet_decentralml orchestrates tasks that provide MLOps, data labeling, and federated learning capabilities. These tasks are defined by the following key methods:

Methods

create_task: Enables the Model Creator to define new tasks within the project.
assign_task: Assigns tasks to Data Contributors, Model Contributors, or Model Engineers.
send_task_result: Allows contributors and engineers to submit the results of their assigned tasks.
validate_task_result: Implements validation strategies for task results, including AutoAccept, ManualAccept, and CustomAccept.
accept_task_result and reject_task_result: These methods are used for validating task results, either as part of a strategy or manually by the Model Creator.
list_tasks and list_task_results: Provide overviews of available tasks and their respective outcomes.

Next up we will examine each of the methods defined inpallet_decentralml

NextTask Overview

Last updated 1 year ago