Model Ownership: Vana's Decentralized Data Co-operative
Data is a strategic asset that plays a crucial role in the development of AI technology. However, the owners and creators of this data are often not fairly compensated for its economic value. Anna Kazlauskas, creator of Vana, initiated a project to change this structure. Her goal is to build an internet ecosystem where users can own and have a stake in the AI models they help develop, ensuring that the benefits are not monopolized by large tech companies.
At REDeFiNE TOMORROW 2025, Anna Kazlauskas shared her vision in a conversation with Danny Sursock, a Partner at Archetype, about the future of Decentralized AI founded on the principle of Collective Data Ownership. VANA is laying the infrastructure for this, enabling users' digital footprints to be transformed into controllable, value-generating assets and helping to drive a more equitable and transparent technological society.
The Data Wall and the Overlooked Treasure
The AI industry is currently facing a significant problem known as the “data wall.” This has occurred because leading AI companies like OpenAI and Google have already trained their models on nearly all of the public data available on the internet. Anna Kazlauskas points out that this vast amount of data is just a small fraction of the digital world, accounting for only 0.1% to 4% of all existing data.
The real treasure is the private data that people create every day, whether it's text messages, emails, or documents. This data is far superior to public data in both quantity and quality but remains hidden behind the data walls of giant corporations. Therefore, VANA's core mission is to unlock this data by empowering people to reclaim and access their personal information, opening the door to this long-overlooked treasure.
The Legal Framework for Data Sovereignty
One of the most important yet least understood truths of the digital age is that users legally own their data. Most people mistakenly believe that using various platforms means relinquishing ownership of their content to those platforms.
Kazlauskas made an easy-to-understand comparison:
“When you park your car in a parking lot, the parking lot doesn't become the owner of your car just because you parked it there. It's the same when you put your data on a platform; that data is still completely and legally yours.”
A platform's terms of service only grant the company a license to use data for its operations, not to seize ownership. This principle is being enshrined in increasingly clear laws such as Europe's GDPR, California’s CCPA, and Utah’s Digital Choice Act. These laws have also created a framework for data portability, compelling companies to provide a user's data upon request within 30 days.
VANA is built on this legal foundation, acting as a tool for users to exercise their rights. Through a decentralized mechanism, it pulls data from various platforms to be pooled for collective benefit. This turns each individual into their own "data processor," much like how Bitcoin allows you to be your own bank.
Building on the Success of the Reddit Data DAO
VANA's concept of data ownership was concretely proven by the success of the Reddit Data DAO (Decentralized Autonomous Organization), which invited Reddit users to pool their posts and comment history to create a collectively owned dataset.
The project received an overwhelming response, with 140,000 participants in the first week. This success not only validated VANA's concept but also prompted Reddit to reconsider its own data policies. Users with the highest Karma scores were able to earn $300-$400 from their data.
Kazlauskas stated that the key was creating clear economic value for something users never realized they owned. This financial incentive was powerful enough to attract new users to the crypto world, many of whom were willing to set up a crypto wallet specifically to join the DAO.
Furthermore, the project's success revealed that cross-platform data is significantly more valuable than data from a single source. For example, data from Spotify alone might be worth only $0.30 per user, but when combined with fashion and demographic data, its value can soar to $25. VANA is perfectly positioned to facilitate this data aggregation because only the users themselves can collect data from all the services they use.
A New Foundation: The Technical Architecture of VANA
To support this new data economy, VANA couldn't rely on existing blockchain infrastructure. The complex regulatory and privacy requirements of handling personal data necessitated a custom-built solution. VANA operates as a Layer-1 blockchain, a decision driven by the need for true decentralization.
Centralized solutions, such as Layer-2 networks that use a centralized sequencer, risk being classified as “data processors” under GDPR. This would lead to impossible compliance burdens, such as honoring a user's “right to be forgotten” by deleting data from an immutable blockchain. “We can’t make that trade-off,” Kazlauskas affirms. “It has to be a system where each user has the right to do this themselves through a decentralized platform.”
VANA's architecture features a Dual Validator System, which combines traditional L1 validators for network security with Data Validators that operate within Trusted Execution Environments (TEEs). TEEs are secure, isolated hardware environments that allow for the processing of private data without exposing the raw data itself. This enables what VANA calls “Programmable Privacy,” which allows Data DAOs to approve specific code to run on their data while ensuring the raw data remains encrypted and under the user's control—a concept known as “Non-custodial Data.”
Data quality is maintained through a “Proof of Contribution” system, which scores incoming data based on metrics like account age and data volume, and even uses an LLM for additional quality checks. The verified data is then structured in SQL format to allow for granular access control for AI training.
The ecosystem is powered by a Dual Token System. The native VANA token secures the network and facilitates transactions, while VRC-20 tokens, created specifically for each dataset, represent ownership in that Data DAO. This tokenizes data into a new asset class, creating a liquid market and opening the door to new DeFi applications where data can be used as collateral.
User-Owned AI
VANA's ultimate goal isn't just to monetize data, but to forever change the ownership of AI itself. In a historic collaboration with Flower Labs, VANA helped build COLLECTIVE-1, the world's first user-owned foundation model. This 7-billion-parameter model was trained from scratch using text data contributed by users from VANA's various Data DAOs.
This achievement marks a major milestone for decentralized AI. “If you had asked someone 18 months ago if it was possible to do pre-training on a decentralized system, most would have said no,” Kazlauskas emphasizes. This breakthrough proves that community-driven AI development is not only possible but can also compete at the forefront of the industry.
The implications are profound. In a world where a few corporations control the AI models that increasingly shape our perception of reality, user-owned models provide a critical counterbalance. It transforms AI from a potential threat that could take our jobs into a collectively owned tool where the contributors are also the beneficiaries. If you own a piece of the AI you helped teach, its success is your success.
Future Challenges and Opportunities
VANA's ambition is to attract 100 million users to create a massive dataset of 450 trillion tokens, which is 30 times larger than any dataset held by a single AI company. To achieve this, it will be necessary to create user-friendly products that go viral, simplifying the complexities of the crypto world for the average user.
Of course, the path isn't without its challenges. VANA must navigate the regulatory landscapes of both crypto and data privacy, and it challenges the multi-trillion-dollar business models of tech giants, which could create significant resistance.
However, where there are challenges, there are also opportunities. The VANA ecosystem is growing rapidly, with the number of full-time Data DAO builders now exceeding the size of VANA's core team. Through initiatives like the VANA Academy, they are cultivating a new generation of “data entrepreneurs” to build businesses in this new economy. The potential applications are limitless, and Kazlauskas is particularly excited about the future of Data DAOs in several areas:
- Robotics: An industry severely limited by the lack of real-world training data.
- Healthcare: Where user-owned data can break down regulatory barriers to create models for personalized medicine and longevity science.
- Bio-Research: Where AI can analyze real-world data contributed by researchers to accelerate scientific discovery.
For the builders, developers, and investors watching the AI space, VANA represents a crucial opportunity to get in on the ground floor. It's a high-reward bet on a future where the internet's most valuable resource is controlled by those who create it. Whether it's by starting a new Data DAO, training a unique AI model on a distinctive dataset, or building DeFi applications with data as collateral, VANA is providing the tools to build a more equitable and decentralized digital world.





