Abstract
Generative artificial intelligence (AI) offers numerous opportunities for research and innovation, but its commercialization has raised concerns about the transparency and safety of frontier AI models. Most models lack the necessary components for full understanding, auditing, and reproducibility, and some model producers use restrictive licenses whilst claiming that their models are “open source”. To address these concerns, we introduce the Model Openness Framework (MOF), a three-tiered ranked classification system that rates machine learning models based on their completeness and openness, following open science principles. For each MOF class, we specify code, data, and documentation components of the model development lifecycle that must be released and under which open licenses. In addition, the Model Openness Tool (MOT) provides a user-friendly reference implementation to evaluate the openness and completeness of models against the MOF classification system. Together, the MOF and MOT provide timely practical guidance for (i) model producers to enhance the openness and completeness of their publicly-released models, and (ii) model consumers to identify open models and their constituent components that can be permissively used, studied, modified, and redistributed. Through the MOF, we seek to establish completeness and openness as core tenets of responsible AI research and development, and to promote best practices in the burgeoning open AI ecosystem.
K eywords Artificial intelligence ⋅ \cdot machine learning ⋅ \cdot open science ⋅ \cdot open source software ⋅ \cdot open data
1 Introduction
Artificial intelligence (AI) has seen remarkable advances in recent years [1] , driven by the growth in computational capabilities [2, 3] , available training data [4, 5] , and improved deep learning algorithms [6, 7, 8] . However, as AI systems have become more advanced, concerns have also grown regarding their transparency, reproducibility, and safety [9, 10, 11] . Most state-of-the-art (SOTA) models are black boxes, making it hard to explain their internal logic or to ensure fairness [12, 13] . While the number of publicly available models has been growing, many of these models are falsely being promoted as “open-source”, a practice that has been characterized as “openwashing” [14, 15, 16, 17] . The lack of transparency and reproducibility in AI models hinders sc...
To address these concerns, we introduce the Model Openness Framework (MOF) for evaluating and classifying the completeness and openness of machine learning (ML) models across their development process. Model producers must go beyond releasing models and trained weights; they should include all artifacts involved in the model development lifecycle. The MOF contributes to broader efforts that seek to promote transparency, reproducibility, and responsibility in AI R&D, including reproducibility checklists [19] , ethical AI guidelines [20] , model and data cards [21, 22] . By adopting the MOF, the AI community can create a more open, accountable, and trustworthy ecosystem.
For the sake of simplicity in nomenclature, this paper refers to any person or entity that develops and trains a first-generation model as a “model producer” or simply a “producer”. This encompasses AI researchers, developers, AI hobbyists, or anyone who trains a model in some form or fashion, including fine-tuning and alignment, as long as they are the originator of the (foundation) model. Similarly, any person or entity that adopts, consumes, alters, or uses a model and corresponding artifacts for any purpose including modifying weights through fine-tuning is referred to as a “model consumer” or simply a “consumer”. This includes end users, researchers, developers, or anyone that uses an ML model and is not its producer. We also use the terms “ML” and “ML model” to broadly describe any model, whether classical machine learning or deep learning and both generative and discriminative.
The paper has the following structure. It begins with a discussion of related work and how the MOF builds on prior approaches to evaluating the openness of models (Section 2). Next, it reviews the concepts of openness and completeness in science and technology (Section 3). Then, it introduces the three classes of the MOF classification system (Section 4), as well as the 17 components (Section 5) and acceptable licenses per component (Section 6). Then, it discusses how to adopt the framework in practice (Section 7), as well as the benefits (Section 8) and limitations (Section 9) of the MOF. Finally, it concludes with a summary of the key contributions for both model producers and consumers (Section 10).
2 Related Work
2.1 Opening the Black Box: Benefits and Risks of Openness in AI
While AI has seen remarkable advances in recent years [1] , most SOTA foundation models are black boxes, making it hard to audit or explain their logic or behavior [12, 13] . Large language model services like OpenAI’s GPT-4 hide opaque models behind cloud-based APIs, providing no insight into the inner workings [23] . To address these concerns, there has been a growing movement towards the openness of models with companies, research organizations, and individuals sharing models on platforms like Hugging Face Hub, GitHub, and Kaggle [24, 25, 26] . Furthermore, grassroots initiatives have emerged as the early leaders in the open development of open foundation models [27, 28] , such as GPT-Neo by EleutherAI [29] , BLOOM by BigScience [30] , and SantaCoder by BigCode [31] . This shift towards the open development of AI models is increasingly viewed as a credible alternative to closed-source development [32] .
There has been much debate about the benefits and risks of releasing models [33, 34, 35, 36, 37, 38] . On the one hand, the accessibility and transparency of open models can concurrently deliver advantages over closed source models, including security and performance advantages through distributed development and auditing [39, 40] , adaptability and customization for diverse domains and languages [35, 41] , as well as advances in science [42, 43, 44] . On the other hand, the openness of models introduces a number of risks, such as the generation of disinformation [45, 46] , illegal content [47] , as well as security vulnerabilities [48] . Open foundational models are understood to have five distinctive properties that present both benefits and risks: broade...
2.2 Lack of Openness in “Open Source” AI
ML models whose weights are made publicly-available for download and downstream use are being falsely promoted as “open-source” [15, 49, 16] . Such models may more accurately be described as “open-weight models” [17] . While there is a fast-growing number of open models and open datasets shared on online platforms, a concerning number of models and datasets are shared either without licenses—for example, 64.67% of models and 72.13% of datasets on Hugging Face Hub are unlicensed [26] —or with restrictive licenses that do not meet the standards required of open licenses [15, 49] . In most cases, pretraining data or human feedback data collected from usage of models in, for example, chatbots are not released [50] . Some model producers even add conditions that stipulate that their model outputs cannot be used to train subsequent models or add trigger conditions that would require a model consumer to negotiate a new license when some condition is met. In addition, fine-tuned models based on foundation models with restrictive licenses are being released with open-source licenses, such as Apache 2.0, even though altering the original license is not legally permitted. This creates confusion in th...
Many open (foundation) models are released with technical reports and model cards that provide limited information on the source and treatment of their training data, fine-tuning, or alignment methods [23, 51] , and evaluation results often cannot be reproduced independently due to the lack of their disclosure [52] . Furthermore, few disclosures are made about guardrails and if prompts and outputs are altered, filtered, or replaced [53, 54] . Overall, the lack of openness leaves downstream model consumers to rely on limited claims reported by the model producers.
The misrepresentation of models as “open source” is in part due to confusion about the appropriate use of open-source licenses. Many developers do not realize that open-source licenses were designed to cover conventional software code and are not appropriate for the intricacies of ML models [55, 56] . As we discuss in Section 6, open-source licenses cover the model architecture, which is defined in software code, but not the corresponding model parameters. By contrast, model parameters are data and are more aptly governed by open-data licenses than by open-source licenses [57] . Meanwhile the misrepresentation of models as “open source” by companies has also been characterized as “openwashing” [15, 16, 17] , where “open” has been used imprecisely and loosely to describe “systems that offer minimal transparency or reusability…alongside those that offer maximal transparency, reusability, and extensibility” [14] . This problem motivates our creation of a ranking system to promote openness and completeness.
Another challenge is that most models have fallen short in their completeness (i.e. the full availability of components from the model development lifecycle, see list in Section 5), only releasing model architectures and final trained parameters. To achieve full transparency, reproducibility, and extensibility, we argue that model producers must go beyond just releasing their model and the trained weights and biases, which is currently the norm. Instead, they should include all artifacts of their work, including datasets for training, validation, testing, and benchmarking, as well as detailed documentation, such as research papers, model cards, data cards, and any usage documentation. Completeness also requires all code used to parse and process data, the code used for training and inference, and any code used in benchmark tests, along with any libraries or other code artifacts that were a part of the model development lifecycle.
2.3 MOF: A Novel Approach to Evaluating Model Openness & Completeness
There is not yet a formally agreed-upon definition of “open source AI” [56] . Broadly, open AI refers to the concept of transparency and accessibility in AI R&D. It entails the sharing of key artifacts associated with the development of models, including data, code, models, and publications, under both open and restrictive licenses, which allow access, inspection, modification, or distribution of models. As mentioned above, open AI also entails grassroots initiatives that have used open collaboration approaches to develop open-weight models [27, 28] . The sharing of models grants the community the freedoms to transparently review capabilities and limitations, identify issues, reuse or extend functionality, and participate in collective advancement. This is enabled through open licenses applied judiciously to key model components, including datasets, model architectures, and trained parameters, which facilitates attribution, safeguards model consumers, and maintains community norms while removing barriers to adoption [58, 59] .
The combination of open source, open data, open access, and open science is a powerful and effective way of solving the most pressing issues in AI R&D, including access, explainability, transparency, reproducibility, and safety. The goals of open AI are to accelerate progress through open collaboration, establish trust by allowing system inspection, enable diverse perspectives, and align AI advancement with social benefits [24] . Due to the nascent nature of the open AI movement, new standards are being developed to address shortcomings, including the draft Open Source AI Definition [56] ; tools for auditing model explainability, fairness, and robustness [60, 61, 62, 63, 64, 65] ; frameworks to evaluate model openness, such as the AAAI Reproducibility Checklist [66] and the NeurIPS 2019 ML Reproducibility Checklist [19] ; the establishment of ethics review boards in AI research labs [67] ; as well as work by government agencies, including NIST and NTIA in the USA [68] and the AI Safety Institute i...
However, prior approaches do not evaluate both the completeness and openness of models. The MOF reinforces existing approaches by objectively evaluating and classifying models based on which components of the development lifecycle are released under open-source licenses. It codifies openness across model development pipelines with informative guidelines, a classification system, and a method for assigning badges to qualified models. Models with licenses that do not impose downstream restrictions are considered open, while restrictive ones are source-available. This differs from the gradient approach to model openness [36] , which classifies BLOOM by BigScience [30] and GPT-J by EleutherAI [70] as open. We would classify GPT-J as open because it was released under the OSI-approved Apache 2.0 license, while BLOOM is source-available due to its restrictive, non-OSI-approved OpenRAIL license [71] . Overall, the MOF encourages model producers to strive for complete transparency and usability without restrictions.
3 Understanding the Concepts and Culture of Openness and Completeness
Before presenting the details of the MOF, we review the concepts of openness and completeness in science and technology. These core tenets form the basis of open science, open source, open data, and open access, which enable transparency, reproducibility, and collaboration in research, and are part of the wider open knowledge movement that believes all knowledge should be shared freely [72] . This section provides an overview of each domain and how they connect to the framework’s goals. Understanding the motivations behind openness clarifies why it is vital to extend these concepts to AI R&D: it facilitates the democratization of AI, which is essential for advancing AI research and innovation, as well as responsibility in AI R&D, including transparency, accessibility, and inclusivity [73] .
3.1 Openness
Openness is the practice of freely sharing the methodology, progress, and products of R&D with the public without restrictions on access, inspection, modification, or distribution [74] . Instead of limiting transparency through proprietary terms, openness concerns the release of materials under permissive open licenses tailored to the type of content. This upholds scientific ideals around reproducibility, accountability, and cumulative innovation, while empowering research and developer communities to meaningfully review, discuss, reuse, and extend upon prior work [75] . As we elaborate in Section 3.10, the careful selection of appropriate open licenses facilitates attribution, protects downstream consumers, maintains community ethical norms, and facilitates adoption and impact [76, 59] .
We also seek to differentiate between the terms “open” and “complete” in order to make it clear to model consumers exactly what model producers are providing and under what conditions when they say their model is open. Openness is not just about what is included, but importantly under what license each component is released. We believe opening the “black box” of AI will be crucial for continued advances and responsible use [77] . Although open-source licensing is imperative for the code components that are provided for the MOF, our approach to the MOF aligns with wider open science principles and the vision of open AI that requires more than open-source licenses for code components for models to be considered open. For example, non-code elements like datasets and research papers need an appropriate license that suits its format, such as open-data or open-content licenses, which are not currently OSI-approved licenses.
3.2 Completeness
Completeness is a core tenet of open science [75] . We define completeness as the availability of key artifacts produced during the full lifecycle of conducting research or the engineering of a technical product, enabling comprehensive transparency, inspection, evaluation, and reproducibility. In the context of ML, completeness entails releasing all the key components associated with developing an ML model rather than just selected artifacts. It entails sharing the full pipeline that produced a model in a usable form. Comprehensive releases empower unfettered scrutiny into model genetics: curation and treatment of training data, feature engineering, neural architectures, weight evolution, training configurations, model performance across diverse benchmarks, replication of model producer claims, and other byproducts of the model development lifecycle. The MOF encourages model producers to exhibit full completeness, providing all artifacts involved in the model development lifecycle when distributing models. It defines an ascending hierarchy of criteria for releasing key artifacts with the highest bar aligned with open science paradigms. Completeness combined with openness (open licensing) accelerates collective advancement of trustworthy and innovative AI.
We use the term “completeness” borrowed from open science to disambiguate from the multiple uses of the word “openness”, which has unfortunately become a vague and confusing term [14, 16] . Openness is often used to describe not only the licensing used for artifacts but also the availability of artifacts and even the thoroughness of those artifacts. The multiple uses of the term “open” continues to be used in a way that is misleading or does not reveal the specifics of its usage [68] . Packing the term “openness” with multiple definitions, uses, or dimensions does not clearly articulate what aspect of the model is open. For instance, a model producer may claim that their model is “open” but model consumers may not know if it is open because it employs open licenses, because it is made publicly available, because it provides additional components like datasets, or because the components released are thorough or usable. For this reason, we use the term “completeness” to measure the availability of components that are released with models (with the goal of full completeness) and the term “openness” to describe the usage of permissive licenses for components.
3.3 Open Knowledge
Open knowledge is an overarching philosophy and larger movement that encompasses all the preceding areas of openness, revolving around the free and public sharing of information and insights across various domains [78, 79] . This entails making knowledge resources accessible to everyone and contributing to a wider pool of shared understanding. Open knowledge practices also involve ensuring that the information is ethically curated and disseminated, upholding principles of integrity and respect for intellectual property. The Wikimedia Foundation, Open Knowledge Foundation, and Science Commons are leading organizations in the open knowledge community.
3.4 Open Science
Open science refers to the practice of making all stages of the scientific process transparent and accessible to others [75, 80] . This includes publishing research papers, data, source code, code notebooks, and any information or tools needed to replicate research. The goals of open science are to enable reproducibility, collaboration, and facilitate building on previous knowledge to advance scientific research [75] . Open science is critical for credible, ethical, and accessible scientific research that can be reviewed, validated, replicated, and built upon. Open science in AI is sometimes referred to as “open science AI” and is the gold standard for ensuring reproducibility and transparency.
Advances in AI R&D are in part attributed to the sharing of preprints on platforms like arXiv, but much of the training data, model details, and code of SOTA AI systems remain proprietary. The opaque nature of many AI systems limits reproducibility, hinders research, and increases concerns around bias and safety. Transitioning to open datasets, architectures, weights, and code promises to facilitate AI research, innovation, and adoption across the private and public sectors. Overall, openness has repeatedly shown immense power to advance progress, equity, and opportunity across endeavors. The MOF aims to promote the spirit and methodology of open science in the AI R&D community.
3.5 Open Access
Open access is the process of making research outputs like publications freely available to read without subscriptions or paywalls, enabling broad dissemination of knowledge. [81, 82] . There are various open-access platforms like Cornell University’s arXiv, which make publications, often distributed under an open license, freely available for review. Furthermore, the adoption of open access policies, mandates, and licenses by journals and conferences have contributed to greater access to research. Before open access, research publications were mostly locked behind expensive journal subscriptions and paywalls, which limited the discoverability and use of knowledge. The open access movement has made more research freely available to all. Open access speeds the dissemination of discoveries to scientists and the public, and it facilitates reproducibility and meta research. As a result, entry barriers to accessing research have greatly reduced and public access to AI research papers has helped advance the field, including many of the developments and enhancements to the transformer architecture that powers the latest highly-capable LLMs.
3.6 Open Collaboration and Open Community
Open collaboration encourages cooperative efforts across institutions, disciplines, and borders, involving more inclusive and diverse participation in the development of science and technology [83, 80, 84] . Open community goes beyond open collaboration, and it concerns the creation and sustainability of a shared community with neutral governance, where projects can be worked on collaboratively in an equitable environment that embraces principles of openness. The LF AI & Data and Generative AI Commons are examples of open communities [85] .
3.7 Open Source Software
Open source software (OSS) involves publishing software code under licenses that grant users independence and control over the technology by allowing inspection, modification, and redistribution of the code without restrictions [55] . The OSS movement has transformed software development over the past few decades: while early closed and proprietary systems limited access, locked in users, and stagnated innovation [86] ; nowadays OSS is estimated to be used in 96% of global code bases [87] and to constitute up to 90% of software stacks [88] . It is increasingly being recognized as digital infrastructure [89, 90] . OSI-approved licenses like Apache 2.0 and MIT have been key to enabling worldwide collaborative development, freedom of choice, and accelerated progress [58] .
OSS has emerged as an indispensable component of AI R&D [91, 92] and open science at large [93, 94] . OSS presents a myriad of benefits for individuals [95, 96] and enterprises [97, 98] . It encourages the sharing of code and software development methodologies [99] , providing a basis for building upon existing work and contributing to the advancement and democratization of science [100, 101, 102] ; it provides learning, skill development, and career development opportunities [27, 103] ; it reduces software development and testing costs [104, 105, 106] ; and it facilitates the development of open standards [99, 107] , among others. The benefits of OSS d...
3.8 Source Available
Source available should not be confused with open source. Source available originated from conventional software development, where a developer provides access to the source code, but the licenses are not open-source. This means they include restrictions that consumers must fully understand before agreeing to use it. Some have referred to these projects as open access, but this is a misnomer since open access applies to documentation without paywalls. Most open-washed projects are examples of source available due to their restrictive licensing [15, 49] .
3.9 Open Data
Open data refers to the public release of datasets, databases, and other structured data used for research, enabling access and reuse [112, 113] . This practice upholds scientific reproducibility, allows reanalysis, and spurs innovation [114] . Standard policies and formats are often employed to ensure quality and usable data sharing. Open content, on the other hand, refers to the sharing of creative materials and unstructured data. Both open-data and open-content licenses exist, with open-data licenses often applicable to both data and content. Open data emphasizes the standardization of datasets, addressing transparency and requiring comprehensive descriptions of data collection methods and assessments for intrinsic bias. Furthermore, accessibility is a cornerstone of open data, with datasets expected to be readily available without personal requests or paywalls, promoting transparency and enabling scrutiny.
Historically, many research fields had cultures of data secrecy that impaired reproducibility and knowledge building. Openly sharing data enables reanalysis, reproducibility, and new applications [115, 112] . For instance, government open data initiatives provide transparency of government operations and promote innovation in and for the public sector [116, 117] ; opening clinical trial data facilitate pharmaceutical research [118] and open genomic databases enablee bioinformatics breakthroughs [119] ; and open climate data [120, 121] have fuelled research and innovation to combate climate change. While better standards and tooling around open data publishing are still needed, the value of open data is clear. In the context of AI R&D, the Datasets and Benchmarks track at NeurIPS underscores the paramount importance of openly releasing machine learning datasets [22] .
3.10 Open Licenses
Open licenses are legal mechanisms that allow content and artifacts to be freely accessed, used, modified, and shared under permissive terms. They are essential for operationalizing open science, open data, and open-source ideals [58] . Different licenses have emerged for addressing rights, responsibilities, and permissible usage for data, publications, code, and other research outputs. Open licenses solve key problems with closed, restricted systems, including:
- •
Enabling free access without paywalls or subscriptions
- •
Allowing reproduction, analysis, and extension of work
- •
Disseminating contributions back to the community
- •
Progressing cumulatively by building on prior ideas
- •
Fostering collaboration across organizational and geographic boundaries
- •
Promoting transparency and accountability
- •
Mitigating anti-competitive behavior or rent-seeking
For research papers and scholarly works, Creative Commons (CC) licenses are widely adopted, which allow free distribution and reuse with conditions, such as requiring attribution and allowing commercial use and derivative works. Common choices for open licenses are CC-BY (attribute) and CC-BY-SA (share alike). Using permissive CC licenses for papers, technical reports, and documentation provides rights to reproduce, expand, and translate the works [59] .
For software code, many open-source licenses have been developed. The Open Source Definition and the list of approved open-source licenses is maintained by the OSI [55] . Using OSI-approved open-source licenses encourages community review and contributions to code, promoting quality and shared progress [76] . Prominent examples include the MIT, Apache 2.0, and the 3-Clause BSD license, which allow inspection, modification, and redistribution of code while requiring preservation of copyright and license terms. Alternative licenses, such as the Llama 2 license, OpenRAIL, and AI2 ImpACT licenses, are not considered open-source licenses due to their restrictions on usage [49] .
For datasets, typical licenses are Creative Commons licenses, particularly Creative Commons Zero (CC0), CC BY (attribution) and CC BY-SA (Attribution-ShareAlike), as well as Linux Foundation’s Community Data License Agreement (CDLA-Permissive) and the Open Data Commons licenses like Public Domain Dedication and License (PDDL) and the Open Data Commons Attribution License License (ODC-By). They provide terms for sharing data openly while addressing concerns, such as attribution, permissive usage, and liability [59] .
4 Model Openness Framework Classes
4.1 MOF Structure
The MOF proposes a three-tier classification system (see Table 1) to classify the degree of completeness and openness of ML models across all aspects of a model’s development lifecycle. The MOF has 17 components to fulfil completeness of model artifacts, which cover the code, data, and documentation that are part of the model development lifecycle (see definitions of each components in Section 5). The distribution includes an additional component, the MOF configuration file, to comply with the MOF requirements.
The 17 components are categorized into three distinct classes. Each class builds upon the previous one, with Class III being the least complete and and Class I being the most complete (see Table 1). The higher the class indicates a more complete distribution that promotes more transparency and enables reproducibility, auditing, and downstream use. This approach is more meaningful than a calculated index, as it guides model producers in providing essential components released under open licenses for each tier of the framework. As the class of the MOF increases, the producer moves closer to a more complete distribution that best aligns with the principles of open science in AI. To qualify for a particular class, the producer must provide every required component for that class. Each component must be released using an appropriate open license from Table 2 to qualify the entire project at the specified class level.
4.2 MOF Class Descriptions
The 3 classes of the MOF represent ascending levels of model completeness and openness. We describe the distinguishing aspects of each tier beginning with the lowest class.
4.2.1 Class III. Open Model
In the MOF, Class III is the entry point and contains the minimum required components that must be released using open licenses. If not all of these components are included in a release and all components do not use an open license then the entire release cannot be considered open under the MOF. The Open Model class covers the following:
- •
Core model architecture and the final set of parameters
- •
Light documentation conveying capabilities and characterization of the model and data.
Class III contains all the components required to study, modify, redistribute, and build upon a model without restrictions, including for commercial and educational purposes. The inclusion of the model architecture, final weights and biases, and documentation (including the technical report, evaluation results, model and data cards) provides the necessary information to work with the model and understand its capabilities, constraints, and the nature of the training data. However, this class lacks completeness and robustness for full reproducibility and the transparency needed to confirm all claims made by the producer. It also lacks sufficient components to evaluate the model, including the training data.
4.2.2 Class II. Open Tooling
Building upon Class III, Class II provides model consumers the complete codebase including libraries and tools needed for training, assessing and testing models themselves. Added elements include:
- •
Full training and inference code
- •
Benchmark tests to validate and quantify performance
- •
Libraries and tools to ease integration and to complete the codebase (optional)
This tier is an intermediate step between an open model and open science, providing a model consumer with information to test a model producer’s assertions. It also allows a model consumer to perform debugging, and it allows for enhancements to model functionality. Although it does provide insights into the training process, it does not include the actual datasets. It is also lighter on documentation, which limits a deeper understanding of the model’s intricacies.
4.2.3 Class I. Open Science
The top tier aligns with ideals of open science: the sharing of all artifacts needed for end-to-end transparency, reproducibility, and collaboration. This includes:
- •
A detailed research paper conveying the genesis of the model and its evolution
- •
Raw training datasets used in the training of the model (any license or unlicensed)
- •
Checkpoint weights showcasing full model evolution
- •
Log files providing yet more low-level insights
Fulfilling Class I empowers the community to inspect models through the model lifecycle along multiple fronts, representing the gold standard for completeness and openness rooted in scientific principles.
4.3 Hybrid Releases
Openness has always been a binary decision in the open-source movement; software is either open-source or not, with no in-between [55] . A developer either released their software under an OSI-approved license or they did not. If any essential component was not released under an open-source license, the entire release was no longer considered open source. The MOF follows this principle. When any component is not released using an open license as described in Table 1, that component is not deemed open and does not qualify for an MOF class. Removing a component that moves the project into a lesser class is acceptable if all remaining components are released with open licenses.
To qualify as a Class III project, the model, its parameters, and a technical report that describes the work along with evaluation results and model and data cards must be released with open licenses. If not, the project cannot be considered open. This includes projects that use modified open licenses and implement restrictions or acceptable uses.
It should be noted that the MOF classifies models and their components on completeness when they are open. The reader should not confuse the classification system as being a gradient measure of openness [36] , but rather a measurement of the completeness of a release in adherence with open science principles [24, 80, 93] .
5 MOF Components
The following defines the 17 components included in the above three-tier classification system of models (see Table 1). They cover the degree of completeness and openness across all aspects of the development process of an ML model, including training data, model architecture, model parameters, evaluation benchmarks, and documentation.
Note that not all components are required for all classes. Each component section below specifies the classes and it applies to and Table 1 lists the components required for each class.
Note that not all components need to be distributed separately, some MAY be combined. For example, evaluation results MAY be included in the research paper, technical report, or model card rather than published as a standalone artifact. This sort of combination SHOULD however be limited to combining component types that are covered by the LICENSE file for that component or the whole distribution.
5.1 Model Architecture (III.1)
The model architecture is the core of any ML project. It can include the ML algorithms, neural network layout, connectivity, activations, and other architectural elements. Examples include transformers (e.g., GPT, BERT), convolutional neural networks (CNNs), recurrent neural networks (RNNs), and graph neural networks (GNNs). While the model architecture is often closely tied to the trained model parameters, sharing the architecture alone allows others to understand the structure of the model without necessitating the release of the fully trained model. The model architecture SHOULD be fully described in the research paper, technical report, or model card, and MUST be distributed as open source code under an OSI-approved open source software license that does not limit its usage and derivative works.
5.2 Model Parameters – Final Checkpoints (III.2)
Trained model parameters MUST be released under an open license. In the case of deep learning models, checkpoints from key intermediate stages of training as well as the final optimizer state SHOULD be included. At a minimum the final model parameters and optimizer state (when applicable) MUST be distributed, whether compressed or uncompressed, in a format compatible with popular deep learning frameworks such as TensorFlow, Keras, PyTorch or the framework independent ONNX file format.
To date, model producers have been releasing model parameters (i.e., weights and biases) using an open source software license, such as Apache 2.0 and MIT, even though model parameters are not compatible with such licenses. Since model parameters are in fact data, model parameters SHOULD be distributed under an open data license, like CDLA-Permissive-2.0. Although licenses designed for OSS are permissive and indemnify the developer from liability, open data licenses are better suited to data-specific considerations such as privacy, ethics, and data rights. Most permissive licenses do not refer to data directly and do not address the ability to modify and redistribute model parameters. This gap could result in a legal obligation to any model consumer if the model producer were to implement royalties after widespread adoption of their model. This is a legal gray area that remains untested. The model architecture and model parameters SHOULD be saved independently in different files for distribution, as each one requires a different format-appropriate open license. This separation allows each component to be studied, modified, redistributed, and used independently of the other.
5.3 Technical Report (III.3)
The technical report, which MAY be in the form of a white paper, provides the necessary documentation for the model consumer to understand the performance, usage, and implications of using the model, but not necessarily enough details to reproduce the model and replicate its results. The technical report MUST be included in the distribution or made available on a permanent open access platform such as arXiv, and SHOULD be linked from the model card. The technical report MAY be omitted if a research paper is provided. The technical report MUST be distributed under an open license that SHOULD be appropriate for documentation, ideally CC-BY-4.0 or CC0.
5.4 Evaluation Results (III.4)
Detailed quantitative metrics and qualitative results from evaluating the model MUST be reported. They MAY be included in the technical report, the research paper, or the model card. Tests can evaluate any factor, not limited to model efficiency, accuracy, performance, fairness and bias evaluations, toxicity, truthfulness and so forth. Producers MUST include benchmark test results, whether industry standard benchmarks or custom benchmark tests that were developed. If industry standard benchmark tests or test suites are used, the test suite name, test name and version number MUST be included with the results. If custom benchmarks were developed, whether in code or any form of media including text, images, the custom benchmarks MUST be included in full for validation. The raw outputs of the model evaluation MUST be distributed under an open license that SHOULD be appropriate for content like CC-BY-4.0.
5.5 Model Card (III.5)
A model card provides metrics, usage guidance, and details about a model [21] . Model cards SHOULD cover model details, intended uses, factors, evaluation, risks, and mitigations related to the model. The model card itself MUST be distributed under an open license that SHOULD be appropriate for documentation, ideally CC-BY-4.0.
5.6 Data Card (III.6)
A data card provides summary statistics and key information about a dataset to enhance understanding of its composition [22] . Following guidelines from the Data Nutrition Project [123] , data cards SHOULD describe metrics about the features, instances, intended uses, motivation, and collection process. Data cards help identify potential biases in datasets and guide proper usage by downstream usage. They also contribute to reproducibility and transparency by detailing the entire data preparation process. The data card MUST be distributed under an open license that SHOULD be appropriate for documentation, ideally CC-BY-4.0.
5.7 Sample Model Outputs (III.7)
5.8 Training, Validation, and Testing Code (II.2)
5.9 Inference Code (II.3)
The availability of inference code facilitates complete replication of the performance of the model, and it informs the model consumer about how to use the model most effectively for their applications. Code for performing inference includes any data preprocessing or postprocessing required during inference and possibly any model optimizations and dependencies like external libraries. It MUST include any code required to fully replicate the benchmark results for the model. The inference code MUST be released under an OSI-approved open source software license.
5.10 Evaluation Code (II.4)
Evaluation code, evaluation data, and evaluation results are separate components in the MOF. This is due to the fact that some benchmarks are written in code and other benchmarks only use data; for instance, text used to evaluate an LLM or images used to evaluate a computer vision model. Many benchmark tests are a combination of both code and data used to evaluate a model, which includes the scripts needed to load the data and run benchmark tests. Since code and data require different licenses, they are separate components. Depending on the nature of the model and the methods used to evaluate it, the distribution MAY include one or both of evaluation code and data. Any code used for model evaluation and benchmarking MUST be included and distributed under an OSI-approved open source software license.
5.11 Evaluation Data (II.5)
When the model is evaluated with data (be it any media format including text, images, videos, audio, 3D data, and so forth), that evaluation data MUST be included with the distribution. Where the model producer relies on standard benchmark tests that are widely disseminated, they MAY be omitted from the distribution, but they MUST be described in the technical report, research paper, or model card, along with the version of the test. The evaluation data MUST be released under a data or content appropriate open license like CDLA-Permissive-2.0, CC-BY-4.0 or CC0.
5.12 Supporting Libraries and Tools (II.6)
Any supporting code libraries, utilities, or tools developed in the course of the development of the model SHOULD be distributed under an OSI-approved open source software license. This includes data loaders, visualization code, simulation environments, etc. Use of existing and custom open source tools SHOULD also be documented. Any of the following tools and libraries SHOULD also be included:
- •
Software libraries and frameworks used in model development along with version details.
- •
Tokenizers: Code used to tokenize text and any data used to train the tokenizer (if used.)
- •
Hyperparameter search code: Code for automating hyperparameter tuning (if used).
- •
Compute infrastructure code: If specialized compute infrastructure was built to scale training, the setup code could be released.
- •
Monitoring code: Code for tracking experiments, metrics, artifacts etc. during model development is often useful to open source as well.
- •
Containerization files: Dockerfiles or other container packaging to distribute the model could be shared.
- •
Frontend/visualization: Any web/mobile frontends or visualizations built on top of the model outputs could be released as open source.
- •
Deployment orchestration: Infrastructure-as-Code templates for deploying the model to production.
- •
Model integration code: Wrapper code/SDKs to integrate the model into downstream applications.
- •
Interactive demos: Links to hosted interactive demos of the model through Jupyter, Streamlit, etc.
Presumably most libraries and tools used already have their own licenses, but if the model producer created their own libraries or tools they MUST include them with the distribution under an OSI-approved open source software license.
5.13 Research Paper (I.2)
5.14 Datasets (I.3)
Data is the lifeblood of ML models and is the most often held back element in the release of a model. Datasets include training data which is data used for any form of model training including pre-training, fine-tuning, alignment using reinforcement learning techniques or data used for other methods that otherwise modify the weights of the model. Datasets also include data used for model validation and testing as well as data used with benchmark tests. The datasets component also includes tokenized datasets when present. Data can be any form or combination of media, whether text, code, images, videos, audio, 3D objects, URIs and any other data used for training, validation and testing purposes. Datasets also include any metadata. This includes anything from annotation data like labels, bounding boxes and key points to attribution, bitrates, resolution and other metadata relevant to a dataset used in the model development process. The datasets used to develop the model MUST be provided, in the public domain, as copyrighted data, or under any form of license. They SHOULD be released under an open license, preferably CC-BY-4.0 or CC-0. Any limits on sharing due to privacy or sensitivity SHOULD be documented. Both pre- and post-processed data SHOULD be supplied, however producers MAY provide instead links to any curated raw datasets online if they are accompanied by data preprocessing code.
5.15 Data Preprocessing Code (I.4)
The data preprocessing code is all code used for preprocessing, cleaning, and formatting the training, validation, and testing data for a model. It also includes code used to transform fine-tuning data and code that is used for alignment tasks like Reinforcement Learning from Human Feedback (RLHF). Other data preprocessing code such as code for data ingestion when appropriate, feature engineering, data augmentation and tokenization is also included. The data preprocessing code MUST be released using an OSI-approved open source software license.
5.16 Model Parameters – Intermediate Checkpoints (I.5)
In addition to the final checkpoints and optimizer states, for Class I models, the checkpoints and optimizer states (when applicable) from key intermediate stages of training along with the log files MUST be included and distributed under an open license. Intermediate model parameters SHOULD be distributed under an open data license, such as CDLA-Permissive-2.0.
5.17 Model Metadata (I.6)
There are other forms of metadata that can provide additional context about the model, such as the version of the framework used to create it and custom tags or descriptions provided by the developer including model and data lineage information. There is no particular requirement or profile for this type of metadata and it can reveal anything the developer would like to include with the shipped model. This information can help with model management, especially when working with multiple versions of models or conducting experiments. Often the metadata is exported from or loaded by a metadata store. The model metadata MAY be included in the model card, research paper, or technical report. Any model metadata SHOULD be covered by an open data license such as CDLA-Permissive-2.0.
5.18 Model Openness Configuration File
The MOF configuration file MUST be included in any distribution. It describes what model components are included in the release and what license covers each component. The file itself MUST be distributed under an open license and SHOULD be distributed under the CC-BY-4.0 license.
6 Model Openness Framework Acceptable Licenses
Table 2 provides an overview of acceptable licenses for each component. The table categorizes each component into one of three domains: Data, Model, or both. Additionally, the content type of each component is classified as data, code, or documentation. The table specifies standard open licenses that should be used for releasing each component, while allowing some flexibility for equivalent licenses. By providing a comprehensive scope, the MOF encourages opening the entire pipeline that produces, evaluates, and applies a model. This approach offers multiple perspectives into the model’s inner workings, promoting transparency and reproducibility in open model development.
7 Adopting the Model Openness Framework
7.1 MOF Process Overview
Unlike other frameworks that attempt to dictate how model producers should build and train their models or create a release path on how models should be released, we take a more objective approach by evaluating models based on their completeness and openness. This approach does not constrain model producers into a single methodology but rather lays out a pliable process that acts as a guideline to help model producers create the most complete and open models. At the completion of the process the MOF provides model producers with a badge for their MOF class that clearly demonstrates to the public their commitment to both completeness and openness.
The MOF process generally follows these steps:
- 1.
Inventory of artifacts (a) Comprehensively list all artifacts involved in creating the model (data, code, documentation, etc).
- (b)
Capture details like component names, component locations, versions and licenses.
- 2.
Map to MOF components (a) Align inventory items to the 16 components defined in Section 5.
- (b)
Multiple inventory elements may map to a single standard component.
- 3.
Verify licenses (a) For each MOF component present, check if it uses an acceptable open license from Table 2.
- (b)
If licenses are incompatible, the model cannot be classified.
- 4.
Determine completeness (a) Check inventory against the component list for...