Balancing openness with Indigenous data sovereignty: An opportunity to leave no one behind in the journey to sequence all of life – pnas.org

Posted: January 24, 2022 at 9:48 am

Abstract

The field of genomics has benefited greatly from its openness approach to data sharing. However, with the increasing volume of sequence information being created and stored and the growing number of international genomics efforts, the equity of openness is under question. The United Nations Convention of Biodiversity aims to develop and adopt a standard policy on access and benefit-sharing for sequence information across signatory parties. This standardization will have profound implications on genomics research, requiring a new definition of open data sharing. The redefinition of openness is not unwarranted, as its limitations have unintentionally introduced barriers of engagement to some, including Indigenous Peoples. This commentary provides an insight into the key challenges of openness faced by the researchers who aspire to protect and conserve global biodiversity, including Indigenous flora and fauna, and presents immediate, practical solutions that, if implemented, will equip the genomics community with both the diversity and inclusivity required to respectfully protect global biodiversity.

Since the early days of the Bermuda Accord (1), Human Genome Project (2), and the Fort Lauderdale Agreement (3), the field of genomics has been strongly committed to open data sharing, and the calls for improved data-sharing approaches have only become even louder in the recent response to the COVID-19 outbreak (4). Rapid sequencing and open release of SARS-CoV-2 viral genome sequences throughout the outbreak have aided vaccine development, efficacy assessments, and continual monitoring of the viruss evolution in ways unimaginable a few decades ago (5). Similarly, the open release of the human reference genome and follow-up studies, such as the 1000 Genomes and the gnomAD data resource, have transformed our understanding of human genomic variation and disease and are exemplars of successful community resource-building projects. Now, new projects, such as the Earth BioGenome Project (6), aim to sequence the genomes of all living eukaryotic species to further understand molecular evolution, catalog the worlds biodiversity, and inform future conservation efforts. Such projects have the potential to bring the benefits of genomics to all people and species, but the past model of large consortia generating vast troves of data, favoring the inclusion of some over the exclusion of others, is both damaging and inequitable, requiring movement beyond the principles defined in Bermuda and updated in Toronto (7). These ambitious projects will require contributions from community and academic partners around the globe, and so the genomics community must develop and implement inclusive data-sharing policies and infrastructure that respect the rights and interests of all people.

Unfettered openness of genomic data, and the hows and whys of its enforcing open-science norms, impinge on the rights of Indigenous Peoples. As one example, the Navajo Nation became rightfully wary of freely contributing samples and genomic data and, in 2002, placed a tribal-wide Banishment Order on genetics research (8). In Canada, the three councils that fund research have formally adopted policies that were developed by Indigenous Peoples and scholars, which include that data and samples from Indigenous communities must be collected, analyzed, and disseminated under the terms of a mutually determined research agreement that respects community preferences to maintain control over, and access to, data and human biological materials collected for research (9). Only by reconsidering the definition of openness and who it benefits within the context of the current inequitable infrastructures can a more inclusive genomics community be built to responsibly sequence all of life for the future of life (6).

The prospect of cataloging the genome reference sequences for a huge number of representative species is only possible thanks to the exponential technological advances of the genomics community over the past 40 y. Whereas the initial Human Genome Project cost several billion in todays dollars (USD), the sequencing and assembly of high-quality vertebrate reference genomes now costs under $10,000 and continues to drop rapidly. Leveraging these new sequencing technologies, the Vertebrate Genomes Project has now generated over 100 new vertebrate reference genomes (10), and in the coming year, the Human Pangenome Reference Consortium (https://humanpangenome.org/) aims to create hundreds of new reference genomes that will better represent human genetic diversity. Along with reductions in sequencing costs, the underlying technologies are also becoming increasingly portable, with nanopore-based technologies now enabling on-site sequencing in the most remote corners of the world (11).

This genomics revolution is timely, in the midst of the Earths sixth mass extinction with 35,500 species on the International Union for Conservation of Nature Red (threatened) List (https://www.iucnredlist.org/en). Unlike the mass extinctions of the past, the sixth has been caused as a result of the actions of just one species, humans, and as a species we must act swiftly to halt the dangerous loss of biodiversity and extensively catalog what remains. Providing a catalog of genomic sequences for all life will be important for informing decisions about the effects of climate change on species diversity (12), the development of conservation strategies for threatened and endangered flora and fauna (13), assessing the success of ongoing conservation efforts, and for the preservation of genomic biodiversity before it is lost forever to extinction (6).

The importance of conserving biodiversity is universally recognized, but Earths biodiversity is not uniformly distributed. The Critical Ecosystem Partnership Fund currently recognizes 36 biodiversity hotspots, defined as regions with over 1,500 endemic vascular plant species. These hotspots have suffered a 70% loss of their native vegetation (14). Hotspots will be a top priority for any genomic conservation project, but many of these hotspots overlap Indigenous lands. Indigenous Peoples and lands historically have been exploited and excluded, and not engaged by the genomics community (15). Thus, it is imperative for the genomics community to work as equal partners with Indigenous Peoples going forward. To move forward, however, new infrastructure and policies are required to facilitate alternative modes of data sharing that can coexist with the current open-sharing policies of international genomics consortia. Current blanket open data-sharing policies override the rights of Indigenous Peoples, specifically the right to determine the use and mode of sharing Indigenous resources, which includes data. A fact that contravenes the United Nations (UN) Convention on Biological Diversity (CBD) as a matter of international law (16), violates several rights stipulated in the UN Declaration on the Rights of Indigenous Peoples (17), and results in perpetuating the marginalization of these Indigenous Peoples (18).

Open genomic data are defined here as genomic sequence information that is made freely available without restrictions on use, copying, or distribution. The worlds most popular molecular sequence databasessuch as the National Center for Biotechnology Informations GenBank, the European Nucleotide Archive, and DNA Database of Japanstrictly adhere to this model. Furthermore, in 2011 a Joint Data Archive Policy was drafted and adopted by many leading journals that reinforced open data sharing (19). Open data sharing in genomics has fostered a productive and collaborative international research community; it aspires to reduce systematic wealth and power inequalities by extending research opportunities from partners with a large investment in genomics capacity and capability to those partners with lower investment. In addition, open data sharing has provided knowledge that is more transparent, accessible, and verifiable, which has improved the efficiency and reliability of genomic research (20). However, despite its success, by negating local and regional representation and participation in governance, it has also resulted in the development of data-sharing policies that do not maximize opportunities for all participants in an equitable manner (21).

Moreover, when strictly mandated, open data policies can have the unintended consequence of excluding many minority communities, including those Indigenous Peoples who wish, for a variety of legitimate reasons, to retain control over the resources and data derived from their lands, species, and waters. The lack of clear, respectful, and operational policy that respects Indigenous rights breeds mistrust among Indigenous partners and not only hinders the inclusion of Indigenous science in international biodiversity and conservation efforts, but can also build opposition that results in the stagnation and reversal of Indigenous genomics projects (22). By demanding rigid policies on data sharing, the genomics community has forged rules premised on a single worldview. It undermines the rights and interests associated with traditional knowledge, a phenomenon scholars of Indigenous communities call epistemicide (23). Despite international consortia recognizing the rights of Indigenous Peoples, a lack of accountability and clarity for implementation of appropriate policies has exacerbated tensions between Indigenous communities and international genomic efforts (21).

In the past, the worlds of genomic science and Indigenous communities intersected mainly through Indigenous Peoples being used as subjects of research conducted by non-Indigenous researchers. Research was done on Indigenous Peoples, not by them and very rarely for them. The mistrust of the scientific community among Indigenous communities is well-earned: it has been caused by years of exploitation, mistrust, power imbalances, and inequality (24). It has included decades of taking and using Indigenous samples and data without adequate consent and consultation (24, 25); Indigenous data and samples not being properly attributed or acknowledged as coming from Indigenous lands and waters; Indigenous data being misused through bioprospecting and biopiracy (2628); Indigenous data being scientifically interpreted without cultural or contextual knowledge (29); and researchers who have claimed authority over the Indigenous world by relying on quantitative data rather than traditional knowledge and lived experience (30). Furthermore, the failure of researchers to disseminate research outcomes respectfully through mechanisms that are meaningful and applicable to Indigenous partners, such as asset-based approaches (31), has fomented a sense of a lack of control, lack of access, lack of opportunities to derive benefits from the use of traditional knowledge and genetic resources, and a lack of opportunity to integrate traditional ways of knowing into research plans (32). Through asset-based approaches, results can be communicated more meaningfully and ameliorate the five Ds of statistical data on Indigenous Peoples: disparity, deprivation, disadvantage, dysfunction, and difference (33).

Indigenous Peoples are the guardians and sovereign authorities of their lands and have been since time immemorial. Indigenous Peoples have their own unique beliefs, values, and worldviews. They are highly diverse; however, a commonality shared among many is a deep interconnectedness, interdependence, and intimate connection to their lands and waters (34). In regions of Africa, for example, life is not perceived through an individualistic lens but is experienced as relational and collective; this worldview is known as Ubuntu (35), an example of Indigenous or traditional knowledge that is based upon lived experience extending as far back as the Pleistocene era (36). It has been developed over time, informed by an extensive system of principles, beliefs, and traditions. In New Zealand, a governmental inquiry into the Mori knowledge system, or Mtauranga Mori, concluded that this system of knowledge is fundamentally different from Western science. The Mori knowledge framework has evolved through its own cultural context and evolutionary pathway (37). These epistemological differences in knowledge sharing and individual possession are largely incommensurate with existing intellectual property rights, which privilege and support Eurocentric notions of knowledge commons with no or limited rules around access to knowledge and property. However, rather than being treated as outdated or inferiorattitudes that embody cognitive imperialism and epistemic violencetraditional knowledge systems should be acknowledged, integrated, treated as a coequal, and considered when interpreting findings. One system of knowledge should not eclipse the other. When recognized in this way, traditional knowledge is integral to knowledge production contributing both technically and scientifically to the protection and sustainable development of Indigenous lands, resources, and data through an intrinsic understanding of the interdependence of land and its inhabitants (38).

Any complete catalog of Earths biodiversity must necessarily include species on the lands of Indigenous Peoples. Thus, for global genomic conservation efforts to succeed, the genomics community will need to adapt its open data policies to Indigenous data sovereignty and knowledge systems. To achieve this, policies must be operationalized that embrace multiparadigmatic research approaches (39, 40) that recognize the inherent sovereignty of Indigenous Peoples and that remove barriers to those Indigenous communities who wish to contribute to bioconservation as equal partners.

Over the past two decades there has been an international call for the recognition and protection of Indigenous data rights. Indigenous data sovereignty (IDSov) refers to the individual and collective rights of Indigenous Peoples to control data from and about their communities, land, species, and waters (30).

In 2010, the Nagoya Protocol was established and adopted by the UN CBD (41) to protect, promote, and fulfill this right. It has been fundamental in providing guidance on access and benefit-sharing of Indigenous resources and data. Article 12 states that parties shall, in accordance with domestic law, take into consideration Indigenous and local communities customary laws, community protocols, and procedures. The Nagoya Protocol now has 2,000 internationally recognized certificates of compliance, but notably does not include some nations that have both Indigenous Peoples and a large genomic research program (e.g., the United States, Canada, New Zealand, and Australia). Despite this, domestic legislation over a sample/genetic resource from a signatory nation extends to where that sample/genetic resource is housed or used. Thus, nonsignatory countries are expected to implement Nagoya legislation if resources have been obtained from a country where the Nagoya Protocol is enforced.

In 2014, the UNs General Assembly adopted the United Nations Declaration on the Rights of Indigenous Peoples (17), which affirms the right of Indigenous Peoples to control, protect, and develop manifestations of their sciences, technologies, and cultures, including human and genetic resources (Article 31), the right to the conservation and protection of the environment and the productive capacity of their lands (Article 29), as well as the right to participate in decision-making in matters which would affect their rights (Article 18). Furthermore, the UN has also developed 17 Sustainable Development Goals (SDG) to be achieved by 2030. In 2015, these were agreed upon and adopted by 193 countries worldwide, including the United States, Canada, New Zealand, and Australia (42). SDG 15 aims to Protect, restore and promote sustainable use of terrestrial ecosystems, sustainably manage forests, combat desertification, and halt and reverse land degradation and halt biodiversity loss (42). Its associated Sustainable Development Solutions Network Target 15.6 aims to ensure fair and equitable sharing of the benefits arising from the utilization of genetic resources, and promote appropriate access to genetic resources (42), a provision that has particular importance for marginalized communities, including Indigenous Peoples. Additionally, many individual nations have binding legislation covering their own Indigenous populations. For example, in New Zealand, the founding charter, subsequent legislation, and other policies covering Indigenous species require that all data and intellectual property be retained by the government within New Zealand (43, 44). Indigenous claims to cultural and intellectual property are also being addressed in New Zealand, where a work program to address the issues identified in WAI262 Report Ko Aotearoa Tenei has just been developed and some projects have been initiated (45, 46).

Rights secured through IDSov can be at odds with the open by default culture of the genomics field, leaving Indigenous genomic data unsupported by the decades of open infrastructure that has been built by the genomics community. In an effort to close the gap, higher-income countries, such as Australia, Canada, and New Zealand, have established national Indigenous-driven human genomic efforts, including the work of the National Centre for Indigenous Genomics (https://ncig.anu.edu.au/), the Silent Genomes project, and the Aotearoa Variome, respectively (47). These national efforts are examples of Indigenous-driven human genomics research programs intended to directly benefit Indigenous Peoples. In Canada, protocols have also been established for the protection of nonhuman data, specifically through the Tri-Council Policy Statement (48) on research ethics that provides protection over Indigenous samples. Furthermore, research licensing in the three territories of Canada protects samples and data collected on Indigenous lands (4951).

To date, three national-level IDSov networks provide processes and protocols to enable Indigenous data governance (SI Appendix, Table S1): Te Mana Raraunga Mori Data Sovereignty Network, the United States Indigenous Data Sovereignty Network, and the Maiam nayri Wingara Aboriginal and Torres Strait Islander Data Sovereignty Group in Australia. However, blanket adoption of national efforts is not feasible in countries that lack substantial genomics investment or in which Indigenous governance structures are less established or respected.

Alongside the national efforts, IDSov is also gaining recognition on an international level through a variety of initiatives. For example, in 2019 the Global Indigenous Data Alliance (GIDA) (https://www.gida-global.org) was established to build a global community for the development of data-sharing infrastructure, data-driven research, and data use policies. In 2020, ENRICH (Equity in Indigenous Research and Innovation Co-ordinating Hub) was established in a collaboration between New York University and the University of Waikato. ENRICH supports IDSov-based protocols, Indigenous-centered standard-setting mechanisms, and machine-focused technology that informs policy and transforms institutional and research practices (https://www.enrich-hub.org/bc-labels). Platforms such as the International IDSov Interest Group have also been set up under the Research Data Alliance (https://www.rd-alliance.org/groups/international-indigenous-data-sovereignty-ig). These initiatives include the development of specific tools and practical mechanisms alongside education and training to provide a foundation for further development of ethical research guidelines that address Indigenous rights and interests.

The FAIR principles are a common refrain of open data efforts that encourage data to be Findable, Accessible, Interoperable, and Reusable (52). In 2019, GIDA released a set of complementary CARE' Principles (53) that highlight the core values and expectations of Indigenous Peoples when engaging with the scientific community. These principles encourage the consideration of collective benefit, authority to control, responsibility, and ethics in Indigenous data governance. Such efforts toward developing new policies to respect and promote IDSov are essential; however, there is now the difficult challenge of informing and implementing IDSov principles, policy, and mechanisms within the global field of genomics (54).

A brief inspection of the publicly available data access and governance policies of international genomics-based consortia showcases where progress has been made and where it is needed the most. Notable exceptions include the H3Africa Consortium (55), which has led the way in the adoption of Indigenous policies for human genomics, providing clarity to researchers through an in-depth set of principles and guidelines that hold participating researchers accountable for their implementation. At present, many nonhuman-focused consortia lack governance and data policy information. Some claim to recognize the rights of Indigenous Peoples but provide no pragmatic implementation plan or accountability measures. Exceptions in the nonhuman space include Genomics Aotearoa (56), which have actively developed engagement and biobanking frameworks in partnership with Mori to guide all consortium members while engaging with Indigenous data. However, for many other efforts, the lack of clear and transparent adoption of IDSov policy is problematic for a successful engagement between genomic researchers and Indigenous partners, given the incompatibility of unfettered open data and IDSov. Moreover, there remain ongoing practical challenges in keeping provenance and cultural connections between Indigenous communities and the data generated from their lands and waters transparent and clear within the databases themselves. Open data have successfully encouraged transparency and inclusion among international genomic research collaborations, but it is now time to ensure such success extends to including Indigenous partners and IDSov in these collaborative infrastructures.

The conflicts between IDSov and open data in genomics research are not new and have been extensively discussed (18). Progress, although slow, is being made to identify and provide solutions to these incompatibilities. Local Contexts is a key international initiative that recognizes and advances the rights of Indigenous Peoples in museum collections and their data through a unique set of traditional knowledge and biocultural labels and notices (with licenses under development) (57). Inspired by the Creative Commons licensing structure (https://creativecommons.org/), Local Contexts initiated this work in 2010, producing a suite of practical mechanisms designed to enhance the protection of Indigenous communities and hold researchers accountable. That process entailed community partnership and collaboration, as will scientific projects that follow its precepts. As durable digital tags with unique IDs, the labels (for communities) and the notices (58) (for researchers and institutions) provide an opportunity to include Indigenous protocols and expectations around the sharing of knowledge as metadata within the data infrastructures. As a result, this information, such as the origin of samples and data, travels with the data across platforms. Through this mechanism, Indigenous partners are given a voice, and future research engagement is encouraged; its aspiration is to leave no one behind.

The field of genomics is operating under data-sharing practices established decades ago. A status quo that began with the Bermuda Principles defining the best mode of data sharing with respect to human data, these principles were then extended by the Fort Lauderdale Agreement to include nonhuman data and further updated in Toronto (59). Since Toronto, community-based efforts such as the Global Alliance for Genomics and Health (https://www.ga4gh.org) have reconsidered these data-sharing frameworks, developing responsible and inclusive human data-sharing policies and toolkits for genomics researchers.

An equal effort is now needed for nonhuman data, and nonhuman genomics continues to embed inherent biases and inequality, doing little to address existing disparities. Indigenous Peoples are part of contemporary life, they are not outside of modernity. Indigenous voices need to be heard. It is both a moral responsibility and a legal obligation to share benefits of research fairly and to respect traditional knowledge derived from their lands and waters. Genomics research needs to implement a future that has hitherto been mainly aspirational, a future that builds intellectual bridges between different ways of knowing and being. The appropriate acknowledgment, understanding, and implementation of Indigenous Peoples rights while conducting genomic research provide a foundation to reach this goal.

Change must happen both at the individual and institutional level to ensure that Earths genomic biodiversity can be ethically cataloged. Several suggestions, references, and resources are provided to aid this transformation.

Operationalizing clear policies that respect Indigenous rights will communicate to potential Indigenous research partners what principles guide the research activity, the manner in which the researchers will conduct themselves, and the standards enforced and upheld. By providing clarity and increasing transparency, trust can be built and remove potential impediments to building relationships with Indigenous partners. When implementing these policies, inclusion does not equal assimilation. Respecting and cultivating divergent practices and beliefs is important to avoid monoculturalization. Indigenous Peoples wishes regarding data access and benefit-sharing must be honored, making one-size-fits-all open data licenses inappropriate. International consortia seeking to perform Indigenous research must implement IDSov policies and engage with Indigenous communities in a manner that allows them to contribute on mutually agreed terms.

To change the culture from research that is done to Indigenous Peoples rather than by or for them, researchers, institutes, scientific journals, repositories, and funding bodies must change the status quo. Researchers must reflect upon their personal assumptions and biases and listen attentively to alternative frameworks. This can be done through questioning scientific orthodoxies and recognizing that research, even when good is intended for all humanity, can create power and benefit imbalances. In beginning a new project, researchers must question the expectations of each research partner, the genomics community, the institutions, the funding bodies, the ethics review boards, the Indigenous partners, and the Indigenous communities who have provenance over the data and organisms in question. Rather than pushing the boundaries, attempt to foresee the consequences and deeply consider at the outset of each research project its social license and duty to diverse societies.

Although significant progress toward policy development has been made, further clarity is particularly needed for nonhuman Indigenous data. As species do not respect country or land borders, policy is required to provide clarity to researchers regarding species that straddle the borders of Indigenous and non-Indigenous lands, and those species that are of special importance to Indigenous Peoples but are found also on non-Indigenous lands.

To ensure an even distribution of power, financial resourcing, and benefit, researchers who wish to partner with Indigenous communities must first ensure their own cultural competency while also prioritizing engagement with Indigenous communities at the onset of the study. This allows the necessary time for a partner relationship to be built from mutual agreement as to the role and responsibilities of both groups, the community, and the researchers. Early engagement also provides Indigenous communities with relevant details pertaining to all aspects of the project, from sample collection to potential research publications and intellectual property development and benefit-sharing in a clear, transparent, and accessible fashion, including: the background, the scope of the research, potential outcomes of the project, and any foreseen risks associated with the research. By doing so, both researchers and Indigenous partners have all of the necessary information and education to conceptualize and design the research project in a concerted fashion that acknowledges the communities long-standing relationship with local species and greater breadth of knowledge of the ecological systems and how they are changing (60, 61). This equips all parties with a fair and equal voice in setting research goals, understanding and contextualizing data, and planning of the time and budgetary requirements needed to achieve research goals ethically. Early engagement also allows project outcomes to be jointly interpreted, drafted, and disseminated by multiple parties, rather than the typical one-sided reporting driven by research institutions. Furthermore, the dissemination of outcomes in the Indigenous local languages will enhance accessibility for Indigenous community partners so that the community can relay the outcomes to others, and this process does not depend on an external scientist. This joint dissemination of research outcomes is extremely important for maintaining trust, communicating mutual benefits, and ensuring that Indigenous knowledge is not misappropriated. Indigenous partners should also be included in the evaluation phases of a project to include Indigenous best practice and better understand research impacts in an Indigenous context.

Projects that have been conceptualized and funded prior to engagement already fall outside the best practices for engagement with Indigenous Peoples. Here, other considerations are crucial for a successful partnership, such as minimizing power inequalities throughout the remaining research period. Indigenous Peoples, such as the African San tribe, Mori in New Zealand, and the Australian Institute of Aboriginal and Torres Strait Islander Studies in Australia, have considered and documented the best practices and expectations for engagement in these circumstances (60, 62, 63). These best practices include understanding and incorporating the expectations of Indigenous communities into the research plan; clearly communicating the scope of research, timelines, funding, methods of consent as informed by the Indigenous research partners, and all potential research outcomes; identifying short- and long-term risks and benefits and how they will be shared; building sustainable long-term governance and communication frameworks; discussing potential barriers to project completion and the impacts of project incompletion on partners; and evaluating the cultural competency of the research team. A focus on the process rather than the product is also helpful in assuring that the project has an adequate timeframe and budget to achieve its stated outcomes in a respectful manner, keeping in mind that fast-paced, product-oriented, and extractive strategies are not compatible with Indigenous cultures and may lead to irrevocable harm (24).

The fully open model of sharing must be challenged; the inclusion of some should not be valued over the exclusion of others. Policies need to be cognizant of the history, needs, and worldviews distinct to each Indigenous community (64). To operationalize situated openness, a pragmatic implementation of IDSov policies and licenses is necessary. As it stands, IDSov policies are being actively developed and adopted; however, progress depends on implementing and enforcing these policies by the genomics research community. Ambitious international goals, such as the push to catalog all genomic information on Earth, sit at the interface of genomic science and Indigenous ways of knowing. Effective implementation of IDSov policies and power sharing between communities is necessary to ethically realize such visions. This will require multiparadigm research methodologies built upon commonalities, but also accepting of divergent beliefs and practices, to move away from the extractive and exploitative strategies of past research on Indigenous Peoples. The task is hard, but eminently achievable, as recently demonstrated by more inclusive, diverse, and political research paradigms developed by researchers in New Zealand, Australia, North America, Africa, Central and South America, and the Pacific (40). These stand as positive examples for how to best champion polycultural expression and establish a new status quo for the genomics community.

Open data sharing in genomics has fueled progress and brought benefits to a field that continues to grow, even as it ramifies into many different fields of research and application. However, it is evident that those doing the sharing, to date, have taken on very little riskand in many cases, stand to benefitfrom the act of openly sharing. To impose the same open data requirements on those with the most to lose by relinquishing control over use of resources and data is unfair, and when openness is stated as a prerequisite for participation, it can have the unintended effect of excluding marginalized communities. An infrastructure that allows for multiple modes of data sharing is needed, particularly modes that allow for materials and data over which Indigenous communities exert stewardship to remain under their control, and with respectful communication of findings and sharing of benefits with Indigenous communities. The Native BioData Consortium is the first tribal-driven BioBank in the United States (NBDC; https://nativebio.org/) and provides a model of how to facilitate the flexibility needed to share data in a manner respectful of all parties and worldviews. In an Aboriginal and Torres Strait Islander context, the idea of kinship speaks toward the interconnectedness and interdependence of all life (65), as well as water and geographical features. This relationship to land is shared among Mori (66), and First Nations and Inuit Peoples (67). Adequate time and resources must be assigned to directly coordinate conservation efforts with Indigenous partners who are the experts on implementing systems thinking approaches within their own lands.

To sequence everything requires the help and participation of everyone on equal and mutually agreed terms. Ultimately, genomic technologies can be advanced to the point of becoming commonplace, and initiatives are already under way to bring DNA sequencing into classrooms (68). As the field of genomics progresses, all research partners have the responsibility and opportunity to build a trustworthy and inclusive research community. Investing in outreach programs that pass on the latest technologies and methods such as the SING Consortium (https://www.singconsortium.org/) and IndigiData (https://indigidata.nativebio.org/) workshops, this capacity building will facilitate local research, fueled by local priorities and guided by local best practice. Graduate and undergraduate genomics courses should also include training in ethics and engagement best practices to improve the cultural competency of non-Indigenous researchers that may enter this space. This provides cultural safety but also alleviates expectations and responsibilities resting solely on Indigenous researchers shoulders (47). Infrastructure and opportunities for media producers local to the study should also be developed for the dissemination of genomic research findings in multiple languages, regions, and formats. These efforts will enable all partners, including Indigenous and other marginalized communities, to directly contribute to ongoing international genomics efforts and by fostering diversity within the field. It can help ensure that genomics infrastructure will be accessible and beneficial for all, and practices put in place to foster trust over the long haul.

Parties to the UN CBD and its Nagoya Protocol are currently reviewing the meaning of digital sequence information (DSI) and the requirement for a change to access and benefit-sharing policies under the convention that pertain to such DSI (41). As it stands, the term DSI is a placeholder used to facilitate discussions surrounding three data types: 1) DNA and RNA; 2) DNA, RNA nucleotide sequences, and protein-peptide amino acid sequences; and 3) DNA, RNA, and protein sequences as well as digital information pertaining to metabolites and macromolecules. All three of these definitions would include data contributing to reference genome sequences for nonhuman organisms. Prior to these discussions, there had been a fourth option for associated information, including traditional knowledge (69), but this was removed during the revision.

Despite the Nagoya Protocol calling for access and benefit-sharing, to date only 16 signatory countries have domestic legislation regarding DSI. Eighteen additional signatories are planning to or are in the process of drafting such legislation (70). The United States is not a signatory to the Convention, but United States representatives have attended the November 2021 review conference in China, and will attend further discussions in 2022. Many nations involved in the Earth BioGenome Project, European Reference Genome Atlas (https://vertebrategenomesproject.org/erga), the Human Pangenome Reference Consortium, and other international genomic collaborations are signatories. The ongoing CBD review has the goal of standardizing terms for access and benefit-sharing among all signatories, and discussions continue to include DSI. The international committee overseeing the CBD has expressed discontent with the status quo. Disparate policies among signatories and other major nations have led to the interpretation of open access to DSI as sufficient to fulfill access and benefit-sharing requirements in some cases, while in other cases formal agreements are required to share samples or sequence data. The review considers 13 recent publications relevant to access, benefit-sharing, and sequence data that have been categorized into five policy archetypes, some of which are mutually exclusive, while others can be combined (Table 1). Each archetype will be considered for cost-effectiveness, feasibility, and practicality, as well as uses of traditional knowledge. Access and benefit-sharing standards will be addressed again before a standardized policy is agreed upon and incorporated into the convention framework.

Potential policy options under review of the Convention on Biological Diversity, with respect to access and benefit-sharing and digital sequence information

The lack of infrastructure to trace the geographic origin of samples and DSI is readily apparent: only 12% of the sequence data in publicly available databases specifies a country of origin. The lack of proper infrastructure to monitor compliance with access, benefit-sharing, and sharing of DSI at each point in the value chain has also been flagged as a potential barrier to agreement, with block chain smart contracts highlighted as a potential solution (71).

Policies about access and benefit-sharing, and about sharing of DSI are in flux, but it is clear that unfettered open access to data and materials, including sharing of sequence data, is being questioned when it comes into conflict with Indigenous rights. National and international law are likely to evolve, and the scientific community would be wise to both directly engage in helping set the standards and practices but also to comply with the emerging laws, norms, and practices governed by national and international law.

Following basic principles in a transparent manner, with all parties having access to and an equal understanding of the research project, will help remove the barriers between the genomics community and Indigenous partners, and will facilitate a long-term partnership founded on trust, safety, honesty, and accountability. The genomics community must engage with each Indigenous partner in accordance with that communitys specific traditional beliefs, practices, and connections to the organisms being studied and the appropriate way to engage with other people in discussions of other organisms. As Chip Colwell, previous senior curator of anthropology at the Denver Museum of Nature and Science, stated during SING Aotearoa (https://www.singaotearoa.nz), Indigenous People are not anti-science [but] they demand a science that restores the dignity of Indigenous Peoples and is carried out with fundamental respect (72). This is now the responsibility of each researcher, consortium, journal, data repository, and funding body that seeks engagement with data or resources derived from Indigenous lands. Practical mechanisms like the traditional knowledge and biocultural labels and notices, and Indigenous-driven biobanks such as the Native BioData Consortium, provide proven models. The field has come a long way in working toward diversity, and the wind is at our back. Indigenous researchers have already put great effort into developing guidelines, best practices, legal and extralegal tools, and new research paradigms (SI Appendix, Table S1). Equipped with this knowledge, the community must now capitalize on the opportunity to build an inclusive, respectful, and mutually beneficial future for genomics.

There are no data underlying this work.

We thank Carla Easter (Education and Outreach Department of the National Human Genome Research Institute, NIH), Jenny Reardon (University of California, Santa Cruz), Harris Lewin (University of California, Davis), and Jacob S. Sherkow (University of Illinois) for their time in reviewing and consulting in preparation of this manuscript; and IndigiData and SING USA, Canada, and Aotearoa for their support and guidance throughout the manuscript-drafting process. This work was supported, in part, by the Intramural Research Program of the National Human Genome Research Institute, NIH (A.M.M.C. and A.M.P.). J.G. is funded by NIH Grant 5R01CA237118-02 and a Canadian Institutes of Health Research Fellowship (202012MFE-459170-174211). Development of the Biocultural Label Initiative has been supported by Catalyst Seeding funds for the project Te Tukiri o te Tonga: Recognizing Indigenous Interests in Genetic Resources provided by the New Zealand Ministry of Business, Innovation and Employment and administered by the Royal Society Te Aprangi (19UOW008CSG to M.L.H. and J.A.), leveraging the existing Local Contexts (https://localcontexts.org/) platform supported by the National Endowment for the Humanities (PR 234372-16 and PE 263553-19 to J.A.) and the Institute of Museums and Library Services in the United States (RE-246475-OLS-20 to J.A.), New York University Graduate School of Arts and Sciences, and the University of Waikato. Continuing infrastructure development is supported through the Equity for Indigenous Research and Innovation Co-ordinating Hub based at New York University and University of Waikato (https://www.enrich-hub.org/). The Biocultural Label Initiative is extended through use cases, supported and refined by the Aotearoa Biocultural Label Working Group, Federation of Mori Authorities Innovation (https://www.foma.org.nz/), Te Mana Rauranga (https://www.temanararaunga.maori.nz/), Genomics Aotearoa (https://www.genomics-aotearoa.org.nz/), Indigenous Design and Innovation Aotearoa (https://www.idia.nz/), the Genomics Observatories Metadatabase (https://geome-db.org/), the Ira Moana Genes of the Sea Project (https://sites.massey.ac.nz/iramoana/), supported by Catalyst Seeding funds provided by the New Zealand Ministry of Business, Innovation and Employment and administered by the Royal Society Te Aprangi, 17MAU309CSG to L.L.), and a Massey University Research Fund to L.L. L.L. is supported by a Rutherford Foundation Discovery Fellowship. J.G. and R.C.-D. are funded by the US National Cancer Institute through Grant R01 CA227118 (sulstonproject.org). M.Z.A. is funded by NIH Grant R01AI148788 and NSF CAREER 2046863.

Author contributions: A.M.M.C., J.A., L.L., M.L.H., M.Z.A., B.T., J.G., R.C.-D., and H.R.P. designed research; A.M.M.C. and A.M.P. wrote the paper; and J.A., L.L., M.L.H., M.Z.A., B.T., J.G., R.C.-D., and H.R.P. contributed to drafting text.

The authors declare no competing interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2115860119/-/DCSupplemental.

More:
Balancing openness with Indigenous data sovereignty: An opportunity to leave no one behind in the journey to sequence all of life - pnas.org

Related Posts