Compiled by @KatLH@scholar.social with input from many others. I have been regularly using the fedi since 2018. Thank you to those who responded to my call-out, and please get in touch if you’d like to be credited.
I tooted a call-out seeking different people’s views on the ethical complexities of academics treating the fediverse as a source of research data (I hesitate to even use the term ‘data’ in case it implies a resource or commodity). By ‘the fediverse’, I am mainly referring to Mastodon and its similar forks.
My intent is to provide a non-exhaustive list of ethical issues for researchers. I have written this in plain language for broad accessibility. Those contemplating research should not assume that something that might be seen as acceptable and innocuous on social media platforms such as Twitter and Facebook is acceptable or innocuous within the fedi. Skilled social media scholars will already have good practices with these kinds of things, but experience tells us that many researchers still do not.
This is not a new topic and others have tackled it in far more detail and with greater insight and consultation than I have here – please see the list of links at the end, most of which I consider essential reading. I hope that this document is a useful summary and helps avert harm.
Terms of Service
At the most basic level, researchers need to be aware that every server has its own ToS that defines what is allowed. There is no central authority granting access. ToS are not consistent between different servers, and furthermore they may change over time. Fediverse content can involve many, many different servers.
While it is a very obvious point that researchers need to comply with all relevant ToS, unfortunately there are already egregious examples of this not being done .
Some terms may explicitly address the use of data for research (e.g. ) or might be more general about the use of content hosted by that server (e.g. ). I suggest that anyone contemplating research with the fedi starts by getting in touch with the relevant server admin(s).
Many people on the fediverse explicitly opt out of data collection via the bio in their profile, which might contain something like “I do not consent to my data being used […]”. However, there is a general expectation that the minimum bar to clear should be opt in via informed consent. There are several reasons for this, some of which I outline below. Unfortunately, there is a long history of private data being leaked from the fediverse simply because it’s technically possible.
Server admins will actively defend safety and privacy. Actions by anyone that compromise safety will almost certainly result in moderation actions such as: suspending the offending account or defederating from the researcher’s server in the event their admin does not take action. This means that a server that aims to be ‘open’ or ‘objective’ will achieve the opposite, as their reluctance to defederate will be met by defederation by others.
There are communities within the fedi that have been directly harmed by unethical research in the past (for example, First Nations and transgender communities). Hence there is a pre-existing trust deficit.
Furthermore, the historically extractive nature of research and the impetus to ‘publish or perish’ is well understood by many people on the fediverse. While in some areas academia is getting better at foregrounding the needs and desires of communities through participant-led research, most would argue this is still the exception rather than the rule.
Researchers might therefore encounter a general sense of wariness before earning trust and obtaining informed consent.
Audience and privacy
Fediverse software offers a suite of privacy controls such as post privacy and the different server moderation policies. This can get quite complex depending on federation and vantage point.
The visibility of content to a particular user is not necessarily indicative of the intentions of the content’s author. For example, posts on specific servers (such as those run by and for marginalised groups) may well only be intended to be read or shared by that community and not by outsiders, regardless of technical visibility.
A fediverse user’s understanding and preferences relating to privacy may be in flux as they interact with the fedi. Once again this means that researchers shouldn’t make assumptions about what is acceptable use of a person’s content.
Many people on the fedi are members of communities historically targeted by hate groups – this includes some academic communities, too. Defederation from untrustworthy servers is an important safety measure, but data collection can compromise defederation and create risks.
As one person who responded to me explained:
I am on server A. I have preemptively suspended server C for safety. An academic (or journalist) comes along from server B without any knowledge of this and puts A and C together into some dataset. Now C can know of my existence.
On this note, academics rebuilding their networks post-Twitter should remember to ask permission before adding anyone to public lists.
Much of the fediverse has a strong sense of relationships, mutual aid and a duty not to betray one another. Researchers should not assume that their understanding of social relationships gleaned from other social media platforms or communities are transferable to the fediverse. Incentives or reasons to participate in research might be different as a result.
Nobody can see the entire fediverse. Different servers can have very different stances on moderation ranging from heavily defederated to completely unmoderated. As a result, what is visible can vary wildly. This means aggregate analysis is highly contextual and research purporting to offer an objective view is junk.
Despite everything mentioned in this document regarding marginalised communities, much of the fedi is hegemonically white, tech-oriented, able-bodied and rich.
I highly recommend this blog post  which gets into more detail about how researchers can misinterpret social worlds and cause harm.
It’s not all negative
Compared to the large social media platforms, the fedi can be a great site of dialogue and engagement. Opt in research can tend to get a lot of traction, provided researchers are trusted and the framing doesn’t grate or harm.
Server admins and many other people on the fedi are excellent sources of guidance and could even become potential collaborators. I imagine that open source research tools and creative participatory methods will lead to all kinds of cool things.
Links and further reading
 An Open Letter from the Mastodon Community https://www.sunclipse.org/wp-content/downloads/2020/01/open-letter.html
 Server Rules – use of content hosted by mastodon.nz https://mastodon.nz/about
 maloki, On Scraping Mastodon https://blogghoran.se/2020/01/27/on-scraping-mastodon/
Robert W. Gehl FOSS Academic https://fossacademic.tech/2022/10/18/notesOnNobreEtAl.html
Elias, T., Ritchie, L., Gevalt, G., & Bowles, K. (2020). A pedagogy of ‘small’: Principles and values in small, open, online Communities. In Open (ing) Education (pp. 364-389). Brill. (thoughts from 2018 on what to consider)